World: r3wp
[View] discuss view related issues
older newer | first last |
Anton 19-Apr-2008 [7656x2] | All white does not do any cropping. It didn't find any "content" (non-white pixels) to crop to. What do you want it to do ? |
If there's two words above and below in the image, like this: line one line two middle line four line five then the 3 middle lines will be included, unless there is some white above the top line or some white below the bottom line (in which cases they will be included, respectively). | |
Graham 19-Apr-2008 [7658] | Hmm.... return none? |
Anton 19-Apr-2008 [7659] | Let me add that to the to-do list... Is this a common case, by the way ? |
Graham 19-Apr-2008 [7660x3] | Yes |
Let me send you some images ... that it appears to have failed on. | |
Ok, sent. some don't have the whitespace cropped at the top. | |
Anton 19-Apr-2008 [7663] | Currently the algorithm scans downwards and upwards simultaneously, looking for non-white content. When it doesn't find any, it has nowhere to crop to, so no cropping happens. I can change it so that when the scans bump into each other they set that as the "content found" position, and, the scan lines being right next to each other, will result in a 0-height crop region. I will check for that case and return none instead. |
Graham 19-Apr-2008 [7664] | some don't lose some rubbish at the bottom. |
Anton 19-Apr-2008 [7665x3] | The first one show this result, indeed. Let me analyse... |
I understand the bug in my code. I did not implement the weighting quite correctly. | |
hmm.. more issues... it's complex when you want to scan from top and from bottom simultaneously. | |
Graham 19-Apr-2008 [7668] | Anton, I found that the OCR engine I am using needs a white space border, so I am padding the image back again with a little white space. |
Anton 19-Apr-2008 [7669] | That would help my algorithm. Text which is right up against the edge is likely to be classified as 'junk'. When there is text at the top edge and text at the bottom edge only, then we have two possibly 'content' texts. But which one is the content and which is the junk ? The algorithm is forced to either make a choice (which it could do by choosing the larger one), or not choose at all (which is what currently happens), so including both as the 'content'. If you put just one line of white outside the text you consider 'content' then it will be surrounded by white and the algorithm will select it as 'content'. |
Graham 19-Apr-2008 [7670] | I would always select the larger ... |
Anton 20-Apr-2008 [7671x2] | Rewritten algorithm (selects the larger now). load-thru/update these two: http://anton.wildit.net.au/rebol/gfx/auto-crop-bitmap-text.r http://anton.wildit.net.au/rebol/gfx/demo-auto-crop-bitmap-text.r And download this new test script: http://anton.wildit.net.au/rebol/gfx/test-auto-crop-bitmap-text.r |
You can fiddle with the last script to make it load your 6 test files (which all yield correct looking results). | |
Graham 20-Apr-2008 [7673x2] | Cool. |
if the region is blank, your scan routine returns none, and then the crop errors. | |
Anton 20-Apr-2008 [7675] | Oops, forgot the simplest input. |
Anton 21-Apr-2008 [7676x2] | I've fixed that oversight. Update these files: auto-crop-bitmap-text.r test-auto-crop-bitmap-text.r |
The above update also cleans up loose words in the auto-crop-bitmap-text.r file. | |
Graham 21-Apr-2008 [7678x2] | I added a /pad option to mine so that it returns the text with a white space border. |
which is needed for some ocr engines | |
Anton 21-Apr-2008 [7680x5] | /border makes more sense, doesn't it ? |
maybe not... | |
updated auto-crop-bitmap-text.r removed old code and comments (file is 3.5k smaller) | |
updated again auto-crop-bitmap-text.r replaced old comments with new ones. | |
Hmm.. I think the image padding might be outside the responsibility of an auto-crop function. Its job is to remove stuff, not add. It's probably better to write a small generalised function to do the padding (which could be useful elsewhere) and just feed the result of the auto-crop to it. | |
Graham 21-Apr-2008 [7685x2] | you're probably right |
though it could be auto-crop to as it were. | |
Anton 21-Apr-2008 [7687] | I think if you're going to make an "all-in-one" function, then its name should reflect that. eg. crop-and-pad-ready-for-ocr: func [image][ pad-image auto-crop-bitmap-text image 1x1 ] (where pad-image is adding a 1x1 white border around the cropped image.) |
Graham 21-Apr-2008 [7688x2] | auto-crop-bitmap-text: func ["Returns a cropped image, or none if the input image was blank" image [image!] /local region ][ if region: find-bitmap-text-crop-region image [ copy/part skip image region/1 region/2 ; return a cropped image ] ] Looking at this, it appears to return unset! if region is none! |
How about this auto-crop-bitmap-text: func ["Returns a cropped image, or none if the input image was blank" image [image!] /local region ][ all [ region: find-bitmap-text-crop-region image region: copy/part skip image region/1 region/2 ] region ] | |
Anton 21-Apr-2008 [7690x2] | IF returns none when given false. |
And your code redefines the meaning of 'region (which by itself is bad because it can cause confusion later) unnecessarily. I could rewrite it more simply: all [ region: find-bitmap-text-crop-region image copy/part skip image region/1 region/2 ] but that's just equivalent to my IF above. | |
Graham 21-Apr-2008 [7692x3] | Ah ... ok. |
I think it would be nice now to have the crop work on the sides as well. | |
Would it be hard to write a deskewing function? basically I guess one finds a best fit horizontal line for the base of the text one finds, and then returns the angle needed to deskew it. | |
Anton 22-Apr-2008 [7695] | Is it really skew or do you mean rotate ? |
Graham 22-Apr-2008 [7696x2] | It's normally called skew but it's the same. |
To crop the right and left edges I think we can just rotate the image 90 deg, crop it with the existing routines and then rotate it back again | |
Reichart 22-Apr-2008 [7698] | Smart... |
Anton 22-Apr-2008 [7699x2] | I was going to say that modifying the code to support horizontal cropping should be pretty easy. But that method is even easier ! |
(if a bit hackish.) | |
Graham 22-Apr-2008 [7701] | Seems to work ... but I have to rotate 270 deg and not -90 to get the original orientation back. Does effect not take a negative rotation? |
Geomol 22-Apr-2008 [7702] | Seems to only do 0, 90, 180 and 270: http://www.rebol.com/docs/view-system.html#section-9.5 |
Graham 22-Apr-2008 [7703] | Ahh.. that's no fun. So, I can't use this to deskew. |
Geomol 22-Apr-2008 [7704] | Do you know, how to use DRAW to do image transformations? |
Graham 22-Apr-2008 [7705] | nope |
older newer | first last |