Programmatically divide scanned images into separate images

Question

In order to improve OCR quality, I need to preprocess my scanned images. Sometimes I need to OCR the image with few pictures (components on the page and they are at different angles - for example, a few paper documents scanned at one time), for example:

enter image description here

Is it possible to automatically programmatically divide such images into separate images that will contain every logical document? For example with a tool like ImageMagick or something else? Is there any solutions/technics exists for such problem?

fmw42 · Accepted Answer

alexanoid wrote: I have added another image with scanning artifacts. Will this approach work on such images also?

No it will not work well for several reasons. The second image you provide was much larger than the first. So it would need a much larger blur. It is jpg and has artifacts in it. JPG is not a good format, since the image in 'constant' regions is not really constant. The blur will pick up your artifacts and will need to have a different threshold to remove some of them. In your case, the top of the image has a good sized artifact that will get caught as an object. Finally your blurred and thresholded text region's bounding boxes overlap even if they do not touch. Thus one crop may include text from other regions.

Here is my test command to blur and threshold your image:

convert image.jpg -blur 0x50 -auto-level -threshold 95% -type bilevel tmp.png

enter image description here

Programmatically divide scanned images into separate images

Tags:

image-processing

imagemagick

ocr

image-preprocessing

alexanoid

1 Answers

fmw42

Recent Activity

Donate For Us

Programmatically divide scanned images into separate images

Tags:

image-processing

imagemagick

ocr

image-preprocessing

alexanoid

1 Answers

fmw42

Related questions

Recent Activity

Donate For Us