How to detect Text Area from image?

Tags:

i want to detect text area from image as a preprocessing step for tesseract OCR engine, the engine works well when the input is text only but when the input image contains Nontext content it falls, so i want to detect only text content in image,any idea of how to do that will be helpful,thanks.

834

asked Apr 18 '12 09:04

chostDevil

3 Answers

Well, I'm not well-experienced in image processing, but I hope I could help you with my theoretical approach.

In most cases, text is forming parallel, horisontal rows, where the space between rows will contail lots of background pixels. This could be utilized to solve this problem. So... if you compose every pixel columns in the image, you'll get a 1 pixel wide image as output. When the input image contains text, the output will be very likely to a periodic pattern, where dark areas are followed by brighter areas repeatedly. These "groups" of darker pixels will indicate the position of the text content, while the brighter "groups" will indicate the gaps between the individual rows. You'll probably find that the brighter areas will be much smaller that the others. Text is much more generic than any other picture element, so it should be easy to separate.

You have to implement a procedure to detect these periodic recurrences. Once the script can determine that the input picture has these characteristics, there's a high chance that it contains text. (However, this approach can't distinguish between actual text and simple horisontal stripes...)

For the next step, you must find a way to determine the borderies of the paragraphs, using the above mentioned method. I'm thinking about a pretty dummy algorithm, witch would divide the input image into smaller, narrow stripes (50-100 px), and it'd check these areas separately. Then, it would compare these results to build a map of the possible areas filled with text. This method wouldn't be so accurate, but it probably doesn't bother the OCR system.

And finally, you need to use the text-map to run the OCR on the desired locations only.

On the other side, this method would fail if the input text is rotated more than ~3-5 degrees. There's another backdraw, beacuse if you have only a few rows, then your pattern-search will be very unreliable. More rows, more accuracy...

Regards, G.

167

answered Sep 21 '22 02:09

Gergely Lukacsy

Take a look at this bounding box technique demonstrated with OpenCV code:

Input:

enter image description here

Eroded:

enter image description here

Result:

enter image description here

answered Sep 20 '22 02:09

karlphillip

I am new to stackoverflow.com, but I wrote an answer to a question similar to this one which may be useful to any readers who share this question. Whether or not the question is actually a duplicate, since this one was first, I'll leave up to others. If I should copy and paste that answer here, let me know. I also found this question first on google rather than the one i answered so this may benefit more people with a link. Especially since it provides different ways of going about getting text areas. For me, when I looked up this question, it did not fit my problem case.

Detect text area in an image using python and opencv

answered Sep 21 '22 02:09

prijatelj

Related questions
                            
                                Conversion from basic_string to jstring
                            
                                C++: Using function pointers with member functions
                            
                                Why are @ and $ characters not used for anything in C and C++? [closed]
                            
                                C++ LibCurl - Converting CURLcode into a CString
                            
                                How should I design a set of related classed where only some of them support a certain operation?
                            
                                What are the differences between RedrawWindow and UpdateWindow in Win32?
                            
                                Making use of sandy bridge's hardware true random number generator?
                            
                                Get the pointer of a Java ByteBuffer though JNI
                            
                                Compiler (G++) seems to allocate more memory for instances of classes than it needs
                            
                                Set background image of an openGL window
                            
                                what is const good for in "const T& operator[](size_type i)"?
                            
                                What g++ flags will make a runtime-sized array on stack cause compiler error?
                            
                                How to make boost unordered_map to support flyweight<string>
                            
                                How to add CMake includes and libraries to Visual Studio Solution?
                            
                                ostream chaining, output order
                            
                                using nullptr instead of NULL when mixing C and C++
                            
                                How to instantiate an fstream if you declare it as a member of a class?
                            
                                How to connect a socket to an http server through proxy?
                            
                                What does "void *(*)(void *)" mean in C++?
                            
                                Why constructor is used instead of functions?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to detect Text Area from image?

Tags:

c++

image-processing

text-extraction

tesseract

chostDevil

People also ask

3 Answers

Gergely Lukacsy

karlphillip

prijatelj

Recent Activity

Donate For Us