I was watching this talk from pycon http://youtu.be/B1d9dpqBDVA?t=15m34s around the 15:33 mark the speaker talks about extracting lines from an image (receipt) and then feeding that to the OCR engine so that text can be extracted in a better way. I have a similar need where I'm passing images to the OCR engine. However, I don't quite understand what he means by extracting lines from an image. What are some open source tools that I can use to extract lines from an image?

Take a look at the technique used to detect the skew angle of a text. <img src="https://i.stack.imgur.com/PTXnL.jpg" width="500" height="200"> Groups are lines are used to isolate text on an image (this is the interesting part). <img src="https://i.stack.imgur.com/Z6d90.jpg" width="500" height="200"> From this result you can easily detect the upper/lower limits of each line of text. The text itself will be located inside them. I've faced a similar problem before, the code might be useful to you: <img src="https://i.stack.imgur.com/b3RlQ.png" width="300" height="350"> All you need to do from here is crop each pair of lines and feed that as an image to Tesseract.

Extracting lines from an image to feed to OCR - Tesseract

Tags:

image-processing

opencv

tesseract

I was watching this talk from pycon http://youtu.be/B1d9dpqBDVA?t=15m34s around the 15:33 mark the speaker talks about extracting lines from an image (receipt) and then feeding that to the OCR engine so that text can be extracted in a better way.

I have a similar need where I'm passing images to the OCR engine. However, I don't quite understand what he means by extracting lines from an image. What are some open source tools that I can use to extract lines from an image?

474

asked Mar 28 '13 15:03

birdy

1 Answers

Take a look at the technique used to detect the skew angle of a text.

Groups are lines are used to isolate text on an image (this is the interesting part).

From this result you can easily detect the upper/lower limits of each line of text. The text itself will be located inside them. I've faced a similar problem before, the code might be useful to you:

All you need to do from here is crop each pair of lines and feed that as an image to Tesseract.

120

answered Sep 18 '22 09:09

karlphillip

Related questions
                            
                                Masks in python opencv cv2 not working?
                            
                                In an OpenCV application, how do I identify the source of memory leak and fix it?
                            
                                Compiling openCV 2.3.1 programs with MinGW gcc/g++ on Windows 7 64bit
                            
                                Use fundamental matrix to compute coordinates translation using OpenCV
                            
                                Android JavaCV dilemma, NoClassDefFoundError thrown inside of method 'draw' when IplImage is created
                            
                                Detection of rectangular bright area in a Image using OpenCv
                            
                                cv::goodFeaturesToTrack doesn't return any features
                            
                                Adjusting the threshold in Canny edge algorithm
                            
                                OpenCV and Latent SVM Detector
                            
                                How to make motion history image for presentation into one single image?
                            
                                Multiple-View Geometry
                            
                                cv2.cameraCalibration using python
                            
                                Convert raster images to vector graphics using OpenCV?
                            
                                Capturing video from a MJPEG stream using C++
                            
                                How to Access Points location on OpenCV Matcher?
                            
                                I would like to add a scalar to just one row of a matrix
                            
                                Why does cv::circle() only display on a 3D matrix for certain RGB values?
                            
                                OpenCV intrusion detection
                            
                                Feature Extraction Methods for Hand gesture/posture recognition
                            
                                Mat element bulk modification : negative to 0, positive to 1

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With