Python OCR : Converting Scanned Image Into Text For Processing

Tags:

I am trying to create an answer paper marking (multiple choice question) python application. The answer sheet will be scanned into image file (gif,png,jpg,whichever format is needed).

My App has access to the database where all the answers are stored.

So,all it need is some kind of data from scanned image so that it can compare the answer and calculate the marks.

The answer sheet has fixed dimensions with the table format like this ( Answers will be marked by 'X' by the candidate to indicate their answers):

enter image description here

After searching through the internet, i found that there are a few OCR APIs available.

First one is Pytesser . It is very easy to use and the results are quite okay. But it only work for the images with just pure texts. So, i think it is not suitable.

The second one i found is Ocropus. It seems powerful but in it's documentation

Windows

OCRopus relies a lot on POSIX path names and file systems. You may be able to install OCRopus on Windows using . An easier way is to install VirtualBox and run OCRopus in Ubuntu under VirtualBox.

So i think it is mostly for linux. I could not find a detail installation guide for window platform. ( I am a beginner, so i could be wrong)

The third one i found is python-tesseract , a wrapper for Tesseract OCR. In their page, the installation guide was provided. Basically, i need,

python-tesseract-win32.deb
python-opencv
numpy

but i have no clue on how to install .deb files on window. I have the opencv and nampy already installed.

So the following are my questions:

(1) In which way can i convert the table image into processable data(is it even possible?)?

(2) Is there any other useful OCR APIs that i have not mentioned here that could be helpful?

(3) Finally, (my silly idea) Is it possible to split the image into small chucks(based on the size of the table cells - since the table dimensions are known) using PIL and then use pytesser to convert each small images into text, thereafter process the data accordingly?

FYI: I only need it for Windows Platform, possibly for windows xp 32 bits. I am using python 2.7.5.

961

asked Nov 20 '13 12:11

Chris Aung

1 Answers

Answers correspond to your numbers

1) OCR is in general very hard, but (good news for you) for test score processing, I think it is nearly a solved problem. In this vein there are tried and true solutions for such problems. School systems have been doing this to automate grading 'scantron' tests for years, so if you have access to such resources going that route might be your best bet. At least you should check how they do it

2) I am sure there are others, but those are the main free ones I know of

3)a I think if you are trying to do this on a budget and time is less an issue, your 'silly' idea is actually not silly at all. It might be the best way to do it, and it is likely that the scantron test graders use a similar method. You know the exact dimensions of the test form. You can know the direct pixel mapping of where to look. You could use pytesser very easily. Keep in mind that pytesser sometimes needs you to resize the image (sometimes up, sometimes down) to get the best accuracy.

3)b You might want to consider rolling your own solution. You could use the concept of morphological operations (numpy and other image libraries can do this nearly out of the box). You might not even need these operators and simply do a binary threshold of the table rows (assuming you have already cut the image into table rows) and simply look for blobs and mark the score as coming from the column with the most blob values.

157

answered Sep 26 '22 01:09

Paul

Related questions
                            
                                Replace one python object with another everywhere
                            
                                Python Create a VPN connection for just a host
                            
                                Python having trouble accessing usb microphone using Gstreamer to perform speech recognition with Pocketsphinx on a Raspberry Pi
                            
                                Stop selenium from opening a new window when clicking on a link
                            
                                How to use griddata from scipy.interpolate
                            
                                TypeError: must be string without null bytes, not str
                            
                                May I omit .pyo and .pyc files in an RPM?
                            
                                How to correctly catch and process RQ timeouts in Python?
                            
                                "ImportError: No module named pwd" but it exists
                            
                                OpenERP module xml ValidateError
                            
                                Angularjs routing with django's urls
                            
                                Is there a way to get the name of the 'parent' module from an imported module?
                            
                                How to create multiple signup pages with django-allauth?
                            
                                Create a complete binary search tree from list
                            
                                Matplotlib: How to adjust linewidth in colorbar for contour plot?
                            
                                python StringIO doesn't work as file with subrpocess.call()
                            
                                Acquiring a regular reference from a weakref proxy in python
                            
                                What are the weight values to use in numpy polyfit and what is the error of the fit
                            
                                Accessing Single Entries in Sparse Matrix in Python
                            
                                python multiprocessing.Pool kill *specific* long running or hung process

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python OCR : Converting Scanned Image Into Text For Processing

Tags:

python

python-2.7

python-imaging-library

ocr

tesseract

Chris Aung

People also ask

1 Answers

Paul

Recent Activity

Donate For Us