I have code using pytesseract and work perfect, only don't work when the image I try to recognize are 0 to 9. If image only have one digit don't give any result.
This a sample of image I'm working https://drive.google.com/folderview?id=0B68PDhV5SW8BdFdWYVRwODBVZk0&usp=sharing
And this the code I'm using
import pytesseract
varnum= pytesseract.image_to_string(Image.open('images/table/img.jpg'))
varnum = float(varnum)
print varnum
Thanks!!!!
With this code I'm able to read all numbers
import pytesseract
start_time = time.clock()
y = pytesseract.image_to_string(Image.open('images/table/1.jpg'),config='-psm 10000')
x = pytesseract.image_to_string(Image.open('images/table/1.jpg'),config='-psm 10000')
print y
print x
y = pytesseract.image_to_string(Image.open('images/table/68.5.jpg'),config='-psm 10000')
x = pytesseract.image_to_string(Image.open('images/table/68.5.jpg'),config='-psm 10000')
print y
print x
print time.clock() - start_time, "seconds"
result
>>>
1
1
68.5
68.5
0.485644155358 seconds
>>>
You would need to set the Page Segmentation mode to be able to read single character/digits.
From the tesseract-ocr manual (which is what pytesseract internally uses), you can set the page segmentation mode using -
-psm N
Set Tesseract to only run a subset of layout analysis and assume a certain form of image. The options for N are:
10 = Treat the image as a single character.
So you should set the -psm
option to 10. Example -
varnum= pytesseract.image_to_string(Image.open('images/table/img.jpg'),config='-psm 10')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With