I am using a combination of pyautogui and pytesseract to capture small regions on the screen and then pull the number/text out of the region. I have written script that has read the majority of captured images perfectly, but single digit numbers seem to cause an issue for it. For example small regions of an image containing numbers are saved to .png files the numbers 11, 14, and 18 were pulled perfectly, but the number 7 is just returning as a blank string.
Question: What could be causing this to happen?
Code: Scaled down drastically to make it every easy to follow:
def get_text(image):
return pytesseract.image_to_string(image)
answer2 = pyautogui.screenshot('answer2.png',region=(727, 566, 62, 48))
img = Image.open('answer2.png')
answer2 = get_text(img)
This code is repeated 4 times, once for each image, it worked for 11,14,18 but not for 7.
Just to slow the files being read here is a screenshot of the images after they were saved through the screenshot command.
https://gyazo.com/0acbf5be2d970abeb29561113c171fbe
here is a screenshot of what I am working from:
https://gyazo.com/311913217a1302382b700b07ad3e3439
I found question Why pytesseract does not recognise single digits? and in comments I found option --psm 6
.
I checked tesseract
with option --psm 6
and it can recognize single digit on your image.
tesseract --psm 6 number-7.jpg result.txt
I checked pytesseract.image_to_string()
with option config='--psm 6'
and it can recognize single digit on your image too.
#!/usr/bin/env python3
from PIL import Image
import pytesseract
img = Image.open('number-7.jpg')
print(pytesseract.image_to_string(img, config='--psm 6'))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With