Obviously this image is pretty tough as it is low clarity and is not a real word. However, with this code, I'm detecting nothing close:
import pytesseract
from PIL import Image, ImageEnhance, ImageFilter
image_name = 'NedNoodleArms.jpg'
im = Image.open(image_name)
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save(image_name)
text = pytesseract.image_to_string(Image.open(image_name))
print(text)
outputs
, Mdfiaodfiamms
Any ideas here? The image my contrasting function produces is:
Which looks decent? I don't have a ton of OCR experience. What preprocessing would you recommend here? I've tried resizing the image larger, which helps a little bit but not enough, along with a bunch of different filters from PIL. Nothing getting particularly close though
Tesseract does various image processing operations internally (using the Leptonica library) before doing the actual OCR. It generally does a very good job of this, but there will inevitably be cases where it isn't good enough, which can result in a significant reduction in accuracy.
You are right, tesseract works better with higher resolutions so sometimes resizing the image helps - but don't convert to 1 bit.
I got good results converting to grayscale, making it 3 times as large and making the letters a bit brighter:
>>> im = Image.open('j78TY.png')\
.convert('L').resize([3 * _ for _ in im.size], Image.BICUBIC)\
.point(lambda p: p > 75 and p + 100)
>>> pytesseract.image_to_string(im)
'NedNoodleArms'
Check this jupyter notebook:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With