Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tesseract doesn't seem to work with digits

Tags:

tesseract

I followed the FAQ to make Tesseract recognize digits, but all I get is a bunch of text in the output file, despite having only numbers in my image.

My command line looks like this:

tesseract --tessdata-dir ./ ./input.jpg ./output/output digits

Any ideas what could be happening?.

like image 942
Artemix Avatar asked Dec 10 '25 16:12

Artemix


1 Answers

As mentioned in tesseract github issue you can't black or whitelist characters with tesseract 4.0 LSTM, instead you should train LSTM with characters you expect on your image.

Thanks to Shreeshrii you can try his 'experimantal' digits traineddata from here

Please note that Tesseract 4.0 is still in alpha stage and if you want - you can still use 3.* versions of tesseract which support your needs from the box. Tesseract v 3.4 tessdata is located here, library for windows can be downloaded from here

like image 134
Dmitrii Z. Avatar answered Dec 16 '25 12:12

Dmitrii Z.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!