Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make text in image thinner for OCR

I'm making an automated text recognition script with Python on Ubuntu.

I'm using Gocr and the recognition render is too low.

Exemple:

enter image description here

Output: _O4_4E34E_4_O4_

I suppose that the type in the image is too bold, so I'm asking if there is a way to make it thinner using an python library or a linux command.

like image 302
Ghilas BELHADJ Avatar asked Jan 13 '14 08:01

Ghilas BELHADJ


People also ask

Is Tesseract OCR good?

Tesseract does various image processing operations internally (using the Leptonica library) before doing the actual OCR. It generally does a very good job of this, but there will inevitably be cases where it isn't good enough, which can result in a significant reduction in accuracy.


1 Answers

You probably will need to apply a morphological operation like "erosion" on your image, e.g by using OpenCV. This will make structures thinner. To the cost of the quality, though.

Look here: https://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html

like image 53
synthomat Avatar answered Jan 02 '23 13:01

synthomat