Different Results with tesseract on same image

Question

Hello I am trying to ocr on an image.

enter image description here

this is the original image after some pre processing (skipping preprocessing part since its not really related to my question but will share if somebody needs it)

I've got this image

enter image description here

when I try to ocr this image with using tesseract

I'm getting a result as

HN'

2809

however when I manually crop half part of the image on photoshop

enter image description here

I recieve

HN'

Z8

as a result.

I wonder whats difference between those two images because one gives 2 instead of Z but the other one gives the Z.

I know I have to smooth edges for more accurate results but motion blur, gaussian blur nor ordinary blur filter did change the results I'm getting.

karlphillip · Accepted Answer

Tesseract implements an algorithm that picks number 2 over letter Z based on the amount and type of digits in the neighbourhood:

In the first image, it guesses 2 over Z because it's neighbours are all numbers (809), so it assumes that the first digit must also be a number.

I had this problem before. :(

By the way, I think you should flip the first part of the image so HN' becomes .NH.

Different Results with tesseract on same image

Tags:

image-processing

opencv

ocr

tesseract

Anar Bayramov

1 Answers

karlphillip

Recent Activity

Donate For Us

Different Results with tesseract on same image

Tags:

image-processing

opencv

ocr

tesseract

Anar Bayramov

1 Answers

karlphillip

Related questions

Recent Activity

Donate For Us