What does the key values of the dictionary output of the following code in tesseract signify?

Question

I am using the following code in python:

I am getting the following key values in the dictionary:

'block_num' 'conf'  'level' 'line_num'  'page_num'  'par_num', 'text', 'top', 'width', 'word_num', 'height, 'left'.

What do these key values signify

I tried to find these in the official documentation of tesseract. If you have some links which explain the same please do provide or explain it.

    img = cv2.imread('../Image_documents/6.png')
    d = pytesseract.image_to_data(img, output_type=Output.DICT)
    pprint.pprint(d)

Dmitry Harnitski · Accepted Answer

You called an API to get information about text in your image.

The best way to think about response is as a composition of boxes (rectangles) on the image highlighting text areas.

Result-set contains values for multiple different levels.

You can check value of level key to see what level box belongs to. Bellow are supported values:

page
block
paragraph
line
word

Image can contain multiple blocks of the same type and these attributes used to define position of block in list and parents hierarchy - page_num, block_num, par_num, line_num, word_num

top, width, height, left values define box shape.

Let's take a look at sample see how it works.

Assume we have picture with 2 words on the same line.

For that picture tesseract returns 6 boxes: 1 for page, 1 for block, 1 for paragraph, 1 for line and 2 for words

This is the data you get:

'level': [1, 2, 3, 4, 5, 5]
'page_num': [1, 1, 1, 1, 1, 1]
'block_num': [0, 1, 1, 1, 1, 1]
'par_num': [0, 0, 1, 1, 1, 1]
'line_num': [0, 0, 0, 1, 1, 1]
'word_num': [0, 0, 0, 0, 1, 2]

Code below renders all level boxes on image:

d = pytesseract.image_to_data(image, output_type=Output.DICT)
n_boxes = len(d['level'])
for i in range(n_boxes):
    (x, y, w, h) = (d['left'][i], d['top']
                    [i], d['width'][i], d['height'][i])
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)

What does the key values of the dictionary output of the following code in tesseract signify?

Tags:

python-3.x

text-extraction

tesseract

python-tesseract

Mayank Kumar

1 Answers

Dmitry Harnitski

Recent Activity

Donate For Us

What does the key values of the dictionary output of the following code in tesseract signify?

Tags:

python-3.x

text-extraction

tesseract

python-tesseract

Mayank Kumar

1 Answers

Dmitry Harnitski

Related questions

Recent Activity

Donate For Us