Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

get Font Size in Python with Tesseract and Pyocr

Is it possible to get font size from an image using pyocr or Tesseract? Below is my code.

tools = pyocr.get_available_tools()
tool = tools[0]
txt = tool.image_to_string(
      Imagee.open(io.BytesIO(req_image)),
      lang=lang,
      builder=pyocr.builders.TextBuilder()
)

Here i get text from image using function image_to_string . And now, my question is, if i can get font-size(number) too of my text.

like image 229
Witcher Avatar asked Oct 30 '22 20:10

Witcher


1 Answers

Using tesserocr, you can get a ResultIterator after calling Recognize on your image, for which you can call the WordFontAttributes method to get the information you need. Read the method's documentation for more info.

import io
import tesserocr
from PIL import Image

with tesserocr.PyTessBaseAPI() as api:
    image = Image.open(io.BytesIO(req_image))
    api.SetImage(image)
    api.Recognize()  # required to get result from the next line
    iterator = api.GetIterator()
    print iterator.WordFontAttributes()

Example output:

{'bold': False,
 'font_id': 283,
 'font_name': u'Times_New_Roman',
 'italic': False,
 'monospace': False,
 'pointsize': 9,
 'serif': True,
 'smallcaps': False,
 'underlined': False}
like image 67
sirfz Avatar answered Nov 15 '22 07:11

sirfz