Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get orientation pytesseract Python3

I want to get the orientation of a scanned document. I saw this post Pytesseract OCR multiple config options and I tried to use --psm 0 to get the orientation.

target = pytesseract.image_to_string(text, lang='eng', boxes=False, \
config='--psm 0 tessedit_char_whitelist=0123456789abcdefghijklmnopqrstuvwxyz')

But I get an error:

FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/jy/np7p4twj4bx_k396hyc_bnxw0000gn/T/tess_dzgtpadd_out.txt'
like image 351
lads Avatar asked Aug 13 '18 13:08

lads


People also ask

How do I check text orientation?

You can use the Hough Transform to detect the longest lines in your image and then find the predominant slope of those lines. If the slope is close to zero, your text is horizontal; if it's close to infinity, your text is vertical.

What is OSD in Pytesseract?

The OSD mode provides us with meta-data of the text in the image, including both estimated text orientation and script/writing system detection. The text orientation refers to the angle (in degrees) of the text in the image.

Does Pytesseract need Tesseract?

Pytesseract or Python-tesseract is an OCR tool for python that also serves as a wrapper for the Tesseract-OCR Engine. It can read and recognize text in images and is commonly used in python ocr image to text use cases.


1 Answers

I found another way to get the orientation using pytesseract:

print(pytesseract.image_to_osd(Image.open(file_name)))

This is the output:

Page number: 0
Orientation in degrees: 270
Rotate: 90
Orientation confidence: 21.27
Script: Latin
Script confidence: 4.14
like image 200
lads Avatar answered Nov 12 '22 12:11

lads