I'm developing an Android app which uses tesseract OCR to recognize Text, now I have the Problem that on different Smartphones the image gets rotate in a different way, so on one it is in landscape mode right away and on the other in portrait mode. So now i want to intelligently rotate the Image so that Tesseract can recognize the Text. Which is only in one of the two options possible, but it might be in either, due to the user taking the picture. I don't want the User to have to take the picture in the same format everytime, i want to rotate it so it fits the need, if possible without too much of a performance loss.
The Tesseract lib with the autorotate does not seem to work for me in that way. Anybody an idea how to solve that problem.
Thanks
In OSD mode, Tesseract can detect text orientation and script type. From there, we can rotate the text back to 0° with OpenCV.
Create a Python tesseract script Create a project folder and add a new main.py file inside that folder. Once the application gives access to PDF files, its content will be extracted in the form of images. These images will then be processed to extract the text.
If this question is still relevant for you: Maybe you can extract the exif data of the image, to get its orientation?
Otherwise this paper maybe can help you: Combined Orientation and Script Detection using the Tesseract OCR Engine.
If you don't mind rolling your sleeves up, http://www.leptonica.org/ is probably a good option to evaluate the glyphs (raw Pix that is not detected as text yet) and determine orientation. I've seen references to Android bindings for Leptonica.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With