Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to detect image orientation (text)

My program is working with fax documents stored as separate bitmaps
I wonder if there is a way to detect automatically page orientation (vertical or horizontal) to show image preview for user in right order (meant rotate if neccesary)

Any advices much appreciated!

EDIT: Clarification:
When Faxmachine receives multi-page document it saves each page as separate TIFF file.
My app has built-in viewer displaying those files. All files are scaled to A4 format and saved in TIFF (so there is no change to detect orientation by height/width parameters)
My viewer displays images in portrait mode by default

What I'd like to do is automagically detect situation when org document was printed in landscape mode (eg wide Excel tables) then I'd like to show rotated preview for end user to speed up preview process

Obviously there are 4 possible fax orientation portrait / landscape x 2 kinds of rotations.

I'm even interested simplified solution detecting when org doc was landscape or portrait (I've noticed most of landscape docs needs to be rotated clockwise)

EDIT2: Idea
I think it might be some idea:
If I could draw horizontal and vertical lines and check if line doesn't cut any (black) point. Then we can compare what are more type of lines (horizontal or vertical) and his decides about page orientation.
What do you think ?

like image 532
Maciej Avatar asked Apr 01 '10 09:04

Maciej


People also ask

How do you detect if text is rotated 180 degrees or flipped upside down?

By comparing the values between the two halves, if the top half has more pixels than the bottom half, it is upside down by 180 degrees. If it has less, it is correctly oriented.

Which algorithm is used to detect text in images?

Optical Character Recognition (OCR) is used to analyze text in images. The proposed algorithm deals with taking scanned copy of a document as an input and extract texts from the image into a text format using Otsu's algorithm for segmentation and Hough transform method for skew detection.

What is orientation and script detection?

The OSD mode provides us with meta-data of the text in the image, including both estimated text orientation and script/writing system detection. The text orientation refers to the angle (in degrees) of the text in the image.


3 Answers

You could perform a Fast Fourier Transform (FFT) to convert your spatial image to a frequency/angle representation. Then find the angle with the most prominent frequency. It sounds complicated but it's not that hard, it's pretty efficient, and in effect it tests every possible angle at once, instead of being a hard-coded hack that only works for specific angles. Search for a sample implementation with search terms like Numerical Recipes and FFT.

like image 74
Liudvikas Bukys Avatar answered Oct 22 '22 02:10

Liudvikas Bukys


You'd need OCR for that. Rolling your own OCR would be a bit difficult, but there might be library or something out there worth looking into? Also, even with good OCR, it's not a 100% reliable solution.

like image 2
Catdirt Avatar answered Oct 22 '22 03:10

Catdirt


I wonder if there are some properties of text you could use to help you do this.

For instance based on a quick glance, there are far more vertical lines in text (l,j,k,m,n etc) than horizontal ones so maybe you could start with this.

But even detecting these isn't straightforward, you'd need to use some sort of filter like a Sobel or Prewitt. They both have horizontal and vertical versions, see here for more info.

Of course the vertical/horizontal lines of an excel spreadsheet would be the strongest edges so you'd have to ignore these and look only at the text.

Alternative: Can you not just give the user an easy way to rotate the images, like the arrows in Windows Picture viewer or just show 4 thumbnail previews they can click on. You might need to cache the 4 versions (if you are rotating) so it's quick, but only if speed turns out to be an issue?

like image 2
Matt Warren Avatar answered Oct 22 '22 01:10

Matt Warren