Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Page layout analysis using Tesseract?

Tesseract 3 is able to perform page layout analysis. However, I couldn't find any sample code or documentation on how to use the library for such purposes. I hope someone here can explain how to perform layout analysis on an image and how to parse the resulting data.

like image 692
Pedro Avatar asked Nov 13 '11 21:11

Pedro


2 Answers

There is an option since 3.04:

tesseract -c preserve_interword_spaces=1 test.tif test

Here is a reference to what looks like the related development thread.

like image 92
Laurent Avatar answered Sep 19 '22 18:09

Laurent


Tesseract can be given a page mode parameter (-psm) which can have the following values:

  • 0 = Orientation and script detection (OSD) only.
  • 1 = Automatic page segmentation with OSD.
  • 2 = Automatic page segmentation, but no OSD, or OCR
  • 3 = Fully automatic page segmentation, but no OSD. (Default)
  • 4 = Assume a single column of text of variable sizes.
  • 5 = Assume a single uniform block of vertically aligned text.
  • 6 = Assume a single uniform block of text.
  • 7 = Treat the image as a single text line.
  • 8 = Treat the image as a single word.
  • 9 = Treat the image as a single word in a circle.
  • 10 = Treat the image as a single character.

Example:

tesseract image.tif image.txt -l eng -psm 0

However, I am not sure that it is possible to use the layout analysis in standalone mode.

like image 43
poiuytrez Avatar answered Sep 21 '22 18:09

poiuytrez