Which algorithm is used in google's tesseract-OCR for Recognition?Is it Neural Network?
This paper in the tesseract source provides a deep overview of the technology.
Notably:
Blobs are organized into text lines, and the lines and regions are analyzed for fixed pitch or proportional text.
[...]
Recognition then proceeds as a two-pass process. In the first pass, an attempt is made to recognize each word in turn. Each word that is satisfactory is passed to an adaptive classifier as training data. The adaptive classifier then gets a chance to more accurately recognize text lower down the page.
[...]
Once the text lines have been found, the baselines are fitted more precisely using a quadratic spline.
[...]
The baselines are fitted by partitioning the blobs into groups with a reasonably continuous displacement for the original straight baseline. A quadratic spline is fitted to the most populous partition, (assumed to be the baseline) by a least squares fit.
The paper does not explicitly state whether it uses a neural network, but given the content I would say it's likely, at least for parts of it.
For more on line-finding, see R. Smith, “A Simple and Efficient Skew Detection Algorithm via Text Row Accumulation”, Proc. of the 3rd Int. Conf. on Document Analysis and Recognition (Vol. 2), IEEE 1995, pp. 1145-1148.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With