I have to recognize text from Image, which is possible because there are a lot of library found of OCR, but now I have to find the text size and font type also. I have searched a lot but got no help, this is possible, there is an application "WhatTheFont" available on store which is finding the font type that is best match. How to do this?
I am copying this answer directly from the link in the comment (http://stackoverflow.com/questions/4601291/ocr-combined-with-font-recognition?rq=1), as the question and answer has been removed and only found in Google Cache. I am interested in this so don't want to rely on a broken link :)
Answer courtesy of Andrew Cash (https://stackoverflow.com/users/433635/andrew-cash)
This is what common OCR engines generally do. Look at ABBYY FineReader, Omnipage, Cuneiform, Google Tessetact, Expervision etc...
This is not as easy as it looks as many commercial OCR engines still make silly mistakes and most engines have taken years to develop.
The problem of find the paragraph bounding boxes is part of the OCR process. With your case the paragraph zoning is dead simple but think of a page of a newspaper or magazine and the job becomes much harder.
The problem of background preservation is just as difficult. Simple single colored backgrounds are easy to remove but add something a little more complex and it get difficult very quickly.
Combine all three problems together in the same image and it gets even more difficult. Add some lines and boxes, grey scale shading, halftoning, rotated fonts, fades and other special effects and the OCR almost becomes impossible. Many OCR engines are 100% accurate on simple pages with clearly defined text but when you start adding more complexity to the document then the reading rates start to drop quickly. Some OCR engines are much better than others.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With