Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

improve Tesseract performance with OpenCV on Android

I am working on a Android application using real-time OCR. I using OpenCV and Tesseract Library. But the performance is very poor, even on my Galaxy SIII. There are any methods to improve the performance? It is my code:

    Mat mGray = new Mat();
capture.retrieve(mGray);
Bitmap bmp = Bitmap.createBitmap(mGray.cols(), mGray.rows(), Bitmap.Config.ARGB_8888);
tessBaseApi.setImage(bmp);
String recognizedText = tessBaseApi.getUTF8Text();
Log.i("Reg", recognizedText);

Will the speed of tesseract OCR be reduced by passing bitmap to the Tesseract API? What pre-processing should I perform before passing to the Tesseract API?

like image 858
QuiLl HoN Avatar asked Oct 03 '12 03:10

QuiLl HoN


People also ask

Does OpenCV use Tesseract?

OpenCV package is used to read an image and perform certain image processing techniques. Python-tesseract is a wrapper for Google's Tesseract-OCR Engine which is used to recognize text from images. Download the tesseract executable file from this link.

Is Tesseract an API?

The tesseract api provides several page segmentation modes if you want to run OCR on only a small region or in different orientations, etc. 0 Orientation and script detection (OSD) only. 1 Automatic page segmentation with OSD. 2 Automatic page segmentation, but no OSD, or OCR.

Does Tesseract preprocess images?

Tesseract does various image processing operations internally (using the Leptonica library) before doing the actual OCR. It generally does a very good job of this, but there will inevitably be cases where it isn't good enough, which can result in a significant reduction in accuracy.

How to filter OpenCV image using switcher?

The switcher function, apply_threshold, takes two arguments, namely OpenCV image and an integer that denotes the filter. Likewise, since this function returns the OpenCV image as a result, it could easily be integrated into our get_string function from the previous post. ...

How to blur an image in OpenCV?

It’s worth mentioning that there are a few blur filters available in the OpenCV library. Image blurring is usually achieved by convolving the image with a low-pass filter kernel. While filters are usually used to blur the image or to reduce noise, there are a few differences between them.

How do I enlarge an image in OpenCV?

If you’d like to trade off some of your image quality for faster performance, you may want to try INTER_LINEAR for enlarging images. It’s worth mentioning that there are a few blur filters available in the OpenCV library. Image blurring is usually achieved by convolving the image with a low-pass filter kernel.

Is inter_cubic better than other OpenCV blur filters?

In this case, INTER_CUBIC generally performs better than other alternatives, though it’s also slower than others. If you’d like to trade off some of your image quality for faster performance, you may want to try INTER_LINEAR for enlarging images. It’s worth mentioning that there are a few blur filters available in the OpenCV library.


Video Answer


2 Answers

One thing to try is to binarize the image using adaptive thresholding (adaptiveThreshold in OpenCV).

like image 86
ojs Avatar answered Oct 26 '22 10:10

ojs


You can have Tesseract only do the recognition pass 1, so that it skips passes 2 through 9, when it calls recog_all_words().

Change the following line in baseapi.cpp and rebuild your Tesseract library project:

if (tesseract_->recog_all_words(page_res_, monitor, NULL, NULL, 0)) {

Change it to:

if (tesseract_->recog_all_words(page_res_, monitor, NULL, NULL, 1)) {
like image 31
rmtheis Avatar answered Oct 26 '22 08:10

rmtheis