Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Camera Preview and OCR

I am new to android development - I'm using Xamarin.

I am trying to write an application that initiates the camera preview, and then constantly scans the incoming frames for text (I am using Xamarin.Tesseract from NuGet).

In other words, I don't want to make the user take a photo and then do the OCR analysis, instead I want them to just point the video camera at some paper with text on it, i'll continually do the OCR analysis until I detect the specific text I'm searching for) at which point I'll give a big thumbs up to the user.

This is the approach I have gone down so far:

  1. Initialise the camera and set a preview callback

    _Camera = Android.Hardware.Camera.Open();          
    _Camera.SetPreviewCallback(this); 
    _Camera.StartPreview();              
    
  2. In the Callback, take the bytes representing the current frame and pass this as the input image bytes for Xamarin.Tesseract

    public void OnPreviewFrame(byte[] data, Android.Hardware.Camera camera)
    {        
    await _TesseractApi.SetImage(data); /// this hangs                
    string text = _Api.Text;
    return text;          
    } 
    
    

This currently hangs when passing the byte[] into the Tesseract API. I'm pretty sure it's going to be because the Bytes in the array are either the wrong encoding, or, i'm fundamentally not understanding the Camera api!

Can anyone give me a nudge in the write direction?

like image 396
Darrell Avatar asked Jun 18 '15 16:06

Darrell


People also ask

What is a OCR camera?

Optical Character Recognition and Verification Vision Systems. Machine vision OCR is used to automatically read printed, scribed or stamped text at high speed to confirm text readability, quality and form.

What is OCR detection?

Optical Character Recognition (OCR) is the process of detecting and reading text in images through computer vision. Detection of text from document images enables Natural Language Processing algorithms to decipher the text and make sense of what the document conveys.

Does OCR work on images?

Optical character recognition (OCR) is a technology that extracts text from images. It scans GIF, JPG, PNG, and TIFF images. If you turn it on, the extracted text is then subject to any content compliance or objectionable content rules you set up for Gmail messages.


1 Answers

Looking at the code for TesseractApi.SetImage(byte[]), it is calling BitmapFactory.DecodeByteArray() which expects a valid Bitmap.

Unfortunately, the camera preview is returning a YUV image, which BitmapFactory doesn't support.

Here is code to transform the YUV image to a JPEG which you can then pass to Tesseract.

private byte[] ConvertYuvToJpeg(byte[] yuvData, Android.Hardware.Camera camera)
{
    var cameraParameters = camera.GetParameters();
    var width = cameraParameters.PreviewSize.Width;
    var height = cameraParameters.PreviewSize.Height;
    var yuv = new YuvImage(yuvData, cameraParameters.PreviewFormat, width, height, null);   
    var ms = new MemoryStream();
    var quality = 80;   // adjust this as needed
    yuv.CompressToJpeg(new Rect(0, 0, width, height), quality, ms);
    var jpegData = ms.ToArray();

    return jpegData;
}
like image 138
Kiliman Avatar answered Sep 20 '22 15:09

Kiliman