Are there any OCR engines designed for identifying text in screen-captured images rather than scanned text? I have a project where I need to retrieve and identify text in an application, and none of the OCR engines I've tried so far have faired well with screenshots. Ideally the engine should work well with color and with background noise, although I can make some allowances if nothing like that is available. It will need to be .NET compatible; either written in .NET or having a .NET-callable API.

Usually OCR technolgy is tuned to work with scanned text, which is at at least 200 dpi, however 300 dpi is recommended for reliable OCR quality. Thus you need to put some efforts into tweaking settings and everything to make it work on screen text, which is typically considered to be something near to 96 dpi. ABBYY has screen shot OCR software: http://www.abbyy.com/screenshot_reader/ which proves that its technology is able to work in this conditions well. I use it, it just works. Thus you may want to contact ABBYY for OCR SDK: http://www.abbyy.com/ocr_sdk/ (can be used from .NET) It is not cheap, but it works. Disclaimer: I work for ABBYY

You're essentially looking for the CAPTCHA circumvention tools various researchers have tried, some with success. Another approach would be to use smoothing algorithms to interpolate 96 DPI captures and convert them to 300 DPI (eg, photoshop it), then use standard OCR tools.

OCR engines designed for screen-reading

Tags:

.net

text

screenshot

ocr

imaging

Are there any OCR engines designed for identifying text in screen-captured images rather than scanned text? I have a project where I need to retrieve and identify text in an application, and none of the OCR engines I've tried so far have faired well with screenshots.

Ideally the engine should work well with color and with background noise, although I can make some allowances if nothing like that is available.

It will need to be .NET compatible; either written in .NET or having a .NET-callable API.

792

asked Jul 27 '10 15:07

Erik Forbes

3 Answers

I've found Tesseract OCR to be pretty solid for an open source project. I've found that it can even read and decode simple captchas, like Megaupload's. I'd think with a little tweaking this could work pretty well.

The only pain is that it only accepts uncompressed TIFF images, which can be annoying.

EDIT: Philip Daubmeier already found a .NET integration, but below is code to convert a Bitmap to uncompressed TIFF.

private void ConvertBitmapToTIF(Bitmap convert)
{
    ImageCodecInfo codecInfo = GetEncoderInfo("image/tiff");
    System.Drawing.Imaging.Encoder encodeCom = System.Drawing.Imaging.Encoder.Compression;
    System.Drawing.Imaging.Encoder encodeBPP = System.Drawing.Imaging.Encoder.ColorDepth;

    EncoderParameters parms = new EncoderParameters(2);
    EncoderParameter param0 = new EncoderParameter(encodeCom, (long)EncoderValue.CompressionNone);
    EncoderParameter param1 = new EncoderParameter(encodeBPP, 8L);
    parms.Param[0] = param0;
    parms.Param[1] = param1;

    convert.Save("output.tif", codecInfo, parms);
}

This saves to a file, but the Bitmap.Save method can write to a stream also.

answered Oct 08 '22 01:10

Nate

Usually OCR technolgy is tuned to work with scanned text, which is at at least 200 dpi, however 300 dpi is recommended for reliable OCR quality. Thus you need to put some efforts into tweaking settings and everything to make it work on screen text, which is typically considered to be something near to 96 dpi.

ABBYY has screen shot OCR software: http://www.abbyy.com/screenshot_reader/ which proves that its technology is able to work in this conditions well. I use it, it just works. Thus you may want to contact ABBYY for OCR SDK: http://www.abbyy.com/ocr_sdk/ (can be used from .NET)

It is not cheap, but it works. Disclaimer: I work for ABBYY

answered Oct 08 '22 00:10

Tomato

You're essentially looking for the CAPTCHA circumvention tools various researchers have tried, some with success.

Another approach would be to use smoothing algorithms to interpolate 96 DPI captures and convert them to 300 DPI (eg, photoshop it), then use standard OCR tools.

answered Oct 07 '22 23:10

joe snyder

Related questions
                            
                                Does an EventWaitHandle have any implicit MemoryBarrier?
                            
                                Good .NET libraries for working with JSON data? [closed]
                            
                                WCF Named Pipe Security and Multiple User Sessions?
                            
                                Working with Partial Views in ASP.NET MVC
                            
                                .NET stream capabilities - is the CanXXX test safe?
                            
                                Can I make the default AppDomain use shadow copies of certain assemblies?
                            
                                What are the minimum permissions a user needs to install and run a ClickOnce application based on .NET 3.5?
                            
                                Globalize an existing Windows Forms application?
                            
                                'Arrays as attribute arguments is not CLS-compliant' warning, but no type information given
                            
                                Class versioning to support backwards compatibility
                            
                                WPF vs Windows Forms in desktop applications
                            
                                Is it possible to save a dynamic assembly to disk?
                            
                                Tools for Building an OCA (Occasionally Connected Application)
                            
                                .NET wrapper for Windows API functionality [closed]
                            
                                Programmatically fetch GPU utilization
                            
                                C# Delegate Instantiation vs. Just Passing the Method Reference [duplicate]
                            
                                Can't create a fullscreen WPF popup
                            
                                What are the benefits of Compiler as a Service
                            
                                Get just the hour of day from DateTime using either 12 or 24 hour format as defined by the current culture
                            
                                LINQ to SQL: how to update the only field without retrieving whole entity

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With