Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use MODI in an ASP.Net Web Application?

Tags:

asp.net

ocr

modi

I've written an OCR wrapper library around the Microsoft Office Document Imaging COM API, and in a Console App running locally, it works flawlessly, with every test.

Sadly, things start going badly when we attempt to integrate it with a WCF service running as an ASP.Net Web Application, under IIS6. We had issues around trying to free up the MODI COM Objects, and there were plenty of examples on the web that helped us.

However, problems still remain. If I restart IIS, and do a fresh deployment of the web app, the first few OCR attempts work great. If I leave it for 30 minutes or so, and then do another request, I get server failure errors like this:

The server threw an exception. (Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT)): at MODI.DocumentClass.Create(String FileOpen)

From this point on, every request will fail to do the OCR, until I reset IIS, and the cycle begins again.

We run this application in it's own App Pool, and it runs under an identity with Local Admin rights.

UPDATE: This issue can be solved by doing the OCR stuff out of process. It appears as though the MODI library doesn't play well with managed code, when it comes to cleaning up after itself, so spawning new processes for each OCR request worked well in my situation.

Here is the function that performs the OCR:

    public class ImageReader : IDisposable
{
    private MODI.Document _document;
    private MODI.Images _images;
    private MODI.Image _image;
    private MODI.Layout _layout;
    private ManualResetEvent _completedOCR = new ManualResetEvent(false);

    // SNIP - Code removed for clarity

    private string PerformMODI(string fileName)
    {
        _document = new MODI.Document();
        _document.OnOCRProgress += new MODI._IDocumentEvents_OnOCRProgressEventHandler(_document_OnOCRProgress);
        _document.Create(fileName);

        _document.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
        _completedOCR.WaitOne(5000);
        _document.Save();
        _images = _document.Images;
        _image = (MODI.Image)_images[0];
        _layout = _image.Layout;
        string text = _layout.Text;
         _document.Close(false);
        return text;
    }

    void _document_OnOCRProgress(int Progress, ref bool Cancel)
    {
        if (Progress == 100)
        {
            _completedOCR.Set();
        }
    }
    private static void SetComObjectToNull(params object[] objects)
    {
        for (int i = 0; i < objects.Length; i++)
        {
            object o = objects[i];
            if (o != null)
            {
                Marshal.FinalReleaseComObject(o);
                o = null;
            }
        }
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public void Dispose()
    {
        SetComObjectToNull(_layout, _image, _images, _document);
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
}

I then instantiate an instance of ImageReader inside a using block (which will call IDisposable.Dispose on exit)

Calling Marshal.FinalReleaseComObject should instruct the CLR to release the COM objects, and so I'm at a loss to figure out what would be causing the symptoms we have.

For what it's worth, running this code outside of IIS, in say a Console App, everything seems bullet proof. It works every time.

Any tips that help me diagnose and solve this issue would be an immense help and I'll upvote like crazy! ;-)

Thanks!

like image 982
Scott Ferguson Avatar asked Aug 28 '09 00:08

Scott Ferguson


People also ask

How does an ASP.NET Application work?

The ASP Technology ASP and ASP.NET are server side technologies. Both technologies enable computer code to be executed by an Internet server. When a browser requests an ASP or ASP.NET file, the ASP engine reads the file, executes any code in the file, and returns the result to the browser.

What is C# ASP.NET used for?

Conclusion. ASP.NET is a web application development framework used to develop web applications using different back-end programming languages like C# where C# is used as an object-oriented programming language to develop web applications along with ASP.NET.

What is an ASP.NET web app?

ASP.NET is an open source, server-side web application framework created by Microsoft that runs on Windows and was started in the early 2000s. ASP.NET allows developers to create web applications, web services, and dynamic content-driven websites.


1 Answers

Have you thought of hosting the OCR portion of your app out-of-process.

Having a service can give you tons of flexibility:

  1. You can define a simple end point for your web application, and access it via remoting or WCF.
  2. If stuff is pear shape and the library is all dodge, you can have the service launch a separate process every time you need to perform OCR. This gives you extreme safety, but involves a small extra expense. I would assume that OCR is MUCH more expensive than spinning up a process.
  3. You can keep an instance around of the COM object, if memory starts leaking you can restart yourself without impacting the web site (if you are careful).

Personally I have found in the past the COM interop + IIS = grief.

like image 63
Sam Saffron Avatar answered Sep 27 '22 23:09

Sam Saffron