Does anyone know of a way to reference Microsoft.Windows.Ocr
(/ WindowsPreview.Media.Ocr.dll
) Assembly on a server-side ASP.Net Web application like MV4 Web API and make use of the OCR Functionality in that assembly to take a photo image as input and extract the text content out of it ? If yes, please provide detailed instructions in your answer.
I am building a web application that takes an image uploaded to the Server (via a file upload UI screen) and then reads the text using OCR and displays the text on the next page, right next to the image that was uploaded.
Since most commercial OCR Libraries cost an arm and length (over $1,300 last time I checked) I thought I can try and use the Microsoft OCR Library Microsoft.Windows.Ocr
that is FREE and seems to be very simple and straightforward to use.
So I tried to install the Microsoft.Windows.Ocr Nuget Package to my ASP.Net MVC4 Web API Project and that succeeded.
After that, I looked through my MVC4 Web API Project References, and to my surprise, did not find a reference to Microsoft.Windows.Ocr.dll Assembly.
So then I tried to add a reference to the x86 version of the Microsoft.Windows.Ocr.dll Assembly by browsing to that Assembly in the \packages
folder, and selected the WindowsPreview.Media.Ocr.dll from \lib\win81\x86
folder
Note: The Assembly name is WindowsPreview.Media.Ocr.dll and not Microsoft.Windows.Ocr.dll , not sure why!
When I did that and clicked OK, I got the following Error Message.
---------------------------
Microsoft Visual Studio
---------------------------
A reference to
'D:\TestProjects\packages\Microsoft.Windows.Ocr.1.0.0\lib\win81\x86\
WindowsPreview.Media.Ocr.dll' could not be added. Please make sure
that the file is accessible, and that it is a valid assembly
or COM component.
---------------------------
OK
---------------------------
I then found out from the Nuget Page that the "Supported Platforms" are only Windows Phone 8, Windows Phone 8.1, Windows 8.1 (Windows Store apps only).
But surely, there must be a way to use this OCR dll on the Server-side in an ASP.Net Application?
Microsoft.Windows.Ocr
(/ WindowsPreview.Media.Ocr.dll
) Assembly on a server-side ASP.Net Web application like MV4 Web API and make use of the OCR Functionality in that assembly to take a photo image as input and extract the text content out of it ?** If yes, please provide detailed instructions in your answer.
Any "hacks" and/or Sample code would be much appreciated!!
Thank you!!
If you are using Visual Studio 2015 and Windows 10, the
Microsoft.Windows.Ocr
has been moved to Universal Windows Platform. It is available as
Windows.Media.Ocr
So you need to upgrade your VS 2015 with tools for Windows 10 enabled.
I did the following and Windows.Media.Ocr got added as a reference in my Web API.
Note: The following works only with VS 2015 and windows 10. That too VS 2015 should be updated for Universal Windows Platform(UWP). Check this for a sample OCR.
Hope this helps. Update It got imported into my reference but is failing to load though. Hope it provides some start for people. Thanks!.
You can Skip to Update 2 below for a working solution.
IT WILL THROW A TYPE LOAD EXCEPTION. That being said I am posting because I am trying to do the same thing but can't get the project to run. Here are some basic instructions on how to get the winrt api into your non windows app project.
http://weblogs.thinktecture.com/cnagel/2012/10/calling-winrt-from-windows-desktop-apps.html
Also don't try to reference the dll, reference instead the winmd file.
Here is a sample console app that references the ocr library but when you run the solution it throws the type load exception.(https://github.com/Xandroid4Net/MicrsoftOcrConsoleApp) It should be easily ported from a console app to and asp.net application. I don't know how to fix the type load exception maybe you can get farther than I can. Please post if you do find a solution.
More digging revealed the following assembly binding error. Any idea how to set a package Id for a Process?
File: WindowsPreview.Media.Ocr!WindowsPreview.Media.Ocr.OcrEngine, Version=255.255.255.255, Culture=neutral, PublicKeyToken=null, ContentType=WindowsRuntime.htm
File Contents:
* Assembly Binder Log Entry (12/1/2014 @ 11:48:01 PM) *
The operation failed. Bind result: hr = 0x80073d54. The process has no package identity.
Assembly manager loaded from: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\clr.dll Running under executable C:\Users\Wesley\Documents\Dev\ConsoleApplication2\Program.exe --- A detailed error log follows.
BEGIN : Windows Runtime Type bind. END : The process has no package identity. (Exception from HRESULT: 0x80073D54)
This is a nasty workaround but it worked on my Windows 8.1 Surface Pro 3 tablet. The WebOcr is a webforms but it can easily be refactored for MVC
1) Download https://github.com/Xandroid4Net/CommandLineOcr. This is the psuedo store app.
2) Build and run the app. It will appear to do nothing. That is okay, it will update the registry with a new key that we will need.
3) Download https://github.com/Xandroid4Net/WebOcr.
4) Find OcrCommandLineCaller.cs in the Webocr project.
5) Locate the registry key at HKEY_CURRENT_USER\Software\Classes\ActivatableClasses\Package\Some_Sort_Of_Guid_For_Your_APP\Server\App.App....\AppUserModelId
Refer to this SO question if you need more help finding the registry key. IApplicationActivationManager::ActivateApplication in C#?
6) Update the appActiveManager.ActivateApplication call in OcrCommandLineCaller.cs with the strange guid app identifier found in the registry key.
7) In default.aspx.cs replace the saveAsPath with the path on your machine. In the windows store app there is a static path represented by Windows.Storage.ApplicationData.Current.LocalFolder. This is the path where I saved my images for simplicity.
8) Modify any of the code to your heart's content and let me know if you have any questions.
This is a very rough and nasty solution, but it does work.
I have been using the MODI solution packaged with MS Office for a few years, and was pretty happy with it (it was free if you bought Office). I was rather disappointed when it was discontinued. I've tried Tesseract... I really wanted to like it, but found it slow and inaccurate for Dutch machine type of good quality... and like you, I could not justify spending anything north $200 for what was essentially a hobby project.
After a desperate search, someone on here pointed me at TOCR (a Transym product). An epiphany followed soon after. ;-) I think I need to say at this point that I am not affiliated with Transym in any way, and yes, I paid the full price... of 60 pounds! (no typo... sixty quid), which, including VAT, worked out to 113 euros.
It is essentially meant for integrators (it includes a scan/viewer/OCR app, but that is merely meant as a demo - if you buy the license, you get its source code). The API is outdated (it is, as OCR engines go, quite a mature code base), but it is fast, stable, and unexpectedly accurate. Not as accurate as the DokuStar engine or other esoteric engines, but for my application (Dutch and ENglish machine type) it holds its own against various engines that are well north of $1000. Recognition accuracy on Dutch machine type is excellent (it doesn't do handwriting). In my opinion, in terms of value for money, it is simply ridiculously good. As to the API: I wrote a rudimentary .NET wrapper around it to suit my needs - this was done in a few evenings.
There is an eval version available on their web site (http://www.transym.com/index.htm). And no, I don't get any money if you do ;-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With