Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Cognitive Services OCR giving differing results - how to remedy?

Azure CS has an OCR demo (westcentralus endpoint) at

https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/?v=18.05

On a poor test image (which I'm afraid I can't post because it's an identity document), I get OCR results that 100% match the actual text for three test cases in fact - remarkable.

However, when I follow the sample at the URL below, with the westeurope endpoint, I get poorer OCR results - some text is missing:

https://docs.microsoft.com/en-us/azure/cognitive-services/Computer-vision/quickstarts/python-print-text

Why is this? More to the point - how do I access the v=18.05 endpoint?

Thanks for all speedy help.

like image 250
jtlz2 Avatar asked Jun 19 '18 09:06

jtlz2


People also ask

How accurate is Azure OCR?

Microsoft Azure Computer Vision OCR engine provides approximately 18% STP and 80% accuracy with data extraction.

What is the maximum image size for the OCR API Azure?

The file size of images must be less than 500 MB (4 MB for the free tier) and dimensions at least 50 x 50 pixels and at most 10000 x 10000 pixels. PDF files do not have a size limit. The minimum height of the text to be extracted is 12 pixels for a 1024 x 768 image.

Which Azure cognitive service would you use for image classification and object detection?

Azure Custom Vision is an image recognition service that lets you build, deploy, and improve your own image identifier models. An image identifier applies labels to images, according to their visual characteristics. Each label represents a classification or object.


Video Answer


1 Answers

I think I got your point: you are not using the same operation between the 2 pages you mention.

If you read the paragraph just above the working demo you are mentioning here it says:

Get started with the OCR service in general availability, and discover below a sneak peek of the new preview OCR engine (through "Recognize Text" API operation) with even better text recognition results for English.

And if you have a look to the other documentation you are pointing at (this one), they are using the OCR operation:

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

ocr_url = vision_base_url + "ocr"

So if you want to use this new preview version, change the operation to recognizeText

It is available in West Europe region (see here), and I made a quick test: the samples provided on Azure demo page are working with this operation, and not in the other one.

But this time the operation needs 2 calls:

  • One POST operation to submit your request (recognizeText operation), where you will have a 202 Accepted answer with an operationId
  • One GET opertaion to get the results (textOperations operation), with your OperationId from the previous step. For example: https://westeurope.api.cognitive.microsoft.com/vision/v2.0/textOperations/yourOperationId

DEMO :

For the CLOSED sign from Microsoft Demos:

Result with OCR operation:

{
  "language": "unk",
  "orientation": "NotDetected",
  "textAngle": 0.0,
  "regions": []
}

Result with RecognizeText:

{
  "status": "Succeeded",
  "recognitionResult": {
    "lines": [{
      "boundingBox": [174, 488, 668, 675, 617, 810, 123, 622],
      "text": "CLOSED",
      "words": [{
        "boundingBox": [164, 494, 659, 673, 621, 810, 129, 628],
        "text": "CLOSED"
      }]
    }, {
      "boundingBox": [143, 641, 601, 811, 589, 843, 132, 673],
      "text": "WHEN ONE DOOR CLOSES, ANOTHER",
      "words": [{
        "boundingBox": [147, 646, 217, 671, 205, 698, 134, 669],
        "text": "WHEN"
      }, {
        "boundingBox": [230, 675, 281, 694, 269, 724, 218, 703],
        "text": "ONE"
      }, {
        "boundingBox": [291, 697, 359, 722, 348, 754, 279, 727],
        "text": "DOOR"
      }, {
        "boundingBox": [370, 726, 479, 767, 469, 798, 359, 758],
        "text": "CLOSES,"
      }, {
        "boundingBox": [476, 766, 598, 812, 588, 839, 466, 797],
        "text": "ANOTHER"
      }]
    }, {
      "boundingBox": [56, 668, 645, 886, 633, 919, 44, 700],
      "text": "OPENS.ALL YOU HAVE TO DO IS WALK IN",
      "words": [{
        "boundingBox": [74, 677, 223, 731, 213, 764, 65, 707],
        "text": "OPENS.ALL"
      }, {
        "boundingBox": [233, 735, 291, 756, 280, 789, 223, 767],
        "text": "YOU"
      }, {
        "boundingBox": [298, 759, 377, 788, 367, 821, 288, 792],
        "text": "HAVE"
      }, {
        "boundingBox": [387, 792, 423, 805, 413, 838, 376, 824],
        "text": "TO"
      }, {
        "boundingBox": [431, 808, 472, 824, 461, 855, 420, 841],
        "text": "DO"
      }, {
        "boundingBox": [479, 826, 510, 838, 499, 869, 468, 858],
        "text": "IS"
      }, {
        "boundingBox": [518, 841, 598, 872, 587, 901, 506, 872],
        "text": "WALK"
      }, {
        "boundingBox": [606, 875, 639, 887, 627, 916, 594, 904],
        "text": "IN"
      }]
    }]
  }
}
like image 134
Nicolas R Avatar answered Sep 21 '22 17:09

Nicolas R