Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OCR confidence score from Google Vision API

I am using Google Vision OCR for extracting text from images in python.
Using the following code snippet.
However, the confidence score always shows 0.0 which is definitely incorrect.

How to extract the OCR confidence score for individual char or word from the Google response?

 content = cv2.imencode('.jpg', cv2.imread(file_name))[1].tostring()
 img = types.Image(content=content)
 response1 = client.text_detection(image=img, image_context={"language_hints": ["en"]})
 response_annotations = response1.text_annotations
 for x in response1.text_annotations:
      print(x)
      print(f'confidence:{x.confidence}')

Ex: output for an iteration

description: "Date:"
bounding_poly {
  vertices {
    x: 127
    y: 11
  }
  vertices {
    x: 181
    y: 10
  }
  vertices {
    x: 181
    y: 29
  }
  vertices {
    x: 127
    y: 30
  }
}

confidence:0.0
like image 238
letsBeePolite Avatar asked Jul 01 '20 17:07

letsBeePolite


People also ask

Is Google OCR better than Tesseract?

However, the quality of the Google Vision OCR is still better, especially on difficult cases such as very small text. Since the quality is most important to us, the Google Vision OCR wins the comparison in our use case.

What does Google use for OCR?

The Google OCR API is a subset of the Google Cloud Vision API. We can use Google OCR API to extract text from JPEG, GIF, PNG, and TIFF images. A number of Google products use this OCR technology, including Gmail and Google Drive.


2 Answers

I managed to reproduce your issue. I used the following function and obtained confidence 0.0 for all items.

from google.cloud import vision

def detect_text_uri(uri):
    client = vision.ImageAnnotatorClient()
    image = vision.types.Image()
    image.source.image_uri = uri

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print('Texts:')

    for text in texts:
        print('\n"{}"'.format(text.description))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                    for vertex in text.bounding_poly.vertices])

        print('bounds: {}'.format(','.join(vertices)))
        print("confidence: {}".format(text.confidence))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))

However, when using the same image with the "Try the API" option in the documentation I obtained a result with confidences non 0. This happened also when detecting text from a local image.

One should expect confidences to have the same value using both methods. I've opened an issue tracker, check it here.

like image 157
aemon4 Avatar answered Oct 18 '22 03:10

aemon4


Working code that retrieves the right confidence values of GOCR response.

(using document_text_detection() instead of text_detection())

def detect_document(path):
    """Detects document features in an image."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    # [START vision_python_migration_document_text_detection]
    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.types.Image(content=content)

    response = client.document_text_detection(image=image)

    for page in response.full_text_annotation.pages:
        for block in page.blocks:
            print('\nBlock confidence: {}\n'.format(block.confidence))

            for paragraph in block.paragraphs:
                print('Paragraph confidence: {}'.format(
                    paragraph.confidence))

                for word in paragraph.words:
                    word_text = ''.join([
                        symbol.text for symbol in word.symbols
                    ])
                    print('Word text: {} (confidence: {})'.format(
                        word_text, word.confidence))

                    for symbol in word.symbols:
                        print('\tSymbol: {} (confidence: {})'.format(
                            symbol.text, symbol.confidence))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))
    # [END vision_python_migration_document_text_detection]
# [END vision_fulltext_detection]

# add your own path
path = "gocr_vision.png"
detect_document(path)
like image 32
letsBeePolite Avatar answered Oct 18 '22 03:10

letsBeePolite