I am using Google Vision OCR for extracting text from images in python.
Using the following code snippet.
However, the confidence score always shows 0.0
which is definitely incorrect.
How to extract the OCR confidence score for individual char or word from the Google response?
content = cv2.imencode('.jpg', cv2.imread(file_name))[1].tostring()
img = types.Image(content=content)
response1 = client.text_detection(image=img, image_context={"language_hints": ["en"]})
response_annotations = response1.text_annotations
for x in response1.text_annotations:
print(x)
print(f'confidence:{x.confidence}')
Ex: output for an iteration
description: "Date:"
bounding_poly {
vertices {
x: 127
y: 11
}
vertices {
x: 181
y: 10
}
vertices {
x: 181
y: 29
}
vertices {
x: 127
y: 30
}
}
confidence:0.0
However, the quality of the Google Vision OCR is still better, especially on difficult cases such as very small text. Since the quality is most important to us, the Google Vision OCR wins the comparison in our use case.
The Google OCR API is a subset of the Google Cloud Vision API. We can use Google OCR API to extract text from JPEG, GIF, PNG, and TIFF images. A number of Google products use this OCR technology, including Gmail and Google Drive.
I managed to reproduce your issue. I used the following function and obtained confidence 0.0 for all items.
from google.cloud import vision
def detect_text_uri(uri):
client = vision.ImageAnnotatorClient()
image = vision.types.Image()
image.source.image_uri = uri
response = client.text_detection(image=image)
texts = response.text_annotations
print('Texts:')
for text in texts:
print('\n"{}"'.format(text.description))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
print("confidence: {}".format(text.confidence))
if response.error.message:
raise Exception(
'{}\nFor more info on error messages, check: '
'https://cloud.google.com/apis/design/errors'.format(
response.error.message))
However, when using the same image with the "Try the API" option in the documentation I obtained a result with confidences non 0. This happened also when detecting text from a local image.
One should expect confidences to have the same value using both methods. I've opened an issue tracker, check it here.
Working code that retrieves the right confidence values of GOCR response.
(using document_text_detection()
instead of text_detection()
)
def detect_document(path):
"""Detects document features in an image."""
from google.cloud import vision
import io
client = vision.ImageAnnotatorClient()
# [START vision_python_migration_document_text_detection]
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.document_text_detection(image=image)
for page in response.full_text_annotation.pages:
for block in page.blocks:
print('\nBlock confidence: {}\n'.format(block.confidence))
for paragraph in block.paragraphs:
print('Paragraph confidence: {}'.format(
paragraph.confidence))
for word in paragraph.words:
word_text = ''.join([
symbol.text for symbol in word.symbols
])
print('Word text: {} (confidence: {})'.format(
word_text, word.confidence))
for symbol in word.symbols:
print('\tSymbol: {} (confidence: {})'.format(
symbol.text, symbol.confidence))
if response.error.message:
raise Exception(
'{}\nFor more info on error messages, check: '
'https://cloud.google.com/apis/design/errors'.format(
response.error.message))
# [END vision_python_migration_document_text_detection]
# [END vision_fulltext_detection]
# add your own path
path = "gocr_vision.png"
detect_document(path)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With