Google Vision API does not recognize single digits

Tags:

I have a project that make use of Google Vision API DOCUMENT_TEXT_DETECTION in order to extract text from document images.

Often the API has troubles in recognizing single digits, as you can see in this image:

enter image description here

I suppose that the problem could be related to some algorithm of noise removal, that recognizes isolated single digits as noise. Is there a way to improve Vision response in these situations? (for example managing noise threshold or others parameters)

At other times Vision confuses digits with letters:

enter image description here

But if I specify as parameter languageHints = 'en' or 'mt' these digits are ignored by the ocr. Is there a way to force the recognition of digits or latin characters?

296

asked Mar 20 '18 14:03

Davide Biraghi

1 Answers

Unfortunately I think the Vision API is optimized for both ends of the spectrum -- dense text (DOCUMENT_TEXT_DETECTION) on one end, and arbitrary bits of text (TEXT_DETECTION) on the other. As you noted in the comments, the regular TEXT_DETECTION works better for these stray single digits while DOCUMENT_TEXT_DETECTION works better overall.

As far as I've heard, there are no current plans to try to cover both of these in a single way, but it's possible that this could improve in the future.

I think there have been other requests to do more fine-tuning and hinting on what you're looking to detect (e.g., here and here), but this doesn't seem to be available yet. Perhaps in the future you'll be able to provide more hints on the format of the text that you're looking to find in images (e.g., phone numbers, single digits, etc).

answered Oct 17 '22 06:10

JJ Geewax

Related questions
                            
                                How to access/use Google's pre-trained Word2Vec model without manually downloading the model?
                            
                                How can I save google cloud build step text output to file
                            
                                Google compute engine - getting blocked after accessing SSH a few times
                            
                                Unable to start App Engine application after updating it via Google Cloud SDK
                            
                                unrecognized configuration parameter "default table access method" google cloud
                            
                                DiskPressure crashing the node
                            
                                Exposing two ports in Google Container Engine
                            
                                Google Compute Engine: increase memory/CPU of the instance
                            
                                How to pass command line arguments in kubernetes?
                            
                                Google Compute Engine Ubuntu 17.04 zesty does no longer have a Release file
                            
                                How to know if a machine is an Google Compute Engine instance
                            
                                Google Cloud Pub/Sub - INVALID ARGUMENT error in Push subscription
                            
                                Google App Engine Error: <gcloud.app.deploy> INVALID_ARGUMENT: The following quotas were exceeded:BACKEND_SERVICES (quota: 5, used: 5 + needed 1)
                            
                                Google Cloud Platform not allowing project shut down due to lien
                            
                                Regenerate Web API key of Google Firebase
                            
                                GCP Cloud Function - ERROR fetching storage source during build/deploy
                            
                                Run node.js database migrations on Google Cloud SQL during Google Cloud Build
                            
                                Querying by id in Firestore Console [closed]
                            
                                How to do gsutil cp -R while ignoring files like .git, .gitignore?
                            
                                How to deploy one app from a large monorepo with dependencies to packages in the same repo to google app engine?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Google Vision API does not recognize single digits

Tags:

google-cloud-platform

ocr

google-cloud-vision

text-recognition

Davide Biraghi

People also ask

1 Answers

JJ Geewax

Recent Activity

Donate For Us