Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Incomplete coordinate values for Google Vision OCR

I have a script that is iterating through images of different forms. When parsing the Google Vision Text detection response, I use the XY coordinates in the 'boundingPoly' for each text item to specifically look for data in different parts of the form.

The problem I'm having is that some of the responses come back with only an X coordinate. Example:

{u'description': u'sometext', u'boundingPoly': {u'vertices': [{u'x': 5595}, {u'x': 5717}, {u'y': 122, u'x': 5717}, {u'y': 122, u'x': 5595}

I've set a try/except (using python 2.7) to catch this issue, but it's always the same issue: KeyError: 'y'. I'm iterating through thousands of forms; so far it has happened to 10 rows out of 1000.

Has anyone had this issue before? Is there a fix other than attempting to re-submit the request if it reaches this error?

like image 244
crld Avatar asked Sep 07 '16 20:09

crld


People also ask

Does Google use my data for improving Google vision?

Google does not use any of your content (such as images and labels) for any purpose except to provide you with the Vision API service.


1 Answers

From the docs:

boundingPoly

object(BoundingPoly)

The bounding polygon around the face. The coordinates of the bounding box are in the original image's scale, as returned in ImageParams. The bounding box is computed to "frame" the face in accordance with human expectations. It is based on the landmarker results. Note that one or more x and/or y coordinates may not be generated in the BoundingPoly (the polygon will be unbounded) if only a partial face appears in the image to be annotated.

I believe this is implying that the 'y' value in this case is 0, or more generally, an edge value. In other words, it doesn't know where the bounded poly truly ends, as the text goes all the way to the edge of the image, and thus the image doesn't give enough info to know for sure that the text actually ends there. As far as the image provides, it ends at 'y' of 0.

like image 160
CivFan Avatar answered Oct 17 '22 12:10

CivFan