Is there any option on Google Cloud Vision API, to detect and return a table (Rows and Column with headers) from a scanned Image?

Question

We are using Google Cloud Vision APIs to extract Invoice fields. We would like to know whether the APIs support detection of table of data? Or do we have to write custom code to detect tables?

Christopher P · Accepted Answer

The Google Vision API will not return data from forms in a structured way. However, the coordinates of the polygons that surround the text (the boundingPoly) will be provided in the response. Take a look at this example:

{
     "description": "ABBEY",
     "boundingPoly": {
         "vertices": [ {
             "x": 44,
             "y": 43
             }, ...
          ] }, ...
}

One approach you can use is to determine the coordinates of the field on your invoice and then write some code to iterate through the boundingPoly objects of your JSON response to check if the region in which the vertices lie overlaps to some degree with the region of your fields. If the boundingPoly coordinates are in the same region as your fields, then - with Python for example - you can map those words using a dictionary to your field names.

Is there any option on Google Cloud Vision API, to detect and return a table (Rows and Column with headers) from a scanned Image?

Tags:

detection

google-cloud-vision

Ajinkya Mulay

1 Answers

Christopher P

Recent Activity

Donate For Us

Is there any option on Google Cloud Vision API, to detect and return a table (Rows and Column with headers) from a scanned Image?

Tags:

detection

google-cloud-vision

Ajinkya Mulay

1 Answers

Christopher P

Related questions

Recent Activity

Donate For Us