Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get correct image orientation by Google Cloud Vision api (TEXT_DETECTION)

I tried Google Cloud Vision api (TEXT_DETECTION) on 90 degrees rotated image. It still can return recognized text correctly. (see image below)

That means the engine can recognize text even the image is 90, 180, 270 degrees rotated.

However the response result doesn't include information of correct image orientation. (document: EntityAnnotation)

Is there anyway to not only get recognized text but also get the orientation?
Could Google support it similar to (FaceAnnotation: getRollAngle)

enter image description here

like image 835
Jack Fan Avatar asked Dec 22 '16 14:12

Jack Fan


3 Answers

You can leverage the fact that we know the sequence of characters in a word to infer the orientation of a word as follows (obviously slightly different logic for non-LTR languages):

for page in annotation:
    for block in page.blocks:
        for paragraph in block.paragraphs:
            for word in paragraph.words:
                if len(word.symbols) < MIN_WORD_LENGTH_FOR_ROTATION_INFERENCE:
                    continue
                first_char = word.symbols[0]
                last_char = word.symbols[-1]
                first_char_center = (np.mean([v.x for v in first_char.bounding_box.vertices]),np.mean([v.y for v in first_char.bounding_box.vertices]))
                last_char_center = (np.mean([v.x for v in last_char.bounding_box.vertices]),np.mean([v.y for v in last_char.bounding_box.vertices]))

                #upright or upside down
                if np.abs(first_char_center[1] - last_char_center[1]) < np.abs(top_right.y - bottom_right.y): 
                    if first_char_center[0] <= last_char_center[0]: #upright
                        print 0
                    else: #updside down
                        print 180
                else: #sideways
                    if first_char_center[1] <= last_char_center[1]:
                        print 90
                    else:
                        print 270

Then you can use the orientation of individual words to infer the orientation of the document overall.

like image 54
faaez Avatar answered Nov 20 '22 11:11

faaez


As described in the Public Issue Tracker, our engineering team is now aware of this feature request, and there is currently no ETA for its implementation.

Note, orientation information may already be available in your image's metadata. An example of how to extract the metadata can be seen in this Third-party library.

A broad workaround would be to check the returned "boundingPoly" "vertices" for the returned "textAnnotations". By calculating the width and height of each detected word's rectangle, you can figure out if an image is not right-side-up if the rectangle 'height' > 'width' (aka the image is sideways).

like image 5
Jordan Avatar answered Nov 20 '22 13:11

Jordan


I post my workaround which really works for images 90, 180, 270 degrees rotated. Please see the code below.

GetExifOrientation(annotateImageResponse.getTextAnnotations().get(1));
/**
 *
 * @param ea  The input EntityAnnotation must be NOT from the first EntityAnnotation of
 *            annotateImageResponse.getTextAnnotations(), because it is not affected by
 *            image orientation.
 * @return Exif orientation (1 or 3 or 6 or 8)
 */
public static int GetExifOrientation(EntityAnnotation ea) {
    List<Vertex> vertexList = ea.getBoundingPoly().getVertices();
    // Calculate the center
    float centerX = 0, centerY = 0;
    for (int i = 0; i < 4; i++) {
        centerX += vertexList.get(i).getX();
        centerY += vertexList.get(i).getY();
    }
    centerX /= 4;
    centerY /= 4;

    int x0 = vertexList.get(0).getX();
    int y0 = vertexList.get(0).getY();

    if (x0 < centerX) {
        if (y0 < centerY) {
            //       0 -------- 1
            //       |          |
            //       3 -------- 2
            return EXIF_ORIENTATION_NORMAL; // 1
        } else {
            //       1 -------- 2
            //       |          |
            //       0 -------- 3
            return EXIF_ORIENTATION_270_DEGREE; // 6
        }
    } else {
        if (y0 < centerY) {
            //       3 -------- 0
            //       |          |
            //       2 -------- 1
            return EXIF_ORIENTATION_90_DEGREE; // 8
        } else {
            //       2 -------- 3
            //       |          |
            //       1 -------- 0
            return EXIF_ORIENTATION_180_DEGREE; // 3
        }
    }
}

More info
I found I have to add language hint to make annotateImageResponse.getTextAnnotations().get(1) always follow the rule.

Sample code to add language hint

ImageContext imageContext = new ImageContext();
String [] languages = { "zh-TW" };
imageContext.setLanguageHints(Arrays.asList(languages));
annotateImageRequest.setImageContext(imageContext);
like image 5
Jack Fan Avatar answered Nov 20 '22 11:11

Jack Fan