Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating aspect ratio of Perspective Transform destination image

I've recently implemented Perspective Transform in OpenCV to my app in Android. Almost everything works without issues but one aspect needs much more work to be done.

The problem is that I do not know how to count the right aspect ratio of the destination image of Perspective Transform (it does not have to be set manually), so that it could count the aspect ratio of the image to the size of the real thing/image despite the angle of a camera. Note that the starting coordinates do not form trapezoid, it does form a quadrangle.

If I have a photograph of a book taken from approximately 45 degrees and I want the destination image aspect ratio to be pretty much the same as this book's aspect ratio is. It is hard to do having a 2D photo, but CamScanner app does it perfectly. I've made very simple way to count the size of my destination image (with no expectations for it to work as I want), but it makes the image from 45 degree angle about 20% shorter and when lowering the angle the image height reduces significantly, while CamScanner does it perfectly despite the angle:

enter image description here

Here, CamScanner maintains the aspect ratio of the destination image (second one) the same as the book's, it did pretty accurately even at ~20 degree angle.

Meanwhile, my code looks like this (while counting sizes of destination image I have no intention for it to work as I ask in this question):

public static Mat PerspectiveTransform(Point[] cropCoordinates, float ratioW, float ratioH, Bitmap croppedImage)
{
    if (cropCoordinates.length != 4) return null;

    double width1, width2, height1, height2, avgw, avgh;
    Mat src = new Mat();
    List<Point> startCoords = new ArrayList<>();
    List<Point> resultCoords = new ArrayList<>();

    Utils.bitmapToMat(croppedImage, src);

    for (int i = 0; i < 4; i++)
    {
        if (cropCoordinates[i].y < 0 ) new Point(cropCoordinates[i].x, 0);
        startCoords.add(new Point(cropCoordinates[i].x * ratioW, cropCoordinates[i].y * ratioH));
    }

    width1 = Math.sqrt(Math.pow(startCoords.get(2).x - startCoords.get(3).x,2) + Math.pow(startCoords.get(2).y - startCoords.get(3).y,2));
    width2 = Math.sqrt(Math.pow(startCoords.get(1).x - startCoords.get(0).x,2) + Math.pow(startCoords.get(1).y - startCoords.get(0).y,2));
    height1 = Math.sqrt(Math.pow(startCoords.get(1).x - startCoords.get(2).x, 2) + Math.pow(startCoords.get(1).y - startCoords.get(2).y, 2));
    height2 = Math.sqrt(Math.pow(startCoords.get(0).x - startCoords.get(3).x, 2) + Math.pow(startCoords.get(0).y - startCoords.get(3).y, 2));
    avgw = (width1 + width2) / 2;
    avgh = (height1 + height2) / 2;

    resultCoords.add(new Point(0, 0));
    resultCoords.add(new Point(avgw-1, 0));
    resultCoords.add(new Point(avgw-1, avgh-1));
    resultCoords.add(new Point(0, avgh-1));

    Mat start = Converters.vector_Point2f_to_Mat(startCoords);
    Mat result = Converters.vector_Point2d_to_Mat(resultCoords);
    start.convertTo(start, CvType.CV_32FC2);
    result.convertTo(result,CvType.CV_32FC2);

    Mat mat = new Mat();
    Mat perspective = Imgproc.getPerspectiveTransform(start, result);
    Imgproc.warpPerspective(src, mat, perspective, new Size(avgw, avgh));

    return mat;
}

And from relatively the same angle my method produces this result:

enter image description here

What I want to know is how it is possible to do? It is interesting for me how did they manage to count the length of the object just by having coordinates of 4 corners. Also, if it is possible, please provide some code/ mathematical explanations or articles of similar/same thing.

Thank you in advance.

like image 395
Dainius Šaltenis Avatar asked Jul 09 '16 18:07

Dainius Šaltenis


1 Answers

This has come up a few times before on SO but I've never seen a full answer, so here goes. The implementation shown here is based on this paper which derives the full equations: http://research.microsoft.com/en-us/um/people/zhang/papers/tr03-39.pdf

Essentially, it shows that assuming a pinhole camera model, it is possible to calculate the aspect ratio for a projected rectangle (but not the scale, unsurprisingly). Essentially, one can solve for the focal length, then get the aspect ratio. Here's a sample implementation in python using OpenCV. Note that you need to have the 4 detected corners in the right order or it won't work (note the order, it is a zigzag). The reported error rates are in the 3-5% range.

import math
import cv2
import scipy.spatial.distance
import numpy as np

img = cv2.imread('img.png')
(rows,cols,_) = img.shape

#image center
u0 = (cols)/2.0
v0 = (rows)/2.0

#detected corners on the original image
p = []
p.append((67,74))
p.append((270,64))
p.append((10,344))
p.append((343,331))

#widths and heights of the projected image
w1 = scipy.spatial.distance.euclidean(p[0],p[1])
w2 = scipy.spatial.distance.euclidean(p[2],p[3])

h1 = scipy.spatial.distance.euclidean(p[0],p[2])
h2 = scipy.spatial.distance.euclidean(p[1],p[3])

w = max(w1,w2)
h = max(h1,h2)

#visible aspect ratio
ar_vis = float(w)/float(h)

#make numpy arrays and append 1 for linear algebra
m1 = np.array((p[0][0],p[0][1],1)).astype('float32')
m2 = np.array((p[1][0],p[1][1],1)).astype('float32')
m3 = np.array((p[2][0],p[2][1],1)).astype('float32')
m4 = np.array((p[3][0],p[3][1],1)).astype('float32')

#calculate the focal disrance
k2 = np.dot(np.cross(m1,m4),m3) / np.dot(np.cross(m2,m4),m3)
k3 = np.dot(np.cross(m1,m4),m2) / np.dot(np.cross(m3,m4),m2)

n2 = k2 * m2 - m1
n3 = k3 * m3 - m1

n21 = n2[0]
n22 = n2[1]
n23 = n2[2]

n31 = n3[0]
n32 = n3[1]
n33 = n3[2]

f = math.sqrt(np.abs( (1.0/(n23*n33)) * ((n21*n31 - (n21*n33 + n23*n31)*u0 + n23*n33*u0*u0) + (n22*n32 - (n22*n33+n23*n32)*v0 + n23*n33*v0*v0))))

A = np.array([[f,0,u0],[0,f,v0],[0,0,1]]).astype('float32')

At = np.transpose(A)
Ati = np.linalg.inv(At)
Ai = np.linalg.inv(A)

#calculate the real aspect ratio
ar_real = math.sqrt(np.dot(np.dot(np.dot(n2,Ati),Ai),n2)/np.dot(np.dot(np.dot(n3,Ati),Ai),n3))

if ar_real < ar_vis:
    W = int(w)
    H = int(W / ar_real)
else:
    H = int(h)
    W = int(ar_real * H)

pts1 = np.array(p).astype('float32')
pts2 = np.float32([[0,0],[W,0],[0,H],[W,H]])

#project the image with the new w/h
M = cv2.getPerspectiveTransform(pts1,pts2)

dst = cv2.warpPerspective(img,M,(W,H))

cv2.imshow('img',img)
cv2.imshow('dst',dst)
cv2.imwrite('orig.png',img)
cv2.imwrite('proj.png',dst)

cv2.waitKey(0)

Original:

enter image description here

Projected (the resolution is very low since I cropped the image from your screenshot, but the aspect ratio seems correct):

enter image description here

like image 190
yhenon Avatar answered Oct 10 '22 02:10

yhenon