Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate a specific distance inside of a picture?

sorry for my bad english. I have the following problem:

1

Lets say the camera of my mobile device is showing this picture.

In the picture you can see 4 different positions. Every position is known to me (longitude, latitude).

Now i want to know, where in the picture a specific position is. For example, i want to have a rectangle 20 meters in front and 5 meters to the left of me. I just know the latitude/longitude of this point, but i don't know, where i have to place it inside of the picture (x,y). For example, POS3 is at (0,400) in my view. POS4 is at (600,400) and so on.

Where do i have to put the new point, which is 20 meters in front and 5 meters to the left of me? (So my Input is: (LatXY,LonXY) and my result should be (x,y) on the screen)

I also got the height of the camera and the angles of x,y and z - axis from the camera.

Can i use simple mathematic operations to solve this problem?

Thank you very much!

like image 776
Frame91 Avatar asked Mar 27 '13 22:03

Frame91


1 Answers

The answer you want will depend on the accuracy of the result you need. As danaid pointed out, nonlinearity in the image sensor and other factors, such as atmospheric distortion, may induce errors, but would be difficult problems to solve with different cameras, etc., on different devices. So let's start by getting a reasonable approximation which can be tweaked as more accuracy is needed.

First, you may be able to ignore the directional information from the device, if you choose. If you have the five locations, (POS1 - POS4 and camera, in a consistent basis set of coordinates, you have all you need. In fact, you don't even need all those points.

A note on consistent coordinates. At his scale, once you use the convert the lat and long to meters, using cos(lat) for your scaling factor, you should be able to treat everyone from a "flat earth" perspective. You then just need to remember that the camera's x-y plane is roughly the global x-z plane.

Conceptual Background The diagram below lays out the projection of the points onto the image plane. The dz used for perspective can be derived directly using the proportion of the distance in view between far points and near points, vs. their physical distance. In the simple case where the line POS1 to POS2 is parallel to the line POS3 to POS4, the perspective factor is just the ratio of the scaling of the two lines:

Scale (POS1, POS2) = pixel distance (pos1, pos2) / Physical distance (POS1, POS2)
Scale (POS3, POS4) = pixel distance (pos3, pos4) / Physical distance (POS3, POS4)
Perspective factor = Scale (POS3, POS4) / Scale (POS1, POS2)

So the perspective factor to apply to a vertex of your rect would be the proportion of the distance to the vertex between the lines. Simplifying:

Factor(rect) ~= [(Rect.z - (POS3, POS4).z / ((POS1, POS2).z - (POS3, POS4).z)] * Perspective factor.

Answer

A perspective transformation is linear with respect to the distance from the focal point in the direction of view. The diagram below is drawn with the X axis parallel to the image plane, and the Y axis pointing in the direction of view. In this coordinate system, for any point P and an image plane any distance from the origin, the projected point p has an X coordinate p.x which is proportional to P.x/P.y. These values can be linearly interpolated.

In the diagram, tp is the desired projection of the target point. to get tp.x, interpolate between, for example, pos1.x and pos3.x using adjustments for the distance, as follows:

tp.x = pos1.x + ((pos3.x-pos1.x)*((TP.x/TP.y)-(POS1.x/POS1.y))/((POS3.x/POS3.y)-(POS1.x/POS1.y))

The advantage of this approach is that it does not require any prior knowledge of the angle viewed by each pixel, and it will be relatively robust against reasonable errors in the location and orientation of the camera.

Further refinement

Using more data means being able to compensate for more errors. With multiple points in view, the camera location and orientation can be calibrated using the Tienstra method. A concise proof of this approach, (using barycentric coordinates), can be found here.

Since the transformation required are all linear based on homogeneous coordinates, you could apply barycentric coordinates to interpolate based on any three or more points, given their X,Y,Z,W coordinates in homogeneous 3-space and their (x,y) coordinates in image space. The closer the points are to the destination point, the less significant the nonlinearities are likely to be, so in your example, you would use POS 1 and POS3, since the rect is on the left, and POS2 or POS4 depending on the relative distance.

(Barycentric coordinates are likely most familiar as the method used to interpolate colors on a triangle (fragment) in 3D graphics.)

Edit: Barycentric coordinates still require the W homogeneous coordinate factor, which is another way of expressing the perspective correction for the distance from the focal point. See this article on GameDev for more details.

Two related SO questions: perspective correction of texture coordinates in 3d and Barycentric coordinates texture mapping. This diagram may help in explaining the interpolation of image coordinates based on global coordinates

like image 59
Steven McGrath Avatar answered Oct 31 '22 23:10

Steven McGrath