Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

3D trajectory reconstruction from video (taken by a single camera)

I am currently trying to reconstruct a 3D trajectory of a falling object like a ball or a rock out of a sequence of images taken from an iPhone video.

Where should I start looking? I know I have to calibrate the camera (I think I'll use the matlab calibration toolbox by Jean-Yves Bouguet) and then find the vanishing point from the same sequence, but then I'm really stuck.

like image 853
ripkars Avatar asked Feb 13 '12 10:02

ripkars


1 Answers

read this: http://www.cs.auckland.ac.nz/courses/compsci773s1c/lectures/773-GG/lectA-773.htm it explains 3d reconstruction using two cameras. Now for a simple summary, look at the figure from that site:

3d reconstruction from stereovision

You only know pr/pl, the image points. By tracing a line from their respective focal points Or/Ol you get two lines (Pr/Pl) that both contain the point P. Because you know the 2 cameras origin and orientation, you can construct 3d equations for these lines. Their intersection is thus the 3d point, voila, it's that simple.

But when you discard one camera (let's say the left one), you only know for sure the line Pr. What's missing is depth. Luckily you know the radius of your ball, this extra information can give you the missing depth information. see next figure (don't mind my paint skills): bal projection

Now you know the depth using the intercept theorem

I see one last issue: the shape of ball changes when projected under an angle (ie not perpendicular on your capture plane). However you do know the angle, so compensation is possible, but I leave that up to you :p

edit: @ripkars' comment (comment box was too small)

1) ok

2) aha, the correspondence problem :D Typically solved by correlation analysis or matching features (mostly matching followed by tracking in a video). (other methods exist too) I haven't used the image/vision toolbox myself, but there should definitely be some things to help you on the way.

3) = calibration of your cameras. Normally you should only do this once, when installing the cameras (and every other time you change their relative pose)

4) yes, just put the Longuet-Higgins equation to work, ie: solve

P = C1 + mu1*R1*K1^(-1)*p1
P = C2 + mu2*R2*K2^(-1)*p2

with P = 3D point to find C = camera center (vector) R = rotation matrix expressing the orientation of the first camera in the world frame. K = calibration matrix of the camera (containing internal parameters of the camera, not to be confused with the external parameters contained by R and C) p1 and p2 = the image points mu = parameter expressing the position of P on the projection line from camera center C to P (if i'm correct R*K^-1*p expresses a line equation/vector pointing from C to P)

these are 6 equations containing 5 unknowns: mu1, mu2 and P

edit: @ripkars' comment (comment box too small once again) The only computer vison library that pops up in my mind is OpenCV (http://opencv.willowgarage.com/wiki ). But that's a C library, not matlab... I guess google is your friend ;)

About the calibration: yes, if those two images contain enough information to match some features. If you change the relative pose of the cameras, you'll have to recalibrate of course.

The choice of the world frame is arbitrary; it only becomes important when you want to analyze the retrieved 3d data afterwards: for example you could align one of the world planes with the plane of motion -> simplified motion equation if you want to fit one. This world frame is just a reference frame, changeable with a 'change of reference frame transformation' (translation and/or rotation transformation)

like image 158
Gunther Struyf Avatar answered Oct 13 '22 11:10

Gunther Struyf