I am very new to OpenCV with a limited experience on OpenGL. I am willing to overlay a 3D object on a calibrated image of a checkerboard. Any tips or guidance?
Figure 1: OpenCV can be used to apply augmented reality to real-time video streams. The very reason the OpenCV library exists is to facilitate real-time image processing. The library accepts input images/frames, processes them as quickly as possible, and then returns the results.
OpenCV already supports OpenGL for image output by itself. No need to write this yourself!
Augmented reality starts with a camera-equipped device—such as a smartphone, a tablet, or smart glasses—loaded with AR software. When a user points the device and looks at an object, the software recognizes it through computer vision technology, which analyzes the video stream.
The basic idea is that you have 2 cameras: one is the physical one (the one where you are retriving the images with opencv) and one is the opengl one. You have to align those two matrices.
To do that, you need to calibrate the physical camera.
First. You need a distortion parameters (because every lens more or less has some optical distortion), and build with those parameters the so called intrinsic parameters. You do this with printing a chessboard in a paper, using it for get some images and calibrate the camera. It's full of nice tutorial about that on the internet, and from your answer it seems you have them. That's nice.
Then. You have to calibrate the position of the camera. And this is done with the so called extrinsic parameters. Those parameters encoded the position and the rotation the the 3D world of those camera.
The intrinsic parameters are needed by the OpenCV methods cv::solvePnP
and cv::Rodrigues
and that uses the rodrigues method to get the extrinsic parameters. This method get in input 2 set of corresponding points: some 3D knowon points and their 2D projection. That's why all augmented reality applications need some markers: usually the markers are square, so after detecting it you know the 2D projection of the point P1(0,0,0) P2(0,1,0) P3(1,1,0) P4(1,0,0) that forms a square and you can find the plane lying on them.
Once you have the extrinsic parameters all the game is easily solved: you just have to make a perspective projection in OpenGL with the FoV and the aperture angle of the camera from intrinsic parameter and put the camera in the position given by the extrinsic parameters.
Of course, if you want (and you should) understand and handle each step of this process correctly.. there is a lot of math - matrices, angles, quaternion, matrices again, and.. matrices again. You can find a reference in the famous Multiple View Geometry in Computer Vision from R. Hartley and A. Zisserman.
Moreover, to handle correctly the opengl part you have to deal with the so called "Modern OpenGL" (remember that glLoadMatrix
is deprecated) and a little bit of shader for loading the matrices of the camera position (for me this was a problem because I didn't knew anything about it).
I have dealt with this some times ago and I have some code so feel free to ask any kind of problems you have. Here some links I found interested:
Please read them before anything else. As usual, once you got the concept it is an easy joke, need to crash your brain a little bit against the wall. Just don't be scared from all those math : )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With