Augmented Reality OpenGL+OpenCV

Question

The basic idea is that you have 2 cameras: one is the physical one (the one where you are retriving the images with opencv) and one is the opengl one. You have to align those two matrices.

To do that, you need to calibrate the physical camera.

First. You need a distortion parameters (because every lens more or less has some optical distortion), and build with those parameters the so called intrinsic parameters. You do this with printing a chessboard in a paper, using it for get some images and calibrate the camera. It's full of nice tutorial about that on the internet, and from your answer it seems you have them. That's nice.

Then. You have to calibrate the position of the camera. And this is done with the so called extrinsic parameters. Those parameters encoded the position and the rotation the the 3D world of those camera.

The intrinsic parameters are needed by the OpenCV methods cv::solvePnP and cv::Rodrigues and that uses the rodrigues method to get the extrinsic parameters. This method get in input 2 set of corresponding points: some 3D knowon points and their 2D projection. That's why all augmented reality applications need some markers: usually the markers are square, so after detecting it you know the 2D projection of the point P1(0,0,0) P2(0,1,0) P3(1,1,0) P4(1,0,0) that forms a square and you can find the plane lying on them.

Once you have the extrinsic parameters all the game is easily solved: you just have to make a perspective projection in OpenGL with the FoV and the aperture angle of the camera from intrinsic parameter and put the camera in the position given by the extrinsic parameters.

Of course, if you want (and you should) understand and handle each step of this process correctly.. there is a lot of math - matrices, angles, quaternion, matrices again, and.. matrices again. You can find a reference in the famous Multiple View Geometry in Computer Vision from R. Hartley and A. Zisserman.

Moreover, to handle correctly the opengl part you have to deal with the so called "Modern OpenGL" (remember that glLoadMatrix is deprecated) and a little bit of shader for loading the matrices of the camera position (for me this was a problem because I didn't knew anything about it).

I have dealt with this some times ago and I have some code so feel free to ask any kind of problems you have. Here some links I found interested:

http://ksimek.github.io/2012/08/14/decompose/ (really good explanation)
Camera position in world coordinate from cv::solvePnP (a question I asked about that)
http://www.morethantechnical.com/2010/11/10/20-lines-ar-in-opencv-wcode/ (fabulous blog about computer vision)
http://spottrlabs.blogspot.it/2012/07/opencv-and-opengl-not-always-friends.html (nice tricks) http://strawlab.org/2011/11/05/augmented-reality-with-OpenGL/
http://www.songho.ca/opengl/gl_projectionmatrix.html (very good explanation on opengl camera settings basics)
some other random usefull stuffs: http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html (documentation, always look at the docs!!!) Determine extrinsic camera with opencv to opengl with world space object Rodrigues into Eulerangles and vice versa Python Opencv SolvePnP yields wrong translation vector http://answers.opencv.org/question/23089/opencv-opengl-proper-camera-pose-using-solvepnp/

Please read them before anything else. As usual, once you got the concept it is an easy joke, need to crash your brain a little bit against the wall. Just don't be scared from all those math : )