Question

At the moment I am implementing the calibration method(s) for stereo vision. I am using the OpenCV library.

There is an example in the sample folder, but I have some questions about the implementation:

Where are these array's for and what are those CvMat variables?

// ARRAY AND VECTOR STORAGE:
double M1[3][3], M2[3][3], D1[5], D2[5];
double R[3][3], T[3], E[3][3], F[3][3];
CvMat _M1 = cvMat(3, 3, CV_64F, M1 );
CvMat _M2 = cvMat(3, 3, CV_64F, M2 );
CvMat _D1 = cvMat(1, 5, CV_64F, D1 );
CvMat _D2 = cvMat(1, 5, CV_64F, D2 );
CvMat _R = cvMat(3, 3, CV_64F, R );
CvMat _T = cvMat(3, 1, CV_64F, T );
CvMat _E = cvMat(3, 3, CV_64F, E );
CvMat _F = cvMat(3, 3, CV_64F, F );

In other examples I see this code:

//--------Find and Draw chessboard--------------------------------------------------    

    if((frame++ % 20) == 0)
    {
        //----------------CAM1-------------------------------------------------------------------------------------------------------
        result1 = cvFindChessboardCorners( frame1, board_sz,&temp1[0], &count1,CV_CALIB_CB_ADAPTIVE_THRESH|CV_CALIB_CB_FILTER_QUADS);
        cvCvtColor( frame1, gray_fr1, CV_BGR2GRAY );

What does the if statement exactly do? Why %20?

Thank you in advance!


Update:

I have a two questions about some implementation code: link

-1: Those nx and ny variables that are declared in line 18 and used in the board_sz variable at line 25. Are these nx and ny the rows and columns or the corners in the chessboard pattern? (I think that these are the rows and columns, because cvSize has parameters for width and height).

-2: What are these CvMat variables for (lines 143 - 146)?

CvMat _objectPoints = cvMat(1, N, CV_32FC3, &objectPoints[0] );
CvMat _imagePoints1 = cvMat(1, N, CV_32FC2, &points[0][0] );
CvMat _imagePoints2 = cvMat(1, N, CV_32FC2, &points[1][0] );
CvMat _npoints = cvMat(1, npoints.size(), CV_32S, &npoints[0] );
Was it helpful?

Solution

Each of those matrices has a meaning in epipolar geometry. They describe the relation between your two cameras in 3D space and between the images they record.

In your example, they are:

  • M1 - the camera intrinsics matrix of your left camera
  • M2 - the camera intrinsics matrix of your right camera
  • D1 - the distortion coefficients of your left camera
  • D2 - the distortion coefficients of your right camera
  • R - the rotation matrix from the right to your left camera
  • T - the translation vector from the right to your left camera
  • E - the essential matrix of your stereo setup
  • F - the fundamental matrix of your stereo setup

On the basis of these matrices, you can undistort and rectify your images, which allows you to extract the depth of a point you see in both images by way of their disparity (the difference in x, basically). Finding a point in both images is called matching, and is generally the last step after rectification.

Any good introduction to epipolar geometry and stereo vision will probably be better than anything I could type up here. I recommend the Learning OpenCV book from which your example code is taken and which goes into great detail explaining the basics.

The second part of your question has already been answered in a comment: (frame++ % 20) is 0 for every 20th frame recorded from your webcam, so the code in the if-clause is executed once per 20 frames.


Response to your update:

nx and ny are the number of corners in the chessboard pattern in your calibration images. n a "normal" 8x8 chessboard, nx = ny = 7. You can see that in lines 138-139, the points of one ideal chessboard are created by offsetting nx*ny points with a distance of squareSize, the size of one square in your chessboard.

The CvMat variables "objectPoints", "imagePoints" and "npoints" are passed into the cvStereoCalibrate function.

  • objectPoints contains the points of your calibration object (the chessboard)
  • imagePoints1/2 contain these points as seen by each of your cameras
  • npoints just contains the number of points in each image (as an M-by-1 matrix) - feel free to ignore it, it's not used in the OpenCV C++ API any more anyway.

Basically, cvStereoCalibrate fits the imagePoints to the objectPoints, and returns 1) the distortion coefficients, 2) the intrinsic camera matrices and 3) the spatial relation of the two cameras as the rotation matrix R and translation vector T. The first are used to undistort your images, the second relay your pixel coordinates to real-world coordinates, and the third allow you can rectify your two images.

As a side note: I remember having trouble with the stereo calibration because the chessboard orientation could be detected differently in the left and right camera images. This shouldn't be a problem unless you have a large angle between your cameras (which isn't a great idea) or you incline your chessboards a lot (which isn't necessary), but you can still keep an eye out.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top