Basically, you need to change the 3D coordinate system to convert 3D points as seen by the depth camera to 3D points seen by the RGB camera.
You cannot use function reprojectImageTo3D()
because it expects a matrix Q that you do not have. Instead, you should convert your disparity map to a depthmap using the function raw_depth_to_meters
provided in the page you linked.
Then, for each pixel of the depthmap, you need to compute the associated 3D point, denoted by P3D
in the page you linked (see § "Mapping depth pixels with color pixels"). Then you need to apply the provided 3D rotation matrix R and 3D translation vector T, which represent the transformation from the depth camera to the RGB camera, to each 3D point P3D
in order to obtain the associated new 3D point P3D'
. Finally, using the calibration matrix of the RGB camera, you can project the new 3D points into the RGB image, and assign the associated depth to the obtained pixel in order to generate a new depthmap aligned with the RGB image.
Note that you are necessarily loosing accuracy in the process, since you need to handle occlusions (by keeping only the minimum depth seen by each pixel) and image interpolation (since in general, the projected 3D points won't be associated with integer pixel coordinates in the RGB image). Concerning image interpolation, I recommand you use the nearest neighbor approach, otherwise you might end up with weird behavior at depth boundaries.
Edit following the question update
Here is a model of what you should be doing in order to remap the Kinect depthmap to the RGB cam point of view:
cv::Mat_<float> pt(3,1), R(3,3), t(3,1);
// Initialize R & t here
depthmap_rgbcam = cv::Mat::zeros(height,width,CV_32FC1); // Initialize the depthmap to all zeros
float *depthmap_rgbcam_buffer = (float*)depthmap_rgbcam.data;
for(int row=0; row<height; ++row)
{
for(int col=0; col<width; ++col)
{
// Convert kinect raw disparity to depth
float raw_disparity = kinect_disparity_map_buffer[width*row+col];
float depth_depthcam = disparity_to_depth(raw_disparity);
// Map depthcam depth to 3D point
pt(0) = depth*(col-cx_depthcam)/fx_depthcam; // No need for a 3D point buffer
pt(1) = depth*(row-cy_depthcam)/fy_depthcam; // here, unless you need one.
pt(2) = depth;
// Rotate and translate 3D point
pt = R*pt+t;
// If required, apply rgbcam lens distortion to X, Y and Z here.
// Project 3D point to rgbcam
float x_rgbcam = fx_rgbcam*pt(0)/pt(2)+cx_rgbcam;
float y_rgbcam = fy_rgbcam*pt(1)/pt(2)+cy_rgbcam;
// "Interpolate" pixel coordinates (Nearest Neighbors, as discussed above)
int px_rgbcam = cvRound(x_rgbcam);
int py_rgbcam = cvRound(y_rgbcam);
// Handle 3D occlusions
float &depth_rgbcam = depthmap_rgbcam_buffer[width*py_rgbcam+px_rgbcam];
if(depth_rgbcam==0 || depth_depthcam<depth_rgbcam)
depth_rgbcam = depth_depthcam;
}
}
This is the idea, modulo possible typos. You can also change consistently the datatype as you like. Concerning your comment, I don't think there is any builtin OpenCV function for that purpose yet.