문제

Use-case


  • An object is rotating around it’s center in varying speed
  • A fixed camera is looking at the object
  • Given 2D Image points correspondence reconstruct the 3D Point Cloud
  • As the object rotates so a different part of it is seen to the Camera, and thus, different points & correspondences are detected.


Scene


  a. N Images
  b. N-1 Image pairs
  c. N-1 2D Point correspondence ( Two 2D Points arrays )


Implementation


For each of the (N-1) 2D Points correspondences

  1. Compute Camera relative Pose
  2. Triangulate to result the 3D Points
  3. For each 2 3D Points arrays, Derive correspondence using the 2D Correspondence given at [c]
  4. Using 3D Correspondence derived @ [3] derive the track of each of the object 3D points resulting a single track for each of the Object Points/Vertices


Result:


A (N–2) 3D Point arrays, Correspondences, camera poses and Tracks ( one track for each object point )


Approach considered to resolve the problem:


Given that the triangulation result is accurate up to a scale, calculate the point cloud.
  A. Each of the triangulation results and Camera relative Translations are
      expressed in NON-homogeneous coordinates ( each result has a different scale ).
  B. Under the assumption that the object structure is solid and thus, does not change,
      the distance of each of the 3D points to its center should be identical for all camera poses.
  C. Having [B] in mind, all triangulated 3D points at [A] and Cameras Translations
      can be converted to a homogeneous coordinate system.
  D. Select one of the Camera poses and Transform the first Point in each Track ( defined @ [4] )
      to that Camera Pose ( Transform by the inverse of the accumulated Camera
      Pose ), resulting, the expected point could.

Is the above the right approach to generate the point-cloud from 2D point correspondence?

도움이 되었습니까?

해결책

It is the right procedure to reconstructe an object. I worked on this topic the last year at a project at our University. The experience I made is that it isn't easy to reconstruct a object by hand moving camera.

Matching

First you have to think about the matching of intereset points. SURF and SIFT are good matching methods for this points. When the object is moving less then 15° you can think about to use USURF which is a bit faster then the normal SURF (for more details watch at the SURF paper). In our project we decided to us Optical Flow in OpenCV it looks a bit slower but was more robust about outliers. Your object is only rotating so you can think to use this, too.

Evaluation of Matrix

Next is evaluating your results of the new camera matrix. Do you have a possibility to find out how much the object was rotated (like some step motor or something)? So you can compare your computed results with the steps of the motor. If it is higher then threshold you know the computation was bad. But be carfull the precision of some step motors is not so good, but some experiments could bring more informations about that.

Evaluation of Cloud

There are some nice ways to evaluate the computed cloud. The easiest way is to compute the reprojection error of the cloud. For that you just reverse your reconstruction and look how far the computed images points away from the original corresponding points. A other Test is to check if all points are infronte of the camera. By computing it can happend that the points lie infront and behind the camera. I understand it can happend when both camera are to close each other, and the triangulation terminates as well.

First Image Pair

Iam not sure if this step is necessary with a static camera. But first of all we had to calculate a Fundamental matrix. We made the experience to use the Image pair that has the most matches to extract them and use the RANSAC version give the best results. But maybe you can try to place the object so that it has the most Intereset points in the front for the first shot.

Following Image Pairs

What worked really well is to extract the new camera positions from the exisiting point cloud which was computed from old image pairs before. For that you have remember the 2D 3D correspondenc of images before. It is called Perspective­n­Point Camera Pose Estimation (PnP).

At the end we had some good and bad results. It was depending on the scanning object. Here are some papers which helped me:

Modeling The World

Live Metric 3D-Reconstruction

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top