Pergunta

please help me to understand this idea from a paper which titled is "Scene Summarization for Online Image Collections" by Ian Simon Noah Snavely Steven M. Seitz, University of Washington.

Computing the Feature-Image Matrix :
We first transform the set of views into a feature-image incidence matrix. To do so, we use the SIFT keypoint detector to find feature points in all of the images in V. The feature points are represented using the SIFT descriptor. Then, for each pair of images, we perform feature matching on the descriptors to extract a set of candidate matches. We further prune the set of candidates by estimating a fundamental matrix using RANSAC and removing all inconsistent matches After the previous step is complete for all images,

we organize the matches into tracks, where a track is a connected component of features. We remove tracks containing fewer than two features total, or at least two features in the same image. At this point, we consider each track as corresponding to a single 3D point in S. From the set of tracks, it is easy to construct the |S|-by-|V| feature-image incidence matrix.

the part which i confused about is the italic one.
how we organize matches into tracks ?
and how to construct feature-image incidence matrix ?

pls help me. . .

Foi útil?

Solução

Example for 3 images track.

  1. Detect features

  2. Perform matching (1 - 2, 2 - 3). Now you have correspondences FeatureA_img1 = FeatureB_img2, FeatureC_img2 = FeatureD_img3, FeatureE_img1 = FeatureF_img3.

  3. Check, if FeatureA_img1 == FeatureB_img2 AND FeatureB_img2 == FeatureC_img3, than you have the same feature in 3 images. Save it in the array:

    img1 img2 img3 ... imgn FeatureA FeatureB FeatureC ...

Repeat this for all correspondences. The rows in this table are the tracks you are looking for.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top