OpenCV - Surf Algorithm - Giving lots of false positives

https://stackoverflow.com/questions/22935494

29-06-2023
|

Pergunta

I am learning OpenCV and have started exploring the SURF Algorithm for image matching. I have created a sample image library by modifying the default images available with Microsoft Windows 7.

Each image has a rotated, scaled, blurred and skewed version in the same folder.

My code for finding out matching images is as shown below. As can be seen in the code, the distance is measured by the line dis/objectDescriptors->total and further similarity is calculated by 100 - (dis/objectDescriptors->total) *100.

Unfortunately, this is giving me some weird false positives. For example, it matches the image1 with completely different image2 (85% similarity) but will show only 60% similarity with the slight blurred version of image1.

How do I get rid of false positives?

The below code was inspired from the website: http://opencvuser.blogspot.in/2012/07/surf-source-code-part-2.html

#include <cv.h>
#include <highgui.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h> 
#include <iostream>
#include <vector>

using namespace std;

static double dis=0;//For calculating the distance


IplImage *image = 0;

double
compareSURFDescriptors( const float* d1, const float* d2, double best, int length )
{
    double total_cost = 0;
    assert( length % 4 == 0 );
    for( int i = 0; i < length; i += 4 )
    {
        double t0 = d1[i] - d2[i];
        double t1 = d1[i+1] - d2[i+1];
        double t2 = d1[i+2] - d2[i+2];
        double t3 = d1[i+3] - d2[i+3];
        total_cost += t0*t0 + t1*t1 + t2*t2 + t3*t3;

      if( total_cost > best )
            break;
    }
    return total_cost;
}


int
naiveNearestNeighbor( const float* vec, int laplacian,
                      const CvSeq* model_keypoints,
                      const CvSeq* model_descriptors )
{
    int length = (int)(model_descriptors->elem_size/sizeof(float));
    int i, neighbor = -1;
    double d, dist1 = 1e6, dist2 = 1e6;
    CvSeqReader reader, kreader;
    cvStartReadSeq( model_keypoints, &kreader, 0 );
    cvStartReadSeq( model_descriptors, &reader, 0 );

    for( i = 0; i < model_descriptors->total; i++ )
    {
        const CvSURFPoint* kp = (const CvSURFPoint*)kreader.ptr;
        const float* mvec = (const float*)reader.ptr;
     CV_NEXT_SEQ_ELEM( kreader.seq->elem_size, kreader );
        CV_NEXT_SEQ_ELEM( reader.seq->elem_size, reader );
        if( laplacian != kp->laplacian )
            continue;
        d = compareSURFDescriptors( vec, mvec, dist2, length );

        if( d < dist1 )
        {
            dist2 = dist1;
            dist1 = d;
            neighbor = i;
        }
        else if ( d < dist2 )
            dist2 = d;
    }
dis=dis+dist1;

/*We are finding the distance from every descriptor of probe image to every descriptor of the galley image. Finally in the findpairs function, we divide this distance with the total no. of descriptors to get the average of all the distances
*/
    if ( dist1 < 0.6*dist2 )
        return neighbor;
    return -1;
}

void
findPairs( const CvSeq* objectKeypoints, const CvSeq* objectDescriptors,
           const CvSeq* imageKeypoints, const CvSeq* imageDescriptors, vector<int>& ptpairs )
{
    int i;
    CvSeqReader reader, kreader;
    cvStartReadSeq( objectKeypoints, &kreader );
    cvStartReadSeq( objectDescriptors, &reader );
    ptpairs.clear();

    for( i = 0; i < objectDescriptors->total; i++ )
    {
        const CvSURFPoint* kp = (const CvSURFPoint*)kreader.ptr;
        const float* descriptor = (const float*)reader.ptr;
        CV_NEXT_SEQ_ELEM( kreader.seq->elem_size, kreader );
        CV_NEXT_SEQ_ELEM( reader.seq->elem_size, reader );
        int nearest_neighbor = naiveNearestNeighbor( descriptor, kp->laplacian, imageKeypoints, imageDescriptors);
//For every descriptor, we are trying to find it's nearest neighbour in the probe image
        if( nearest_neighbor >= 0 )
        {
            ptpairs.push_back(i);
            ptpairs.push_back(nearest_neighbor);
        }
    }

printf("\n%lf\n",(dis/objectDescriptors->total));////Here's where I am outputting the distance between the images
/*Dileep: If you are using this for recognition, write this distance to a file along with the name of the image you are matching against. After doing this for several images, you can then sort them in ascending order to find the best possible match - the one with the smallest distance. Here, I am outputting the distance to stdout
*/
}

int main(int argc, char** argv)
{
    const char* object_filename = argc == 3 ? argv[1] : "box.png";
    const char* scene_filename = argc == 3 ? argv[2] : "box_in_scene.png";
//Dileep:When you are excuting the object file, please write Command:./objectfile probe_image Gallery_image
/*Dileep:
Probe_image - This is the image for which you need to find the match
Gallery_image - This is one of the set of images, you use for matching

You keep the same probe image same, repeatedly changing the gallery image and outputting the distance in the format
<Gallery_name distance> into a file
Finally you can sort the distances in ascending order. And the one with the shortest distance - You can output it's name as the best possible match

It may become tedious to continually write the same command multiple times, changing the gallery file name. Try to use shell script with a for loop
*/
    CvMemStorage* storage = cvCreateMemStorage(0);





    IplImage* object = cvLoadImage( object_filename, CV_LOAD_IMAGE_GRAYSCALE );
    IplImage* image = cvLoadImage( scene_filename, CV_LOAD_IMAGE_GRAYSCALE );
    if( !object || !image )
    {
        fprintf( stderr, "Can not load %s and/or %s\n"
            "Usage: find_obj [<object_filename> <scene_filename>]\n",
            object_filename, scene_filename );
        exit(-1);
    }

    CvSeq *objectKeypoints = 0, *objectDescriptors = 0;
    CvSeq *imageKeypoints = 0, *imageDescriptors = 0;
    int i;
    CvSURFParams params = cvSURFParams(500, 1);

    double tt = (double)cvGetTickCount();
    cvExtractSURF( object, 0, &objectKeypoints, &objectDescriptors, storage, params );
    printf("Object Descriptors: %d\n", objectDescriptors->total);
    cvExtractSURF( image, 0, &imageKeypoints, &imageDescriptors, storage, params );
    printf("Image Descriptors: %d\n", imageDescriptors->total);
    tt = (double)cvGetTickCount() - tt;
    printf( "Extraction time = %gms\n", tt/(cvGetTickFrequency()*1000.));




    vector<int> ptpairs;

    findPairs( objectKeypoints, objectDescriptors, imageKeypoints, imageDescriptors, ptpairs );


    return 0;
}

Solução

Remember that the usual pipeline is this:

Get features from two images.
Compute putative correspondences with the neighbor ratio (as you do), or with cross matching (@user2746401).
Find a subset of consistent correspondences that support some geometrical relation between the images.

You have done steps 1 and 2, but 3 is missing. Note that putative correspondences are very noisy. You will obtain a lot of correspondences, many of them wrong, but some correct. It is difficult to tell for sure if them are correctly computed just by drawing them. So, it is normal that you see apparently weird correspondences.

To do step 3, in your case you can find a homography with RANSAC, as @user3481173 says. It is very easy, because OpenCV already provides the function cvFindHomography that does both things at the same time. A homography should work well in your case, since you are dealing with perspective transformations of planar images.

By the way, it will be much easier to you in the future to use the C++ API of OpenCV. The same code will probably occupy half the lines you have.

Outras dicas

I don't think you can easily match between two images with only simple pair matching. The cost of doing this is enormous. Which is why you might want to consider using RANSAC.

I can recommend an example in Matlab by Zisserman and Kovesi

You will need this function for RANSAC by Peter Kovesi

I don't think there is "an" answer to your question, sorry. I can give some suggestions to how to reduce the false positives:

Do cross checked matching. In this scheme you find the nearest match for a feature from image A in image B and then take the match the feature from image B and find the nearest neighbour back in image A. If these match both ways then you consider it a match.
Do ratio matching. Here you track the nearest distance and second nearest distance between feature descriptors. if the ratio of the nearest to second nearest satisfies some threshold (say, 0.8) then keep it as a good match.

They should make your feature matching only match "good" feature between images. Once you have these good feature matches you can see which images have the best average distance between features. You could either threshold (as you are doing now) or, again, do a ratio test to ensure the image you select is sufficiently better than another image.

If you want the feature matching to do image retrieval then have a search for papers on this topic.... it's a very open question so keep experimenting!

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow