Question

I want to use bag of words for content-based image retrieval. I'm confused as to how to apply bag-of-words to content based image retrieval. To clarify:

I've trained my program using SURF features and extract the BoW descriptors. I feed this to a support vector machine as training data. Then, given a query image, the support vector machine can predict which class a given image belongs to.

In other words, given a query image it can find a class. For example, given a query image of a car, the program would return 'car'. How would one find similar images?

Would I, given the class, return images from the training set? Or would the program - given a query image - also return a subset of a test-set on which the SVM predicts the same class?

Was it helpful?

Solution

The title only mentions BoW, but in your text you also use SVMs.

I think the core idea of CBIR is, to find the most similar image, according to some distance measure. You can do this with BoW-features. The SVM is not necessary.

The main purpose of using additional classification is to speed up the process. Because after you obtained a class label for your test image, you only need to search this subgroup of your images for the best match. And of course, if the SVM is better in distinguishing certain classes than your distance measure, it might help to reduce errors.

So the standard workflow would be:

  • obtain the class
  • return the best match from the training samples of this class
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top