How does SVM classify images?

https://datascience.stackexchange.com/questions/77592

12-12-2020
|

Pergunta

I have read about SVM and understood that for complex divisions, the SVM theoretically plots the data into a higher dimensional plane such that the in the new dimension the data is linearly separable and to achieve this in a practical way, it uses kernel functions which in place of actually transforming the data into a higher plane gives the boundary distance between the boundary and that data point.

But how does this work in the case of an image classifier? Let's say we need to classify pictures as either dog or cat. In this scenario, a CNN model would learn features like ear size, face shape, nose shape, and other visual characteristics of the training set to classify between a dog and a cat. But what does SVM learn while training and how does it work in this case?

Solução

There are a few points that I would like to mention and I believe that will serve as an answer to the question -

1. SVM work only the way we know i.e. finding the maximum margin support. So it will treat the image like a "1 x N" dimensional data just like any other data.

2. It performs well with sparse high dimension data (when data volume is small) as compared to other Classifier. This typically happens with many image data. So if you try it on MNIST(~10K samples) data it will perform better than the Decision tree.

3. In the classical image processing approach, we first extract the key-points and then used classifiers to identify images. Data is not sparse, so I think it depends on Features Vs Data count which model can perform best.[Example], here SVM works better than KNN. Still, SVM has other benefits e.g. sparseness of solution, Mathematically derived solution i.e. not solved for local optima.

References -
Researchgate
Stats.SE

Outras dicas

A quick look (this post, for example) suggests that to use SVM for image classification, you need to extract features beforehand, and not run it directly on the image data.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a datascience.stackexchange