What kinds of learning problems are suitable for Support Vector Machines?

https://datascience.stackexchange.com/questions/9736

16-10-2019
|

Question

What are the hallmarks or properties that indicate that a certain learning problem can be tackled using support vector machines?

In other words, what is it that, when you see a learning problem, makes you go "oh I should definitely use SVMs for this'' rather than Neural networks or Decision trees or anything else?

Solution

SVM can be used for classification (distinguishing between several groups or classes) and regression (obtaining a mathematical model to predict something). They can be applied to both linear and non linear problems.

Until 2006 they were the best general purpose algorithm for machine learning. I was trying to find a paper that compared many implementations of the most known algorithms: svm, neural nets, trees, etc. I couldn't find it sorry (you will have to believe me, bad thing). In the paper the algorithm that got the best performance was svm, with the library libsvm.

In 2006 Hinton came up with deep learning and neural nets. He improved the current state of the art by at least 30%, which is a huge advancement. However deep learning only get good performance for huge training sets. If you have a small training set I would suggest to use svm.

Furthermore you can find here a useful infographic about when to use different machine learning algorithms by scikit-learn. However, to the best of my knowledge there is no agreement among the scientific community about if a problem has X,Y and Z features then it's better to use svm. I would suggest to try different methods. Also, please don't forget that svm or neural nets is just a method to compute a model. It is very important as well the features you use.

OTHER TIPS

Let's assume that we are in a classification setting.

For svm feature engineering is cornerstone:

the sets have to be linearly separable. Otherwise the data needs to be transformed (eg using Kernels). This is not done by the algo itself and might blow out the number of features.
I would say that svm performance suffers as we increase the number of dimensions faster than other methodologies (tree ensemble). This is due to the constrained optimization problem that backs svms. Sometimes feature reduction is feasible, sometimes not and this is when we can't really pave the way for an effective use of svm
svm will likely struggle with a dataset where the number of features is much larger than the number of observations. This, again, can be understood by looking at the constrained optimizatiom problem.
categorical variables are not handled out of the box by the svm algorithm.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange