Can I use SVC() as a base_estimtor for ensemble methods?

https://datascience.stackexchange.com/questions/73613

11-12-2020
|

Question

I am currently testing out a few different ensemble methods on my dataset. I've heard that you can also use support vector machines as base learners in boosting and bagging methods but I am not sure which methods allow it or not. In particular, e.g. for XGB i tried out trees and SVMs as base learners and got the exact same result for 5 different performance metrics which made me question the results and/or that the option can only take trees as base learners. I didn't find much info in the documentation or at least not in all of the documentations. I would be interested about AdaBoostClassifier(), BaggingClassifier() and XGBClassifier(). Does anybody know the details and whether or not I can use SVMs here as base learners?

Solution

In short: Yes.

Conceptually, bagging and boosting are model-agnostic techniques, meaning that they work regardless of the learner.

Bagging essentially is the following:

create multiple predictors (they can even be hard-coded!)
gather predictions from the learners and come up with a prediction

Boosting can be seen as:

train a predictor
find where the predictor makes mistakes
put more emphasis on these mistakes
repeat until satisfactory

Regarding the specific Sklearn implementations, here are the base learners that you can use:

AdaBoostClassifier()

The documentation says Support for sample weighting is required, as well as proper classes_ and n_classes_ attributes.

This means that you can use all models that can give weight to your samples as part of the learning process (KNN, SVM, etc.)

BaggingClassifier()

This is a simple bagging strategy, so all estimators can be used here.

GradientBoostingClassifier()

This requires that your learners are differentiable so that the gradient can be computed. Generally, this technique is specific for tree learning.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange