Question

There is the opportunity to fit decision trees with other decision trees. For example:

adaclassification= AdaBoostClassifier(RandomForestClassifier(n_jobs=-1))
adaclassification.fit(X_train,y_train)

I got better results with random forest, so improved the result from adaboost with the random forest classifier. However I dont understand what´s happening here? It sounds easy: adaboost uses a random forest to fit it´s classification. But what´s mathematically going on here? Adaboost is made of the residuals as a sequence (boosting). Random forest (bagging) built a forest out of trees.

Was it helpful?

Solution

Your description is apt. There isn't anything especially "mathematical" happening here, aside from the AdaBoost algorithm itself.

In psuedocode, something like this is happening:

For n in 1 .. N_Estimators do
  Train classifier Tn on data X with weights W
  Compute weighted residuals E from Tn
  Update W based on E
  Renormalize W
end

In your case, Tn would be a Random Forest model, which is itself an ensemble based on bagging. So at each iteration of the "outer" AdaBoost model, an entire Random Forest model is being trained, i.e. several decision trees are fitted on random sub-samples of data points and features.

Of course, this is an unusual setup for a boosting model. But there's no conceptual or computational reason why you couldn't run the algorithm this way.

If you are curious about how exactly the weights are computed and updated, Scikit-learn uses the SAMME algorithm, which is based on but not exactly identical to the original AdaBoost. SAMME is described in "Multi-Class AdaBoost" by Zhu, Rhosset, Zhou, & Hastie (2006).

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top