question about decision trees

https://stackoverflow.com/questions/4262000

27-09-2019
|

Question

after studying decision tree for a while, I noticed there is a small technique called boosting. I see in normal cases, it will improve the accuracy of the decision tree.

So I am just wondering, why don't we just simply incorporate this boosting into every decision tree we built? Since currently we leave boosting out as a separate technique, so I ponder: are there any disadvantages of using boosting than just using a single decision tree?

Thanks for helping me out here!

Solution

Boosting is a technique that can go on top any learning algorithm. It is the most effective when the original classifier you built performs just barely above random. If your decision tree is pretty good already, boosting may not make much difference, but have performance penalty -- if you run boosting for 100 iterations you'll have to train and store 100 decision trees.

Usually people do boost with decision stumps (decision trees with just one node) and get results as good as boosting with full decision trees.

I've done some experiments with boosting and found it to be fairly robust, better than single tree classifier, but also slower (I used to 10 iterations), and not as good as some of the simpler learners (to be fair, it was an extremely noisy dataset)

OTHER TIPS

there are several disadvatages for boosting: 1-hard to implement 2-they need extensive training with training sets more than a decision tree does 3- the worst thing is that all boosting algorithms require a Threshold value which is in most cases not easy to figure out because it requires extensive trial and error tests knowing that the whole performance of the boosting algorithm depends on this threshold

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow