Question

I went through adaboost tutorial and below are my simplified understanding:

  1. Sample weight of equal value is given to all sample in dataset.
  2. Stumps are created which uses only one feature from data set.
  3. Using total error and sample weight stump importance is calculated.
  4. Samples weights are changed i.e. samples which were predicted wrongly by stumps with high importance get sample weight increased and rest of the sample weight are decreased in the same order.
  5. The process is repeated till the number of iterations mentioned where sample weight provides path for training.

Does adaboost contain multiple stumps with different splitting value for a single feature? As mentioned, are the stumps created as first step or is it a continuous process?

Was it helpful?

Solution

Firstly, i want to notice that adaboost is a ensemble of stumps, and each stump is added to the ensemble sequentially, trying to compensate the errors of the existing ensemble.


Saying that:

  • In every step you have a new weighted dataset, and the Adaboost algorithm tries to fit a stump that splits the new dataset best. So yes, it is possible, and it does happen, that a single feature is used more than once (but the splitting value calculated by the stump should be different in order to reduce the ensemble's error.)

  • Stumps' creation is a continuous process, they are calculated in every step for the new weighted dataset.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top