문제

I went through adaboost tutorial and below are my simplified understanding:

  1. Sample weight of equal value is given to all sample in dataset.
  2. Stumps are created which uses only one feature from data set.
  3. Using total error and sample weight stump importance is calculated.
  4. Samples weights are changed i.e. samples which were predicted wrongly by stumps with high importance get sample weight increased and rest of the sample weight are decreased in the same order.
  5. The process is repeated till the number of iterations mentioned where sample weight provides path for training.

Does adaboost contain multiple stumps with different splitting value for a single feature? As mentioned, are the stumps created as first step or is it a continuous process?

도움이 되었습니까?

해결책

Firstly, i want to notice that adaboost is a ensemble of stumps, and each stump is added to the ensemble sequentially, trying to compensate the errors of the existing ensemble.


Saying that:

  • In every step you have a new weighted dataset, and the Adaboost algorithm tries to fit a stump that splits the new dataset best. So yes, it is possible, and it does happen, that a single feature is used more than once (but the splitting value calculated by the stump should be different in order to reduce the ensemble's error.)

  • Stumps' creation is a continuous process, they are calculated in every step for the new weighted dataset.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top