Question

I am trying to implement a decision tree classifier using ID3 algorithm. According to Aritificial Intelligence - A Modern Approach, information gain of attribute A is given by:

Gain(A) = B(p/p+n) - Remainder(A)

where B is the entropy of a Boolean random variable and p and n are the number of positive and negative examples in the training set.

My question is:

do p and n always refer to examples in the full dataset, or the remaining examples in current partition of the set?

If the former applied, the value of B would remain fixed throughout the training procedure. Is this correct?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top