Question

After plotting a sklearn decision tree I check what it says in each box and there is one feature "value" that I am not sure what it refers.

The first line will be the column and the value where it splits, the gini the "disorder" of the data and sample the number of samples in the node.

But value?

enter image description here

Was it helpful?

Solution

value represents the number of items in each class.

If you look at the top node, you should view it as:

There are:

  • 35100 samples of class 0
  • 16288 samples of class 1
  • which sums up to 51388 samples total
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top