Value on Decision Tree plot
-
11-12-2020 - |
Question
After plotting a sklearn decision tree I check what it says in each box and there is one feature "value" that I am not sure what it refers.
The first line will be the column and the value where it splits, the gini the "disorder" of the data and sample the number of samples in the node.
But value?
Solution
value
represents the number of items in each class.
If you look at the top node, you should view it as:
There are:
- 35100 samples of class 0
- 16288 samples of class 1
- which sums up to 51388 samples total
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange