Decision Tree learning algorithm
-
29-09-2019 - |
Question
I want to preface this by saying that this is a homework assignment.
I am given a set of Q binary input variables that will be used to classify output of Y which is also binary.
The first part of the question is: at most how many examples do I need to enumarate all possibile combinations of Q? I am currently think that since it asks for at most I will need Q as it is possible that all values up to Q-1 are the same for instance 1 and the item at Q is 0 .
The second part of the question is: at most how many leaf nodes can the tree have given Z examples?
My current answer is that at most the tree would have 2 leaf nodes, one representing true and one representing false since it is dealing with binary inputs and binary outputs.
Is this the correct way of examining this problem or am I generalizing my answers too deeply?
Edit
After looking at Cameron's response, I would now turn my first answer into 2^Q and to build on his example of Q = 3, I would get 2^3 or 8 (2*2*2). Please correct if that is incorrect thinking.
Edit #2
The second part of the question it appears as though it should be (2^Q) * Z or to provide an example: (2^3) * 3) or 8*3 = 24 leaf nodes. To recap if I have 3 inputs that are binary I would initially take 2^3 and get 8 now I want to go over 3 examples. Therefore I should get 8*3 or 24.
Edit #3
In hindsight it seems that no matter how many examples I use the number of leaf nodes should never increase, as it is a per tree basis.
Solution
I'd suggest you approach the problem by working out small example cases by hand.
For the first part, choose a small value for Q
, say 3, and write down all possible combinations of Q
. Then you can figure out how many examples you need. Increase Q
and do it again.
For the second part of your question, pick a small Z
and run the decision tree algorithm by hand. See how many leaves you get. Then pick another Z
and see if/how it changes. Try generating different examples (with the same Z
) and see if you can change the number of leaves.