If you want to implement a decision tree from scratch I recommend you to build your tree using classes. A tree is composed of nodes, where one node contains nodes recursively and leafs are terminal nodes. For the case of a binary tree, these classes can be something like:
class Node(object):
def __init__(self):
self.split_variable = None
self.left_child = None
self.right_child = None
def get_name(self):
return 'Node'
class Leaf(object):
def __init__(self):
self.value = None
def get_name(self):
return 'Leaf'
For the Node class: 'split_variable' will contain the variable name used in the split ie: [a,t,g,c] and 'left_child' and 'right_child' will be new instances of Node or Leaf. The True/False presence of that variable will be mapped into the left/right children. (In case of a regression tree you'll need to add a fourth variable to the Node class 'split_value' and map less/more than this value into the left/right children).
For the Leaf class: 'value' contains the assigned value of the tree class variable (ie majority in case of a discrete variable or mean in the case of a continuous one).
To complete your implementation you'll need functions to walk your tree evaluating and/or visualising it. These functions will be recursively called to complete walking through the tree. Here is where you can make use of the get_name() functions of the classes, to differentiate nodes from leafs. To implement this part it really depends on how you store your data, I suggest you to use pandas DataFrames which are like tables. A sample evaluate function could be (pseudocode):
def evaluate_tree(your_data, node):
if your_data[node.split_variable]:
if node.left_child.get_name() == 'Node':
evaluate_tree(your_data, node.left_child)
elif node.left_child.get_name() == 'Leaf':
return node.left_child.value
else:
if node.right_child.get_name() == 'Node':
evaluate_tree(your_data, node.right_child)
elif node.right_child.get_name() == 'Leaf':
return node.right_child.value
Good luck!