A tree structure is good. It may be easier to you allow more than two children for each node.
An integral might have three nodes: top and bottom limits, and integrand. You might have a special type of node to indicate an infinite limit.
You might want special types of nodes for matrices. While you could represent them as a list of lists, it may prove easier to use a 2D array of nodes. There is a certain trade off the more special types of node you have the more complex the code.
Some CAS systems (Maple) use a Directed Acyclic Graph this allows common sub expressions to be represented more efficiently. There is a considerably increased complexity in this.