Question

I need to store and then operate (add new nodes, search through, etc) a tree where every node is a pair of x,y coordinates. I found ete2 module to work with trees, but I can't catch how to save a node as a tuple or list of coordinates. Is it possible with ete2?

Edit:

I followed the tutorial here http://pythonhosted.org/ete2/tutorial/tutorial_trees.html#trees To create a simple tree:

t1 = Tree("(A:1,(B:1,(E:1,D:1):0.5):0.5);" )

where A, B, C is the name of a node and a number is a distance.

or

t2 = Tree( "(A,B,(C,D));" )

I don't need names or distances, but a tree of tuples or lists, smth like:

t3 = Tree("([12.01, 10.98], [15.65, 12.10],([21.32, 6.31], [14.53, 10.86]));")

But the last input returns syntax error, in tutorials regarding ete2 I couldn't find any similar example. As a variant I think I could save coordinates as attributes, but attributes stored as strings. I need to operate with coordinates and it's tricky every time to traverse it from string to float and vice verse.

Était-ce utile?

La solution

You can annotate ete trees using any type of data. Just give a name to every node, create a tree structure using such names, and annotate the tree with the coordinates.

from ete2 import Tree

name2coord = {
'a': [1, 1], 
'b': [1, 1], 
'c': [1, 0], 
'd': [0, 1], 
}

# Use format 1 to read node names of all internal nodes from the newick string
t = Tree('((a:1.1, b:1.2)c:0.9, d:0.8);', format=1)     

for n in t.get_descendants():
   n.add_features(coord = name2coord[n.name])

# Now you can operate with the tree and node coordinates in a very easy way: 
for leaf in t.iter_leaves():
    print leaf.name, leaf.coord
# a [1, 1]
# b [1, 1]
# d [0, 1]

print t.search_nodes(coord=[1,0])
# [Tree node 'c' (0x2ea635)]

You can copy, save and restore annotated trees using pickle:

t.copy('cpickle')
# or
import cPickle
cPickle.dump(t, open('mytree.pkl', 'w'))
tree = cPickle.load(open('mytree.pkl'))
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top