For an application I'm working on I need to create lists from nested tuples, representing the data contained in each branch.
For reference the tuples represent a Huffman tree, an example is:
tree = (1.0, (0.5, (0.25, (0.125, 'd'),(0.125, 'c')), (0.25, 'b')), (0.5,'a'))
This was created from a Huffman routine with the following probabilities:
a:0.5, b:0.25, c:0.125, d:0.125
I would like to out put a list which looks like
[['a'],['b','c','d']]
I've tried the following code:
def makeList(tree):
if len(tree) == 2:
return [tree[0]]
else:
rightlist = []
leftlist = []
right = list(tree[1])
left = list(tree[2])
for i in range(1, len(right)):
rightlist.append(right[i])
for i in range(1, len(left)):
leftlist.append(left[i])
return [rightlist, leftlist]
However this returns
[['a'],[(0.25, (0.125, 'd'),(0.125,'c')),(0.25,'b')]
Which isn't quite what I want.
How could I go about modifying my code above to produce the output I want?
EDIT
I have produced some code which given a balanced input:
('a',0.25), ('b', 0.25), ('c', 0.25), ('d',0.25)
produces the output I want:
[['a','b'], ['c','d']]
def makeList(tree):
if len(tree) == 2:
print("I'm in here")
return [tree[1]]
else:
right = tree[1]
left = tree[2]
rightlist = []
leftlist = []
for i in range(0, len(right)):
if type(right[i]) == tuple:
print('right: ' + str(right[i]))
rightlist.append(right[i][1])
for i in range(0, len(left)):
if type(left[i]) == tuple:
print('left: ' + str(left[i]))
leftlist.append(left[i][1])
return [rightlist, leftlist]
However, it fails on the following inputs (output below):
exampleData = [(0.5, 'a'), (0.5,'b')]
[[],[[]]
exampleData = [(0.5, 'a'), (0.25,'b'), (0.25,'c')]
[[],['b'.'c']]
exampleData = [(0.5,'a'), (0.25,'b'), (0.125,'c'), (0.125,'d')]
[[]],['b',(0.125, 'd')]]
However, the gold-standard test that this needs to pass is creating these lists for random trees:
probs = np.random.dirichlet([1]*4).tolist()
indices = range(0,4)
exampleData = zip(probs, indices)
huffTree = makeHuffmanTree(exampleData)
groups = makeLists(groups)