Question

How would you aggregate the third index in the following list if the sublist has the same key as another sublist at index 1?

 lst = [['aaa','key1','abc',4],['aaa','key2','abc',4],['ddd','key3','abc',4],['eas','key1','abc',4],['aaa','key1','abc',2],['aaa','key2','abc',10]]

I would like to the aggregate the third index across only the sublists that have the same index. For example, the above list has key1 at index 1 across three sublists. I'd like to add the 4, 4, and 2 together.

Desired_List = [['aaa','key1','abc',10],['aaa','key2','abc',14],['ddd','key3','abc',4]]

The other items in the list are irrelevant.

Was it helpful?

Solution

Well this didn't come out too readable - but here's a pretty compact way with itertools.groupby and reduce:

from itertools import groupby
from operator import itemgetter as ig

[reduce(lambda x,y: x[:-1] + [x[-1] + y[-1]], g) for k,g in groupby(sorted(lst, key=ig(1)), ig(1))]
Out[26]: 
[['aaa', 'key1', 'abc', 10],
 ['aaa', 'key2', 'abc', 14],
 ['ddd', 'key3', 'abc', 4]]

Things get better if you pull out the lambda into a helper function:

def helper(agg,x):
    agg[-1] += x[-1]
    return agg

[reduce(helper,g) for k,g in groupby(sorted(lst, key=ig(1)), ig(1))]
Out[30]: 
[['aaa', 'key1', 'abc', 10],
 ['aaa', 'key2', 'abc', 14],
 ['ddd', 'key3', 'abc', 4]]

Note that you'll need to do from functools import reduce in python 3, since it got banished from the builtins (sad face).

OTHER TIPS

Pretty ugly, but this works:

lst = [['aaa','key1','abc',4],['aaa','key2','abc',4],['ddd','key3','abc',4],['eas','key1','abc',4],['aaa','key1','abc',2],['aaa','key2','abc',10]]
newlst = []
searched = []
for i, sublist1 in enumerate(lst[0:len(lst)-1]):
    if sublist1[1] not in searched:
        searched.append(sublist1[1])
        total = 0
        for sublist2 in lst[i+1:]:
            if sublist1[1] == sublist2[1]:
                total += int(sublist2[3])
        newlst.append([sublist1[0], sublist1[1], sublist1[2], total + sublist1[3]])

print newlst

gives:

[['aaa', 'key1', 'abc', 10], ['aaa', 'key2', 'abc', 14], ['ddd', 'key3', 'abc', 4]]

In case which key you want or the index the values are at change I made a variable so its easy to change this code:

lst = [['aaa','key1','abc',4],['aaa','key2','abc',4],['ddd','key3','abc',4],['eas','key1','abc',4],['aaa','key1','abc',2],['aaa','key2','abc',10]]
# Index of the key value to sort on
key_index = 1
# Index of the value to aggregate
value_index = 3

list_dict = {}

# Iterate each list and uniquely identify it by its key value.
for sublist in lst:
    if sublist[1] not in list_dict:
        list_dict[sublist[key_index]] = sublist
    # Add the value of the list to the unique entry in the dict
    list_dict[sublist[key_index]][value_index] += sublist[value_index]

# Now turn it into a list. This is not needed but I was trying to match the output
desired_list = [ sublist for _, sublist in list_dict.iteritems()]

Output:

[['ddd', 'key3', 'abc', 8], ['aaa', 'key2', 'abc', 18], ['aaa', 'key1', 'abc', 14]]
lst = [['aaa','key1','abc',4],['aaa','key2','abc',4],['ddd','key3','abc',4],['eas','key1','abc',4],['aaa','key1','abc',2],['aaa','key2','abc',10]]
result = []
for item in lst:
    t = tuple(item[:3])
    d = {t:item[-1]}
    result.append(d)
print result
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top