Question

I have a list that looks like

[(1,2,5),(2,10,13),(5,24,56),(1,8,10),(2,3,11)]

How can I produce a dictionary by grouping by first element of tuples and finding min element in second elements and max element in third elements:

{1:(2,10),2:(3,13),5:{24,56}]
Was it helpful?

Solution

You can, if you first sort on the grouping element, then use itertools.groupby() to group your elements and test for minimum and maximum values. Because the groups are generators, you need to jump through an extra hoop to turn that into a list you can reuse for the min() and max() functions:

from itertools import groupby
from operator import itemgetter

result = {k: (min(item[1] for item in gv), max(item[2] for item in gv))
         for k, g in groupby(sorted(inputlist, key=itemgetter(0)), itemgetter(0))
         for gv in (list(g),)}

Note that this sorts (O(NlogN)) then loops over each group twice to find the minimum and maximum of each group, adding another 2N to the total.

The additional for gv in (list(g),) loop assigns a list to gv containing all elements from the g group.

The simple loop version would be:

result = {}
for key, v1, v2 in inputlist:
    minimum, maximum = result.get(key, (float('inf'), float('-inf')))
    if v1 < minimum:
        minimum = v1
    if v2 > maximum:
        maximum = v2
    result[key] = (minimum, maximum)

This is a straightforward O(N) loop, and more readable to boot.

Demo of the two approaches:

>>> from itertools import groupby
>>> from operator import itemgetter
>>> inputlist = [(1,2,5),(2,10,13),(5,24,56),(1,8,10),(2,3,11)]
>>> {k: (min(item[1] for item in gv), max(item[2] for item in gv))
...          for k, g in groupby(sorted(inputlist, key=itemgetter(0)), itemgetter(0))
...          for gv in (list(g),)}
{1: (2, 10), 2: (3, 13), 5: (24, 56)}

and

>>> result = {}
>>> for key, v1, v2 in inputlist:
...     minimum, maximum = result.get(key, (float('inf'), float('-inf')))
...     if v1 < minimum:
...         minimum = v1
...     if v2 > maximum:
...         maximum = v2
...     result[key] = (minimum, maximum)
... 
>>> result
{1: (2, 10), 2: (3, 13), 5: (24, 56)}

OTHER TIPS

In [9]: ll = [(1,2,5),(2,10,13),(5,24,56),(1,8,10),(2,3,11)]

In [10]: {k[0]: ( min([kk[1] for kk in ll if kk[0] == k[0]]), max([kk[2] for kk in ll if kk[0] == k[0]]) ) for k in ll}
Out[10]: {1: (2, 10), 2: (3, 13), 5: (24, 56)}

or

In [11]: {k[0]: (v[0], v[-1]) for k in ll for v in [sorted([x for y in [[kk[1], kk[2]] for kk in ll if kk[0] == k[0] ] for  x in y])] }
Out[11]: {1: (2, 10), 2: (3, 13), 5: (24, 56)}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top