Consider a list of dicts:

items = [
    {'a': 1, 'b': 9, 'c': 8},
    {'a': 1, 'b': 5, 'c': 4},
    {'a': 2, 'b': 3, 'c': 1},
    {'a': 2, 'b': 7, 'c': 9},
    {'a': 3, 'b': 8, 'c': 2}
]

Is there a pythonic way to extract and group these items by their a field, such that:

result = {
    1 : [{'b': 9, 'c': 8}, {'b': 5, 'c': 4}]
    2 : [{'b': 3, 'c': 1}, {'b': 7, 'c': 9}]
    3 : [{'b': 8, 'c': 2}]
}

References to any similar Pythonic constructs are appreciated.

有帮助吗?

解决方案

Use itertools.groupby:

>>> from itertools import groupby
>>> from operator import itemgetter
>>> {k: list(g) for k, g in groupby(items, itemgetter('a'))}
{1: [{'a': 1, 'c': 8, 'b': 9},
     {'a': 1, 'c': 4, 'b': 5}],
 2: [{'a': 2, 'c': 1, 'b': 3},
     {'a': 2, 'c': 9, 'b': 7}],
 3: [{'a': 3, 'c': 2, 'b': 8}]}

If item are not in sorted order then you can either sort them and then use groupby or you can use collections.OrderedDict(if order matters) or collections.defaultdict to do it in O(N) time:

>>> from collections import OrderedDict
>>> d = OrderedDict()
>>> for item in items:
...     d.setdefault(item['a'], []).append(item)
...     
>>> dict(d.items())
{1: [{'a': 1, 'c': 8, 'b': 9},
     {'a': 1, 'c': 4, 'b': 5}],
 2: [{'a': 2, 'c': 1, 'b': 3},
     {'a': 2, 'c': 9, 'b': 7}],
 3: [{'a': 3, 'c': 2, 'b': 8}]}

Update:

I see that you only want the those keys to be returned that we didn't use for grouping, for that you'll need to do something like this:

>>> group_keys = {'a'}
>>> {k:[{k:d[k] for k in d.viewkeys() - group_keys} for d in g]
                                   for k, g in groupby(items, itemgetter(*group_keys))}
{1: [{'c': 8, 'b': 9},
     {'c': 4, 'b': 5}],
 2: [{'c': 1, 'b': 3},
     {'c': 9, 'b': 7}],
 3: [{'c': 2, 'b': 8}]}

其他提示

Note: This code assumes the the data is already sorted. If it is not, we have to sort it manually

from itertools import groupby
print {key:list(grp) for key, grp in groupby(items, key=lambda x:x["a"])}

Output

{1: [{'a': 1, 'b': 9, 'c': 8}, {'a': 1, 'b': 5, 'c': 4}],
 2: [{'a': 2, 'b': 3, 'c': 1}, {'a': 2, 'b': 7, 'c': 9}],
 3: [{'a': 3, 'b': 8, 'c': 2}]}

To get the result in the same format you asked for,

from itertools import groupby
from operator import itemgetter
a_getter, getter, keys = itemgetter("a"), itemgetter("b", "c"), ("b", "c")

def recon_dicts(items):
    return dict(zip(keys, getter(items)))

{key: map(recon_dicts, grp) for key, grp in groupby(items, key=a_getter)}

Output

{1: [{'c': 8, 'b': 9}, {'c': 4, 'b': 5}],
 2: [{'c': 1, 'b': 3}, {'c': 9, 'b': 7}],
 3: [{'c': 2, 'b': 8}]}

If the data is not sorted already, you can either use the defaultdict method in this answer, or you can use sorted function to sort based on a, like this

{key: map(recon_dicts, grp)
   for key, grp in groupby(sorted(items, key=a_getter), key=a_getter)}

References:

  1. operator.itemgetter

  2. itertools.groupby

  3. zip, map, dict, sorted

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top