Question

I have two lists of dictionaries in python 2.6, and I want to merge them based on the highest value of one key that corresponds to another key. The lists are like this:

[{shape: square, color: red, priority: 2},
{shape: circle, color: blue, priority: 2},
{shape: triangle, color: green, priority: 2}]

[{shape: square, color: green, priority: 3},
{shape: circle, color: red, priority: 1}]

I'm trying to get an output like this:

[{shape: square, color: green, priority: 3},
{shape: circle, color: blue, priority: 2},
{shape: triangle, color: green, priority: 2}]

(The order of the items is not important.)

In other words I'd like to go through both lists and get dictionary of the 'color', 'shape,' and 'priority' of each list item where the value of 'priority' is the highest for each value of 'shape')

I've been searching and trying different things on SO for a few days on and off, and I'm finally giving in to ask. I've tried various versions of max, key, lambda, etc. but all the threads I can find here don't seem to be what I'm looking for.

Thanks in advance!

Was it helpful?

Solution

Just use a new dictionary with the merged lists sorted by priority to hold each dict in the merged list:

li1=[{'shape': 'square', 'color': 'red', 'priority': 2},
{'shape': 'circle', 'color': 'blue', 'priority': 2},
{'shape': 'triangle', 'color': 'green', 'priority': 2}]

li2=[{'shape': 'square', 'color': 'green', 'priority': 3},
{'shape': 'circle', 'color': 'red', 'priority': 1}]

res={}
for di in sorted(li1+li2, key=lambda d: d['priority']):
    res[di['shape']]=di

print res.values()  

Prints:

[{'color': 'blue', 'priority': 2, 'shape': 'circle'}, 
 {'color': 'green', 'priority': 3, 'shape': 'square'}, 
 {'color': 'green', 'priority': 2, 'shape': 'triangle'}]

Since this is a dictionary with uniques keys, the last items of a given shape will replace the earlier items with the same shape. Since the items are sorted by priority, {'shape': 'square', 'color': 'red', 'priority': 2} in the res dictionary is replaced by {shape: square, color: green, priority: 3} since 3>2 and so on.

So you can do this all in a single line in Python 2.7+:

{di['shape']:di for di in sorted(li1+li2, key=lambda d: d['priority'])}.values()

OTHER TIPS

Here is a plan. It assumes you don't care about dicts order, but you can modify that to care.

Let's look what we have. First, it doesn't matter from what list resulting dict comes, so we can just chain them. Second, from every group of dicts with a same shape we select exactly one. Looks like we need to group all dicts by shape and then select a dict with highest priority for each group.

The obvious way would be to group with collections.defaultdict and then use max in a list comprehension to select best dict in each group. The slightly more tricky one would be to sort first by shape and minus priority, group with itertools.groupby by shape and then select first element from each group:

from itertools import chain, groupby 

sorted_dicts = sorted(chain(list1, list2), 
                      key=lambda d: (d['shape'], -d['priority'])) 
groups = groupby(sorted_dicts, key=lambda d: d['shape'])
merged = [next(g) for _, g in groups]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top