Вопрос

I'm trying to turn the following code into something more readable.

for x in list_of_dicts:
    for y in header:
        if y not in x.keys():
            x[y] = ''

It takes a list of dictionaries and adds key:value pairs with the default value = '' for any
keys that do not already exist in the current dictionary.

I'm still new to python, so any help would be greatly appreciated. I tried:

return [x[y] = '' for x in list_of_dicts for y in header if y not in x.keys()]  

But I'm thinking you can't have "="

Это было полезно?

Решение

This is not a problem you should solve with a list comprehension. You can improve on your existing code using some set operations:

for x in list_of_dicts:
    x.update((y, '') for y in header.viewkeys() - x)

This'll achieve the same effect; add keys from header that are missing, as empty strings. For Python 3, replace viewkeys() with keys().

This makes use of dictionary view objects to give us a set-like views on the dictionary keys; in Python 3 this behaviour is now the default.

If I read your question wrong and headers is not a dictionary as well, make it an explicit set to get the same benefits:

header_set = set(header)
for x in list_of_dicts:
    x.update((y, '') for y in header_set.difference(x))

Using set operations makes the code more readable and efficient, pushing any loops to determine the set difference into optimized C routines.

Другие советы

You can't use dict comprehension to add items to a dict; dict comprehension creates a new dict separate from pre-existing ones, and if you want to combine the new and the old, you have to do so explicitly, such as:

for x in list_of_dicts:
    x.update({y: '' for y in header if y not in x})

(Note that y not in x.keys() is unnecessary when dealing with dicts, since you can just do y not in x.)

If you're dead-set on getting rid of that outer for, the way to do it is to create a new list of new dicts:

list_of_dicts2 = [dict(x, **{y: '' for y in header if y not in x}) for x in list_of_dicts]

There are many ways you can do this better. Primary by thinking better of what you are trying to do.

And what you are trying to do? You can think of it this way: you want to add defaults to some dicts. dict.setdefault() method immediately comes to mind:

for d in list_of_dicts:
    for h in header:
        d.setdefault(h, '')

You can think slightly other way: there are a set of defaults I need to apply to all of dicts. Now constructing defaults dict first and then merging it feels natural:

defaults = dict.fromkeys(header, '')
list_of_dicts = [dict(defaults, **d) for d in list_of_dicts]

Note that we are reconstructing each dict here not updating it. This is the right way when using comprehensions. One thing to add here is that merging last line with the code constructing list_of_dicts will probably make sense (I can't say for sure without seeing).

You can use a list comprehension for this, but you shouldn't:

[x.setdefault(y, '') for x in list_of_dicts for y in header]

The reason you shouldn't is that this creates a big old list that you don't need but that takes time and memory.

You can consume a generator comprehension without creating a big old list:

import collections
def consume(iterator):
    collections.deque(iterator, maxlen = 0)

consume(x.setdefault(y, '') for x in list_of_dicts for y in header)

Arguably you shouldn't do this either, since readers don't really expect comprehensions to have side-effects so the code might frighten and confuse them.

You are correct that you can't do x[y] = '' in a comprehension, since it is a statement not an expression. It just so happens that x.setdefault(y, '') does what you want, but if there was no such convenient function then you could write one. And come to think of it, by doing that you can eliminate the comprehension as well as your original loop:

def set_default(x, y):
    if y not in x:
        x[y] = ''

consume(itertools.starmap(set_default, itertools.product(list_of_dicts, header))

Again though, some kind of warning about using generators for their side-effects should probably apply.

>>> d1={'a':1}
>>> d2={'b':2}
>>> d3={'c':3}
>>> listofdict=[d1, d2, d3]
>>> listofdict
[{'a': 1}, {'b': 2}, {'c': 3}]
>>> header = ['x', 'y']
>>> header
['x', 'y']
>>> [ x.update({k:''}) for x in listofdict for k in header if not x.get(k) ]
[None, None, None, None, None, None]
>>> listofdict
[{'a': 1, 'x': '', 'y': ''}, {'y': '', 'x': '', 'b': 2}, {'y': '', 'x': '', 'c': 3}]
>>> d1
{'a': 1, 'x': '', 'y': ''}
>>> d2
{'y': '', 'x': '', 'b': 2}
>>> d3
{'y': '', 'x': '', 'c': 3}
>>> 
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top