I'm transforming fields from an XML document so I can load into a normal relational DB. I've transformed the XML document to a bunch of nested dictionaries. Some of the values I wish to extract are in nested dictionaries, so I need to flatten it first.

Easy enough, but I'd like to create a mapping that lets me specify upfront what to extract.

Example

input_dict = {
'authors': [{'name': u'Google, Inc.'}],
'islink': False,
}

mapping = ['islink',<???>]

Desired output

In: tuple(input_dict[key] for key in mapping)
Out: (False, 'Google, Inc.')

This obviously doesn't work:

In: [input_dict[key] for key in ['islink',['authors'][0]['name']]]
Out: TypeError: string indices must be integers, not str
有帮助吗?

解决方案 2

What about this:

indices = [['islink',], ['authors', 0, 'name']]
result = []
for index in indices:
  value = input_dict
  for single_index in index:
    value=value[single_index]
  result.append(value)

其他提示

and what about:

from collections import Iterable

def flatten(x):
    result = []
    if isinstance(x, dict):
        x = x.values()
    for el in x:
        if isinstance(el, Iterable) and not isinstance(el, str):
            result.extend(flatten(el))
        else:
            result.append(el)
    return result

which, this time is python3 friendly ;-)

>>> dd = {'a': 42, 'c': 12, 'b': [{1: 2, 2: 3, 3: 4}]}
>>> flatten(dd)
[42, 12, 2, 3, 4]

here's a version that supports key filtering:

def flatten(x, keys=None):
    result = []
    if isinstance(x, dict):
        if keys is None:
            x = x.values()
        else:
            x = dict(filter(lambda t: t[0] in keys, x.items())).values()
    for el in x:
        if isinstance(el, Iterable) and not isinstance(el, str):
            result.extend(flatten(el, keys))
        else:
            result.append(el)
    return result

results:

>>> flatten(dd, keys=['a'])
[42]
>>> flatten(dd, keys=['a','b']) # even though 'b' is selected, no sub key has been
[42]
>>> flatten(dd, keys=['a','b',1]) # to get a subkey, 'b' needs to be selected
[42, 2]
>>> flatten(dd,keys=['a',1]) # if you don't then there's no subkey selected
[42]
>>> flatten(dd, keys=['a','b',2,3])
[42, 3, 4]

and for your use case:

>>> input_dict = {'authors': [{'name': 'Google, Inc.'}],'islink': False,}
>>> flatten(input_dict)
[False, 'Google, Inc.']

N.B.: I adapted my answer from that answer about list flattening

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top