Question

I've found how to split a delimited string into key:value pairs in a dictionary elsewhere, but I have an incoming string that also includes two parameters that amount to dictionaries themselves: parameters with one or three key:value pairs inside:

clientid=b59694bf-c7c1-4a3a-8cd5-6dad69f4abb0&keyid=987654321&userdata=ip:192.168.10.10,deviceid:1234,optdata:75BCD15&md=AMT-Cam:avatar&playbackmode=st&ver=6&sessionid=&mk=PC&junketid=1342177342&version=6.7.8.9012

Obviously these are dummy parameters to obfuscate proprietary code, here. I'd like to dump all this into a dictionary with the userdata and md keys' values being dictionaries themselves:

requestdict {'clientid' : 'b59694bf-c7c1-4a3a-8cd5-6dad69f4abb0', 'keyid' : '987654321', 'userdata' : {'ip' : '192.168.10.10', 'deviceid' : '1234', 'optdata' : '75BCD15'}, 'md' : {'Cam' : 'avatar'}, 'playbackmode' : 'st', 'ver' : '6', 'sessionid' : '', 'mk' : 'PC', 'junketid' : '1342177342', 'version' : '6.7.8.9012'}

Can I take the slick two-level delimitation parsing command that I've found:

requestDict = dict(line.split('=') for line in clientRequest.split('&'))

and add a third level to it to handle & preserve the 2nd-level dictionaries? What would the syntax be? If not, I suppose I'll have to split by & and then check & handle splits that contain : but even then I can't figure out the syntax. Can someone help? Thanks!

Was it helpful?

Solution

I basically took Kyle's answer and made it more future-friendly:

def dictelem(input):   
    parts   = input.split('&')
    listing = [part.split('=') for part in parts]

    result = {}
    for entry in listing:
        head, tail = entry[0], ''.join(entry[1:])
        if ':' in tail:
            entries = tail.split(',')
            result.update({ head : dict(e.split(':') for e in entries) })
        else:
            result.update({head: tail})

    return result

OTHER TIPS

Here's a two-liner that does what I think you want:

dictelem = lambda x: x if ':' not in x[1] else [x[0],dict(y.split(':') for y in x[1].split(','))]
a = dict(dictelem(x.split('=')) for x in input.split('&'))

Can I take the slick two-level delimitation parsing command that I've found:

requestDict = dict(line.split('=') for line in clientRequest.split('&'))

and add a third level to it to handle & preserve the 2nd-level dictionaries?

Of course you can, but (a) you probably don't want to, because nested comprehensions beyond two levels tend to get unreadable, and (b) this super-simple syntax won't work for cases like yours, where only some of the data can be turned into a dict.

For example, what should happen with 'PC'? Do you want to make that into {'PC': None}? Or maybe the set {'PC'}? Or the list ['PC']? Or just leave it alone? You have to decide, and write the logic for that, and trying to write it as an expression will make your decision very hard to read.

So, let's put that logic in a separate function:

def parseCommasAndColons(s):
    bits = [bit.split(':') for bit in s.split(',')]
    try:
        return dict(bits)
    except ValueError:
        return bits

This will return a dict like {'ip': '192.168.10.10', 'deviceid': '1234', 'optdata': '75BCD15'} or {'AMT-Cam': 'avatar'} for cases where each comma-separated component has a colon inside it, but a list like ['1342177342'] for cases where any of them don't.

Even this may be a little too clever; I might make the "is this in dictionary format" check more explicit instead of just trying to convert the list of lists and see what happens.

Either way, how would you put that back into your original comprehension?

Well, you want to call it on the value in the line.split('='). So let's add a function for that:

def parseCommasAndColonsForValue(keyvalue):
    if len(keyvalue) == 2:
        return keyvalue[0], parseCommasAndColons(keyvalue[1])
    else:
        return keyvalue

requestDict = dict(parseCommasAndColonsForValue(line.split('=')) 
                   for line in clientRequest.split('&'))

One last thing: Unless you need to run on older versions of Python, you shouldn't often be calling dict on a generator expression. If it can be rewritten as a dictionary comprehension, it will almost certainly be clearer that way, and if it can't be rewritten as a dictionary comprehension, it probably shouldn't be a 1-liner expression in the first place.

Of course breaking expressions up into separate expressions, turning some of them into statements or even functions, and naming them does make your code longer—but that doesn't necessarily mean worse. About half of the Zen of Python (import this) is devoted to explaining why. Or one quote from Guido: "Python is a bad language for code golf, on purpose."

If you really want to know what it would look like, let's break it into two steps:

>>> {k: [bit2.split(':') for bit2 in v.split(',')] for k, v in (bit.split('=') for bit in s.split('&'))}
{'clientid': [['b59694bf-c7c1-4a3a-8cd5-6dad69f4abb0']],
 'junketid': [['1342177342']],
 'keyid': [['987654321']],
 'md': [['AMT-Cam', 'avatar']],
 'mk': [['PC']],
 'playbackmode': [['st']],
 'sessionid': [['']],
 'userdata': [['ip', '192.168.10.10'],
              ['deviceid', '1234'],
              ['optdata', '75BCD15']],
 'ver': [['6']],
 'version': [['6.7.8.9012']]}

That illustrates why you can't just add a dict call for the inner level—because most of those things aren't actually dictionaries, because they had no colons. If you changed that, then it would just be this:

{k: dict(bit2.split(':') for bit2 in v.split(',')) for k, v in (bit.split('=') for bit in s.split('&'))}

I don't think that's very readable, and I doubt most Python programmers would. Reading it 6 months from now and trying to figure out what I meant would take a lot more effort than writing it did.

And trying to debug it will not be fun. What happens if you run that on your input, with missing colons? ValueError: dictionary update sequence element #0 has length 1; 2 is required. Which sequence? No idea. You have to break it down step by step to see what doesn't work. That's no fun.

So, hopefully that illustrates why you don't want to do this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top