Question

I am accessing a service that is returning json as follows:

{
"A":"A value",
"B":{
      "B1":"B1 value",
      "B2":"B2 value"
    },
"C":{
               "c_url":"http:\/\/someurl:someport\/somefolder\/somefile",

    }
}

What I want to do is parse this json and get the url and the somefile portion of the folder in a key-value pair.

So, essentially, after my script is done processing the json, it will output the url and the somefile in some delimited fashion in a file.

The output after the file has processed the above provided json would be:

url: http://someurl:someport/somefolder/somefile
file: somefile

I am pretty sure there are numerous json parsers in python that will parse the json but how would I deal with the url string that has been pre-processed with escape characters? Do I need to write my own url-encoder that will strip out the escape characters from the url string?

Also, I would need to tokenize the individual components of the url to get to the 'file' portion, are there any libraries that can help with that?

Thanks

Was it helpful?

Solution

Your example JSON doesn't need the comma after the "c_url" k-v pair.

>>> import json
>>> st = '{"A":"A value","B":{ "B1":"B1 value", "B2":"B2 value" },"C":{ "c
_url":"http:\/\/someurl:someport\/somefolder\/somefile" }}'
>>> json.loads(st)
{u'A': u'A value', u'C': {u'c_url': u'http://someurl:someport/somefolder/somefile'}, u'B': {u'B1': u
'B1 value', u'B2': u'B2 value'}}

And to get just the 'somefile' part of the URL:

url.split('/')[url.count('/')]

OTHER TIPS

So, as per http://www.quora.com/Why-does-the-cjson-Python-module-not-correctly-unescape-reverse-solidus-solidus

simplejson should be able to handle this "buggy" escaping algorithm. Consider:

#!/usr/bin/env python
import simplejson

print simplejson.loads('"http:\/\/someurl:someport\/somefolder\/somefile"')

Note, that simplejson is not a standard package, but is installable via easy_install.

Sample output:

[85][23:35:24] vlazarenko@alluminium (~/tests) > python unescape.py 
http://someurl:someport/somefolder/somefile
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top