Question

i need to store Python structures made of lists / dictionaries, tuples into a human readable format. The idea is like using something similar to pickle, but pickle is not human-friendly. Other options that come to my mind are YAML (through PyYAML and JSON (through simplejson) serializers.

Any other option that comes to your mind?

Thanks in advance.

Was it helpful?

Solution

For simple cases pprint() and eval() come to mind.

Using your example:

>>> d = {'age': 27,
...  'name': 'Joe',
...  'numbers': [1, 
...              2, 
...              3,
...              4,
...              5],
...  'subdict': {
...              'first': 1, 
...              'second': 2,
...               'third': 3
...              }
... }
>>> 
>>> from pprint import pprint
>>> pprint(d)
{'age': 27,
 'name': 'Joe',
 'numbers': [1, 2, 3, 4, 5],
 'subdict': {'first': 1, 'second': 2, 'third': 3}}
>>> 

I would think twice about fixing two requirements with the same tool. Have you considered using pickle for the serializing and then pprint() (or a more fancy object viewer) for humans looking at the objects?

OTHER TIPS

If its just Python list, dictionary and tuple object. - JSON is the way to go. Its human readable, very easy to handle and language independent too.

Caution: Tuples will be converted to lists in simplejson.

In [109]: simplejson.loads(simplejson.dumps({'d':(12,3,4,4,5)}))
Out[109]: {u'd': [12, 3, 4, 4, 5]}

If you're after more representations than are covered by JSON, I highly recommend checking out PyON (Python Object Notation)...although I believe it's restricted to 2.6/3.0 and above, as it relies on the ast module. It handles custom class instances and recursive data types, amongst other features, which is more than is provided by JSON.

You should check out jsonpickle (https://github.com/jsonpickle/jsonpickle). It will write out any python object into a json file. You can then read that file back into a python object. The nice thing is the inbetween file is very readable because it's json.

What do you mean this is not human-readable??? ;)

>>> d = {'age': 27, 
...   'name': 'Joe',
...   'numbers': [1,2,3,4,5],
...   'subdict': {'first':1, 'second':2, 'third':3}
... }
>>> 
>>> import pickle
>>> p = pickle.dumps(d)      
>>> p
"(dp0\nS'age'\np1\nI27\nsS'subdict'\np2\n(dp3\nS'second'\np4\nI2\nsS'third'\np5\nI3\nsS'first'\np6\nI1\nssS'name'\np7\nS'Joe'\np8\nsS'numbers'\np9\n(lp10\nI1\naI2\naI3\naI4\naI5\nas."

Ok, well, maybe it just takes some practice… or you could cheat...

>>> import pickletools 
>>> pickletools.dis(p)
    0: (    MARK
    1: d        DICT       (MARK at 0)
    2: p    PUT        0
    5: S    STRING     'age'
   12: p    PUT        1
   15: I    INT        27
   19: s    SETITEM
   20: S    STRING     'subdict'
   31: p    PUT        2
   34: (    MARK
   35: d        DICT       (MARK at 34)
   36: p    PUT        3
   39: S    STRING     'second'
   49: p    PUT        4
   52: I    INT        2
   55: s    SETITEM
   56: S    STRING     'third'
   65: p    PUT        5
   68: I    INT        3
   71: s    SETITEM
   72: S    STRING     'first'
   81: p    PUT        6
   84: I    INT        1
   87: s    SETITEM
   88: s    SETITEM
   89: S    STRING     'name'
   97: p    PUT        7
  100: S    STRING     'Joe'
  107: p    PUT        8
  110: s    SETITEM
  111: S    STRING     'numbers'
  122: p    PUT        9
  125: (    MARK
  126: l        LIST       (MARK at 125)
  127: p    PUT        10
  131: I    INT        1
  134: a    APPEND
  135: I    INT        2
  138: a    APPEND
  139: I    INT        3
  142: a    APPEND
  143: I    INT        4
  146: a    APPEND
  147: I    INT        5
  150: a    APPEND
  151: s    SETITEM
  152: .    STOP
highest protocol among opcodes = 0
>>> 

You'd still have to read the pickled object from a file, however you wouldn't need to load it. So, if it's a "dangerous" object, you still might be able to figure that out before doing the load. If you are stuck with a pickle, it might be a good option for deciphering what you have.

To use simplejson first easy_install simplejson:

import simplejson
my_structure = {"name":"Joe", "age":27, "numbers":[1,2,3,4,5], "subdict":{"first":1, "second":2, "third": 3}}
json = simplejson.dumps(my_structure)

results in json being:

{"age": 27, "subdict": {"second": 2, "third": 3, "first": 1}, "name": "Joe", "numbers": [1, 2, 3, 4, 5]}

Notice that its hardly changed the format of the dictionary at all, but you should run it through this step to ensure valid JSON data.

You can further pretty print the result:

import pprint
pprint.pprint(my_structure)

results in:

{'age': 27,
 'name': 'Joe',
 'numbers': [1, 2, 3, 4, 5],
 'subdict': {'first': 1, 'second': 2, 'third': 3}}

There is AXON (textual) format that combine the best of JSON, XML and YAML. AXON format is quite readable and relatively compact.

The python (2.7/3.3-3.7) module pyaxon supports load(s)/dump(s) functionality, including iterative loading/dumping. It's sufficiently fast in order to be useful.

Consider simple example:

>>> d = {
     'age': 27, 'name': 'Joe', 
     'numbers': [1, 2, 3, 4, 5], 
     'subdict': {'first': 1, 'second': 2, 'third': 3}
    }
# pretty form
>>> axon.dumps(d, pretty=1)
{ age: 27
  name: "Joe"
  numbers: [1 2 3 4 5]
  subdict: {
    first: 1
    second: 2
    third: 3}}
# compact form
>>> axon.dumps(d)
{age:27 name:"Joe" numbers:[1 2 3 4 5] subdict:{first:1 second:2 third:3}}

It also can handle multiple objects in the message:

>>> msg = axon.dumps([{'a':1, 'b':2, 'c':3}, {'a':2, 'b':3, 'c':4}])
>>> print(msg)
{a:1 b:2 c:3} 
{a:2 b:3 c:4}
{a:3 b:4 c:5}

and then load them iteratively:

for d in axon.iloads(msg):
   print(d)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top