Question

I want to pickle a list as it takes a long time for me to create the list. The list consists of "n" 2-tuple values where the first tuple value is a large dictionary(1k to 10k keys) and the second value is a string. N can be as large as 100s to 1000s. I would like to save such a data structure so that I may load it on convenience. If the list is human-readable it would be great, but not at the cost of performance.

I have knowledge about Shelve, PyYaml, cPickle and JSON. I am just unsure as to what to use given my data. Initial reading on various other threads on this website and internet says that cPickle may be the best choice. Any thoughts from the gurus here?

Was it helpful?

Solution

I would use cPickle, this works fine,

# Dummy data
from random import randint as r

a,b = 97,123

d = [(dict([(chr(r(a,b)),j) for j in range(1000)]),
       ''.join([chr(r(a,b)) for i in range(5)])) 
                            for j in range(100)]

# Pickle it
import cPickle as pickle

f = open('store.dat','w')
pickle.dump(d,f)
f.close()

I would also consider using something like dumbdbm.

Added later

Following on from the example above, you can do something like this,

import dumbdbm as dbm

g = dbm.open('store.db')
g.update([(str(i),pickle.dumps(j)) for i,j in enumerate(d)])
g.close()
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top