Question

I'm getting this exception when trying to open shelve persisted files over a certain size which is actually pretty small (< 1MB) but I'm not sure where the exactly number is. Now, I know pickle is sort of the bastard child of python and shelve isn't thought of as a particularly robust solution, but it happens to solve my problem wonderfully (in theory) and I haven't been able to find a reason for this exception.

Traceback (most recent call last):
  File "test_shelve.py", line 27, in <module>
    print len(f.keys())
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shelve.py", line 101, in keys
    return self.dict.keys()
SystemError: Negative size passed to PyString_FromStringAndSize

I can reproduce it consistently, but I haven't found much on google. Here's a script that will reproduce.

import shelve
import random
import string
import pprint

f = shelve.open('test')
# f = {}

def rand_list(list_size=20, str_size=40):
    return [''.join([random.choice(string.ascii_uppercase + string.digits) for j in range(str_size)]) for i in range(list_size)]

def recursive_dict(depth=3):
    if depth==0:
        return rand_list()
    else:
        d = {}
        for k in rand_list():
            d[k] = recursive_dict(depth-1)
        return d

for k,v in recursive_dict(2).iteritems():
    f[k] = v

f.close()

f = shelve.open('test')
print len(f.keys())
Was it helpful?

Solution

Regarding error itself:

The idea circulating on the web is the data size exceeded the largest integer possible on that machine (the largest 32 bit (signed) integer is 2 147 483 647), interpreted as a negative size by Python.

Your code is running with 2.7.3, so may be a fixed bug.

OTHER TIPS

The code "works" if I change the depth from 2 to 1, or if I run under python 3 (after fixing the print statements and using items() instead of iteritems()). However, the list of keys is clearly not the set of keys found while iterating over the return value of recursive_dict().

The following restriction from the shelve documentation may apply (emphases mine):

The choice of which database package will be used (such as dbm, gdbm or bsddb) depends on which interface is available. Therefore it is not safe to open the database directly using dbm. The database is also (unfortunately) subject to the limitations of dbm, if it is used — this means that (the pickled representation of) the objects stored in the database should be fairly small, and in rare cases key collisions may cause the database to refuse updates.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top