Question

shelve documentation says:

The choice of which database package will be used (such as dbm, gdbm or bsddb) depends on which interface is available.

What is that mean? How to determine which package choosen? How to strictly define which one must be choosed? What database implementation best to use?

Was it helpful?

Solution

Found it here:
http://www.gossamer-threads.com/lists/python/python/13891

import shelve 
import gdbm 

def gdbm_shelve(filename, flag="c"): 
    return shelve.Shelf(gdbm.open(filename, flag)) 

db = gdbm_shelve("dbfile") 

ps
In linked page someone also found this somewhere, but his link is dead.

OTHER TIPS

I think there is no way to specify the underlaying database yourself. shelve uses anydbm and anydbm uses the whichdb module which tries the following underlaying implementations in the following order

  • dbhash
  • gdm
  • dbm
  • dumbdbm

You may use the shelve.BsdDbShelf subclass of Shelf to force the usage of bsd*d*b implementation.

How to determine which package choosen?

The built-in module whichdb may be used for that. For example:

In [34]: db = anydbm.open('test.db', 'c')

In [35]: db['test'] = '123'

In [36]: db.close()

In [37]: import whichdb

In [38]: dir(whichdb)
Out[38]: 
['__builtins__',
 '__doc__',
 '__file__',
 '__name__',
 '__package__',
 '_dbmerror',
 'dbm',
 'os',
 'struct',
 'sys',
 'whichdb']

In [39]: whichdb.whichdb('test.db')
Out[39]: 'dbhash'

What database implementation best to use?

The shelve module talks about some restrictions if the underlying DB engine is dbm (i.e., the Python module called dbm, which interfaces with Unix ndbm or the BSD DB or the GNU GDBM compatibility interfaces for ndbm):

[...] this means that (the pickled representation of) the objects stored in the database should be fairly small, and in rare cases key collisions may cause the database to refuse updates.

It's not clear whether this applies only to ndbm proper, or the compatibility interfaces also; what "fairly small" means in numbers; and how "rare" are those cases.

Actually, Ruby, which also has bindings for DBM, has this to say:

Original Berkeley DB was limited to 2GB of data. Dbm libraries also sometimes limit the total size of a key/value pair, and the total size of all the keys that hash to the same value. These limits can be as little as 512 bytes. That said, gdbm and recent versions of Berkeley DB do away with these limits.

I'm assuming this is nothing to worry about, because it's quite unlikely that ndbm will be used and because hitting any of these limitations would (hopefully) throw a descriptive exception, at which point we'll need to mess around further.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top