Question

I've spent hours searching for examples of how to use the bsddb module and the only ones that I've found are these (from here):

data = mydb.get(key)
if data:
    doSomething(data)
#####################
rec = cursor.first()
while rec:
    print rec
    rec = cursor.next()
#####################
rec = mydb.set()
while rec:
    key, val = rec
    doSomething(key, val)
    rec = mydb.next()

Does anyone know where I could find more (practical) examples of how to use this package?

Or would anyone mind sharing code that they've written themselves that used it?

Edit:

The reason I chose the Berkeley DB was because of its scalability. I'm working on a latent semantic analysis of about 2.2 Million web pages. My simple testing of 14 web pages generates around 500,000 records. So doing the math out... there will be about 78.6 Billion records in my table.

If anyone knows of another efficient, scalable database model that I can use python to access, please let me know about it! (lt_kije has brought it to my attention that bsddb is deprecated in Python 2.6 and will be gone in 3.*)

Was it helpful?

Solution

These days, most folks use the anydbm meta-module to interface with db-like databases. But the API is essentially dict-like; see PyMOTW for some examples. Note that bsddb is deprecated in 2.6.1 and will be gone in 3.x. Switching to anydbm will make the upgrade easier; switching to sqlite (which is now in stdlib) will give you a much more flexible store.

OTHER TIPS

Look at: Lib3/bsddb/test after downloading the source from http://pypi.python.org/pypi/bsddb3/

The current distribution contains the following tests that are very helpful to start working with bsddb3:

test_all.py
test_associate.py
test_basics.py
test_compare.py
test_compat.py
test_cursor_pget_bug.py
test_dbenv.py
test_dbobj.py
test_db.py
test_dbshelve.py
test_dbtables.py
test_distributed_transactions.py
test_early_close.py
test_fileid.py
test_get_none.py
test_join.py
test_lock.py
test_misc.py
test_pickle.py
test_queue.py
test_recno.py
test_replication.py
test_sequence.py
test_thread.py

I'm assuming this thread is still active, so here we go. This is rough code and there's no error checking, but it may be useful as a starting point.

I wanted to use PHP's built-in DBA functions and then read the database using a Python (2.x) script. Here's the PHP script that creates the database:

<?php 
$id=dba_open('visitor.db', 'c', 'db4');
dba_optimize($id);
dba_close($id);
?>

Now, here's the PHP code to insert an entry: I use JSON to hold the "real" data:

<?php 
/* 
    record a visit in a BSD DB
*/
$id=dba_open('visitor.db', 'w', 'db4');
if (!$id) {
    /* dba_open failed */
    exit;
}
$key  = $_SERVER['REQUEST_TIME_FLOAT']; 
$rip  = $_SERVER['REMOTE_ADDR'];
$now  = date('d-m-Y h:i:s a', time()); 
$data = json_encode( array('remote_ip' => $rip, 'timestamp' => $now) );
$userdata=array($key => $data);
foreach ($userdata as $key=>$value) {
dba_insert($key, $value, $id);
}
dba_optimize($id);
dba_close($id);
?>

Now, here's the code that you and I are actually interested in, and it uses Python's bsddb3 module.

#!/usr/bin/env python
from bsddb3 import db
import json

fruitDB = db.DB()
fruitDB.open('visitor.db',None,db.DB_BTREE,db.DB_DIRTY_READ)
cursor = fruitDB.cursor()
rec = cursor.first()

while rec:
    print rec
    visitordata = rec[1]
    print '\t' + visitordata
    jvdata = json.loads(visitordata)
    print jvdata
    rec = cursor.next()
    print '\n\n'
print '----';

fruitDB.close()

Searching for "import bsddb", I get:

...but personally I'd heavily recommend you use sqlite instead of bsddb, people are using the former a lot more for a reason.

The Gramps genealogy program uses bsddb for its database

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top