Question

I have a script that utilizes a hash, which contains four strings as keys whose values are hashes. These hashes also contain four strings as keys which also have hashes as their values. This pattern continues up to n-1 levels, which is determined at run-time. The nth-level of hashes contain integer (as opposed to the usual hash-reference) values.

I installed the BerkeleyDB module for Perl so I can use disk space instead of RAM to store this hash. I assumed that I could simply tie the hash to a database, and it would work, so I added the following to my code:

my %tags = () ; 
my $file = "db_tags.db" ; 
unlink $file; 


tie %tags, "BerkeleyDB::Hash", 
        -Filename => $file, 
        -Flags => DB_CREATE
     or die "Cannot open $file\n" ;

However, I get the error:

Can't use string ("HASH(0x1a69ad8)") as a HASH ref while "strict refs" in use at getUniqSubTreeBDB.pl line 31, line 1.

To test, I created a new script, with the code (above) that tied to hash to a file. Then I added the following:

my $href = \%tags; 
$tags{'C'} = {} ;

And it ran fine. Then I added:

$tags{'C'}->{'G'} = {} ;

And it would give pretty much the same error. I am thinking that BerkeleyDB cannot handle the type of data structure I am creating. Maybe it was able to handle the first level (C->{}) in my test because it was just a regular key -> scaler?

Anyways, any suggestions or affirmations of my hypothesis would be appreciated.

Était-ce utile?

La solution

Use DBM::Deep.

my $db = DBM::Deep->new( "foo.db" );

$db->{mykey} = "myvalue";
$db->{myhash} = {};
$db->{myhash}->{subkey} = "subvalue";

print $db->{myhash}->{subkey} . "\n";

The code I provided yesterday would work fine with this.

sub get_node {
   my $p = \shift;
   $p = \( ($$p)->{$_} ) for @_;
   return $p;
}

my @seqs = qw( CG CA TT CG );

my $tree = DBM::Deep->new("foo.db");
++${ get_node($tree, split //) } for @seqs;

Autres conseils

No. BerkeleyDB stores pairs of one key and one value, where both are arbitrary bytestrings. If you store a hashref as the value, it'll store the string representation of a hashref, which isn't very useful when you read it back (as you noticed).

The MLDBM module can do something like you describe, but it works by serializing the top-level hashref to a string and storing that in the DBM file. This means it has to read/write the entire top-level hashref every time you access or change a value in it.

Depending on your application, you may be able to combine your keys into a single string, and use that as the key for your DBM file. The main limitation with that is that it's difficult to iterate over the keys of one of your interior hashes.

You might use the semi-obsolete multidimensional array emulation for this. $foo{$a,$b,$c} is interpreted as $foo{join($;, $a, $b, $c)}, and that works with tied hashes also.

No; it can only store strings. But you can use the →filter_fetch_value and →filter_store_value to define "filters" that will automatically freeze arbitrary structures to strings before storing, and to convert back when fetching. There are analogous hooks for marshalling and unmarshalling non-string keys.

Beware though: using this method to store objects that share subobjects will not preserve the sharing. For example:

$a = [1, 2, 3];
$g = { array => $a };
$h = { array => $a };
$db{g} = $g;
$db{h} = $h;

@$a = ();
push @{$db{g}{array}}, 4;

print @{$db{g}{array}};  # prints 1234, not 4
print @{$db{h}{array}};  # prints 123, not 1234 or 4

%db here is a tied hash; if it were an ordinary hash the two prints would both print 4.

While you can't store normal multidimensional hashes in a BerkeleyDB tied hash, you can use emulated multidimensional hashes with a syntax like $tags{ 'C', 'G'}. This creates a single key that looks like ('C' . $; . 'G')

I had the same question, found this. Might be useful for you as well.

Storing data structures as values in BDB

Often, we might be interested in storing complex data structures: arrays, hashtables,… whose elements can be simple values, of references to other data structures. To do this, we need to serialize the data structure: convert it to a string that can be stored in the database, and can be later converted back into the original data structure using a deserialization procedure.

There are several perl modules available to perform this serialization/deserialization process. One of the most popular is JSON::XS. The next example shows how to use this module:

use JSON::XS;

# Data to be stored
my %structure;

# Convert the data into a json string
my $json = encode_json(%structure);

# Save it in the database
$dbh->db_put($key,$json);
To retrieve the original structure, we perform the inverse operation:

# Retrieve the json string from the database
$dbh->db_get($key, $json);

# Deserialize the json string into a data structure
my $hr_structure = decode_json($json);

In perl you can do this. You are using references beyond the first level.

use GDBM_File;
use Storable;
use MLDBM qw(GDBM_File Storable);
my %hash;
my %level_2_hash;
my %level_3_hash1 =  (key1 => x, key2 => y, key3 => z)
my %level_3_hash2 =  (key10 => a, key20 => b, key30 => c)
$level_2_hash = (keyA => /%level_3_hash1, keyB => level_3_hash2)
$hash{key} = \%level_2_hash;

This can be found in the online beginning perl book in chapter 13.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top