NAME
DBM::Deep::Cookbook
DESCRIPTION
This is the Cookbook for DBM::Deep. It contains useful tips and tricks, plus some examples of how to do common tasks.
RECIPES
UTF8 data
When you're using UTF8 data, you may run into the "Wide character in print" warning. To fix that in 5.8+, do the following:
my
$db
= DBM::Deep->new( ... );
binmode
$db
->_fh,
":utf8"
;
In 5.6, you will have to do the following:
my
$db
= DBM::Deep->new( ... );
$db
->set_filter(
'store_value'
=>
sub
{
pack
"U0C*"
,
unpack
"C*"
,
$_
[0] } );
$db
->set_filter(
'retrieve_value'
=>
sub
{
pack
"C*"
,
unpack
"U0C*"
,
$_
[0] } );
In a future version, you will be able to specify utf8 => 1
and DBM::Deep will do these things for you.
Real-time Encryption Example
NOTE: This is just an example of how to write a filter. This most definitely should NOT be taken as a proper way to write a filter that does encryption.
Here is a working example that uses the Crypt::Blowfish module to do real-time encryption / decryption of keys & values with DBM::Deep Filters. Please visit http://search.cpan.org/search?module=Crypt::Blowfish for more on Crypt::Blowfish. You'll also need the Crypt::CBC module.
use
DBM::Deep;
use
Crypt::Blowfish;
use
Crypt::CBC;
my
$cipher
= Crypt::CBC->new({
'key'
=>
'my secret key'
,
'cipher'
=>
'Blowfish'
,
'iv'
=>
'$KJh#(}q'
,
'regenerate_key'
=> 0,
'padding'
=>
'space'
,
'prepend_iv'
=> 0
});
my
$db
= DBM::Deep->new(
file
=>
"foo-encrypt.db"
,
filter_store_key
=> \
&my_encrypt
,
filter_store_value
=> \
&my_encrypt
,
filter_fetch_key
=> \
&my_decrypt
,
filter_fetch_value
=> \
&my_decrypt
,
);
$db
->{key1} =
"value1"
;
$db
->{key2} =
"value2"
;
"key1: "
.
$db
->{key1} .
"\n"
;
"key2: "
.
$db
->{key2} .
"\n"
;
undef
$db
;
exit
;
sub
my_encrypt {
return
$cipher
->encrypt(
$_
[0] );
}
sub
my_decrypt {
return
$cipher
->decrypt(
$_
[0] );
}
Real-time Compression Example
Here is a working example that uses the Compress::Zlib module to do real-time compression / decompression of keys & values with DBM::Deep Filters. Please visit http://search.cpan.org/search?module=Compress::Zlib for more on Compress::Zlib.
use
DBM::Deep;
use
Compress::Zlib;
my
$db
= DBM::Deep->new(
file
=>
"foo-compress.db"
,
filter_store_key
=> \
&my_compress
,
filter_store_value
=> \
&my_compress
,
filter_fetch_key
=> \
&my_decompress
,
filter_fetch_value
=> \
&my_decompress
,
);
$db
->{key1} =
"value1"
;
$db
->{key2} =
"value2"
;
"key1: "
.
$db
->{key1} .
"\n"
;
"key2: "
.
$db
->{key2} .
"\n"
;
undef
$db
;
exit
;
sub
my_compress {
return
Compress::Zlib::memGzip(
$_
[0] ) ;
}
sub
my_decompress {
return
Compress::Zlib::memGunzip(
$_
[0] ) ;
}
Note: Filtering of keys only applies to hashes. Array "keys" are actually numerical index numbers, and are not filtered.
Custom Digest Algorithm
DBM::Deep by default uses the Message Digest 5 (MD5) algorithm for hashing keys. However you can override this, and use another algorithm (such as SHA-256) or even write your own. But please note that DBM::Deep currently expects zero collisions, so your algorithm has to be perfect, so to speak. Collision detection may be introduced in a later version.
You can specify a custom digest algorithm by passing it into the parameter list for new(), passing a reference to a subroutine as the 'digest' parameter, and the length of the algorithm's hashes (in bytes) as the 'hash_size' parameter. Here is a working example that uses a 256-bit hash from the Digest::SHA256 module. Please see http://search.cpan.org/search?module=Digest::SHA256 for more information.
use
DBM::Deep;
use
Digest::SHA256;
my
$context
= Digest::SHA256::new(256);
my
$db
= DBM::Deep->new(
filename
=>
"foo-sha.db"
,
digest
=> \
&my_digest
,
hash_size
=> 32,
);
$db
->{key1} =
"value1"
;
$db
->{key2} =
"value2"
;
"key1: "
.
$db
->{key1} .
"\n"
;
"key2: "
.
$db
->{key2} .
"\n"
;
undef
$db
;
exit
;
sub
my_digest {
return
substr
(
$context
->hash(
$_
[0]), 0, 32 );
}
Note: Your returned digest strings must be EXACTLY the number of bytes you specify in the hash_size parameter (in this case 32). Undefined behavior will occur otherwise.
Note: If you do choose to use a custom digest algorithm, you must set it every time you access this file. Otherwise, the default (MD5) will be used.
PERFORMANCE
Because DBM::Deep is a conncurrent datastore, every change is flushed to disk immediately and every read goes to disk. This means that DBM::Deep functions at the speed of disk (generally 10-20ms) vs. the speed of RAM (generally 50-70ns), or at least 150-200x slower than the comparable in-memory datastructure in Perl.
There are several techniques you can use to speed up how DBM::Deep functions.
Put it on a ramdisk
The easiest and quickest mechanism to making DBM::Deep run faster is to create a ramdisk and locate the DBM::Deep file there. Doing this as an option may become a feature of DBM::Deep, assuming there is a good ramdisk wrapper on CPAN.
Work at the tightest level possible
It is much faster to assign the level of your db that you are working with to an intermediate variable than to re-look it up every time. Thus
# BAD
while
(
my
(
$k
,
$v
) =
each
%{
$db
->{foo}{bar}{baz}} ) {
...
}
# GOOD
my
$x
=
$db
->{foo}{bar}{baz};
while
(
my
(
$k
,
$v
) =
each
%$x
) {
...
}
Make your file as tight as possible
If you know that you are not going to use more than 65K in your database, consider using the
pack_size => 'small'
option. This will instruct DBM::Deep to use 16bit addresses, meaning that the seek times will be less.