The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

DirDB - use a directory as a persistence back end for (multi-level) (blessed) hashes

SYNOPSIS

  use DirDB;
  tie my %session, 'DirDB', "./data/session";
  $session{$sessionID}{email} = get_emailaddress();
  $session{$sessionID}{objectcache}{fribble} ||= new fribble;

DESCRIPTION

DirDB is a package that lets you access a directory as a hash. The final directory will be created, but not the whole path to it.

The empty string, used as a key, will be translated into ' EMPTY' for purposes of storage and retrieval. File names beginning with a space are reserved for metadata for subclasses, such as object type or array size or whatever. Key names beginning with a space get an additional space prepended to the name for purposes of naming the file to store that value.

As of version 0.05, DirDB can store hash references. references to tied hashes are recursively copied, references to plain hashes are first tied to DirDB and then recursively copied. Storing a circular hash reference structure will cause DirDB to croak.

As of version 0.06, DirDB now recursively copies subdirectory contents into an in-memory hash and returns a reference to that hash when a previously stored hash reference is deleted in non-void context.

As of version 0.07, non-HASH references are stored using Storable

As of version 0.08, non-HASH references cause croaking again: the Storable functioning has been moved to DirDB::Storable

Version 0.10 will store and retrieve blessed hash-references and blesses them back into what they were when they were stored.

DirDB will croak if it can't open an existing file system entity.

 tie my %d => DirDB, '/tmp/foodb';
 
 $d{ref1}->{ref2}->{ref3}->{ref4} = 'something'; 
 # 'something' is now stored in /tmp/foodb/ref1/ref2/ref3/ref4
 
 my %e = (1 => 2, 2 => 3);
 $d{e} = \%e;
 # %e is now tied to /tmp/foodb/e, and 
 # /tmp/foodb/e/1 and /tmp/foodb/e/2 now contain 2 and 3, respectively

 $d{f} = \%e;
 # like `cp -R /tmp/foodb/e /tmp/foodb/f`

 $e{destination} = 'Kashmir';
 # sets /tmp/foodb/e/destination
 # leaves /tmp/foodb/f alone
 
 my %g = (1 => 2, 2 => 3);
 $d{g} = {%g};
 # %g has been copied into /tmp/foodb/g/ without tying %g.
 

Pipes and so on are opened for reading and read from on FETCH, and clobbered on STORE.

The underlying object is a scalar containing the path to the directory. Keys are names within the directory, values are the contents of the files.

STOREMETA and FETCHMETA methods are provided for subclasses who which to store and fetch metadata (such as array size) which will not appear in the data returned by NEXTKEY and which cannot be accessed directly through STORE or FETCH. Currently one metadatum, 'BLESS' is used to indicate what package to bless a tied hashref into.

storing and retrieving blessed objects

blessed objects can now be stored, as long as their underlying representation is a hash. This may change. The root of a DirDB tree will not get blessed but all blessed hashreference branches will be blessed on fetch into the package they were in when stored.

RISKS

stale lock risk

"mkdir locking" is used to protect incomplete directories from being accessed while they are being written. It is conceivable that your program might catch a signal and die while inside a critical section. If this happens, a simple

    find /your/data -type d -name ' LOCK*'

at the command line will identify what you need to delete.

Only the very end of the write operation is protected by the locking: during a write, other processes will be able to read the old data. They will also be able to start their own overwrites.

DirDB attempts to guarantee that written data is complete (not partial.)

DirDB does not attempt to guarantee atomicity of updates.

unexpected persistence

Untied hash references assigned into a DirDB tied hash will become tied to the file system at the point they are first assigned. This has the potential to cause confusion.

Tied hash references are recursively copied. This includes hash references tied due to being assigned into a DirDB tied hash.

EXPORT

None by default.

AUTHOR

David Nicol, davidnicol@cpan.org

Assistance

version 0.04 QA provided by members of Kansas City Perl Mongers, including Andrew Moore and Craig S. Cottingham.

LICENSE

GPL/Artistic (the same terms as Perl itself)

SEE ALSO

better read perltie before trying to extend this

DirDB::Storable uses Storable for storing and retrieving arbitrary types

DirDB::FTP provides complete DirDB function over the FTP protocol

Tie::Dir is concerned with accessing stat information, not file contents