The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

DBIx::FileStore - Module to store files in a DBI backend

VERSION

Version 0.05

SYNOPSIS

Ever wanted to store files in a database? Yeah, it's probably a bad idea, but maybe you want to do it anyway.

This code helps you do that.

Internally all the fdb tools in script/ use this library to get at file names and contents in the database.

To get started, see the files QUICKSTART.txt and README from the DBIx-FileStore distribution. This document details the module's implementation.

FILENAME NOTES

The name of the file in the filestore cannot contain spaces.

The maximum length of the name of a file in the filestore is 75 characters.

You can store files under any name you wish in the filestore. The name need not correspond to the original name on the filesystem.

All filenames in the filestore are in one flat address space. You can use / in filenames, but it does not represent an actual directory. (Although fdbls has some support for viewing files in the filestore as if they were in folders. See the docs on 'fdbls' for details.)

IMPLEMENTATION CAVEAT

NOTE THAT THIS IS A PROOF-OF-CONCEPT DEMO.

THIS WAS NOT DESIGNED AS A PRODUCTION EXAMPLE!!!!!

IN PARTICULAR, we wouldn't have one row in the 'files' table for each block in the 'fileblocks' table, we'd have one row per file.

Also, we'd probably use a unique ID to address the blocks in the fileblocks table, instead of the 'name' field that's currently used.

That having been said, this example works quite nicely, and altering the DB Schema and code would not be a large effort.

IMPLEMENTATION

The data is stored in the database using two tables: 'files' and 'fileblocks'. All meta-data is stored in the 'files' table, and the file contents are stored in the 'fileblocks' table.

fileblocks table

The fileblocks table has only three fields:

name

The name of the block. Always looks like "filename.txt <BLOCKNUMBER>", for example "filestorename.txt 00000".

block

The contents of the named block. Each block can currently be a maximum of 512K.

timestamp

The timestamp of when this block was inserted into the DB or updated.

files table

The files table has several fields. There is one row in the files table for each row in the fileblocks table-- not one per file (see IMPLEMENTATION CAVEATS, above). The fields in the files table are:

name

The name of the block, exactly as used in the fileblocks table. Always looks like "filename.txt <BLOCKNUMBER>", for example "filestorename.txt 00000".

c_len

The content length of the whole file (sum of length of all relevant blocks).

b_num

The number of the block this row represents. The b_num is repeated as a five digit number at the end of the name field (see above).

b_md5

The md5 checksum for the block (b is for 'block') represented by this row.

c_md5

The md5 checksum for the whole file (c is for 'content') represented by this row.

lasttime

The timestamp of when this row was inserted into the DB or updated.

See the file 'table-definitions.sql' for more details about the db schema used.

METHODS

my $filestore = new DBIx::FileStore()

returns a new DBIx::FileStore object

my $bytecount = $filestore->read_from_db( "filesystemname.txt", "filestorename.txt" );

Copies the file 'filestorename.txt' from the filestore to the file filesystemname.txt on the local filesystem.

my $bytecount = $filestore->_read_blocks_from_db( $callback_function, $fdbname );

** Intended for internal use by this module. **

Fetches the blocks from the database for the file stored under $fdbname, and calls the $callback_function on each one after it is read.

Locks the relevant tables while data is extracted. Locking should probably be configurable by the caller.

It also confirms that the MD5 checksum for each block and the file contents as a whole are correct. Die()'s with an error if a checksum doesn't match.

my $bytecount = $self->write_to_db( $localpathname, $filestorename );

Copies the file $localpathname from the filesystem to the name $filestorename in the filestore.

Locks the relevant tables while data is extracted. Locking should probably be configurable by the caller.

Returns the number of bytes written. Dies with a message if the source file could not be read.

Note that it currently reads the file twice: once to compute the MD5 checksum before insterting it, and a second time to insert the blocks.

FUNCTIONS

my $filename_ok = DBIx::FileStore::name_ok( $fdbname )

Checks that the name $fdbname is acceptable for using as a name in the filestore. Must not contain spaces or be over 75 chars.

AUTHOR

Josh Rabinowitz, <Josh Rabinowitz>

SUPPORT

You should probably read the documentation for the various filestore command-line tools:

  fdbcat, fdbcp, fdbget, fdbls, fdbmv, fdbput, fdbrm, fdbstat, and fdbtidy.
  fdbslurp (which is the reverse of fdbcat) was not completed.

You can also look for information at:

LICENSE AND COPYRIGHT

Copyright 2010 Josh Rabinowitz.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.