DBIx::TableHash - Tie a hash to a mysql table + SQL utils

        use DBIx::TableHash;
        my $DBHash = DBIx::TableHash->create_or_die
        (my $Params =
             DBIDriver      => 'mysql', 
             Database       => 'mydatabase',
             HostName       => 'localhost', 
             Port           => undef,
             Login          => '',
             Password       => '',

             TableName      => 'SalesPeople',
             KeyField       => 'FullName',  

             ## For multi-key lookup:
             FixedKeys      => {AreaCode   => 415,
                                StatusCode => 'Active',
                                RecordType => 'Primary'},
             ## To retrieve a single value:
             ValueField     => 'PhoneNumber',

             ## ... or for "multi-value" retrieval...
             ValueField     => undef,

             ## ... optionally specifying...
             RetrieveFields => [qw(Title Territory Quota)],
             ## For caching:
             CacheMode => 'CacheBeforeIterate'
             ## or...
             CacheMode => 'CacheOneTime'
             ## or...
             CacheMode => 'CacheNone'

        my %DBHash; tie(%DBHash, 'DBIx::TableHash', $Params);

        my $DBHash = DBIx::TableHash->create($Params) or die "Help!";
        my $DBHash = DBIx::TableHash->create_or_die($Params);

        my $DBHash = DBIx::TableHash->create_copy($Params) or die "Help!";
        my $DBHash = DBIx::TableHash->create_copy_or_die($Params);

    All parameters are passed via a single anonymous hash.

    All parameters are optional, but you'll almost always need to specify
    Database, TableName, and KeyField.

    Omitting ValueField puts the hash in "multi-value" mode, where you
    store/retrieve a hash of fields/values instead of a single value. In
    "multi-value" mode all fields in each record are retrieved on every
    fetch; RetrieveFields limits fields retrieved to a specified list.

    Specifying FixedKeys puts the hash in "multi-key" mode, in which only a
    subset of the database table, corresopnding to records that match the
    spec in FixedKeys, is operated on.

    Cache modes reduce querying, but lose synchronization and hog memory.

    The object is designed to be easy subclass. Try making a subclass that
    sets defaults for all or most of the parameters, so the caller doesn't
    have to supply any at instantiation time.

    "create_copy" methods efficiently create and return potentially huge
    untied hash "snapshot" of the same data that would have been retrieved
    by the corresponding tied hash.

    The DBHash object is designed to tie a hash to a table or a subset of
    records in a table in a DBI database (only tested with mysql in the
    current version, but expected to work with any vendor).

    If the table only has a single KeyField, which this modules assumes to
    be a unique key field in the table, then the hash keys are stored and
    retrieved in that field, and the values are saved in and returned from
    the ValueField. Records are automatically created and deleted as the
    hash is used like any other hash. (If the table is read-only, be sure
    not to try to store into the tied hash!)

    To access only a subset of the records in a table, you may specify hash
    of "FixedKeys", which is a hash mapping OTHER field names to fixed
    values that those fields must have for all lookups, updates, and inserts
    done via the hash interface. This sets up a virtual hash corresponding
    to a subset of the table where N key fields are fixed at given values
    and a different key field may vary.

    There are several ways to use this module.

    Quick and dirty mode (single-key, single-value) using tie:

        use DBIx::TableHash;
        my %PhoneNumbers;
        tie (%PhoneNumbers, 'DBIx::TableHash', 
             {Database      => 'mydatabase', 
              TableName     => 'SalesPeople',
              KeyField      => 'FullName',
              ValueField    => 'PhoneNumber'})
            or die "Failed to connect to database";

    Then you can use %PhoneNumbers like any hash mapping FullName to
    PhoneNumber. Any retrieval of data results in corresponding SQL queries
    being made to the database. Modifying the hash modifies the database.

    Even quicker mode using create():

    For convenience, you can use the create() class method to do the tying
    for you. It creates an anonymous hash, ties it and, returns it. It takes
    the same parameters as new() and tie().

        use DBIx::TableHash;
        my $PhoneNumbers = DBIx::TableHash->create(......)
            or die "Failed to connect to database";

    Quicker still using create_or_die():

        use DBIx::TableHash;
        my $PhoneNumbers = DBIx::TableHash->create_or_die(......);

    create() carps and returns undef if it can't connect to the database.

    create_or_die() croaks (dies) your program, with the same error message
    as create() would have given.

    Normally, create() will carp() (warn) you with an error message upon
    failure, and return undef.

    If you would have handled your error by saying "or die", and ar
    comfortable with create's error message rather than your own, then
    create_or_die() is for you.

    Using one of the create() methods instead of new() and tie() works with
    all of the different modes discussed below, and all parameters are the
    same either way.

    Cooler subclassing mode:

    You can create a simple subclass that provides default parmas in an
    initialize method so they don't have to be provided by the caller ...

        ### MyCompany/


        use strict;

        package   MyCompany::SalesPhoneHash;
        use       vars qw(@ISA);
        use       DBIx::TableHash;
        @ISA = qw(DBIx::TableHash);

        sub initialize
            my $this = shift;

            $this->{Database}   ||= 'mydatabase';   ## Name of database to connect to.
            $this->{TableName}  ||= 'SalesPeople';  ## Table in which to store the data
            $this->{KeyField}   ||= 'FullName';     ## Name of the key field
            $this->{ValueField} ||= 'PhoneNumber';  ## Name of the value field.


    Then to use the object, your script merely does:

        use MyCompany::SalesPhoneHash;
        my %PhoneNumbers = MyCompany::SalesPhoneHash->create_or_die();

    Of course, when instantiating a subclass, if you wish you can still
    override any parameters you wish, as long as the initialize() method in
    the subclass uses ||= rather than = to set defaults for any unspecified

    Multi-key mode:

    You may also use the "FixedKeys" parameter to specify a hash of some
    additional key fields and their fixed values that must match exactly for
    any records that are retrieved, deleted, or created by the tied object,
    effectively allowing the hash to operate on only a subset of the data in
    the database. This is typically helpful in a multi-keyed table where,
    for the purposes of your script, all key values should be fixed except
    one (and that one is the hash key).

        use DBIx::TableHash;
        my $PhoneNumbers = 
                              {Database     => 'mydatabase', 
                               TableName    => 'SalesPeople',
                               KeyField     => 'FullName',
                               ValueField   => 'PhoneNumbers',
                               FixedKeys        =>
                               {AreaCode        => 415,
                                StatusCode      => 'Active',
                                RecordType      => 'Primary'}});

    Multi-value mode:

    If instead of getting and setting a single value, you'd like to get or
    set a hash of all fields in the record, simply don't specify ValueField,
    and the object will use "multi-value" mode, where an entire record, as a
    hash, is gotten or set on each fetch or store. Feel free to combine this
    mode with multi-key mode.

    When storing a record in multi-value mode, if the record already exists,
    only the specified fields are overwritten. If it did not already exist,
    then only the specified fields will be written and the others will be
    NULL or defaulted according to the table schema.

    When storing a record in multi-value mode, you can't change the values
    of the primary key field or any other key field specified in FixedKeys
    (if any), since that would mess up the whole point of this module which
    is to leave the main key and fixed keys fixed while mucking with the
    other values in the record. Any changed values in key fields are simply

        use DBIx::TableHash;
        my $SalesPeopleTable = 
                              {Database     => 'mydatabase', 
                               TableName    => 'SalesPeople',
                               KeyField     => 'FullName'});

        my $SalesPersonFullName = "Joe Jones";
        my $EntireRecord = $SalesPeopleTable->{$SalesPersonFullName};

    When fetching records in multi-value mode, you can limit the list of
    returned fields to a subset of all available fields in case there might
    be some very big ones that you don't want to waste bandwidth getting.
    Just set the RetrieveFields parameter to an anonymous list of the fields
    you care to retrieve. (This setting does not limit the fields you can
    SET, just the ones that get retrieved.)

        use DBIx::TableHash;
        my $SalesPeopleTable = 
                              {Database     => 'mydatabase', 
                               TableName    => 'SalesPeople',
                               KeyField     => 'FullName',
                               RetrieveFields=> [qw(Territory Quota)]});


    In multi-value mode, you might expect that this:

        $Hash->{$MyKey}->{FieldX} = 'foo';

    would set the value of the FieldX field in the appropriate record in the
    database. IT DOES NOT.

    This is because the anonymous hash returned by $Hash->{$MyKey} is not in
    any way tied back to the database. You'd have to retrieve the record
    hash, change any value in it, and then set $Hash->{$MyKey} back to it.

    Making the above syntax work with a multi-valued tied hash to set a
    value in the database is a possible future enhancement under
    consideration by the author. Let me know if you would like to have that

    In the meanwhile, here's how you do could do it:

        (my $Record = $Hash->{$MyKey})->{FieldX} = 'foo';
        $Hash->{$MyKey}->{FieldX} = $Record;

    WARNING: If you use the above approach to update a record in multi-value
    mode, beware that there's potentially a race condition in the above code
    if someone else updates the same record after you've copied it but
    before you've modified and set it. So use this technique with caution
    and understanding. If in doubt, don't use this module and instead use an
    SQL query to update the record in a single transaction. Only you know
    the usage patterns of your database, the concurrency issues, and the
    criticality of errors.

    Caching modes:

    The object has several ways it can cache data to help minimize the
    number of SQL queries to the database, at the expense of potentially
    dramatically increased memory usage. The following cache parameters can
    be specified to enable caching:

        CacheMode => 'CacheBeforeIterate'

        CacheMode => 'CacheOneTime'

    To disable caching, specify:

        CacheMode => 'CacheNone'

    (You can also assign undef to CacheMode, but you'll get warnings.)

    Normally, every time you fetch a value from the hash, it makes an SQL
    query to the database. This, of course, is the intended and normal mode
    of operation.

    Unfortunately, in Perl, just calling values(%Hash) or each(%Hash) or
    even copying the hash with {%Hash} results in a separate FETCH, and
    consequently, a separate SQL query made by this module, for each item.
    This could result in thousands of queries just to fetch all the values
    from a thousand-item table in the database.

    However, often you want to iterate over all the elements of a hash
    without it having to go back to the database and issue another query for
    each item that you retrieve.

    Using the 'CacheBeforeIterate' mode, all keys and values are cached upon
    each call to FIRSTKEYS (i.e. at the start of any iteration or
    enumeration). Then, any subsequent calls to FETCH data from the hash
    retrieve it from the cache instead of doing an SQL query. STORING or
    DELETING any items from the hash results in them being stored and
    deleted from both the database and the cache.

    Using the CacheOneTime mode, the full cache is built at object
    instantiation and time never fully rebuilt. In fact, its contents never
    change unless you make alterations by using it to store into and/or
    delete from the database.

    CACHE WARNING: With both caching modes, of course, you must be
    comfortable with the fact that the data being retrieved is a "snapshot"
    of the database and consequently will not reflect updates done by other
    parties during the lifetime of the object; it will only reflect updates
    that you make by storing or deleting values from it. If other people are
    using the database simultaneously, your cache and the actual data could
    "drift" out of agreement. This is mainly dangerous to you, not others,
    unless you then go make updates to the data based on potentially
    outdated values.

    All modes may be combined...

    All modes and parameters are orthogonal, so any combination of
    parameters may be specified, with the exception that the RetrieveFields
    parameter is only meaningful when ValueField is not unspecified.

    With subclassing, you may create objects that pre-specify any
    parameters, even those that affect the major modes of operation. For
    example, you may combine the subclassing technique and the multi-key
    mode to make an object that accesses only the appropriate subset of a
    multi-keyed table without requiring any parameters to be supplied by the

    Getting a COPY of the data instead of a tied hash:

    What if you just want a big copy -- a snapshot -- of the data in the
    table, in a regular old hash that's no longer tied to the database at
    all? (Memory constraints be damned!)

    Just use the create_copy() or create_copy_or_die() methods. They work
    just like create() and create_or_die(), but instead of returning a tied
    object, they just return a potentially huge hash containing a copy of
    all the data.

    In other words:

       create_copy()        is equivalent to: {%{create() || {} }}
       create_copy_or_die() is equivalent to: {%{create_or_die()}}

    ... but the _copy methods are more efficient because internally, a
    caching mode is used to minimize the queries to the database and
    generate the hash as efficiently as possible.

    In all other respects, create_copy() and create_copy_or_die() perform
    exactly like their non-copying namesakes, taking all the same
    parameters, except CacheMode which is not relevant when making a static

    Please remember that the object returned by the _copy methods is no
    longer tied to the database.

    The full list of recognized parameters is:

    DBI Parameters

        Param       Default         Description
        DBIDriver   'mysql'         Name of DBI driver to try to use (only
                                        mysql has currently been tested by the

        HostName    'localhost'     Host name containing the database and table;

        Port        undef           Port number if different from the standard.
        Login       ''              Login to use when connecting, if any.

        Password       ''               Password to use when connecting, if any.

    SQL Parameters

        Param       Default         Description
        Database    ''              Name of database to connect to.

        TableName   ''              Table to connect to.

        KeyField    ''              Name of field in which lookup key is found.

        ValueField  ''              Name of field to pull value from.
                                    If empty or undef, then a
                                    multi-value hash is used both for
                                    saving and retrieving.  This is
                                    called "multi-value mode".

    Module Parameters

        Param       Default         Description
        FixedKeys   {}              If supplied, gives names and
                                    fixed, hardcoded values that other
                                    keys in the table must have; this
                                    effectively limits the scope of
                                    the tied hash from operating over
                                    the entire table to operating over
                                    just the subset of records that
                                    match the values in FixedKeys.
                                    This is called "multi-key mode".

        RetrieveFields  []          In multi-value mode, limits the
                                    fields that are retrieved; default
                                    is all fields in the record.

    I am unable to provide any technical support for this module. The whole
    reason I had to make it was that I was way too busy (lazy?) to write all
    that SQL code...

    But you are encouraged to send patches, bug warnings, updates, thanks,
    or suggestions for improvements to the author as listed below.

    Just be aware that I may not have time to respond. Please be sure to put
    the name of this module somewhere in the Subject line.

    The code is a pretty simple tied hash implementation, so you're on your
    own to debug it. If you're having trouble debugging via the "tie"
    interface, try instantiating an object directly (or retrieving it when
    you tie (see perltie)) and calling its methods individually. Use the
    debugger or Data::Dumper to dump intermediate values at key points, or
    whatever it takes. Use your database server logs if you want to see what
    SQL code is getting generated. Or contribute a debugging mode to this
    module which prints out or logs the SQL statements before executing

    Problem: If you iterate or enumerate the hash, all keys get pulled in
    from the database and stay stored in memory for the lifetime of the
    object. FIRSTKEY, which is called every time you do a keys(), each() or
    any full iteration or enumeration over the tied hash (such as copying
    it) retrieves and hangs on to a full list of all keys in KeyField. If
    the keys are long or there are lots of them, this could be a memory
    problem. (Don't confuse this with CacheMode in which BOTH keys AND
    values are stored in memory.)


        1) Don't iterate or enumerate.  Just fetch and store.
        2) Only iterate or enumerate on short tables. 
        3) LValue or RValue hash slices should be safe to do.

    Using CPAN module:

        perl -MCPAN -e 'install DBIx::TableHash'

    Or manually:

        tar xzvf DBIx-TableHash*gz
        cd DBIx-TableHash-?.??
        perl Makefile.PL
        make test
        make install

    The DBIx::TableHash home page:

    The implementation in

    The perlref and perltie manual pages.

    The mysql home page:

    Thanks to Mark Leighton Fisher <> for providing a patch
    to fix -w support (change in standard "none" setting of CacheMode from
    undef to CacheNone).

    Chris Thorman <>

    Copyright (c) 1995-2002 Chris Thorman. All rights reserved.

    This program is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.