The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Tie::RDBM::Cached - Tie hashes to relational databases.

SYNOPSIS

DESCRIPTION

In addition to Tie::RDBM this module provides one of two caching methods for fast access and retrieval of data. This can be easily achieved by the user without resorting to this module. I wrote the module because I like the interface to the hash and once done forever usefull.

For more information please see the Documentation for Tie::RDBM. I will document where this module adds functionality to the base class or deviates from base class usage.

TIEING A DATABASE

   tie %VARIABLE,Tie::RDBM::Cached,DSN [,\%OPTIONS]

You tie a variable to a database by providing the variable name, the tie interface (always "Tie::RDBM::Cached"), the data source name, and an optional hash reference containing various options to be passed to the module and the underlying database driver.

The data source may be a valid DBI-style data source string of the form "dbi:driver:database_name[:other information]", or a previously-opened database handle. See the documentation for DBI and your DBD driver for details. Because the initial "dbi" is always present in the data source, Tie::RDBM::Cached will automatically add it for you.

The options array contains a set of option/value pairs. If not provided, defaults are assumed. The options with defaults are:

user ['']

Account name to use for database authentication, if necessary. Default is an empty string (no authentication necessary).

password ['']

Password to use for database authentication, if necessary. Default is an empty string (no authentication necessary).

db ['']

The data source, if not provided in the argument. This allows an alternative calling style:

   tie(%h,Tie::RDBM,{db=>'dbi:mysql:test',create=>1};
table ['pdata']

The name of the table in which the hash key/value pairs will be stored.

key ['pkey']

The name of the column in which the hash key will be found. If not provided, defaults to "pkey".

value ['pvalue']

The name of the column in which the hash value will be found. If not provided, defaults to "pvalue".

frozen ['pfrozen']

The name of the column that stores the boolean information indicating that a complex data structure has been "frozen" using Storable's freeze() function. If not provided, defaults to "pfrozen".

NOTE: if this field is not present in the database table, or if the database is incapable of storing binary structures, Storable features will be disabled.

create [0]

If set to a true value, allows the module to create the database table if it does not already exist. The module emits a CREATE TABLE command and gives the key, value and frozen fields the data types most appropriate for the database driver (from a lookup table maintained in a package global, see DATATYPES below).

The success of table creation depends on whether you have table create access for the database.

The default is not to create a table. tie() will fail with a fatal error.

drop [0]

If the indicated database table exists, but does not have the required key and value fields, Tie::RDBM can try to add the required fields to the table. Currently it does this by the drastic expedient of DROPPING the table entirely and creating a new empty one. If the drop option is set to true, Tie::RDBM will perform this radical restructuring. Otherwise tie() will fail with a fatal error. "drop" implies "create". This option defaults to false.

A future version of Tie::RDBM may implement a last radical restructuring method; differences in DBI drivers and database capabilities make this task harder than it would seem.

autocommit [1]

If set to a true value, the "autocommit" option causes the database driver to commit after every SQL statement. If set to a false value, this option will not commit to the database until you explicitly call the Tie::RDBM::Cached commit() method. Due to the way the cache works this option does not imply that every time you add a value to the Tied hash that it gets inserted.

The autocommit option defaults to true.

DEBUG [0]

When the "DEBUG" option is set to a true value the module will echo the contents of SQL statements and other debugging information to standard error.

cache_type ['HASH'!'BERKELEYDB']

You will eventually have a choice between using a HASH or a BerkeleyDB file as the database. Both offer a good speed improvement.

cache_size [0]

This optio allows you to specify the size the cache will be allowed to grow to before it gets automatically committed to the database. I am not sure how I am going to offer this when using the BerkeleyDB yet.

USING THE TIED ARRAY

The standard fetch, store, keys(), values() and each() functions will work as expected on the tied array. In addition, the following methods are available on the underlying object, which you can obtain with the standard tie() operator:

commit()
   (tied %h)->commit();

This function has been overridden. It will flush the cache then commit to the database, otherwise it performs the same function as the base class. When using a database with the autocommit option turned off, values that are stored into the hash will not become permanent until commit() is called. Otherwise they are lost when the application terminates or the hash is untied.

Some SQL databases don't support transactions, in which case you will see a warning message if you attempt to use this function.

rollback()
   (tied %h)->rollback();

When using a database with the autocommit option turned off, this function will roll back changes to the database to the state they were in at the last commit(). This function has no effect on database that don't support transactions.

PERFORMANCE

What is the performance hit when you use this module? This is very dependant on how you are using the data. If you are doing raw inserts of large amounts of data then I do not recommend using this module because the performance is very slow. If however you are doing a large amount of updates on the data and most of the updates will fall inside the cache then this module can increase the performance of these operations considerably. One tactic that can be employed by is to preload the cache with your data carry out the updates and then commit the data.

Unfortunately deletes do not offer any gain in performance when using this module. The reason for the performance drop on certain types of operation is because when using a hash tied to a database we need to check for existance before we can carry out an insert or update. This adds an extra SQL statement to the operation.

I will compile some performance metrics for peoples perusal at some point I am noit entirely happy with the implimentation yet and may change it a bit so I am not doing any performance tests just yet although indications are that there is significant improvement when carrying out particular operations.

TO DO LIST

    - Add the BerkelyDB as a caching method.
    - Produce some performance metrics. 
    - Write tests for release.

BUGS

Of that I am sure.

AUTHOR

Harry Jackson, harry@hjackson.org

COPYRIGHT

  Copyright (c) 2003, Harry Jackson

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

AVAILABILITY

The latest version can be obtained from:

SEE ALSO

perl(1), Tie::RDBM, DBI(3), Storable(3)