The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Search::InvertedIndex::Simple::BerkeleyDB - Build indexes for a set of search keys; Search using BerkeleyDB

Synopsis

        my($dataset) = [
                       { # Index: 0.
                           address => 'Here',
                           event   => 'End',
                           time    => 'Time',
                       },
                       { # Index: 1.
                           address => 'Heaven',
                           event   => 'Exit',
                           time    => 'Then',
                       },
                       { # Index: 2.
                           address => 'House',
                           event   => 'Finish',
                           time    => 'Thus',
                       }
                       ];
        my($keyset)  = [qw/address time/];
        my($db)      = Search::InvertedIndex::Simple::BerkeleyDB -> new
                       (
                           dataset => $dataset,
                           keyset  => $keyset,
                       );

        $db -> db_put();

        my($result)     = $db -> db_get({address => 'Hea', time => 'T'}); # Returns a hashref.
        my($set)        = $db -> inflate($result);                        # Returns a Set::Array object.

        print $set ? join(',', $set -> print() ) : 'Search did not find any matching records', ". \n";

See t/test.t for a complete program.

Description

Search::InvertedIndex::Simple::BerkeleyDB is a pure Perl module.

See the parent module Search::InvertedIndex::Simple for an explanation of the options dataset and keyset passed in to new().

db_put() writes the index built by Search::InvertedIndex::Simple to an in-RAM database managed by BerkeleyDB.

db_get($key) returns the results of a search as a hash ref.

inflate($result) converts the result hash ref into a single object of type Set::Array.

Distributions

This module is available both as a Unix-style distro (*.tgz) and an ActiveState-style distro (*.ppd). The latter is shipped in a *.zip file.

See http://savage.net.au/Perl-modules.html for details.

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing each type of distro.

Constructor and initialization

new(...) returns an object of type Search::InvertedIndex::Simple::BerkeleyDB.

This is the class's contructor.

Parameters to new():

dataset

This is an arrayref of hashrefs containing the data to be processed.

This parameter is mandatory.

keyset

This is an arrayref of keys used to extract values from the hashrefs in the dataset.

This parameter is mandatory.

lower_case

This parameter takes the values 0 and 1.

If 0, keys put into the database with db_put are not converted to lower case.

If 1, keys are converted to lower case.

Warning: You need to be careful of the case when the index generated by Search::InvertedIndex::Simple contains both upper and lower case keys, such as 'A' and 'a'. Setting this option will convert the 'A' into 'a', potentially creating a hard-to-find source of confusion.

The default value is 0.

This parameter is optional.

separator

This sets the separator used in inflate() to split() the values returned by the search.

See inflate() below for a discussion of when to use this option.

The default value is a comma.

This parameter is optional.

Method: db_get($key)

The $key parameter is a hashref.

The keys are some values of the keyset parameter passed in to new().

The values are the strings to be searched for.

This method returns a hashref of search results.

The keys are the keys of the %$key parameter passed in to db_get().

The values are either undef or the data corresponding to the search key.

If you used join() to create the data values stored in the database with db_put(), consider using inflate() to run split() on all the results returned by db_get().

Method: db_print()

This 'prints' the database by reading all records and converting all key+data pairs to strings.

The result is returned in an array ref, which you can print with:

        print map{"$_\n"} @{$db -> db_print()};

Method: db_put()

This writes the index built by Search::InvertedIndex::Simple to an in-RAM database managed by BerkeleyDB.

Method: inflate($result)

The usual situation in which calling inflate() makes sense is when you use join() to create strings of data which are then put into the database with db_put().

The split() in inflate() reverses the effect of the join(), and inflates the strings recovered by db_get() into objects of type Set::Array, one per search key.

This split() uses the separator value you passed in to new(). The default separator is a comma.

inflate() finds the elements of the inflated results which are common to all search keys, by using the intersection() method in the class Set::Array, and returns the result as an object of type Set::Array, or undef if any search key failed to find anything.

Example code

See t/test.t for a complete program.

Author

Search::InvertedIndex::Simple::BerkeleyDB was written by Ron Savage <ron@savage.net.au> in 2005.

Home page: http://savage.net.au/index.html

Copyright

Australian copyright (c) 2005, Ron Savage. All Programs of mine are 'OSI Certified Open Source Software'; you can redistribute them and/or modify them under the terms of The Artistic License, a copy of which is available at: http://www.opensource.org/licenses/index.html