NAME
Search::Indexer::Incremental::MD5::Indexer - Incrementally index your files
SYNOPSIS
use File::Find::Rule ;
use Readonly ;
Readonly my $DEFAUT_MAX_FILE_SIZE_INDEXING_THRESHOLD => 300 << 10 ; # 300KB
my $indexer
= Search::Indexer::Incremental::MD5::Indexer->new
(
USE_POSITIONS => 1,
INDEX_DIRECTORY => 'text_index',
get_perl_word_regex_and_stop_words(),
) ;
my @files = File::Find::Rule
->file()
->name( '*.pm', '*.pod' )
->size( "<=$DEFAUT_MAX_FILE_SIZE_INDEXING_THRESHOLD" )
->not_name(qr[auto | unicore | DateTime/TimeZone | DateTime/Locale])
->in('.') ;
indexer->add_files(@files) ;
indexer->add_files(@more_files) ;
indexer = undef ;
DESCRIPTION
This module implements an incremental text indexer and searcher based on Search::Indexer.
DOCUMENTATION
Given a list of files, this module will allow you to create an indexed text database that you can later query for matches. You can also use the siim command line application installed with this module.
SUBROUTINES/METHODS
new( %named_arguments)
Create a Search::Indexer::Incremental::MD5::Indexer object.
my $indexer = new Search::Indexer::Incremental::MD5::Indexer(%named_arguments) ;
Arguments - %named_arguments
Returns - A Search::Indexer::Incremental::MD5::Indexer object
Exceptions -
Incomplete argument list
Error creating index directory
Error creating index metadata database
Error creating a Search::Indexer object
add_files($self, %named_arguments)
Adds the contents of the files passed as arguments to the index database. Files already indexed are checked and re-indexed only if their content has changed
Arguments %named_arguments
- FILES - Array reference - a list of files to add to the index. The file can either be a:
- MAXIMUM_DOCUMENT_SIZE - Integer - a warning is displayed for document with greater size
- DONE_ONE_FILE_CALLBACK - sub reference - called every time a file is handled
Returns - Hash reference keyed on the file name
STATE - Boolean -
ID - integer - document id
TIME - Float - re-indexing time
Exceptions
add_file($self, $name, $description)
Arguments
Returns - Hash reference containing
STATE - Boolean -
ID - integer - document id
TIME - Float - re-indexing time
Exceptions
remove_files(%named_arguments)
removes the contents of the files passed as arguments from the index database.
Arguments %named_arguments
- FILES - Array reference - a list of files to remove from to the index
- DONE_ONE_FILE_CALLBACK - sub reference - called every time a file is handled
Returns - Hash reference keyed on the file name
STATE - Boolean -
ID - integer - document id
TIME - Float - re-indexing time
Exceptions
remove_document_with_id($id, $content)
removes the contents of the files passed as arguments
Arguments
$id - The id of the document to remove from the database
$content - The contents of the document or undef
Returns - Nothing
Exceptions - None
check_indexed_files(%named_arguments)
Checks the index database contents.
Arguments %named_arguments
- DONE_ONE_FILE_CALLBACK - sub reference - called every time a file is handled
Returns - Hash reference keyed on the file name or nothing in void context
STATE - Boolean -
ID - integer - document id
TIME - Float - check time
Exceptions - None
remove_reference_to_unexisting_documents()
Checks the index database contents and remove any reference to documents that don't exist.
Arguments - None
Returns - Array reference containing the named of the document that don't exist
Exceptions - None
BUGS AND LIMITATIONS
None so far.
AUTHOR
Nadim ibn hamouda el Khemir
CPAN ID: NKH
mailto: nadim@cpan.org
LICENSE AND COPYRIGHT
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Search::Indexer::Incremental::MD5
You can also look for information at:
AnnoCPAN: Annotated CPAN documentation
RT: CPAN's request tracker
Please report any bugs or feature requests to L <bug-search-indexer-incremental-md5@rt.cpan.org>.
We will be notified, and then you'll automatically be notified of progress on your bug as we make changes.
Search CPAN