SWISH::Prog::Aggregator - document aggregation base class
package MyAggregator; use strict; use base qw( SWISH::Prog::Aggregator ); sub get_doc { my ($self, $url) = @_; # do something to create a SWISH::Prog::Doc object from $url return $doc; } sub crawl { my ($self, @where) = @_; foreach my $place (@where) { # do something to search $place for docs to pass to get_doc() } } 1;
SWISH::Prog::Aggregator is a base class that defines the basic API for writing an aggregator. Only two methods are required: get_doc() and crawl(). See the SYNOPSIS for the prototypes.
See SWISH::Prog::Aggregator::FS and SWISH::Prog::Aggregator::Spider for examples of aggregators that crawl the filesystem and web, respectively.
Set object flags per SWISH::Prog::Class API. These are also accessors, and include:
This will set the parser() value in swish_filter() based on the MIME type of the doc_class() object.
A SWISH::Prog::Indexer object.
The name of the SWISH::Prog::Doc-derived class to use in get_doc(). Default is SWISH::Prog::Doc.
Returns the SWISH::Prog::Config object being used. This is a read-only method (accessor not mutator).
Returns the total number of doc_class() objects returned by get_doc().
Override this method in your subclass. It does the aggregation, and passes each doc_class() object from get_doc() to indexer->process().
Override this method in your subclass. Should return a doc_class() object.
Passes the content() of the SPD object through SWISH::Filter and transforms it to something index-able. Returns the doc_class_object, filtered.
NOTE: This method should be called by all aggregators after get_doc() and before passing to the indexer().
See the SWISH::Filter documentation.
Use code_ref as the doc_class filter. This method called by init() if filter param set in constructor.
doc_class
filter
Peter Karman, <perl@peknet.com>
Copyright 2008 by Peter Karman
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install SWISH::Prog, copy and paste the appropriate command in to your terminal.
cpanm
cpanm SWISH::Prog
CPAN shell
perl -MCPAN -e shell install SWISH::Prog
For more information on module installation, please visit the detailed CPAN module installation guide.