NNexus::Index::Dispatcher - High-level dispatcher to the correct domain indexer classes.
NNexus::Index::Dispatcher
use NNexus::Index::Dispatcher; $dispatcher = NNexus::Index::Dispatcher->new(db=>$db,domain=>$domain,verbosity=>0|1); $invalidated_URLs = $dispatcher->index_step(%options); while (my $payload = $dispatcher->index_step ) { push @$invalidated_URLs, @{$payload}; }
The NNexus::Dispatcher class provides a comprehensive high-level API for indexing web domains.
NNexus::Dispatcher
It requires that each $domain has its own NNexus::Index::$domain indexer plug-in, that follows a ucfirst(lc($domain)) naming convention.
$domain
NNexus::Index::$domain
ucfirst(lc($domain))
Additionally, NNexus::Index::Dispatcher computes the concept diffs when re-indexing, an already visited page and updates the database as needed. Lastly, the return value of an indexing step is a list of suggested URLs to be relinked, a process called "invalidation" in previous NNexus releases.
$dispatcher = NNexus::Index::Dispatcher->new(domain=>$domain,db=>$db,$verbosity=>0|1, start=>$url, dom=>$dom);
The object constructor prepares a domain crawler object ( NNexus::Index::ucfirst(lc($domain)) ) and requires a NNexus::DB object, $db, for database interactions.
NNexus::Index::ucfirst(lc($domain))
$db
The returned dispatcher object can be used to iteratively index the domain, via the index_step method.
The method accepts the following options:
start: the initial URL, required for first invocation
dom: optional, provides a Mojo::DOM object for the current URL instead of performing an HTTP GET to retrieve it.
verbosity: 0 for quiet, 1 for detailed progress messages
$invalidated_URLs = $dispatcher->index_step(%options);
Performs an indexing step as follows:
Dispatches a crawl request to the domain indexer
Computes a diff over the previously and currently indexed concepts for the given object/URL
Updates the Database tables
Computes and returns an impact graph of previously linked objects (aka "invalidation")
Accepts no options, all customization is to be achieved through the new constructor.
Deyan Ginev <d.ginev@jacobs-university.de>
Research software, produced as part of work done by the KWARC group at Jacobs University Bremen. Released under the The MIT License (MIT)
To install NNexus, copy and paste the appropriate command in to your terminal.
cpanm
cpanm NNexus
CPAN shell
perl -MCPAN -e shell install NNexus
For more information on module installation, please visit the detailed CPAN module installation guide.