WARC::Collection - Interface to a group of WARC files
use WARC::Collection; $collection = assemble WARC::Collection ($index_1, $index_2, ...); $collection = assemble WARC::Collection from => ($index_1, ...); $record = $collection->search(url => $url, time => $when); @records = $collection->search(url => $url, time => $when);
The WARC::Collection class is the primary means by which user code is expected to use the WARC library. This class uses indexes to efficiently search for records in one or more WARC files.
WARC::Collection
The search method accepts a list of parameters as key => value pairs with each pair narrowing the search, sorting the results, or both, indicated in the following list with "[N ]", "[ S]", or "[NS]", respectively.
search
[N ]
[ S]
[NS]
The same search keys documented here are used for searching indexes, since WARC::Collection is a wrapper around one or more indexes.
The keys supported are:
An exact match for a URL.
A prefix match for a URL. Prefers records with shorter URLs.
Prefer records collected nearer to the requested time.
Assemble a collection of WARC files from one index or multiple indexes, specified either as objects derived from WARC::Index or filenames.
WARC::Index
While multiple indexes can be used in a collection, note that searching a collection requires individually searching every index in the collection.
Search the indexes for records matching the parameters and return the best match in scalar context or a list of all matches in list context. The returned values are WARC::Record objects.
WARC::Record
See "Search Keys" for more information about the parameters.
Jacob Bachmeyer, <jcb@cpan.org>
WARC
Copyright (C) 2019 by Jacob Bachmeyer
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install WARC, copy and paste the appropriate command in to your terminal.
cpanm
cpanm WARC
CPAN shell
perl -MCPAN -e shell install WARC
For more information on module installation, please visit the detailed CPAN module installation guide.