XML::Tape::Index - a XMLtape indexer
use XML::Tape::Index qw(:all); unless (indexexists('ex/tape.xml')) { $x = indexopen('ex/tape.xml', 'w'); $x->reindex; $x->indexclose(); } $x = indexopen('ex/tape.xml', 'r'); for (my $rec = $x->list_identifiers(); defined($rec); $rec = $x->list_identifiers($rec->{token})) { print "id : %s\n" , $rec->{identifier}; print "date : %s\n" , $rec->{date}; print "start : %s\n" , $rec->{start}; print "length : %s\n" , $rec->{len}; } my $rec = $x->get_identifier('oai:arXiv.org:hep-th:0208183'); my $xml = $x->get_record('oai:arXiv.org:hep-th:0208183');
This modules creates an index on XMLtapes to enable fast retrieval of XML documents from the archive. The index files are stored next to the XMLtape.
This function opens an index for reading or writing. The parameter tape_file is the location of a XMLtape archive. The flag is "w" when creating a new index or "r" when reading an index. An XML::Tape::Index instance will be returned on success or undef on failure.
This method reads the XMLtape extracts all identifier and datestamps from it and stores the byte positions of all records in the index.
Use this method to iterate through the index to return all records. This method returns an index record on success or undef when no more records are available. Each index record is a HASH reference containing the fields 'identifier', 'date', 'start' (the starting byte of the XML document in the XMLtape), 'len' (the length of the XML document in the XMLtape) and 'token'. The 'token' field should be used to return the next index record. One can filter the returned indexed records by using two arguments at the first list_identifiers method invocation. Only index records with dates greater or equal than 'from' and less than 'until' will be returned by subsequent list_identifier requests. E.g.
# Return all index records... for (my $r = $x->list_identifiers(); defined($r); $r = $x->list_identifiers($r->{token}) { } # Return all index records with dates between 2000-01-01 and 2005-12-31... for (my $r = $x->list_identifiers( '2001-01-01T00:00:00Z', '2005-12-31T23:59:59Z' ); defined($r); $r = $x->list_identifiers($r->{token}) { }
This methods returns earliest date in the index file
This methods returns name of the tape file associated with this index.
This methods returns the number of record in an index.
This method returns an index record given an identifier as argument. When no matching index record can be found undef will be returned. The index record is a HASH reference containing the fields 'identifier', 'date', 'start' and 'len' (see above).
This method returns an XML document from the XMLtape given an identifier as argument. When no matching record can be found undef will be returned.
Closes the XMLtape index.
This class method returns true when an index on the XMLtape with location $tape_file exists, returns false otherwise.
This class method deletes the index associated with the XMLtape with location $tape_file.
The XML::Tape::Index doesn't lock XMLtape before writing. It is possible to overwrite and index while another process is reading it.
XMLtape archives were developed by the Digital Library Research & Prototyping team at Los Alamos National Laboratory.
XML::Tape
Patrick Hochstenbach <Patrick.Hochstenbach@UGent.be>
1 POD Error
The following errors were encountered while parsing the POD:
You forgot a '=back' before '=head1'
To install XML::Tape, copy and paste the appropriate command in to your terminal.
cpanm
cpanm XML::Tape
CPAN shell
perl -MCPAN -e shell install XML::Tape
For more information on module installation, please visit the detailed CPAN module installation guide.