The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

WARC::Volume - Web ARChive file access for Perl

SYNOPSIS

  use WARC::Volume;

  $volume = mount WARC::Volume ($filename);

  $filename = $volume->filename;

  $handle = $volume->open;

  $record = $volume->first_record;

  $record = $volume->record_at($offset);

DESCRIPTION

A WARC::Volume object represents a WARC file in the filesystem and provides access to the WARC records within as WARC::Record objects.

Methods

$volume = mount WARC::Volume ($filename)

Construct a WARC::Volume object. The parameter is the name of an existing WARC file. An exception is raised if the first record does not have a valid WARC header.

$volume->filename

Return the filename for this volume.

$volume->open

Return a readable and seekable file handle for this volume. The returned value may be a tied handle. Do not assume that it is an IO::Handle.

$volume->first_record

Construct and return a WARC::Record object representing the first WARC record in $volume. This should be a "warcinfo" record, but it is not required to be so.

$volume->record_at( $offset )

Construct and return a WARC::Record object representing the WARC record beginning at $offset within $volume. An exception is raised if an appropriate magic number is not found at $offset.

CAVEATS

The internal tags used to distinguish volumes assume that only Unix-like systems have hard links. On all other platforms, the absolute filename is used.

AUTHOR

Jacob Bachmeyer, <jcb@cpan.org>

SEE ALSO

WARC

COPYRIGHT AND LICENSE

Copyright (C) 2019 by Jacob Bachmeyer

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.