- SEE ALSO
- COPYRIGHT AND LICENSE
WARC::Record - one record from a WARC file
use WARC; # or ... use WARC::Volume; # or ... use WARC::Collection; # WARC::Record objects are returned from ->record_at and ->search methods # Construct a record, as when preparing a WARC file $warcinfo = new WARC::Record (type => 'warcinfo'); # Accessors $value = $record->field($name); $version = $record->protocol; # analogous to HTTP::Message::protocol $volume = $record->volume; $offset = $record->offset; $record = $record->next; $fields = $record->fields;
WARC::Record objects come in two flavors with a common interface. Records read from WARC files are read-only and have meaningful return values from the methods listed in "Methods on records from WARC files". Records constructed in memory can be updated and those same methods all return undef.
Get the internal
WARC::Fieldsobject that contains WARC record headers.
- $record->field( $name )
Get the value of the WARC header named $name from the internal
- $record <=> $other_record
- $record->compareTo( $other_record )
WARC::Recordobjects according to a simple total order: ordering by starting offset for two records in the same file, and by filename of the containing
WARC::Volumeobjects for records in different files. Constructed
WARC::Recordobjects are assumed to come from a volume named "" (the empty string) for this purpose, and are ordered in an arbitrary but stable manner amongst themselves. Constructed
WARC::Recordobjects never compare as equal.
Perl constructs a
==operator using this method, so WARC record objects will compare as equal iff they refer to the same physical record.
'WARC-Date'field as a
These methods all return undef if called on a
WARC::Record object that does not represent a record in a WARC file.
Return the format and version tag for this record. For WARC 1.0, this method returns 'WARC/1.0'.
WARC::Volumeobject representing the file in which this record is located.
Return the file offset at which this record can be found.
Return the next
WARC::Recordin the WARC file that contains this record. Returns an undefined value if called on the last record in a file.
Return a tied filehandle that reads the WARC record block.
The WARC record block is the content of a WARC record, analogous to the entity body in an
- $record->replay( as => $type )
Return a protocol-specific object representing the record contents.
This method returns undef if the library does not recognize the protocol message stored in the record and croaks if a requested conversion is not supported.
A record with Content-Type "application/http" with an appropriate "msgtype" parameter produces an
HTTP::Responseobject. An unknown "msgtype" on "application/http" produces a generic
HTTP::Message. The returned object may be a subclass to support deferred loading of entity bodies.
A request to replay a record "as => http" attempts to convert whatever is stored in the record to an HTTP exchange, analogous to the "everything is HTTP" interface that
Return a tied filehandle that reads the WARC record payload.
The WARC record payload is defined as the decoded content of the protocol response or other resource stored in the record. This method returns undef if called on a WARC record that has no payload or content that we do not recognize.
- $record = new WARC::Record (key => value, ...)
Construct a fresh WARC record, suitable for use with
Jacob Bachmeyer, <email@example.com>
Copyright (C) 2019 by Jacob Bachmeyer
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.