Parse::File::Metadata - For plain-text files that contain both metadata and data records, parse metadata first
use Parse::File::Metadata; $metaref = {}; @rules = ( { rule => sub { exists $metaref->{d}; }, label => q{'d' key must exist}, }, { rule => sub { $metaref->{d} =~ /^\d+$/; }, label => q{'d' key must be non-negative integer}, }, { rule => sub { exists $metaref->{f}; }, label => q{'f' key must exist}, }, ); $self = Parse::File::Metadata->new( { file => 'path/to/myfile', header_split => '\s*=\s*', metaref => $metaref, rules => \@rules, } ); $dataprocess = sub { my @fields = split /,/, $_[0], -1; print "@fields\n"; }; $self->process_metadata_and_proceed( $dataprocess ); $self->process_metadata_only(); $metadata_out = $self->get_metadata(); $exception = $self->get_exception();
This module is useful when you have to parse a plain-text file that meets the following conditions:
The file consists of two types of records:
A header section consisting of key-value pairs which constitute, in some sense, metadata.
A body section consisting mainly or entirely of data records, which may be either delimited or fixed-width.
The header and the body are separated by one or more empty records.
Your program must parse the metadata first, then make a decision on the basis of the metadata whether to proceed with parsing of the data. The metadata may or may not be used in the parsing of the data.
Below is a plain-text file in which the header consists of key-value pairs delimited by = signs. The key is the to the left of the first delimiter. Everything to the right is part of the value (including any additional delimiter characters).
=
The body consists of comma-delimited strings. Whether in the body or the header, comments begin with a # sign and are ignored.
#
# comment a=alpha b=beta,charlie,delta c=epsilon zeta eta d=1234567890 e=This is a string f=, some,body,loves,me I,wonder,wonder,who could,it,be,you
Suppose you are told that you should proceed to parse the body if and only if the following conditions are met in the header:
There must be a metadata element keyed on d.
d
The value of metadata element d must be a non-negative integer.
There must be a metadata element keyed on f.
f
This file would meet all three criteria and the program would proceed to parse the three data records.
If, however, metadata element f were commented out:
#f=,
the file would no longer meet the criteria and the program would cease before parsing the data records.
new()
Purpose
Parse::File::Metadata constructor. Validates input.
Arguments
$self = Parse::File::Metadata->new( { file => 'path/to/myfile', header_split => '\s*=\s*', metaref => $metaref, rules => \@rules, } );
Single hash reference. Hash has the following elements:
file
Path, relative or absolute, to the file needing parsing.
header_split
Hard-quoted string holding a Perl 5 regex to be used for parsing metadata records.
metaref
Empty hash-reference.
rules
Reference to an array of hashrefs. Each such hashref has two elements:
rule
Reference to a subroutine describing a criterion which the header must pass before parsing of the body begins. The subroutine returns a true value when the criterion is met and an undefined value when the criterion is not met.
label
A human-friendly string which will be used to populate exceptions if the criteria are not met.
The rules are applied in the order specified in the array.
Return Value
Parse::File::Metadata object.
process_metadata_and_proceed()
Process metadata rows found in file header and test the resulting hash against the criteria specified in the rules. If all criteria are met, proceed to parse the data rows with the subroutine specified as argument to this method.
$dataprocess = sub { my @fields = split /,/, $_[0], -1; print "@fields\n"; }; $self->process_metadata_and_proceed( $dataprocess );
Return Values
None. Use get_metadata() and get_exception() methods to obtain that data.
get_metadata()
get_exception()
process_metadata_only()
Same as process_metadata_and_proceed, except that it returns before beginning any processing of the data records.
$self->process_metadata_only();
None.
Access metadata in file's header section.
$metadata_out = $self->get_metadata()
Hash of metadata found in file's header.
Access reasons, if any, why file failed to meet specified criteria.
$exception = $self->get_exception()
Reference to an array holding lists of labels for rules on which the metadata fails.
https://rt.cpan.org
James E Keenan CPAN ID: jkeenan Perl Seminar NY jkeenan@cpan.org http://thenceforward.net/perl/modules/Parse-File-Metadata
Copyright 2010 James E Keenan
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.
perl(1).
To install Parse::File::Metadata, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Parse::File::Metadata
CPAN shell
perl -MCPAN -e shell install Parse::File::Metadata
For more information on module installation, please visit the detailed CPAN module installation guide.