The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

PICA::Path - PICA path expression to match field and subfield values

SYNOPSIS

    use PICA::Path;
    use PICA::Parser::Plain;

    # extract URLs from PIC Records, given from STDIN
    my $urlpath = PICA::Path->new('009P$a');
    my $parser = PICA::Parser::Plain->new(\*STDIN);
    while ( my $record = $parser->next ) {
        print "$_\n" for $urlpath->record_subfields($record);
    }

DESCRIPTION

PICA path expressions can be used to match fields and subfields of PICA::Data records or equivalent record structures. An instance of PICA::Path is a blessed array reference, consisting of the following fields:

  • regular expression to match field tags against

  • regular expression to match occurrences against, or undefined

  • regular expression to match subfields against

  • substring start position

  • substring end position

METHODS

new( $expression )

Create a PICA path by parsing the path expression. The expression consists of

  • A tag, constisting of three digits, the first 0 to 2, followed by a digit or @. The character * can be used as wildcard.

  • An optional occurrence, given by two digits (or * as wildcard) in brackets, e.g. [12] or [0*].

  • An optional list of subfields. Allowed subfield codes include _A-Za-z0-9.

  • An optional position, preceeded by /. Both single characters (e.g. /0 for the first), and character ranges (such as 2-4, -3, 2-...) are supported.

match_field( $field )

Check whether a given PICA field matches the field and occurrence of this path. Returns the $field on success.

filter_record_fields( $record )

Returns an array reference with fields of a PICA::Data that match the path. Subfield codes are ignore.

match_subfields( $field )

Returns a list of matching subfields (optionally trimmed by from and length) without inspection field and occurrence values.

stringify( [ $short ] )

Stringifies the PICA path to normalized form. Subfields are separated with $, unless called as stringify(1) or the first subfield is $.

SEE ALSO

Catmandu::Fix::pica_map