The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
    Parse::Nibbler - Parse huge files using grammars written in pure perl.
    Copyright (C) 2001  Greg London

    This library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
    License as published by the Free Software Foundation; either
    version 2.1 of the License, or (at your option) any later version.

    This library is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
    Lesser General Public License for more details.

    You should have received a copy of the GNU Lesser General Public
    License along with this library; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

NAME

Parse::Nibbler - Parse huge files using grammars written in pure perl.

SYNOPSIS

{ package MyGrammar;

use Parse::Nibbler; our @ISA = qw( Parse::Nibbler );

############################################################################### Register ( 'McCoy', sub ############################################################################### { my $p = $_[0]; $p->AlternateRules( 'DeclareProfession', 'MedicalDiagnosis' ); } );

############################################################################### # DeclareProfession : # [Dammit,Gadammit] <name> , I'm a doctor not a [Bricklayer,Ditchdigger] ! ############################################################################### Register ( 'DeclareProfession', sub ############################################################################### { my $p = $_[0]; $p->AlternateValues('Dammit', 'Gadammit'); $p->Name; $p->ValueIs(","); $p->ValueIs("Ima"); $p->ValueIs("doctor"); $p->ValueIs("not"); $p->ValueIs("a"); $p->AlternateValues('Bricklayer', 'Ditchdigger'); $p->ValueIs("!"); } );

############################################################################### # MedicalDiagnosis : # [He's,She's] dead, <name> ! ############################################################################### Register ( 'MedicalDiagnosis', sub ############################################################################### { my $p = $_[0]; $p->AlternateValues("He", "She"); $p->ValueIs("is"); $p->ValueIs("dead"); $p->ValueIs(","); $p->Name; $p->ValueIs("!"); } );

############################################################################### Register ( 'Name', sub ############################################################################### { my $p = $_[0]; $p->AlternateValues( 'Jim', 'Scotty', 'Spock' );

  }
);

} # end package MyGrammar

use Data::Dumper;

############################################################################### # call the constructor to create a parser ############################################################################### my $p = MyGrammar->new('transcript.txt');

############################################################################### # call the top-level rule of the grammar on the parser object ############################################################################### $p->McCoy;

print Dumper $p;

DESCRIPTION

Create a parser object using the ->new method. This method is provided by the Parse::Nibbler module and should not be overridden.

The main functionality of the Parse::Nibbler module is the Register subroutine. This subroutine is used to define the rules of your grammar. The Register subroutine takes two parameters: A string and a code reference.

The string is the name of the rule (i.e. the name of the subroutine/method)

The code reference is a reference to the code to execute for this rule.

The Register subroutine will take the code reference, wrap it up in another subroutine that acts as a closure, and then installs that code reference as a subroutine with the name matching the given string.

The wrapper code (the closure) is the same for every rule. The wrapper code handles quantifiers, calls the rule, and decides what to do based on the rule passing or failing.

A rule is a code reference with a given string name that have been passed to Register. Here is an example of a rule:

Register ( 'Name', sub { my $p = shift; $p->AlternateValues( 'Jim', 'Scotty', 'Spock' );

  }
);

The parser object will always be passed in as the first parameter to your rule. You must pass this into any further rules or any Parse::Nibbler methods.

In the above example, the rule, "Name" is Registered. "Name" calls one of the builtin methods, AlternateValues, defined below. Once a rule is Registered, other rules can call it:

Register ( 'MedicalDiagnosis', sub { my $p = shift; $p->AlternateValues("He's", "She's"); $p->ValueIs("dead"); $p->ValueIs(","); $p->Name; $p->ValueIs("!"); } );

This code registers a rule called "MedicalDiagnosis". It uses some builtin methods, but it also calls the rule just registered, "Name".

Once a user defines a rule, they can use it in other rules by simply calling it as they would call a method.

Rules registered with the Parse::Nibbler module can be called with quantifiers. Quantifiers are passed into the Rule when you call it in your grammar by passing in a string that matches the format described here.

Quantifiers allow you to specify the quantity of rules present. Quantifiers also allow you to specify whether multiple rules have separators.

Quantifiers are specified using the following string format:

     {quantifier}

This indicates that there are zero or one Name rules expected: $p->Name('{?}');

This indicates that there are zero or more Name rules expected: $p->Name('{*}');

This indicates that there are one or more Name rules expected: $p->Name('{+}');

This indicates that there are exactly three Name rules expected: $p->Name('{3}');

This indicates there are 1 to 3 Name rules expected: $p->Name('{1:3}');

This indicates there are at least 2 Name rules expected: $p->Name('{2:');

Separators are specified using the following string format:

     /separator/

This indicates 1 or more Name rules, each separated by a comma:

$p->Name('{1:}/,/');

It is the job of the Register function to make sure this additional functionality is provided transparently and automagically to you.

If you call a rule with no quantifier and no separator, the rule will assume the quantifier is 1 and there is no separator.

Additional Parse::Nibbler methods are provided to simplify rule definition and to provide smart, automatic error handling, etc. You grammars should only call other rules that you defined, or these methods explained below.

(Note: these methods do not take quantifiers)

############### Method: ValueIs ###############

Parameters: One parameter, required. A string containing the expected value.

Example: $p->ValueIs( 'stringvalue' );

Description:

This method will look at the next lexical and determine if its value matches that of the stringvalue given as a parameter. If it does not match, an exception is raised and the rule fails.

If the values do match, then the parser stores the lexical, and the rule continues.

####################### Method: AlternateValues #######################

Parameters: A list of string parameters, at least two values.

Example: $p-AlternateValues( 'value1', 'value2' );

Description:

This method behaves like the ValueIs method, except that it will recieve a list of allowed alternate expected values. The first match that succeeds causes the rule to pass and return.

If no match occurs, then an exception is raised and the rule aborts.

If a match does occur, the parser stores the lexical, and the rule continues.

############## Method: TypeIs ##############

Parameters: One parameter, required. A string containing the expected type.

Description:

This method will look at the next lexical item, and determine if the lexical type matches the type given as a parameter.

Valid type values depend on the Lexer that you use, but possible values may include "Identifier" and "Number", etc.

Use this in a case where your rule requires an identifier type, for example, but it does not care what the name of the identifier is for the rule.

If a match occurs, the parser stores the lexical and the rule continues.

If a match does not occur, an exception is raised, and the rule aborts.

###################### Method: AlternateRules ######################

Parameters: A list of string parameters, at least two.

Example: $p->AlternateRules( 'Rule1', 'Rule2' );

Description:

You can describe rule alternation in your rule by calling this method. The method takes a list of strings whose string values match the names of the valid alternate rule names.

In the above example, the McCoy rule is either a declaration of profession or a medical diagnosis. These are two rules that are defined in the same package. The AlternateRules method allows you to define multiple rules that may be valid at the same point in the text.

If a rule in the parameter list succeeds, the AlternateRule method succeeds, and returns immediately.

If no rule succeeds, an exception is thrown, and the rule aborts.

This rule expects either a "DeclareProfession" rule or a "MedicalDiagnosis" rule to be present.

Register ( 'McCoy', sub { my $p = shift; $p->AlternateRules( 'DeclareProfession', 'MedicalDiagnosis' ); } );

You can specify quantifiers as part of the alternate rule strings.

    $p->AlternateRules( 'DeclareProfession({+})', 'MedicalDiagnosis' );

The above example indicates that you can have one or more DeclareProfession rules OR ALTERNATELY you can have exactly one MedicalDiagnosis rule.

EXPORT

     Register, used to register the rules in your grammar.

AUTHOR

    Parse::Nibbler - Parse huge files using grammars written in pure perl.
    Copyright (C) 2001  Greg London

    This library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
    License as published by the Free Software Foundation; either
    version 2.1 of the License, or (at your option) any later version.

    This library is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
    Lesser General Public License for more details.

    You should have received a copy of the GNU Lesser General Public
    License along with this library; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

    contact the author via http://www.greglondon.com

SEE ALSO

1 POD Error

The following errors were encountered while parsing the POD:

Around line 3:

=for without a target?