The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Parse::Stallion::EBNF - Output/Input parser in Extended Backus Naur Form.

SYNOPSIS

  #Output
  use Parse::Stallion;
  $parser = new Parse::Stallion(...);

  use Parse::Stallion::EBNF;
  $ebnf_form = ebnf Parse::Stallion::EBNF($parser);

  print $ebnf_form;

  #Input
  my $rules = '
    start = (number qr/\s*\+\s*/ number)
     S{return $number->[0] + $number->[1]}S;
    number = qr/\d+/;
  ';

  my $rule_parser = ebnf_new Parse::Stallion::EBNF($rules);

  my $value = $rule_parser->parse_and_evaluate('1 + 6');
  # $value should be 7

DESCRIPTION

Output

Given a parser from Parse::Stallion, creates a string that is the parser's grammar in EBNF.

If LEAF_DISPLAY is passed in as a parameter to a LEAF rule, that is also part of the output of a leaf node. This can be useful, for instance, to display a description of the code of a PARSE_FORWARD routine.

The following are appended to rules that have them defined:

        -MATCH_MIN_FIRST-
        -EVALUATION-
        -UNEVALUATION-
        -USE_STRING_MATCH-
        -MATCH_ONCE-
        -RULE_INFO-

Input

Use Parse::Stallion for more complicated grammars.

Enter a string with simple grammar rules, a parser is returned.

Each rule must be terminated by a semicolon.

Each rule name must consist of word characters (\w).

Format:

   <rule_name> = <rule_def>;

Four types of rules: 'and', 'or', 'leaf', 'multiple'/'optional'

Rule names and aliases must start with a letter or underscore though may contain digits as well. They are case sensitive.

AND

'and' rule, the rule_def must be rule names separated by whitespace.

OR

'or' rule, the rule_def must be rule names separated by single pipes (|).

LEAF

'leaf' rule can be done on a string via 'qr' or 'q' or as a parse_forward/optionally parse_backtract combination.

'leaf' rule, the rule_def can be a 'qr' or 'q' followed by a non-space, non-word character (\W) up to a repetition of that character. What is betweent the characters is treated as either a regular expression (if 'qr') or a string (if 'q'). Additionally, if a string is within quotes or double quotes it is treated as a string. The following are the same:

  q/x\x/, q'x\x', 'x\x', "x\x",  qr/x\\x/, qr'x\\x'

The qr of a leaf is not the same as a perl regexp's declaration. Notably, one cannot escape the delimiting chars. That is, qr/\//

is valid perl but not valid here, one could instead use

     qr+/+

which is also valid perl.

Modifiers are allowed and are inserted into the regexp via an extended regex sequence:

         qr/abc/i

internally becomes

         qr/(?i)abc/

MULTIPLE/Optional

'multiple' rule, a rule name enclosed within curly braces {}. Optionally may have a minimum and maximum occurence by following the definition with an asterisk min, max. For example:

   multiple_rule = {ruleM}*5,0;

would have at least 5 occurences of ruleM. The maximum is required and 0 sets it to unlimited.

Optional rules can be specified within square brackets. The following are the same:

  {rule_a}*0,1

  [rule_a]

To try to parse with the minimum occurences of a multiple rule first and then go increasing order add a '?' after the right curly brace:

  multiple_rule2 ={ruleX}?;

  multiple_rule ={ruleX}?*3,9;

SUBRULES

Subrules may be specified within a rule by enclosing the subrule within parentheses.

ALIAS

An alias may be specified by an alias name followed by a dot: the alias then a dot. I.e.,

    rule_1 = rule_2 S{print $rule_2;}S;

    rule_3 = alias.rule_2 S{print $alias;}S;

    alias.qr/regex/

    alias.(rule1 rule2)

    alias.(rule1 | rule2)

EVALUATION

For the evaluation phase (see Parse::Stallion) any rule can be enclosed within parentheses followed by an evaluation subroutine that should be enclosed within S{ til }S. Or else S[ til ]S. The 'sub ' declaration is done internally.

Internally all subrules have variables created that contain their evaluated values. If a subrule's name may occur more than once it is passed in an array reference. See Parse::Stallion for details on parameters passed to evaluation routine. This saves on having to create code for reading in the parameters.

Examples:

   rule = (number plus number) S{subroutine}S;

will create an evaluation subroutine string and eval:

  sub {
  my $number = $_[0]->{number};
  my $plus = $_[0]->{plus};
  my $_matched_string = MATCHED_STRING($_[1]);
  subroutine
  }

$number is an array ref, $plus is the returned value from subrule plus.

  number = (/\d+/) S{subroutine}S;

is a leaf rule, which only gets one argument to its subroutine:

  sub {
  my $_ = $_[0];
  my $_matched_string = MATCHED_STRING($_[1]);
  subroutine
  }

The variable, $_matched_string is set to the corresponding matched string of the rule and the rule's descendants. For leaf rules this is the same as $_[0] .

Evaluation is only done after parsing unlike the option of during parsing found in Parse::Stallion.

STRING_MATCH, MATCH_ONCE, MATCH_MIN_FIRST

By putting =SM within a rule (or subrule), the string match is used instead of the returned or generated values.

   ab = (x.({qr/\d/} =SM) qr/\d/) S{$x}; #Will return a string

   cd = (y.{qr/\d/} qr/\d/) S{$y}; #Will return hash ref to an array ref

Likewise, =MO does MATCH_ONCE and =MMF does MATCH_MIN_FIRST. These are described in the Parse::Stallion documentation.

COMMENTS

Comments may be placed on lines after a hash ('#'):

    rule = (sub1 # comment
    sub2 #comment
    sub3) S{}
    # comment

PARSE_FORWARD

As in Parse::Stallion, a PARSE_FORWARD routine may be declared via F{ sub {your routine} }F (or F[ followed by ]F). A PARSE_BACKTRACK routine can follow via a B{ sub {...}}B.

VERSION

0.7

AUTHOR

Arthur Goldstein, <arthur@acm.org>

ACKNOWLEDGEMENTS

Julio Otuyama

COPYRIGHT AND LICENSE

Copyright (C) 2007-9 by Arthur Goldstein. All Rights Reserved.

This module is free software. It may be used, redistributed and/or modified under the terms of the Perl Artistic License (see http://www.perl.com/perl/misc/Artistic.html)

SEE ALSO

example/calculator_ebnf.pl

t/ebnf_in.t in the test cases for examples.

Parse::Stallion