The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Parse::Eyapp::Driver - The LALR parser

INTRODUCTION

This class has the method YYParse implementing the LR generic parsing algorithm plus the methods that give support to the generated parser.

THE YYParse METHOD

The YYParse methods implements the generic LR parsing algorithm. It very much works Parse::Yapp::YYParse and as yacc/bison yyparse. It accepts almost the same arguments as Class->new (Being Class the name of the generated class).

The parser uses two tables and a stack. The two tables are called the action table and the goto table. The stack is used to keep track of the states visited.

At each step the generated parser consults the action table and takes one decision: To shift to a new state consuming one token (and pushing the current state in the stack) or to reduce by some production rule. In the last case the parser pops from its stack as many states as symbols are on the right hand side of the production rule. Here is a Perl/C like pseudocode summarizing the activity of YYParse:

     1   my $parser = shift; # The parser object
     2   push(@stack, $parser->{startstate});
     3   $b = $parser->YYLexer(); # Get the first token
     4   FOREVER: {
     5     $s = top(0);  # Get the state on top of the stack
     6     $a = $b;
     7     switch ($parser->action[$s->state][$a]) {
     8       case "shift t" : 
     9         my $t;
    10         $t->{state} = t;
    11         $t->{attr}  = $a->{attr};
    12         push($t); 
    13         $b = $parser->YYLexer(); # Call the lexical analyzer
    14         break;
    15       case "reduce A->alpha" : 
    16         # Call the semantic action with the attributes of the rhs as args
    17         my $semantic  = $parser->Semantic{A ->alpha}; # The semantic action
    18         my $r;
    19         $r->{attr} = $semantic->($parser, top(|alpha|-1)->attr, ... , top(0)->attr); 
    20  
    21         # Pop as many states as symbols on the rhs of A->alpha
    22         pop(|alpha|);  
    23  
    24         # Goto next state 
    25         $r->{state} = $parser->goto[top(0)][A]; 
    26         push($r); 
    27         break;
    28       case "accept" : return (1); 
    29       default : $parser->YYError("syntax error"); 
    30     }
    31     redo FOREVER;
    32   }

Here |alpha| stands for the length of alpha. Function top(k) returns the state in position k from the top of the stack, i.e. the state at depth k. Function pop(k) extracts k states from the stack. The call $state->attr returns the attribute associated with $state. The call $parser->Semantic{A ->alpha} returns the semantic action associated with production A ->alpha.

Let us see a trace for the small gramar in examples/aSb.yp:

  pl@nereida:~/LEyapp/examples$ /usr/local/bin/paste.pl aSb.yp aSb.output | head -5
  %%                                             | Rules:
  S:                 { print "S -> epsilon\n" }  | ------
      |   'a' S 'b'  { print "S -> a S b\n" }    | 0:    $start -> S $end
  ;                                              | 1:    S -> /* empty */
  %%                                             | 2:    S -> 'a' S 'b'

The tables in file aSb.output describe the actions and transitions to take:

  pl@nereida:~/LEyapp/examples$ cat -n aSb.output
     .  .........................................
     7  States:
     8  -------
     9  State 0:
    10
    11          $start -> . S $end      (Rule 0)
    12
    13          'a'     shift, and go to state 2
    14
    15          $default        reduce using rule 1 (S)
    16
    17          S       go to state 1
    18
    19  State 1:
    20
    21          $start -> S . $end      (Rule 0)
    22
    23          $end    shift, and go to state 3
    24
    25  State 2:
    26
    27          S -> 'a' . S 'b'        (Rule 2)
    28
    29          'a'     shift, and go to state 2
    30
    31          $default        reduce using rule 1 (S)
    32
    33          S       go to state 4
    34
    35  State 3:
    36
    37          $start -> S $end .      (Rule 0)
    38
    39          $default        accept
    40
    41  State 4:
    42
    43          S -> 'a' S . 'b'        (Rule 2)
    44
    45          'b'     shift, and go to state 5
    46
    47  State 5:
    48
    49          S -> 'a' S 'b' .        (Rule 2)
    50
    51          $default        reduce using rule 2 (S)
    52
    53
    54  Summary:
    55  --------
    56  Number of rules         : 3
    57  Number of terminals     : 3
    58  Number of non-terminals : 2
    59  Number of states        : 6

When executed with yydebug set and input aabb we obtain the following output:

  pl@nereida:~/LEyapp/examples$ use_aSb.pl
  ----------------------------------------
  In state 0:
  Stack:[0]
  aabb                       <----------- user input
  Need token. Got >a<
  Shift and go to state 2.  
  ----------------------------------------
  In state 2:
  Stack:[0,2]
  Need token. Got >a<
  Shift and go to state 2.
  ----------------------------------------
  In state 2:
  Stack:[0,2,2]
  Need token. Got >b<
  Reduce using rule 1 (S --> /* empty */): S -> epsilon
  Back to state 2, then go to state 4.

The output S-> epsilon is consequence of the semantic action associated with such production rule.

  ----------------------------------------
  In state 4:
  Stack:[0,2,2,4]
  Shift and go to state 5.
  ----------------------------------------
  In state 5:
  Stack:[0,2,2,4,5]
  Don't need token.
  Reduce using rule 2 (S --> a S b): S -> a S b
  Back to state 2, then go to state 4.

As a result of reducing by rule 2 the semantic action is executed

                { print "S -> a S b\n" }

and the three last visited states are popped from the stack, and the stack becomes [0,2]. But that means that we are now in state 2 seeing a S. If you look at the table above being in state2 and seeing a S we go to state 4.

  ----------------------------------------
  In state 4:
  Stack:[0,2,4]
  Need token. Got >b<
  Shift and go to state 5.
  ----------------------------------------
  In state 5:
  Stack:[0,2,4,5]
  Don't need token.
  Reduce using rule 2 (S --> a S b): S -> a S b
  Back to state 0, then go to state 1.
  ----------------------------------------
  In state 1:
  Stack:[0,1]
  Need token. Got ><
  Shift and go to state 3.
  ----------------------------------------
  In state 3:
  Stack:[0,1,3]
  Don't need token.
  Accept.

METHODS IN THE GENERATED CLASS: Parse::Eyapp::Driver METHODS

The class containing the parser generated by Parse::Eyapp inherits from Parse::Eyapp::Driver. Therefore all the methods in Parse::Eyapp::Driver are avaialbe in the generated class.

This section describes the methods and objects belonging to the class generated either using eyapp or Parse::Eyapp->new_grammar. In the incoming paragraphs we will assume that Class was the value selected for the classname argument when Parse::Eyapp->new_grammar was called. Objects belonging to Class are the actual parsers for the input grammar.

Class->new

The method Class->new returns a new LALR parser object. Here Class stands for the name of the class containing the parser. See an example of call:

  my $parser = main->new(yyprefix => 'Parse::Eyapp::Node::',
                         yylex    => \&main::_Lexer,
                         yyerror  => \&main::_Error,
                         yydebug => 0x1F,
  );

The meaning of the arguments used in the example are as follows:

- yyprefix

Used with %tree or %metatree. When used, the type names of the nodes of the syntax tree will be build prefixing the value associated to yyprefix to the name of the production rule. The name of the production rule is either explicitly given through a %name directive or the concatenation of the left hand side of the rule with the ordinal of the right hand side of the production. See section "Compiling with eyapp and treereg" in Parse::Eyapp for an example.

- yylex

Reference to the lexical analyzer subroutine

- yyerror

Reference to the error subroutine. The error subroutine receives as first argument the reference to the Class parser object. This way it can take advantage of methods like YYCurval and YYExpect (see below):

  sub _Error {
    my($token)=$_[0]->YYCurval;
    my($what)= $token ? "input: '$token'" : "end of input";
    my @expected = $_[0]->YYExpect();

    local $" = ', ';
    die "Syntax error near $what. Expected one of these tokens: @expected\n";
  }
- yydebug

Controls the level of debugging. Must be a number.

The package produced from the grammar has several methods.

The parser object has the following methods that work at parsing time exactly as in Parse::Yapp. These methods can be found in the module Parse::Eyapp::Driver. Assume you have in $parser the reference to your parser object:

$parser->YYParse()

It very much works Parse::Yapp::YYParse and as yacc/bison yyparse. It accepts almost the same arguments as Class->new with the exception of yyprefix which can be used only with new.

$parser->YYErrok

Works as yacc/bison yyerrok. Modifies the error status so that subsequent error messages will be emitted.

$parser->YYError

Works as yacc/bison YYERROR. Pretends that a syntax error has been detected.

$parser->YYNberr

The current number of errors

$parser->YYAccept

Works as yacc/bison YYACCEPT. The parser finishes returning the current semantic value to indicate success.

$parser->YYAbort

Works as yacc/bison YYABORT. The parser finishes returning undef to indicate failure.

$parser->YYBuildingTree

Influences the semantic of list operators. If true the action associated with X+ will be to build a Parse::Eyapp::Node node with all the attributes of the elements in the list as children. This is the appropriate semantic when working under the %tree directive. If set to false the semantic action will return an anonymous list with the attributes associated with the X in the plus list. Same thing with the operators * and ?.

$parser->YYRecovering

Works as yacc/bison YYRECOVERING. Returns TRUE if the parser is recovering from a syntax error.

$parser->YYCurtok

Gives the current token

$parser->YYCurval

Gives the attribute associated with the current token

$parser->YYExpect

Returns the list of tokens the parser expected when the failure occurred

 pl@nereida:~/src/perl/YappWithDefaultAction/examples$ \
                            sed -ne '26,33p' Postfix.eyp
 sub _Error {
   my($token)=$_[0]->YYCurval;
   my($what)= $token ? "input: '$token'" : "end of input";
   my @expected = $_[0]->YYExpect();

   local $" = ', ';
   die "Syntax error near $what. Expected one of these tokens: @expected\n";
 }

$parser->YYLexer

Returns a reference to the lexical analyzer

$parser->YYLhs

Returns the identifier of the left hand side of the current production (the one that is being used for reduction/antiderivation. An example of use can be found in examples/Eyapp/Lhs1.yp:

  %defaultaction { print $_[0]->YYLhs,"\n" }

$parser->YYRuleindex

Returns the index of the production rule, counting the super rule as rule 0. To know the numbers have a look at the .output file. To get a .output file use the option -v of eyapp or the outputfile parameter when using method new_grammar (see the documentation for eyapp).

$parser->YYRightside

Returns an array of strings describing the right hand side of the rule

$parser->YYIsterm

Returns TRUE if the symbol given as argument is a terminal. Example:

  DB<0> x $self->YYIsterm('exp')
 0  ''
  DB<1> x $self->YYIsterm('*')
 0  1

An example of combined use of YYRightside, YYRuleindex, YYLhs and YYIsterm can be found examples/Eyapp/Rule3.yp:

 nereida:~/src/perl/YappWithDefaultAction/examples> sed -n -e '4,22p' Rule3.yp | cat -n
  1  sub build_node {
  2    my $self = shift;
  3    my @children = @_;
  4    my @right = $self->YYRightside();
  5    my $var = $self->YYLhs;
  6    my $rule = $self->YYRuleindex();
  7
  8    for(my $i = 0; $i < @right; $i++) {
  9      $_ = $right[$i];
 10      if ($self->YYIsterm($_)) {
 11        $children[$i] = bless { token => $_, attr => $children[$i] },
 12                                            __PACKAGE__.'::TERMINAL';
 13      }
 14    }
 15    bless {
 16            children => \@children,
 17            info => "$var -> @right"
 18          }, __PACKAGE__."::${var}_$rule"
 19  }

when executed an output similar to this is produced:

 nereida:~/src/perl/YappWithDefaultAction/examples> userule3.pl
 2*3
 $VAR1 = bless( {
   'info' => 'exp -> exp * exp',
   'children' => [
     bless( {
       'info' => 'exp -> NUM',
       'children' => [ bless( { 'attr' => '2', 'token' => 'NUM' }, 'Rule3::TERMINAL' ) ]
     }, 'Rule3::exp_6' ),
     bless( { 'attr' => '*', 'token' => '*' }, 'Rule3::TERMINAL' ),
     bless( {
       'info' => 'exp -> NUM',
       'children' => [ bless( { 'attr' => '3', 'token' => 'NUM' }, 'Rule3::TERMINAL' )
       ]
     }, 'Rule3::exp_6' )
   ]
 }, 'Rule3::exp_11' );

$parser->YYRule

Return the list of rules. The following debugger session illustrates its use:

  pl@europa:~/LEyapp/examples/recycle$ perl -wd usepostfix.pl
  main::(usepostfix.pl:5):        my $parser = new Postfix();
  main::(usepostfix.pl:6):        $parser->Run;
  0  ARRAY(0xa068e0)
     0  '$start'
     1  2
     2  undef
  1  ARRAY(0xa06940)
     0  'line'
     1  1
     2  CODE(0xc22360)
        -> &Postfix::__ANON__[Postfix.eyp:10] in Postfix.eyp:227-10
  ... etc, etc.

Each item has three components: the LHS of the production, the number os symbols in the RHS and the CODE reference to the semantic action.

If an index is specified as argument it returns the corresponding item:

     DB<2> x $parser->YYRule(7)
  0  'exp'
  1  3
  2  CODE(0xc1fce0)
     -> &Postfix::__ANON__[Postfix.eyp:7] in Postfix.eyp:276-7

To know to what production an item is associated we can use the YYGrammar method:

     DB<3> x $parser->YYGrammar('exp_7')
  0  'exp_7'
  1  'exp'
  2  ARRAY(0xa05290)
     0  'exp'
     1  '*'
     2  'exp'
  3  0

We can also use the name of the rule to get the item:

   DB<4> x $parser->YYRule('exp_7')
  0  'exp'
  1  3
  2  CODE(0xc1fce0)
     -> &Postfix::__ANON__[Postfix.eyp:7] in Postfix.eyp:276-7

$parser->YYState

YYState returns a reference to the list of states containing the LALR(1) tables: the action and GOTO tables. Each state is an anonymous hash:

  DB<4> x $parser->YYState(2)
  0  HASH(0xfa7120)
     'ACTIONS' => HASH(0xfa70f0) # token => state
           ':' => '-7'
     'DEFAULT' => '-6'

A negative number means reduction using the corresponding production rule (opposite) number. The former example tells to reduce by rule 7 when in state 2 and seeing token ':'. By default, the action when in state 2 is to reduce by rule number 6.

There are three keys: ACTIONS, GOTOS and DEFAULT

  DB<7> x $parser->YYState(13)
 0  HASH(0xfa8b50)
    'ACTIONS' => HASH(0xfa7530)
       'VAR' => 17
    'GOTOS' => HASH(0xfa8b20)
       'type' => 19

The GOTOS tables contains the DFA transition tables for the syntactic variables. The former example tells to move to state 19 when in state 13 after seeing the syntactic variable type (i.e. if after reducing by a rule of type we are in state 13).

$parser->YYGrammar

Return the list of grammar items. Each item is an anonymous list containing

  • The name of the production

  • The LHS of the production

  • An anonymous list containing the symbols in the RHS

If it receives an index as argument returns the corresponding item The following debugger session explain its use:

  pl@europa:~/LEyapp/examples/recycle$ perl -wd usepostfix.pl
  main::(usepostfix.pl:5):        my $parser = new Postfix();
    DB<1> n
  main::(usepostfix.pl:6):        $parser->Run;
    DB<1> x $parser->YYGrammar
  0  ARRAY(0xde5e20)
     0  '_SUPERSTART'
     1  '$start'
     2  ARRAY(0xc85e80)
        0  'line'
        1  '$end'
     3  0
  1  ARRAY(0xe2b6b0)
     0  'line_1'
     1  'line'
     2  ARRAY(0xe3abc0)
        0  'exp'
     3  0
  2  ARRAY(0xa05530)
     0  'exp_2'
     1  'exp'
     2  ARRAY(0x75bdc0)
        0  'NUM'
     3  0

     ...  etc, etc

If an index is provided it returns the item for such number:

    DB<2> x $parser->YYGrammar(10)
  0  'exp_10'
  1  'exp'
  2  ARRAY(0xa05f80)
     0  '('
     1  'exp'
     2  ')'
  3  0

You can also use a production name as argument:

    DB<3> x $parser->YYGrammar('exp_7')
  0  'exp_7'
  1  'exp'
  2  ARRAY(0xa05890)
     0  'exp'
     1  '*'
     2  'exp'
  3  0

$parser->YYNames

Return the list of production names. In a scalar context returns a reference to such list.

  pl@europa:~/LEyapp/examples/recycle$ eyapp Postfix
  pl@europa:~/LEyapp/examples/recycle$ perl -wd usepostfix.pl
  main::(usepostfix.pl:5):        my $parser = new Postfix();
  main::(usepostfix.pl:6):        $parser->Run;
  DB<1> x $parser->YYNames
  0  '_SUPERSTART'
  1  'line_1'
  2  'exp_2'
  3  'exp_3'
  4  'exp_4'
  5  'exp_5'
  6  'exp_6'
  7  'exp_7'
  8  'exp_8'
  9  'exp_9'
  10  'exp_10'

$parser->YYIndex

Receives the name of production (right hand side). Returns the index in the grammar of the production with such name. When called in a list context and without a name return the hash containing the relation

           production name => production index

The following debugger session illustrates its use:

  pl@europa:~/LEyapp/examples/recycle$ perl -wd usepostfix.pl
  main::(usepostfix.pl:5):        my $parser = new Postfix();
  main::(usepostfix.pl:6):        $parser->Run;
  DB<1> x $parser->YYIndex
  0  'line_1'
  1  1
  2  'exp_3'
  3  3
  4  'exp_6'
  5  6
  6  'exp_4'
  7  4
  8  'exp_10'
  9  10
  10  'exp_8'
  11  8
  12  'exp_5'
  13  5
  14  'exp_7'
  15  7
  16  'exp_2'
  17  2
  18  '_SUPERSTART'
  19  0
  20  'exp_9'
  21  9

We can specify a list of names:

  DB<2> x $parser->YYIndex(qw{exp_4 exp_7})
  0  4
  1  7
  DB<3> x $parser->YYIndex(qw{exp_4})
  0  4

$parser->YYAction

Receives the name of a production and a subroutine reference implementing the new semantic action. If no subroutine reference is set returns the reference to the current semantic action

$parser->YYSetaction

Receives a hash with keys the names of the production rules (right hand sides) and values the new semantic actions. Used to reuse a grammar without overwriting all the semantic actions. See section Reusing Grammars by Dynamic Substitution of Semantic Actions in Parse::Eyapp::defaultactionsintro.

$parser->YYIssemantic

Returns TRUE if the terminal is semantic. Semantics token can be declared using the directive %semantic token. The opposite of a Semantic token is a Syntactic token. Syntactic tokens can be declared using the directive %syntactic token.

When using the %tree directive all the nodes corresponding to syntactic tokens are pruned from the tree. Under this directive tokens in the text delimited by simple quotes (like '+') are, by default, considered syntactic tokens.

When using the %metatree directive all the tokens are considered, by default, semantic tokens. Thus, no nodes will be - by default- pruned when construction the code augmented tree. The exception are string tokens used as separators in the definition of lists, like in S <* ';'>. If you want the separating string token to appear include an explicit semantic declaration for it (example %semantic token ';').

$parser->YYName

Returns the name of the current rule (The production whose reduction gave place to the execution of the current semantic action).

  DB<12> x $self->YYName
 0  'exp_11'

$parser->YYPrefix

Return and/or sets the yyprefix attribute. This a string that will be concatenated as a prefix to any Parse::Eyapp::Node nodes in the syntax tree.

$parser->YYDelegateaction

Use it as defaultaction if you want to recycle your grammar. It is equivalent to:

  sub YYDelegateaction {
    my $self = shift;

    my $action = $self->YYName;

    $self->$action(@_);
  }

For a full example illustrating how to use it, see files examples/recycle/NoacInh.eyp and examples/recycle/icalcu_and_ipost.pl in the Parse::Eyapp distribution

$parser->YYBypass

Returns TRUE if running under the %tree bypass clause

$parser->YYBypassrule

Returns TRUE if the production being used for reduction was marked to be bypassed.

$parser->YYFirstline

First line of the input string describing the grammar

Parse::Eyapp::Driver::BeANode

Is not a method. Receives as input a Class name. Introduces Parse::Eyapp::Node as an ancestor class of Class. To work correctly, objects belonging to Class must be hashes with a children key whose value must be a reference to the array of children. The children must be also Parse::Eyapp::Node nodes. Actually you can circumvent this call by directly introducing Parse::Eyapp::Node in the ancestors of Class:

         push @{$class."::ISA"}, "Parse::Eyapp::Node" 

$parser->YYBuildAST

Sometimes the best time to decorate a node with some attributes is just after being built. In such cases the programmer can take manual control building the node with YYBuildAST to inmediately proceed to decorate it.

The following example from the file lib/Simple/Types.eyp in the tarball in examples/typechecking/Simple-Types-XXX.tar.gz illustrates the idea:

 Variable:
     %name  VARARRAY
     $ID ('[' binary ']') <%name INDEXSPEC +>
       {
         my $self = shift;
         my $node =  $self->YYBuildAST(@_);
         $node->{line} = $ID->[1];
         return $node;
       }

Actually, the %tree directive is semantically equivalent to:

  %default action { goto &Parse::Eyapp::Driver::YYBuildAST }

$parser->YYBuildTS

Similar to $parser->YYBuildAST but builds nodes for translation schemes.

AUTHOR

Casiano Rodriguez-Leon (casiano@ull.es)

ACKNOWLEDGMENTS

This work has been supported by CEE (FEDER) and the Spanish Ministry of Educacion y Ciencia through Plan Nacional I+D+I number TIN2005-08818-C04-04 (ULL::OPLINK project http://www.oplink.ull.es/). Support from Gobierno de Canarias was through GC02210601 (Grupos Consolidados). The University of La Laguna has also supported my work in many ways and for many years.

A large percentage of code is verbatim taken from Parse::Yapp 1.05. The author of Parse::Yapp is Francois Desarmenien.

I wish to thank Francois Desarmenien for his Parse::Yapp module, to my students at La Laguna and to the Perl Community. Special thanks to my family and Larry Wall.

LICENCE AND COPYRIGHT

Copyright (c) 2006-2008 Casiano Rodriguez-Leon (casiano@ull.es). All rights reserved.

Parse::Yapp copyright is of Francois Desarmenien, all rights reserved. 1998-2001

These modules are free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.