Parse::Eyapp::Scope - Support for Scope Analysis
# Fragment of the grammar lib/Simple/Types.eyp # in examples/typechecking/Simple-Types-XXX.tar.gz funcDef: $ID { $ids->begin_scope(); } '(' $params ')' $block { my $st = $block->{symboltable}; my @decs = $params->children(); $block->{parameters} = []; while (my ($bt, $id, $arrspec) = splice(@decs, 0, 3)) { $bt = ref($bt); # The string 'INT', 'CHAR', etc. my $name = $id->{attr}[0]; my $type = build_type($bt, $arrspec); $type{$type} = Parse::Eyapp::Node->hnew($type); # control duplicated declarations die "Duplicated declaration of $name at line $id->{attr}[1]\n" if exists($st->{$name}); $st->{$name}->{type} = $type; $st->{$name}->{param} = 1; $st->{$name}->{line} = $id->{attr}[1]; push @{$block->{parameters}}, $name; } $block->{function_name} = $ID; $block->type("FUNCTION"); my ($nodec, $dec) = $ids->end_scope($st, $block, 'type'); # Type checking: add a direct pointer to the data-structure # describing the type $_->{t} = $type{$_->{type}} for @$dec; return $block; } ; ... Primary: %name INUM INUM | %name CHARCONSTANT CHARCONSTANT | $Variable { $ids->scope_instance($Variable); return $Variable } | '(' expression ')' { $_[2] } | $function_call { $ids->scope_instance($function_call); return $function_call # bypass } ;
The examples used in this document can be found in the file examples/typechecking/Simple-Types-XXX.tar.gz. This distribution contains the front-end of a compiler (lexical analysis, syntax analysis, scope analysis and type checking) for a small subset of the C language. The language has characters, integers, arrays and functions. Here is a small example:
examples/typechecking/Simple-Types-XXX.tar.gz
pl@nereida:~/Lbook/code/Simple-Types/script$ cat prueba03.c int a,b,e[10]; g() {} int f(char c) { char d; c = 'X'; e[d] = 'A'+c; { int d; d = a + b; } a = b * 2; return c; }
You can find more examples in the script/ directory. The front-end provided analyzes the input program
script/
pl@nereida:~/Lbook/code/Simple-Types/script$ usetypes.pl prueba03.c
and produces the decorated abstract tree, i.e. s.t. like:
PROGRAM^{0}( FUNCTION[g]^{1}, FUNCTION[f]^{2}( ASSIGNCHAR( VAR( TERMINAL[c:7]), CHARCONSTANT( TERMINAL['X':7]) ), ASSIGNINT( VARARRAY( TERMINAL[e:8], INDEXSPEC(CHAR2INT(VAR(TERMINAL[d:8])))), PLUS( CHAR2INT(CHARCONSTANT(TERMINAL['A':8])), CHAR2INT(VAR(TERMINAL[c:8])) ) ), BLOCK[9:3:f]^{3}( ASSIGNINT( VAR(TERMINAL[d:11]), PLUS(VAR(TERMINAL[a:11]),VAR( TERMINAL[b:11]))) ), ASSIGNINT( VAR(TERMINAL[a:13]), TIMES(VAR(TERMINAL[b:13]),INUM(TERMINAL[2:13]))), RETURNINT(CHAR2INT(VAR(TERMINAL[c:14]))) ) ) ...... # More descriptions
A scope manager helps to compute the mapping function that maps the uses (instances) of source objects to their definitions. For instance,
When dealing with identifier scope analysis the problem is to associate each occurrence of an identifier with the declaration that applies to it.
Another example is loop scope analysis where the problem is to associate each occurrence of a CONTINUE or BREAK node with the shallowest LOOP that encloses it.
CONTINUE
BREAK
LOOP
Or label scope analysis, the problem to associate a GOTO node with the node to jump to, that is, the one with the STATEMENT associated with the label.
GOTO
STATEMENT
The scope analysis start by creating the Parse::Eyapp::Scope objects:
Parse::Eyapp::Scope
program: { reset_file_scope_vars(); } definition<%name PROGRAM +>.program { .......... # Semantic actions } ;
Before the analysis of the whole program we call reset_file_scope_vars which is in charge to create the scope analyzers for identifier scope analysis and loop scope analysis:
reset_file_scope_vars
sub reset_file_scope_vars { %st = (); # reset symbol table ($tokenbegin, $tokenend) = (1, 1); %type = ( INT => Parse::Eyapp::Node->hnew('INT'), CHAR => Parse::Eyapp::Node->hnew('CHAR'), VOID => Parse::Eyapp::Node->hnew('VOID'), ); $depth = 0; $ids = Parse::Eyapp::Scope->new( SCOPE_NAME => 'block', ENTRY_NAME => 'info', SCOPE_DEPTH => 'depth', ); $loops = Parse::Eyapp::Scope->new( SCOPE_NAME => 'exits', ); $ids->begin_scope(); $loops->begin_scope(); # just for checking }
To take advantage of Parse::Eyapp::Scope, the compiler writer must mark at the appropriate time (for example a new block or new subroutine for identifier scope analysis, a new loop for loop scope analysis, etc.) the beginning of a new scope calling the method begin_scope. For example, the following code deals with the declaration of functions
begin_scope
funcDef: $ID { $ids->begin_scope(); } '(' $params ')' $block { ........ # semantic action code } ;
The call
$ids->begin_scope
marks the beginning of a new identifier scope.
From that point on, any ocurring instance of an object (for example, variables in expressions for identifier scope analysis, breaks and continues for loop scope analysis, etc.) must be declared calling the method scope_instance. For example, the following rules deal with the use of of variables and functions inside expressions:
scope_instance
Primary: ........... # Other production rules | $Variable { $ids->scope_instance($Variable); return $Variable } | $function_call { $ids->scope_instance($function_call); return $function_call # bypass } ;
The programmer must also mark the end of the current scope at the appropriate time. After the processing of the block following a function declaration an identifier scope has finished and we call end_scope:
block
end_scope
funcDef: $ID { $ids->begin_scope(); } '(' $params ')' $block { ............................... my ($nodec, $dec) = $ids->end_scope($st, $block, 'type'); # Type checking: add a direct pointer to the data-structure # describing the type $_->{t} = $type{$_->{type}} for @$dec; return $block; } ;
This call is made after each end of scope, including the end of the program:
program: { reset_file_scope_vars(); } definition<%name PROGRAM +>.program { $program->{symboltable} = { %st }; # creates a copy of the s.t. $program->{depth} = 0; $program->{line} = 1; $program->{types} = { %type }; $program->{lines} = $tokenend; my ($nondec, $declared) = $ids->end_scope($program->{symboltable}, $program, 'type'); if (@$nondec) { warn "Identifier ".$_->key." not declared at line ".$_->line."\n" for @$nondec; die "\n"; } # Type checking: add a direct pointer to the data-structure # describing the type $_->{t} = $type{$_->{type}} for @$declared; my $out_of_loops = $loops->end_scope($program); if (@$out_of_loops) { warn "Error: ".ref($_)." outside of loop at line $_->{line}\n" for @$out_of_loops; die "\n"; } # Check that are not dangling breaks reset_file_scope_vars(); $program; } ;
There are three ways of calling $scope->end_scope. The first one is for Scope Analysis Problems where a symbol table is needed (for example in identifier scope analysis and label scope analysis and there is a Parse::Eyapp::Node node that owns the scope.
$scope->end_scope
Parse::Eyapp::Node
For each ocurring instance of an object $x that occurred since the last call to begin_scope the call to
$x
$scope->end_scope(\%symboltable, $definition_node, 'attr1', 'attr2', ... )
decorates the ocurring instance $x with several attributes:
An entry $x->{SCOPE_NAME} is built that will reference $definition_node.
$x->{SCOPE_NAME}
$definition_node
An entry $x->{ENTRY_NAME} is built. That entry references $symboltable{$x->key} (to have a faster access from the instance to the attributes of the object). The instantiated nodes must have a $x->key method which provides the entry for the node in the symbol table:
$x->{ENTRY_NAME}
$symboltable{$x->key}
$x->key
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '651,657p' Types.eyp sub VAR::key { my $self = shift; return $self->child(0)->{attr}[0]; } *VARARRAY::key = *FUNCTIONCALL::key = \&VAR::key;
For each aditional arguments attr#k an entry $x->{attr#k} will be built. That entry references $symboltable{$x->key}{attr#k}. Therefore the entry for $x in the symbol table must already have a field named attr#k. If the hash referenced by $symboltable{$x->key} does not have a key attr#k no reference is built.
attr#k
$x->{attr#k
$symboltable{$x->key}{attr#k}
In a list context $scope>end_scope returns two references. The first one is a reference to a list of node instantiated that weren't defined in the current scope. The second is a reference to a list of nodes that were defined in this scope. In a scalar context returns the first of these two. An instance $x is defined if, and only if, exists $symboltable{$_->key}.
$scope>end_scope
exists $symboltable{$_->key}
$scope->end_scope(\%symboltable, 'attr1', 'attr2', ... )
An entry $x->{ENTRY_NAME} is built. That entry references $symboltable{$x->key} (to have a faster access from the instance to the attributes of the object). The instantiated nodes must have a $x->key method which provides the entry for the node in the symbol table.
Some scope analysis problems do not require the existence of a symbol table (for instance, the problem of associating a RETURN node with the FUNCTION that encloses it). For such kind of problems $scope>end_scope provides a second form of call.
RETURN
FUNCTION
The second way to call $scope>end_scope is
$declared = $scopemanager->end_scope($definition_node);
The only argument is the reference to the node that controls/defines the scope. The method returns a reference to the declared nodes. Any node instanced with scope_instance since the last call to begin_scope is considered declared.
The scope node $definition_node is decorated with an attribute with name the value of the attribute SCOPE_NAME of the scope manager $scopemanager. The value of the attribute is the anonymous list of references to the instances declared in the scope of $definition_node (i.e. the same list referenced by $declared).
SCOPE_NAME
$scopemanager
$declared
The scope instances in @$declared are decorated with an attribute with name the value of the attribute SCOPE_NAME of the scope manager. The value is a reference to the scope node $definition_node.
@$declared
Marks the beginning of an scope. Example (file Types.eyp in examples/typechecking/Simple-Types-XXX.tar.gz):
Types.eyp
loopPrefix: $WHILE '(' expression ')' { $loops->begin_scope; $_[3]->{line} = $WHILE->[1]; # Save the line for error diagostic $_[3] }
Declares the node argument to be an occurring instance of the scope:
nereida:~/doc/casiano/PLBOOK/PLBOOK/code> \ sed -ne '375,380p' Simple6.eyp | cat -n 1 $Variable '=' binary 2 { 3 my $parser = shift; 4 $ids->scope_instance($Variable); 5 $parser->YYBuildAST(@_); # "Manually" build the node 6 }
new
Parse::Eyapp::Scope->new returns a scope management object. The scope mapping function is implemented by Parse::Eyapp::Scope through a set of attributes that are added to the nodes involved in the scope analysis. The names of these attributes can be specified using the parameters of Parse::Eyapp::Scope->new. The arguments of new are:
Parse::Eyapp::Scope->new
SCOPE_NAME is the name chosen for the attribute of the node instance which will held the reference to the definition node. If not specified it will take the value "scope".
"scope"
ENTRY_NAME is the name of the attribute of the node instance which will held the reference to the symbol table entry. By default takes the value "entry".
ENTRY_NAME
"entry"
SCOPE_DEPTH is the name for an attribute of the definition node. Optional. If not specified it will not be defined.
SCOPE_DEPTH
The project home is at http://code.google.com/p/parse-eyapp/. Use a subversion client to anonymously check out the latest project source code:
svn checkout http://parse-eyapp.googlecode.com/svn/trunk/ parse-eyapp-read-only
Parse::Eyapp, Parse::Eyapp::eyapplanguageref, Parse::Eyapp::debuggingtut, Parse::Eyapp::defaultactionsintro, Parse::Eyapp::translationschemestut, Parse::Eyapp::Driver, Parse::Eyapp::Node, Parse::Eyapp::YATW, Parse::Eyapp::Treeregexp, Parse::Eyapp::Scope, Parse::Eyapp::Base, Parse::Eyapp::datagenerationtut
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/languageintro.pdf
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/debuggingtut.pdf
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/eyapplanguageref.pdf
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/Treeregexp.pdf
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/Node.pdf
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/YATW.pdf
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/Eyapp.pdf
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/Base.pdf
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/translationschemestut.pdf
The pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/MatchingTrees.pdf
The tutorial Parsing Strings and Trees with Parse::Eyapp (An Introduction to Compiler Construction in seven pages) in http://nereida.deioc.ull.es/~pl/eyapsimple/
Parse::Eyapp
perldoc eyapp,
perldoc treereg,
perldoc vgg,
The Syntax Highlight file for vim at http://www.vim.org/scripts/script.php?script_id=2453 and http://nereida.deioc.ull.es/~vim/
Analisis Lexico y Sintactico, (Notes for a course in compiler construction) by Casiano Rodriguez-Leon. Available at http://nereida.deioc.ull.es/~pl/perlexamples/ Is the more complete and reliable source for Parse::Eyapp. However is in Spanish.
Parse::Yapp,
Man pages of yacc(1) and bison(1), http://www.delorie.com/gnu/docs/bison/bison.html
Language::AttributeGrammar
Parse::RecDescent.
HOP::Parser
HOP::Lexer
ocamlyacc tutorial at http://plus.kaist.ac.kr/~shoh/ocaml/ocamllex-ocamlyacc/ocamlyacc-tutorial/ocamlyacc-tutorial.html
The classic Dragon's book Compilers: Principles, Techniques, and Tools by Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman (Addison-Wesley 1986)
CS2121: The Implementation and Power of Programming Languages (See http://www.cs.man.ac.uk/~pjj, http://www.cs.man.ac.uk/~pjj/complang/g2lr.html and http://www.cs.man.ac.uk/~pjj/cs2121/ho/ho.html) by Pete Jinks
Hal Finkel http://www.halssoftware.com/
G. Williams http://kasei.us/
Thomas L. Shinnick http://search.cpan.org/~tshinnic/
Casiano Rodriguez-Leon (casiano@ull.es)
This work has been supported by CEE (FEDER) and the Spanish Ministry of Educacion y Ciencia through Plan Nacional I+D+I number TIN2005-08818-C04-04 (ULL::OPLINK project http://www.oplink.ull.es/). Support from Gobierno de Canarias was through GC02210601 (Grupos Consolidados). The University of La Laguna has also supported my work in many ways and for many years.
A large percentage of code is verbatim taken from Parse::Yapp 1.05. The author of Parse::Yapp is Francois Desarmenien.
I wish to thank Francois Desarmenien for his Parse::Yapp module, to my students at La Laguna and to the Perl Community. Thanks to the people who have contributed to improve the module (see "CONTRIBUTORS" in Parse::Eyapp). Thanks to Larry Wall for giving us Perl. Special thanks to Juana.
Copyright (c) 2006-2008 Casiano Rodriguez-Leon (casiano@ull.es). All rights reserved.
Parse::Yapp copyright is of Francois Desarmenien, all rights reserved. 1998-2001
These modules are free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
To install Parse::Eyapp, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Parse::Eyapp
CPAN shell
perl -MCPAN -e shell install Parse::Eyapp
For more information on module installation, please visit the detailed CPAN module installation guide.