Parse::Flex - The Fastest Lexer in the West


 # First, you must create your custom lexer:
 $ -n Flex01  grammar.l

 # Then, interface with the lexer you just created:

 use Flex01;
 my $w= gen_walker( 'data.text');
 print $w->();
 print $w->();

 use Flex01;
 walkthrough( 'data.txt' );


Parse::Flex works similar to Parse::Lex, but it uses XS for faster performance.

This module allows you to construct a lexer analyzer with your custom rules. Parse::Flex is not intended to be used directly; instead, use the script to submit your grammar file. The output of the script is a custom shared library and a custom .pm module which, among other things, will transparently load the library and provide interface to your (custom) lexer. In other words, you supply a grammar.l file to and you receive and . Then, use only the - since will automatically load

The grammar.l file requires the same syntax as flex(1); that is, the actions are written in C . See the flex(1) documentation to learn the syntax, or fetch the sample t/grammar.l file inside this package.

Interfacing with Parsers

Almost all Perl parsers expect that your lexer provides either a list of two values ( token type and token value), or a reference to such list. Parse::Flex can provides the right response since it consults wantarray context.

In the particular case when interfacing with Parse::Yapp, you could also this interface (if you already have a custom parser) :

 my $p = yapp_new  'MyParser'  ;
 print $p->yapp_parse(  'data.txt' ) ;


Loading the Custom Library

As mentioned earlier, your custom will use bootstrap to automatically load Keep the .so where bootstrap can find it; the current directory is always a good option.


All the following methods are defined inside the shared library for every custom lexer. Most methods take arguments that are not shown bellow. They are identical to the equivalent flex(1) functions. For now, consult the source code at lib/Parse/Flex/ (the $xs_content variable).

yylex() Fetch the the next token. (Your grammar should return 0 at the end of input).
yyin() Sets the next input file.
yyset_in() Sets the next input file.
yyget_in() Receives a glob to whatever yyin is currently pointing.
yyout() Sets the where ECHO command should go.
yyset_out() Sets the where ECHO command should go.
yyget_out() Receives a glob to whatever yyout is currently pointing.
yyget_text() The semantic value of the token.
yyget_leng() The length of the string that holds the semantic value.
yyget_lineno() The current value of yylineno. ( But first enable it via %option .)
yyset_lineno() Sets the value for yylineno. ( But first enable it via %option .)
yyset_debug() If set to non-zero (already the default), the scanner will output debugging info (provided, of course, you have also enabled it via %option .)
yyget_debug() Fetch the current value of yy_flex_debug .
yy_scan_string() The lexer will read its data from this string.
yy_scan_bytes() The lexer will read n bytes from this string (which could contain nulls).
yyrestart() Restart scanning from this file.
create_push_buffer() Create a buffer and push it on buffer_state. Don't forget to pop it from your grammar rules (when you have to).
yypop_buffer_state() Remove the top buffer from buffer_state .

Internal Methods


Is an internal method of no interest to the user.




Ioannis Tambouras, <>


flex(1), Parse::Lex