Gary Spivey


EP3 - The Extensible Perl PreProcessor


  use Text::EP3;
  [use Text::EP3::{Extension}] # Language Specific Modules
  my $object = new Text::EP3 file;
     [other methods that can be invoked]
     $object->ep3_process([$filename, [$condition]]);


EP3 is a Perl5 program that preprocesses STDIN or some set of input files and produces an output file. EP3 only works on input files and produces output files. It seems to me that if you want to preprocess arrays or somesuch, you should be using perl. EP3 was first developed to provide a flexible preprocessor for the Verilog hardware description language. Verilog presents some problems that were not easily solved by using cpp or m4. I wanted to be able to use a normal preprocessor, but extend its functionality. So I wrote EP3 - the Extensible Perl PreProcessor. The main difference between EP3 and other preprocessors is its built-in extensibility. Every directive in EP3 is really a method defined in EP3, one of its submodules, or embedded in the file that is being processed. By linking the directive name to the associated methods, other methods could be added, thus extending the preprocessor.

Many of the features of EP3 can be modified via command line switches. For every command line switch, there is an also accessor method.

Directives and Method Invocation

Directives are preceded with the a user defined delimeter. The default delimeter is `@'. This delimeter was chosen to avoid conflicts with other preprocessor delimeters (`#' and the Verilog backtick), as well as Verilog syntax that might be found a the beginning of a line (`$', `&', etc.). A directive is defined in Perl as the beginning of the line, any amount of whitespace, and the delimeter immediately followed by Perl word characters (0-9A-Za-z_).

EP3 looks for directives, strips off the delimeter, and then invokes a method of the same name. The standard directives are defined within the EP3 program. Library or user defined directives may be loaded as perl modules either via the use command or from a command line switch for inclusion at the beginning of the EP3 run. Using the "include" directive coupled with the "perl_begin/end" directives perl subroutines (and hence EP3 directives) may be dynamically included during the EP3 run.

Directive Extension Method 1: The use command.

A module may be included with the use statement provided that it pushes its package name onto EP3's @ISA array (thus telling EP3 to inherit its methods). For a Verilog module whose filename is and has the package name Text::EP3::Verilog, the following line must be included ...

    push (@Text::EP3::ISA, qw(Text::EP3::Verilog));

This package can then be simply included in whatever script you are using to call EP3 with the line:

    use Text::EP3::Verilog;

All methods within the module are now available to EP3 as directives.

Directive Extension Method 2: The command line switch.

A module can be included at run time with the -module modulename switch on the command line (assuming the ep3_parse_command_line method is invoked). The modulename is assumed to have a .pm extension and exist somewhere in the directories specified in @INC. All methods within the module are now available to EP3 as directives.

Directive Extension Method 3: The ep3_modules accessor method.

Modules can be added by using the accessor method ep3_modules.

    $object->ep3_modules("module1","module2", ....);

All methods within the module are now available to EP3 as directives.

Directive Extension Method 4: Embedded in the source code or included files.

Using the perl_begin and perl_end directives to delineate perl sections, subroutines can be declared (as methods) anywhere in a processed file or in a file that the process file includes. In this way, runtime methods are made available to EP3. For example ...

    1 Text to be printed ...
    sub hello {
        my $self = shift;
        print "Hello there\n";
    2 Text to be printed ...
    3 Text to be printed ...
    would result in
    1 Text to be printed ...
    2 Text to be printed ...
    Hello there 
    3 Text to be printed ...

Using this method, libraries of directives can be built and included with the include directive (but it is recommended that they be moved into a module when they become static).

Input Files and Processing

Input files are processed one line at a time. The EP3 engine attempts to perform substitutions with elements stored in macro/define/replace lists. All directive lines are preprocessed before being evaluated (the only exception being the key portions of the if[n]def and define directives). Directive lines can be extended across multiple lines by placing the `\' character at the end of each line. Comments are normally protected from the preprocessor, but protection can be dynamically turned off and then back on. From a command line switch, comments can also be deleted from the output.

Output Files

EP3 typically writes output to Perl's STDOUT, but can be assigned to any output file. EP3 can also be run in "dependency check" mode via a command line switch. In this mode, normal output is suppressed, and all dependent files are output in the order accessed. NOTE! EP3 uses the select call to change the default output file for included perl blocks. However, if you are using a method invocation of ep3, note that the default output for the rest of your script will be changed as well. (This can be easily worked with, but should be known beforehand).

Most parameters can be modified before invoking EP3 including directive string, comment delimeters, comment protection and inclusion, include path, and startup defines.

Standard Directives

EP3 defines a standard set of preprocessor directives with a few special additions that integrate the power of Perl into the coded language.

The define directive

@define key definition The define directive assigns the definition to the key. The definition can contain any character including whitespace. The key is searched for as an individual word (i.e the input to be searched is tokenized on Perl word boundaries). The definition contains everything from the whitespace following the key until the end of the line.

The replace directive

@replace key definition The replace directive is identical to the define directive except that the substitution is performed if the key exists anywhere, not just on word boundaries.

The macro directive

@macro key(value[,value]*) definition The macro directive tokenizes as the define directive, replacing the key(value,...) text with the definition and saving the value list. The definition is then parsed and the original macro values are replaced with the saved values.

The eval directive

@eval key expr The eval directive first evaluates the expr using Perl. Any valid Perl expr is accepted. This key is then defined with the result of the evaluation.

The include directive

@include <file> or "file" [condition] The include directive looks for the "file" in the present directory, and <file> anywhere in the include path (definable via command line switch). Included files are recursively evaluated by the preprocessor. If the optional condition is specified, only those lines in between the text strings "@mark condition_BEGIN" and "@mark condition_END" will be included. The condition can be any string. For example if the file "file.V" contains the following lines:

    1 Stuff before
    @mark PORT_BEGIN
    2 Stuff middle
    @mark PORT_END
    3 Stuff after

Then any file with the following line:

    @include "file.V" PORT 

will include the following line from file.V

    2 Stuff middle

This is useful for partial inclusion of files (like port list specifications in Verilog).

The enum directive

@enum a,b,c,d,... enum generates multiple define's with each sequential element receiving a 1 up count from the previous element. Default starts at 0. If any element is a number, the enum value will be set to that value.

The ifdef and ifndef directives

@ifdef and @ifndef key Conditional compilation directives. The key is defined if it was placed in the define/replace list by define, replace, or any command that generates a define or replace.

The if directive

@if expr The expression is evaluated using Perl. The expression can be any valid Perl expression. This allows for a wide range of conditional compilation.

The elif [elsif] directive

@[elif|elsif] key | expr The else if directive. Used for either "if[n]def" or "if".

The else directive

@else The else directive. Used for either "if[n]def" or "if".

The endif directive

@endif The conclusion of any "if[n]def" or "if" block.

The comment directive

@comment on|off|default|previous The comment switch can be one of "on", "off", "default", or "previous". This is used to turn comments on or off in the resultant file. This directive is very useful when including other files with commented header descriptions. By using "comment off" and "comment previous" surrounding a header the output will not see the included files comments. Using "comment on" with "comment previous" insures that comments are included (as in an attached synthesis directive file). The default comment setting is on. This can be altered by a command line switch. The "comment default" directive will restore the comment setting to the EP3 invocation default.

The ep3 directive

@ep3 on|off The "ep3 off" directive turns off preprocessing until the "ep3 on" directive is encountered. This can greatly speed up processing of large files where postprocessing is only necessary in small chunks.

The perl_begin and perl_end directives

@perl_begin perl code here .... (Single line and multi-line output mechanisms are available)

@> text to be output after variable interpolation or

@>> text to be output

    after variable interpolation



The "perl" directives provide the underlying language with all of the power of perl, embedded in the preprocessed code. Anything enclosed within the "perl_begin" and "perl_end" directives will be evaluated as a Perl script. This can be used to include a subroutine that can later be called as a directive. Using this type of extension, directive libraries can be developed and included to perform a variety of powerful source code development features. This construct can also be used to mimic and expand the VHDL generate capabilities. The "@>" and "@>> @<<" directives from within a perl_[begin|end] block directs ep3 to perform variable interpolation on the given line and then print it to the output.

The debug directive

@debug on|off|value The debug directive enables debug statements to go to the output file. The debug statements are preceded by the Line Comment string. Currently the debug values that will enable printouts are the following:

    0x01  1  - Primary messages (Entering Subroutines)
    0x02  2  - ep3_process Engine
    0x04  4  - define (replace, macro, eval, enum)
    0x08  8  - include
    0x10  16 - if (else, ifdef, etc.)
    0x20  32 - perl_begin/end

EP3 Methods

EP3 defines several methods that can be invoked by the user.


Execute sets up EP3 to act like a perl script. It parses the command line, includes any modules specified on the command line, loads in any specified modules, does any preexisting defines, sets up the output files, and then processes the input. Sort of the whole shebang.


ep3_parse_command_line does just that - parses the command line looking for EP3 options. It uses the GetOpt::Long module.


This method will find and include any modules specified as arguments. It expects just the name and will append .pm to it before doing a require. The module returns the methods specified in the objects methods array.


ep3_output_file determines what the output should be (either the processed text or a list of dependencies) and where it should go. It then proceeds to open the required output files. NOTE! - this module uses select to change the default output file. The module returns the output filename.


ep3_reset resets all of the internal EP3 lists (defines, replaces, keycounts, etc.) so that a user can do multiple files independently from within one script.

ep3_process([$filename [$condition]])

ep3_process is the guts of the whole thing. It takes a filename as input and produces the specified output. This is the method that is iteratively called by the include directive. A null filenam will cause ep3_process to look for filenames in ARGV.


This method will add the specified directories to the ep3 include path.


This method will initialize defines with string1 defined as string 2. It initializes all of the defines in the objects Defines array.


This method sets the end_comment string to the value specifed. If null, the method returns the current value.


This method sets the start_comment string to the value specifed. If null, the method returns the current value.


This method sets the end_commenline string to the value specifed. If null, the method returns the current value.


This method sets the delimeter string to the value specifed. If null, the method returns the current value.


This method enables/disables dependency list generation. When gen_depend_list is 1, a dependency list is generated. When it is 0, normal operation occurs. If null, the method returns the current value.


This method sets the keep_comments variable to the value specifed. If null, the method returns the current value.


This method sets the protect_comments variable to the value specifed. If null, the method returns the current value.

EP3 Options

EP3 Options can be set from the command line (if ep3_execute or ep3_parse_command_line is invoked) or the internal variables can be explicitly set.

    Should comments be protected from substution? 
    Default: 1
    Should comments be passed to the output?
    Default: 1
    Are we generating a dependency list or simply processing?
    Default: 0
-delimeter string
    The directive delimeter - can be a string
    Default: @
-define string1=string2
    Defines from the command line. 
    Multiple -define options can be specified
    Default: ()
-includes directory
    Where to look for include files. 
    Multiple -include options can be specified
    Default: ()
-output_filename filename
    Where to place the output. 
    Default: STDOUT
-modules filename
    Modules to load (just the module name, expecting to find somewhere in @INC. 
    Multiple -modules options can be specified
    Default: ()
-line_comment string
    The Line Comment string. 
    Default: //
-start_comment string
    The Start Comment string. 
    Default: /*
-end_comment string
    The End Comment string. 
    Default: */


Gary Spivey, Dept. of Defense, Ft. Meade, MD.

Many thanks to Steve Bresson for his help, ideas, and code ...