Pen - Perl Embedding Notation (Yet another parser for embedding Perl in HTML)


  use Pen ;
  Pen->new( $htmlfilename ) ;
  Pen->new( $htmlfilename, { path => ..., blockend => ..., errorlog => ... } ) ;
  Pen->new( { request => ..., path => ..., blockend => ..., errorlog => ... } ) ;


Pen performs simple in-line substitution of Perl code. Its syntax is consistent with SGML and HTML, but can be used on any file type.

Pen recognizes the following syntax as a Perl expression and performs a literal interpretation.

    &subroutine( args1, args2 )

The entire expression is replaced by the subroutine's return value. The syntax reflects a usage which distinguishes defined subroutines from Perl's internal functions. For example,

    &join( '/', @path )

    fails unless join() is explicitly defined. Some predefined functions deliberately overlap with internal functions, for example, eval() and undef(). In this documentation, the term subroutine reflects an explicit definition using Perl's sub keyword and avoids ambiguity with internal Perl functions.

Pen functions eval() and undef() are the most basic. Everything delimited by the parentheses is interpreted as a Perl expression. eval() returns output and undef() does not.

The subroutine expression must be enclosed within an SGML or HTML tag. The following defines two complete Perl expressions in Pen:

    <!-- &eval( 'Hello World' ) -->
    <td style="background: &eval( $rowct++ %2? '#ffffff': '#cccccc' )">


    <!-- Hello World -->
    <td style="background: #cccccc">

Both of these examples demonstrate how Pen performs in-line substitution. As HTML, the first example is pretty useless. But the second example demonstrates Pen's effectives for granular in-line substitution.

The colon switch changes the substitution pattern to slurp everything between the <> delimiters.

    <!--: &eval( 'Hello World' ) -->
    <: testing &eval( 'Hello World' )>
    <: &( 'Hello World' )>

    returns three identical lines:

    Hello World
    Hello World
    Hello World

The first example hides the Perl expression inside an SGML comment, which may be useful for WYSIWYG editors. The second example demonstrates that the SGML comment formatting is optional, and includes the term testing to underscore everything that the Pen interpreter discards. And the third example illustrates that the eval() subroutine can be invoked implicitly.

Everything is processed within the Pen name space. Non-localized variables persist until the constructor is destroyed, normally at the end of the document. Then the entire Pen namespace is cleared. Nothing persists after the HTTP transaction is complete.


The Pen package predefines the following subroutines:

eval or evaluate

eval() takes a Perl expression as an argument, with no delimiters other than the subroutine parentheses. The output, which replaces the in-line expression, is the evaluated result.

If the expression fails, nothing is returned. The error is written into the file specified by the errorlog definition in the constructor, or $pen->{errorlog}.


evalError() is nearly identical, except it returns any thrown errors.


Many Perl expressions return a value as a side effect. For example,

    <: &( $ctr = 0 )>



The following two lines illustrate alternative solutions that print no output:

    <: &( $ctr = 0 ; undef )>
    <: &undef( $ctr = 0 )>


The include() subroutine performs like the traditional SSI directive, and outputs a separate file as an in-line substitution. Its argument is either a file name with a fully qualified path, or a filename whose path is supplied to the constructor, stored as $pen->{path}.

include() takes a positive integer as an optional second argument. Pen will skip this number of lines at the beginning of the included file. This option makes it possible to include the same file in a variety of contexts if the included file starts with a series of skip() directives.


do() also takes a file argument and interpretes that file directly as Perl script.

do() takes an optional boolen as a second argument. If true, any encountered errors will be returned.


filePath() takes a file name as an argument and returns its fully qualified path, using $pen->{path} if necessary.


Pen processes files line by line. Enclosing HTML or SGML brackets must be on a single line, as is common practice. Several Pen functions operate on lines of text.

For example, comment() slurps the remainder of the line.


skip() slurps the remainder of the line plus an additional number of lines as specified by its argument. skip( 0 ) is equivalent to comment(), used to swallow linefeeds for fussy file formats other than HTML. Another idiom, skip( -1 ) slurps all remaining lines in the current file.

skip() should be used carefully. Any argument greater than two or three becomes a headache. See the block() subroutine as an alternative. skip() is a useful tool for commenting multiple lines and also provides if/then functionality.

    <: &is( ! ref $users->{$userid} )><: &skip(2)>
      Welcome <: &( $users->{$userid}->{name} )>!
    <: &skip(1)>
      Unknown User


Two commands take the line remainder as part of the argument as shown in these examples:

    <: evalLine()>print "Hello World"


    Hello World


printf() returns a line for each element in its array argument, This argument can be a one or two dimensional array. An example of a two dimesional array:

    <: &undef( @data = ( [ -1 => 'New User' ], [ 1 => 'Jim S' ] ) )>
    <: printf( @data )><select value="%s">%s</select>


    <select value="-1">New User</select>
    <select value="1">Jim S</select>


block( *BLOCK, 'end' ) also slurps lines and saves them as a specified array. The first argument is an array reference. The glob style reference is easiest to use. The second argument is the closing delimiter, that appears immediately after the specified lines. This delimiter must be at the beginning of the line and followed by optional whitespace. The delimiter argument is optional if $pen->{blockend} is defined.

    <: &block( *BLOCK, 'end' )><!--
    end --></code>

This block of code prints out absolutely nothing and creates the array @BLOCK consisting of 5 one word lines. Note each element corresponds to a line of text, including the terminating newline. block() is intended to reuse portions of the file. Here's a common example:


    <: &block( *BADLOGIN, 'end' )><!--
      <script type="text/javascript">
        alert( "Invalid Login" ) ;
        location.back() ;
    end -->
    <: &is( ! ref $users->{$userid} )><: &displayBlock( @BADLOGIN )>

Blocks of HTML and, in particular, Pen HTML, are useful for conditionally displaying content. The iterate() subroutine is used to display a block repeatedly over a data set.


There are 3 techniques for representing Perl script in a Pen document.

1. Inline, evaluating a single subroutine or expression
2. Using do() to evaluate an entire Perl script file.
3. Defining multiple lines of script as a block.

The last technique is implemented as follows:

    <: &block( *PERL, 'end' )><!--
      use Pen::ContentManager ;
      $doc = new Pen::ContentManager $docid ;
      @pages = $doc->pages() ;
    end -->

Use evalBlock() to process a block of script:

    <: &evalBlock( @PERL )>


script() combines definition and evaluation of the block script. This example requires that $pen->{blockend} be defined, normally as an argument to the constructor.

    <: &script()><%
      use Pen::ContentManager ;
      $doc = new Pen::ContentManager $docid ;
      @pages = $doc->pages() ;


Pen makes it easy to send email from a website. mailBlock() demonstrates a simple implementation. The email headers are embedded inside the block:

    <: &block( *EMAIL, 'end' )><!--
    To: "<: &( $$user{firstname}.' '.$$user{lastname} ) )>" <: &skip(0)>
    <&( $$user{email} )>
    From: "Do Not Reply" <>
    Subject: Confirmation of your website visit

    <: &include('confirmation.txt')>
    end -->

    <: &mailBlock( @EMAIL )>

The interpreted block can be piped into any application defined by the mailprogram configuration, $env->{mailprogram}.


Mime::Lite is a useful tool for sending email as an HTML attachment. loadBlock() dumps the Pen output into a referenced variable instead of printing it to the output stream:

    <: &block( *HTML, 'end' )><!--
    <: &include( 'confirmation.htm' )>
    end -->

    <: &loadBlock( \$html, @HTML )>


    <: &( $html = loadBlock( @HTML ) )>

    <: &( $mime->attach( $html, 'text/html' ) )>


iterate() takes two arguments. The first is an iterator; the second a block array. The iterator is a glob that represents a data array. The block is is interpreted repeatedly over each data element.


Now somewhat archaic, globs provide an alternative technique for passing a variable by reference. For example, the two statements below are equivalent:

    <: &block( *EVENT )>
    <: &block( \@EVENT )>

(Uppercase names are recommended for block definitions.)

A Pen iterator is always passed as a glob. An iterator must be declared using the iterator subroutine, which also defines the iterator with additional arguments: either array elements or an array reference:

    <: &iterator( *event, @data )>
    <: &iterator( *event, \@data )>

Although not strictly a reference, either defintion has the effect: @event = @data.

The most simple example is an interator representing an array of scalars:

    ( @states = qw( Alabama .. Wyoming ) )

    <: &iterator( *states, @states )>

In this case, since *states is already the glob equivalent of the second argument @states. iterator() can be called as a declaration with a single argument:

    <: &iterator( *states )>

Here's the rest of the example:

    <: &block( *SELECT, 'end' )><!--
      <option><: &( $states )></option>
    end -->
    <: &iterate( *states, @SELECT )>

iterator() references each data set element as the scalar version of the glob. Since the glob is *states, each element is accessed as a scalar with the same name, $states.

As a slightly more complicated example, define @states this way:

    @states = ( [ AL => "Alabama" ] .. [ WY => "Wyoming" ] )

    <: &iterator( *states )>
    <: &block( *SELECT, 'end' )><!--
      <option value="&( $$states[0] )"><: &( $$states[1] )></option>
    end -->
    <: &iterate( *states, @SELECT )>

    Or if each element is a hash reference:

    <: &block( *SELECT, 'end' )><!--
      <option value="&( $$states{abbreviation} )">
        <: &( $$states{name} )></option>
    end -->
    <: &iterate( *states, @SELECT )>

Regardless of whether the data set elements are scalars or referenced data structures, a one dimensional array is fairly easy to implement as an iterator.

Iterators are designed to handle multi-dimensional arrays as well. To display the 50 states in a table of 10 rows and 5 columns, first construct a tabular data set, then define an HTML table consisting of a block of rows and a block of columns:

    <: &script('end')><%
      @states = qw( Alabama .. Wyoming ) ;
      @table = () ;
      push @table, [ splice @states, 0, 5 ] while @states ;
      iterator( *states, @table ) ; ## warning: obliterates @states
    end %>

    <: &block( *COLUMN, 'end' )><%
      <td><: &( $states )></td>
    end %>

    <: &block( *ROW, 'end' )><%
      <!-- &( $states ) - displays "ARRAY(0xa19b0a8)" -->
      <tr><: &iterate( *states, @COLUMN )></tr>
    end %>

      <: &iterate( *states, @ROW )>

This example illustrates blocks that are nested to correspond with the data set, in parent-child relationships. Each block recurses by calling iterate() with the common iterator and the name of the child block.

HTML::Pen::Iterator illustrates a relatively sophisticated example of a 4 dimensional data set calendar to be presented as a table. The iterator data consists of an array of weeks; each week consists of an array of days; each day consists of an array of times; each time consists of an array of events. Each event is represented by an event object that is a blessed hash reference.

With a Pen iterator, the rendering code is a few simple lines of HTML. The complexity is absorbed in the data structure, which should be defined as follows:

  \@week -> \@day -> \@time -> \%event

In this example, each day has an integer property (1-31) and each time object has a property (hh::mm) which cannot be included within an array definition. The solution is for each event object to inherit all the properties of its forebears, so that it includes both day and time values.


The advantage of this approach is that every recursion can access the same property values, regardless of its position in the stack, by calling iteratorValue(). This subroutine takes the iterator as its first argument, and a property string as its second. iteratorValue() returns the corresponding value of the bottom-most hash object.

The disadvantage is that every object's properties are replicated across all its descendents. The redundancy cost in the size of the data footprint may be quite high. Alternatively, define each object as a hash reference, and maintain its descendents in an array reference named elements. Then redefine the scalar before calling iterate():

    <: &comment()><!-- for every block -->
    <: &undef( $event = $event->{elements} )>
    <: &iterate( *event, @CHILDBLOCK )>

Note: iteratorValue() is not extended to cover these more complex data structures.

When iterate() is called, the scalar representation of the iterator glob must be an array reference or the subroutine does nothing. This mechanism ensures that no output is displayed for empty data sets.


None by default.


Mention other useful documentation such as the documentation of related modules or operating system documentation (such as man pages in UNIX), or any relevant external documentation such as RFCs or standards.

If you have a mailing list set up for your module, mention it here.

If you have a web site set up for your module, mention it here.


Jim Schueler, <>


Copyright (C) 2011 by Jim Schueler

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.9 or, at your option, any later version of Perl 5 you may have available.