The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Filter::Simple - Simplified source filtering

SYNOPSIS

 # in MyFilter.pm:

         package MyFilter;

         use Filter::Simple;
         
         FILTER { ... };

         # or just:
         #
         # use Filter::Simple sub { ... };

 # in user's code:

         use MyFilter;

         # this code is filtered

         no MyFilter;

         # this code is not

DESCRIPTION

The Problem

Source filtering is an immensely powerful feature of recent versions of Perl. It allows one to extend the language itself (e.g. the Switch module), to simplify the language (e.g. Language::Pythonesque), or to completely recast the language (e.g. Lingua::Romana::Perligata). Effectively, it allows one to use the full power of Perl as its own, recursively applied, macro language.

The excellent Filter::Util::Call module (by Paul Marquess) provides a usable Perl interface to source filtering, but it is often too powerful and not nearly as simple as it could be.

To use the module it is necessary to do the following:

  1. Download, build, and install the Filter::Util::Call module.

  2. Set up a module that does a use Filter::Util::Call.

  3. Within that module, create an import subroutine.

  4. Within the import subroutine do a call to filter_add, passing it either a subroutine reference.

  5. Within the subroutine reference, call filter_read or filter_read_exact to "prime" $_ with source code data from the source file that will use your module. Check the status value returned to see if any source code was actually read in.

  6. Process the contents of $_ to change the source code in the desired manner.

  7. Return the status value.

  8. If the act of unimporting your module (via a no) should cause source code filtering to cease, create an unimport subroutine, and have it call filter_del. Make sure that the call to filter_read or filter_read_exact in step 5 will not accidentally read past the no. Effectively this limits source code filters to line-by-line operation, unless the import subroutine does some fancy pre-pre-parsing of the source code it's filtering.

For example, here is a minimal source code filter in a module named BANG.pm. It simply converts every occurrence of the sequence BANG\s+BANG to the sequence die 'BANG' if $BANG in any piece of code following a use BANG; statement (until the next no BANG; statement, if any):

        package BANG;
 
        use Filter::Util::Call ;

        sub import {
            filter_add( sub {
                my $caller = caller;
                my ($status, $no_seen, $data);
                while ($status = filter_read()) {
                        if (/^\s*no\s+$caller\s*;\s*?$/) {
                                $no_seen=1;
                                last;
                        }
                        $data .= $_;
                        $_ = "";
                }
                $_ = $data;
                s/BANG\s+BANG/die 'BANG' if \$BANG/g
                        unless $status < 0;
                $_ .= "no $class;\n" if $no_seen;
                return 1;
            })
        }

        sub unimport {
            filter_del();
        }

        1 ;

This level of sophistication puts filtering out of the reach of many programmers.

A Solution

The Filter::Simple module provides a simplified interface to Filter::Util::Call; one that is sufficient for most common cases.

Instead of the above process, with Filter::Simple the task of setting up a source code filter is reduced to:

  1. Set up a module that does a use Filter::Simple and then calls FILTER { ... }.

  2. Within the anonymous subroutine or block that is passed to FILTER, process the contents of $_ to change the source code in the desired manner.

In other words, the previous example, would become:

        package BANG;
        use Filter::Simple;
        
        FILTER {
            s/BANG\s+BANG/die 'BANG' if \$BANG/g;
        };

        1 ;

Disabling or changing <no> behaviour

By default, the installed filter only filters to a line of the form:

        no ModuleName;

but this can be altered by passing a second argument to use Filter::Simple.

That second argument may be either a qr'd regular expression (which is then used to match the terminator line), or a defined false value (which indicates that no terminator line should be looked for).

For example, to cause the previous filter to filter only up to a line of the form:

        GNAB esu;

you would write:

        package BANG;
        use Filter::Simple;
        
        FILTER {
                s/BANG\s+BANG/die 'BANG' if \$BANG/g;
        }
        => qr/^\s*GNAB\s+esu\s*;\s*?$/;

and to prevent the filter's being turned off in any way:

        package BANG;
        use Filter::Simple;
        
        FILTER {
                s/BANG\s+BANG/die 'BANG' if \$BANG/g;
        }
              => "";
        # or: => 0;

All-in-one interface

Separating the loading of Filter::Simple:

        use Filter::Simple;

from the setting up of the filtering:

        FILTER { ... };

is useful because it allows other code (typically parser support code or caching variables) to be defined before the filter is invoked. However, there is often no need for such a separation.

In those cases, it is easier to just append the filtering subroutine and any terminator specification directly to the use statement that loads Filter::Simple, like so:

        use Filter::Simple sub {
                s/BANG\s+BANG/die 'BANG' if \$BANG/g;
        };

This is exactly the same as:

        use Filter::Simple;
        BEGIN {
                Filter::Simple::FILTER {
                        s/BANG\s+BANG/die 'BANG' if \$BANG/g;
                };
        }

except that the FILTER subroutine is not exported by Filter::Simple.

How it works

The Filter::Simple module exports into the package that calls FILTER (or uses it directly) -- such as package "BANG" in the above example -- two automagically constructed subroutines -- import and unimport -- which take care of all the nasty details.

In addition, the generated import subroutine passes its own argument list to the filtering subroutine, so the BANG.pm filter could easily be made parametric:

        package BANG;
 
        use Filter::Simple;
        
        FILTER {
            my ($die_msg, $var_name) = @_;
            s/BANG\s+BANG/die '$die_msg' if \${$var_name}/g;
        };

        # and in some user code:

        use BANG "BOOM", "BAM";  # "BANG BANG" becomes: die 'BOOM' if $BAM

The specified filtering subroutine is called every time a use BANG is encountered, and passed all the source code following that call, up to either the next no BANG; (or whatever terminator you've set) or the end of the source file, whichever occurs first. By default, any no BANG; call must appear by itself on a separate line, or it is ignored.

AUTHOR

Damian Conway (damian@conway.org)

COPYRIGHT

 Copyright (c) 2000, Damian Conway. All Rights Reserved.
 This module is free software. It may be used, redistributed
and/or modified under the terms of the Perl Artistic License
     (see http://www.perl.com/perl/misc/Artistic.html)