Pod::POM::View::HTML::Filter - Use filters on sections of your pod documents
In your POD:
Some coloured Perl code: =begin filter perl # now in full colour! $A++; =end filter =for filter=perl $A++; # this works too This should read C<bar bar bar>: =begin filter foo bar foo bar =end filter
In your code:
my $view = Pod::POM::View::HTML::Filter->new; $view->add( foo => { code => sub { my $s = shift; $s =~ s/foo/bar/gm; $s }, # other options are available } ); my $pom = Pod::POM->parse_file( '/my/pod/file' ); $pom->present($view);
The resulting HTML will look like this (modulo the stylesheet):
# now in full colour! $A++;
$A++; # this works too
This should read bar bar bar:
bar bar bar
This module is a subclass of Pod::POM::View::HTML that support the filter extension. This can be used in =begin / =end and =for pod blocks.
Pod::POM::View::HTML
filter
=begin
=end
=for
Please note that since the view maintains an internal state, only an instance of the view can be used to present the POM object. Either use:
my $view = Pod::POM::View::HTML::Filter->new; $pom->present( $view );
or
$Pod::POM::DEFAULT_VIEW = Pod::POM::View::HTML::Filter->new; $pom->present;
Even though the module was specifically designed for use with Perl::Tidy, you can write your own filters quite easily (see "Writing your own filters").
Perl::Tidy
The whole idea of this module is to take advantage of all the syntax colouring modules that exist (actually, Perl::Tidy was my first target) to produce colourful code examples in a POD document (after conversion to HTML).
Filters can be used in two different POD constructs:
=begin filter filter
The data in the =begin filter ... =end filter region is passed to the filter and the result is output in place in the document.
=begin filter
=end filter
The general form of a =begin filter block is as follow:
=begin filter lang optionstring # some text to process with filter "lang" =end filter
The optionstring is trimed for whitespace and passed as a single string to the filter routine which must perform its own parsing.
=for filter=filter
=for filters work just like =begin/C=<end> filters, except that a single paragraph is the target.
The general form of a =for filter block is as follow:
=for filter
=for filter=lang:option1:option2 # some code in language lang
The option string sent to the filter lang would be option1 option2 (colons are replaced with spaces).
lang
option1 option2
Some filters may accept options that alter their behaviour. Options are separated by whitespace, and appear after the name of the filter. For example, the following code will be rendered in colour and with line numbers:
=begin filter perl -nnn $a = 123; $b = 3; print $a * $b; # prints 369 print $a x $b; # prints 123123123 =end filter
=for filters can also accept options, but the syntax is less clear. (This is because =for expects the formatname to match \S+.)
\S+
The syntax is the following:
=for filter=html:nnn=1 <center><img src="camel.png" /> A camel</center>
In summary, options are separated by space for =begin blocks and by colons for =for paragraphs.
The options and their paramater depend on the filter, but they cannot contain the pipe (|) or colon (:) character, for obvious reasons.
|
:
Having filter to modify a block of text is usefule, but what's more useful (and fun) than a filter? Answer: a stack of filters piped together!
Take the imaginary filters foo (which does a simple s/foo/bar/g) and bang (which does an even simpler tr/r/!/). The following block
foo
s/foo/bar/g
bang
tr/r/!/
=begin filter foo|bar foo bar baz =end
will become ba! ba! baz.
ba! ba! baz
And naturally,
=for filter=bar|foo foo bar baz
will return bar ba! baz.
bar ba! baz
Note: The fact that I mention verbatim and paragraph in this section is due to an old bug in Pod::POM, which parses the content of begin/end sections as the usual POD paragraph and verbatim blocks. This is a bug in Pod::POM, around which Pod::POM::View::HTML::Filter tries to work around.
Pod::POM
begin
end
Pod::POM::View::HTML::Filter
As from version 0.06, Pod::POM::View::HTML::Filter gets to the original text contained in the =begin / =end block (it was easier than I thought, actually) and put that string throught all the filters.
If any filter in the stack is defined as verbatim, or if Pod::POM detect any block in the =begin / =end block as verbatim, then the output will be produced between <pre> and </pre> tags. Otherwise, no special tags will be added (his is left to the formatter).
verbatim
<pre>
</pre>
An example of the power of pipes can be seen in the following example. Take a bit of Perl code to colour:
=begin filter perl "hot cross buns" =~ /cross/; print "Matched: <$`> $& <$'>\n"; # Matched: <hot > cross < buns> print "Left: <$`>\n"; # Left: <hot > print "Match: <$&>\n"; # Match: <cross> print "Right: <$'>\n"; # Right: < buns> =end
This will produce the following HTML code:
<pre> <span class="q">"hot cross buns"</span> =~ <span class="q">/cross/</span><span class="sc">;</span> <span class="k">print</span> <span class="q">"Matched: <$`> $& <$'>\n"</span><span class="sc">;</span> <span class="c"># Matched: <hot > cross < buns></span> <span class="k">print</span> <span class="q">"Left: <$`>\n"</span><span class="sc">;</span> <span class="c"># Left: <hot ></span> <span class="k">print</span> <span class="q">"Match: <$&>\n"</span><span class="sc">;</span> <span class="c"># Match: <cross></span> <span class="k">print</span> <span class="q">"Right: <$'>\n"</span><span class="sc">;</span> <span class="c"># Right: < buns></span></pre>
Which your browser will render as:
"hot cross buns" =~ /cross/; print "Matched: <$`> $& <$'>\n"; # Matched: <hot > cross < buns> print "Left: <$`>\n"; # Left: <hot > print "Match: <$&>\n"; # Match: <cross> print "Right: <$'>\n"; # Right: < buns>
Now if you want to colour and number the HTML code produced, it's as simple as tackling the html on top of the perl filter:
html
perl
=begin filter perl | html nnn=1 "hot cross buns" =~ /cross/; print "Matched: <$`> $& <$'>\n"; # Matched: <hot > cross < buns> print "Left: <$`>\n"; # Left: <hot > print "Match: <$&>\n"; # Match: <cross> print "Right: <$'>\n"; # Right: < buns> =end
Which produces the rather unreadable piece of HTML:
<pre><span class="h-lno"> 1</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>hot cross buns<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> =~ <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span>/cross/<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-lno"> 2</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Matched: <span class="h-ent">&lt;</span>$`<span class="h-ent">&gt;</span> $<span class="h-ent">&amp;</span> <span class="h-ent">&lt;</span>$'<span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Matched: <span class="h-ent">&lt;</span>hot <span class="h-ent">&gt;</span> cross <span class="h-ent">&lt;</span> buns<span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-lno"> 3</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Left: <span class="h-ent">&lt;</span>$`<span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Left: <span class="h-ent">&lt;</span>hot <span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-lno"> 4</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Match: <span class="h-ent">&lt;</span>$<span class="h-ent">&amp;</span><span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Match: <span class="h-ent">&lt;</span>cross<span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-lno"> 5</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Right: <span class="h-ent">&lt;</span>$'<span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Right: <span class="h-ent">&lt;</span> buns<span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span></pre>
But your your browser will render it as:
1 <span class="q">"hot cross buns"</span> =~ <span class="q">/cross/</span><span class="sc">;</span> 2 <span class="k">print</span> <span class="q">"Matched: <$`> $& <$'>\n"</span><span class="sc">;</span> <span class="c"># Matched: <hot > cross < buns></span> 3 <span class="k">print</span> <span class="q">"Left: <$`>\n"</span><span class="sc">;</span> <span class="c"># Left: <hot ></span> 4 <span class="k">print</span> <span class="q">"Match: <$&>\n"</span><span class="sc">;</span> <span class="c"># Match: <cross></span> 5 <span class="k">print</span> <span class="q">"Right: <$'>\n"</span><span class="sc">;</span> <span class="c"># Right: < buns></span>
There were a few things to keep in mind when mixing verbatim and text paragraphs in a =begin block. These problems do not exist any more as from version 0.06.
Because the =begin / =end block is now processed as a single string of text, the following block:
=begin filter html B<foo> =end
will not be transformed into <bfoo</b> > before being passed to the filters, but will produce the expected:
<b
<pre>B<span class="h-ab"><</span><span class="h-tag">foo</span><span class="h-ab">></span></pre>
This will be rendered by your web browser as:
B<foo>
And the same text in a verbatim block
will produce the same results.
<pre> B<span class="h-ab"><</span><span class="h-tag">foo</span><span class="h-ab">></span></pre>
Which a web browser will render as:
Which looks quite the same, doesn't it?
As seen in "A note on verbatim and text blocks", the filter now processes the begin block as a single string of text. So, if you have a filter that replace each * character with an auto-incremented number in square brackets, like this:
*
$view->add( notes => { code => sub { my ( $text, $opt ) = @_; my $n = $opt =~ /(\d+)/ ? $1 : 1; $text =~ s/\*/'[' . $n++ . ']'/ge; $text; } } );
And you try to process the following block:
=begin filter notes 2 TIMTOWDI*, but your library should DWIM* when possible. You can't always claims that PICNIC*, can you? =end filter
You'll get the expected result (contrary to previous versions):
<p>TIMTOWDI[2], but your library should DWIM[3] when possible. You can't always claims that PICNIC[4], can you?</p>
The filter was really called only once, starting at 2, just like requested.
2
Future versions of Pod::POM::View::HTML::Filter may support init, begin and end callbacks to run filter initialisation and clean up code.
init
The following methods are available:
add( lang => { options }, ... )
Add support for one or more languages. Options are passed in a hash reference.
The required code option is a reference to the filter routine. The filter must take a string as its only argument and return the formatted HTML string (coloured accordingly to the language grammar, hopefully).
code
Available options are:
Name Type Content ---- ---- ------- code CODEREF filter implementation verbatim BOOLEAN if true, force the full content of the =begin/=end block to be passed verbatim to the filter requires ARRAYREF list of required modules for this filter
Note that add() is both a class and an instance method.
add()
When used as a class method, the new language is immediately available for all future and existing instances.
When used as an instance method, the new language is only available for the instance itself.
delete( $lang )
Remove the given language from the list of class or instance filters. The deleted filter is returned by this method.
delete() is both a class and an instance method, just like add().
delete()
filters()
Return the list of languages supported.
know( $lang )
Return true if the view knows how to handle language $lang.
$lang
The following Pod::POM::View::HTML methods are overridden in Pod::POM::View::HTML::Filter:
new()
The overloaded constructor initialises some internal structures. This means that you'll have to use a instance of the class as a view for your Pod::POM object. Therefore you must use new.
new
$Pod::POM::DEFAULT_VIEW = 'Pod::POM::View::HTML::Filter'; # WRONG $pom->present( 'Pod::POM::View::HTML::Filter' ); # WRONG # this is CORRECT $Pod::POM::DEFAULT_VIEW = Pod::POM::View::HTML::Filter->new; # this is also CORRECT my $view = Pod::POM::View::HTML::Filter->new; $pom->present( $view );
The only option at this time is auto_unindent, which is enabled by default. This option remove leading indentation from all verbatim blocks within the begin blocks, and put it back after highlighting.
auto_unindent
view_begin()
view_for()
These are the methods that support the filter format.
Pod::POM::View::HTML::Filter is shipped with a few built-in filters.
The name for the filter is obtained by removing _filter from the names listed below (except for default):
_filter
default
This filter is called when the required filter is not known by Pod::POM::View::HTML::Filter. It does nothing more than normal POD processing (POD escapes for text paragraphs and <pre> for verbatim paragraphs.
You can use the delete() method to remove a filter and therefore make it behave like default.
This filter does Perl syntax highlighting with a lot of help from Perl::Tidy.
It accepts options to Perl::Tidy, such as -nnn to number lines of code. Check Perl::Tidy's documentation for more information about those options.
-nnn
This filter does Perl syntax highlighting using PPI::HTML, which is itself based on the incredible PPI.
PPI::HTML
PPI
It accepts the same options as PPI::HTML, which at this time solely consist of line_numbers to, as one may guess, add line numbers to the output.
line_numbers
This filter does HTML syntax highlighting with the help of Syntax::Highlight::HTML.
Syntax::Highlight::HTML
The filter supports Syntax::Highlight::HTML options:
=begin filter html nnn=1 <p>The lines of the HTML code will be numbered.</p> <p>This is line 2.</p> =end filter
See Syntax::Highlight::HTML for the list of supported options.
This filter does shell script syntax highlighting with the help of Syntax::Highlight::Shell.
Syntax::Highlight::Shell
The filter supports Syntax::Highlight::Shell options:
=begin filter shell nnn=1 #!/bin/sh echo "This is a foo test" | sed -e 's/foo/shell/' =end filter
See Syntax::Highlight::Shell for the list of supported options.
This filter support syntax highlighting for numerous languages with the help of Syntax::Highlight::Engine::Kate.
Syntax::Highlight::Engine::Kate
The filter supports Syntax::Highlight::Engine::Kate languages as options:
=begin filter kate Diff Index: lib/Pod/POM/View/HTML/Filter.pm =================================================================== --- lib/Pod/POM/View/HTML/Filter.pm (revision 99) +++ lib/Pod/POM/View/HTML/Filter.pm (working copy) @@ -27,6 +27,11 @@ requires => [qw( Syntax::Highlight::Shell )], verbatim => 1, }, + kate => { + code => \&kate_filter, + requires => [qw( Syntax::Highlight::Engine::Kate )], + verbatim => 1, + }, ); my $HTML_PROTECT = 0; =end filter
Check the Syntax::Highlight::Engine::Kate documentation for the full list of supported languages. Please note that some of them aren't well supported yet (by Syntax::Highlight::Engine::Kate), so the output may not be what you expect.
Here is a list of languages we have successfully tested with Syntax::Highlight::Engine::Kate version 0.02: C, Diff, Fortran, JavaScript, LDIF, SQL.
C
Diff
Fortran
JavaScript
LDIF
SQL
This filter converts the wiki format parsed by Text::WikiFormat in HTML.
Text::WikiFormat
The supported options are: prefix, extended, implicit_links, absolute_links. The option and value are separated by a = character, as in the example below:
prefix
extended
implicit_links
absolute_links
=
=begin filter wiki extended=1 [link|title] =end
This filter converts the wiki format parsed by Text::MediawikiFormat in HTML.
Text::MediawikiFormat
The supported options are: prefix, extended, implicit_links, absolute_links and process_html. The option and value are separated by a = character.
process_html
Write a filter is quite easy: a filter is a subroutine that takes two arguments (text to parse and option string) and returns the filtered string.
The filter is added to Pod::POM::View::HTML::Filter's internal filter list with the add() method:
$view->add( foo => { code => \&foo_filter, requires => [], } );
When presenting the following piece of pod,
=begin filter foo bar baz Some text to filter. =end filter
the foo_filter() routine will be called with two arguments, like this:
foo_filter()
foo_filter( "Some text to filter.", "bar baz" );
If you have a complex set of options, your routine will have to parse the option string by itself.
Please note that in a =for construct, whitespace in the option string must be replaced with colons:
=for filter=foo:bar:baz Some text to filter.
The foo_filter() routine will be called with the same two arguments as before.
Each filter uses its own CSS classes, so that one can define their favourite colours in a custom CSS file.
Perl::Tidy's HTML code looks like:
<span class="i">$A</span>++<span class="sc">;</span>
Here are the classes used by Perl::Tidy:
n numeric p paren q quote s structure c comment v v-string cm comma w bareword co colon pu punctuation i identifier j label h here-doc-target hh here-doc-text k keyword sc semicolon m subroutine pd pod-text
ppi
PPI::HTML uses the following CSS classes:
comment double heredoc_content interpolate keyword for language keywords (my, use line_number number operator for language operators pragma for pragmatas (strict, warnings) single structure for syntaxic symbols substitute symbol word for module, function and method names words match
Syntax::Highlight::HTML makes use of the following classes:
h-decl declaration # declaration <!DOCTYPE ...> h-pi process # process instruction <?xml ...?> h-com comment # comment <!-- ... --> h-ab angle_bracket # the characters '<' and '>' as tag delimiters h-tag tag_name # the tag name of an element h-attr attr_name # the attribute name h-attv attr_value # the attribute value h-ent entity # any entities: é «
shell
Syntax::Highlight::Shell makes use of the following classes:
s-key # shell keywords (like if, for, while, do...) s-blt # the builtins commands s-cmd # the external commands s-arg # the command arguments s-mta # shell metacharacters (|, >, \, &) s-quo # the single (') and double (") quotes s-var # expanded variables: $VARIABLE s-avr # assigned variables: VARIABLE=value s-val # shell values (inside quotes) s-cmt # shell comments
kate
Output formatted with Syntax::Highlight::Engine::Kate makes use of the following classes:
k-alert # Alert k-basen # BaseN k-bstring # BString k-char # Char k-comment # Comment k-datatype # DataType k-decval # DecVal k-error # Error k-float # Float k-function # Function k-istring # IString k-keyword # Keyword k-normal # Normal k-operator # Operator k-others # Others k-regionmarker # RegionMarker k-reserved # Reserved k-string # String k-variable # Variable k-warning # Warning
The goal behind this module was to produce nice looking HTML pages from the articles the French Perl Mongers are writing for the French magazine GNU/Linux Magazine France (http://www.linuxmag-france.org/).
The resulting web pages can be seen at http://articles.mongueurs.net/magazines/.
Philippe "BooK" Bruhat, <book@cpan.org>
<book@cpan.org>
Many thanks to Sébastien Aperghis-Tramoni (Maddingue), who helped debugging the module and wrote Syntax::Highlight::HTML and Syntax::Highlight::Shell so that I could ship PPVHF with more than one filter. He also pointed me to Syntax::Highlight::Engine::Kate, which led me to clean up PPVHF before adding support for SHEK.
Perl code examples where borrowed in Amelia, aka Programming Perl, 3rd edition.
There are a few other syntax highlighting modules on CPAN, which I should try to add support for in Pod::POM::View::HTML::Filter:
Syntax::Highlight::Universal
Syntax::Highlight::Mason
Syntax::Highlight::Perl (seems old)
Syntax::Highlight::Perl
Syntax::Highlight::Perl::Improved
Please report any bugs or feature requests to bug-pod-pom-view-html-filter@rt.cpan.org, or through the web interface at http://rt.cpan.org. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
bug-pod-pom-view-html-filter@rt.cpan.org
Copyright 2004 Philippe "BooK" Bruhat, All Rights Reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
Non-ASCII character seen before =encoding in 'Sébastien'. Assuming CP1252
To install Pod::POM::View::HTML::Filter, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Pod::POM::View::HTML::Filter
CPAN shell
perl -MCPAN -e shell install Pod::POM::View::HTML::Filter
For more information on module installation, please visit the detailed CPAN module installation guide.