Author image Sébastien Aperghis-Tramoni
and 1 contributors


Syntax::Highlight::HTML - Highlight HTML syntax


Version 0.04


    use Syntax::Highlight::HTML;

    my $highlighter = new Syntax::Highlight::HTML;
    $output = $highlighter->parse($html);

If $html contains the following HTML fragment:

    <!-- a description list -->
    <dl compact="compact">
      <dt>some word</dt>
      <dd>the description of the word. Plus some <a href="/definitions/other_word"
      >reference</a> towards another definition. </dd>

then the resulting HTML contained in $output will render like this:

    <!-- a description list -->
    <dl compact="compact">
      <dt>some word</dt>
      <dd>the description of the word. Plus some <a href="/definitions/other_word"
      >reference</a> towards another definition. </dd>


This module is designed to take raw HTML input and highlight it (using a CSS stylesheet, see "Notes" for the classes). The returned HTML code is ready for inclusion in a web page.

It is intented to be used as an highlighting filter, and as such does not reformat or reindent the original HTML code.



The constructor. Returns a Syntax::Highlight::HTML object, which derives from HTML::Parser. As such, any HTML::parser method can be called on this object (that is, expect for parse() which is overloaded here).


  • nnn - Activate line numbering. Default value: 0 (disabled).

  • pre - Surround result by <pre>...</pre> tags. Default value: 1 (enabled).


To avoid surrounding the result by the <pre>...</pre> tags:

    my $highlighter = Syntax::Highlight::HTML->new(pre => 0);

Parse the HTML code given in argument and returns the highlighted HTML code, ready for inclusion in a web page.


    $highlighter->parse("<p>Hello, world.</p>");

Internals Methods

The following methods are for internal use only.


HTML::Parser tags handler: highlights a tag.


HTML::Parser text handler: highlights text.


The resulting HTML uses CSS to colourize the syntax. Here are the classes that you can define in your stylesheet.

  • .h-decl - for a markup declaration; in a HTML document, the only markup declaration is the DOCTYPE, like: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">

  • .h-pi - for a process instruction like <?html ...> or <?xml ...?>

  • .h-com - for a comment, <!-- ... -->

  • .h-ab - for the characters '<' and '>' as tag delimiters

  • .h-tag - for the tag name of an element

  • .h-attr - for the attribute name

  • .h-attv - for the attribute value

  • .h-ent - for any entities: &eacute; &#171;

  • .h-lno - for the line numbers

An example stylesheet can be found in eg/html-syntax.css.


Here is an example of generated HTML output. It was generated with the script eg/

The following HTML fragment (which is the beginning of

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
      <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
      <link rel="stylesheet" href="/s/style.css" type="text/css">
      <title> S&#233;bastien Aperghis-Tramoni</title>
     <body id="cpansearch">
    <center><div class="logo"><a href="/"><img src="/s/img/cpan_banner.png" alt="CPAN"></a></div></center>
    <div class="menubar">
     <a href="/">Home</a>
    &middot; <a href="/author/">Authors</a>
    &middot; <a href="/recent">Recent</a>
    &middot; <a href="/news">News</a>
    &middot; <a href="/mirror">Mirrors</a>
    &middot; <a href="/faq.html">FAQ</a>
    &middot; <a href="/feedback">Feedback</a>
    <form method="get" action="/search" name="f" class="searchbox">
    <input type="text" name="query" value="" size="35">
    <br>in <select name="mode">
     <option value="all">All</option>
     <option value="module" >Modules</option>
     <option value="dist" >Distributions</option>
     <option value="author" >Authors</option>
    </select>&nbsp;<input type="submit" value="CPAN Search">

will be rendered like this (using the CSS stylesheet eg/html-syntax.css):

  1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
  2 <html>
  3  <head>
  4   <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  5   <link rel="stylesheet" href="/s/style.css" type="text/css">
  6   <title> S&#233;bastien Aperghis-Tramoni</title>
  7  </head>
  8  <body id="cpansearch">
  9 <center><div class="logo"><a href="/"><img src="/s/img/cpan_banner.png" alt="CPAN"></a></div></center>
 10 <div class="menubar">
 11  <a href="/">Home</a>
 12 &middot; <a href="/author/">Authors</a>
 13 &middot; <a href="/recent">Recent</a>
 14 &middot; <a href="/news">News</a>
 15 &middot; <a href="/mirror">Mirrors</a>
 16 &middot; <a href="/faq.html">FAQ</a>
 17 &middot; <a href="/feedback">Feedback</a>
 18 </div>
 19 <form method="get" action="/search" name="f" class="searchbox">
 20 <input type="text" name="query" value="" size="35">
 21 <br>in <select name="mode">
 22  <option value="all">All</option>
 23  <option value="module" >Modules</option>
 24  <option value="dist" >Distributions</option>
 25  <option value="author" >Authors</option>
 26 </select>&nbsp;<input type="submit" value="CPAN Search">
 27 </form>


Syntax::Highlight::HTML relies on HTML::Parser for parsing the HTML and therefore suffers from the same limitations.




Sébastien Aperghis-Tramoni, <>


Please report any bugs or feature requests to, or through the web interface at I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.


Copyright (C)2004 Sébastien Aperghis-Tramoni, All Rights Reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.