Simon Wistow


Text::Chump - a module for parsing Chump like syntax


        use Text::Chump;

        my $tc = Text::Chump->new();

        $tc->chump('[all mine!|]');
        # returns <a href=''>all mine!</a>

        $tc->chump('+[all mine!|]');
        # returns <img src='' alt='all mine!'>

        # returns <a href=''>http;//</a>

        my $tc = Text::Chump->new({images=>0});

        $tc->chump('+[all mine!|]');
        # returns '+[all mine!|]'

        sub foo {
                my ($url, $label) = @_;

                return "$label ($url)";

        # returns 'foo ('

        sub quirka {
                my ($opts, $match, $label) = @_;

                return "<a href="blog.cgi?entry=$match">$label</a>";


        # returns "<a href="blog.cgi?entry=4444">stuff</a>"


Chump is an IRC bot that allows people to post links and comments onto a website from within an IRC Channel. Some people call this a blog but I hate that term. Hate it. *HATE IT*! ... *cough* ... so I'll avoid it from now on.

The Chump is based on an original idea by Bijan Parsia. Bijan wrote a bot in Squeak called DiaWebLogBot, which powers the Monkeyfist Daily Churn and subsequently Useful Inc. "stole all his good ideas". Therefore The Chump syntax is derived and extended from diaweblogbot.

The bot is available from and the original page that uses this form of markup is

The items which are displayed on the page can have a special format. These, in turn get marked up as HTML (by default). Essentially this provides a simple markup language. Yes - they could have used XML and been fully buzzword compliant (it uses XML in the backend if that's any help) but they didn't.

Since then the syntax has been appropriated by a number of projects including one of my own, so, like the good little code that I am, it all went in a module.

Which I may as well release because somebody else wants to release a module which depends on it and it might be useful to someone else.

Alternatives to this module include Text::WikiFormat and HTML::FromText although they do subtly different things. In fact you could probably chain them together - especially HTML::FromText with uri set to 0.


As described here

  • Links :


    This creates an inline link (i.e. turning a word into a link). So, for example

      They also have [another site|]

    will make the words "another site" appear as a hyperlink to the URL

  • Images :


    This creates an inline image in some text. By providing some text you can provide an alt tag which is considered a good thing to do.

      +[This is the alt text|]

    By providing a url in the middle

      +[This is the alt text||]

    You can turn the image into a clickable link.

  • Urls :

    this will be turned into a clicable link.


new <opts>

Can take an hashref of options (target defaults to nothing, border defaults to 0, everything else defaults to 1 == yes)

  • target :

    A default target for a URL (such as _blank)

  • border :

    Whether inline images should have a border

  • images :

    Whether to process image markup

  • links :

    Whether to process link markup

  • urls :

    Whether to process urls

new_type [name] [char] [coderef] <regexp>

Installs a new type so that if the parser comes across


then the parts will be passed to the coderef in the normal way. If you pass in a regexp then that will be used to determine the match, just like if you install a new handler.

In order to turn off handling of the new type pass in

        $opt->{"${name}s"} = 0;

as the options to chump(). So

        my $text = 'foo bar %[foo|]';

        $mc->new_type('percent','%', sub { return $_[1] });


        'foo bar'


        my $text = 'foo bar %[foo|]';

        $mc->new_type('percent','%', sub { return $_[1] }, 'foo');


        'foo bar foo'


        my $text = 'foo bar %[foo|]';

        $mc->new_type('percent','%', sub { return $_[1] }, 'foo');
        $mc->chump($text, { 'percents' => 0 });


        'foo bar %[foo|]'

So that's all clear then :)

chump [text]

Takes some text to munge and returns it, fully chumped. Can optionally take a hashref with the same options as new except that these options will only apply to this bit of text.

install [type] [coderef] <regexp>

if you pass in either 'image', 'link' or 'url' and a valid coderef then that code ref will be called on the original sting instead of the default behaviour.

This is useful for outputting something other than HTML.

And, in a special, one time only offer, if optionally you pass in a regexp then you can add your own handlers. So, for example, if you did :

        $tc->install('link', sub { return 'foo' }, '\d{4}');
        print $tc->chump('[test|1234]'); # prints "foo"

However you regexps are checked in reverse order they're put in so if you then do :

        $tc->install('link', sub { return 'bar' }, '\d{5}');

then :

        print $tc->chump('[test|1234]');  # prints "foo"
        print $tc->chump('[test|12345]'); # prints "bar"

Note: all regexps are assumed to be case insensitive.

If you want to monkey around with the ordering post install then the IxHash object that they're installed in can be found in


For a link or and image the values passed to the coderef are a hashref of options then the match then the label and then optionally a middle value.

If no label is passed then it will be set to the same value as the link.

So for these


a sub will be passed

        my ($opt, $link, $label, $middle) = @_;
        # $opt    = hashref of options
        # $link   =
        # $label  = foo
        # $middle = bar

and for


you'll get

        # $opt    = hashref of options
        # $link   =
        # $label  =
        # $middle = undef

For a url you'll only get passed an opt and the original string.

_order_params [function] [@params]

Given a function and an array of params it will return the first parameter that matches the function.

The order that it checks in is last element of the array and then the first element.

Why this weird order? Because it's more natural to write


or, at least, that seems to be the behaviour I've observed.

A typical function would look like this

        sub {
                return $_[0] =~ /\d+/;

Just incase you want to call this from your own plugin, this is the default action for links.

Calls, _make_link internally.

Ditto, but for images.


        <img src='$url' alt='$label' title='$label' $opts->{border} />

optionally wrapping it in an href to <link>

_chump_url [opts] [text]

Does a call to to _make_link for each URL it finds.


        <a href='$link' target='$opts->{target}'>$text</a>

_is_url [text]

Returns 1 if the text is a url or 0 if it isn't.


Not that I know of.

Oh, wait - maybe it should URL escape any entities in the text but you should probably do that yourself.


(c)opyright 2002, Simon Wistow

Distributed under the same terms as Perl itself.

This software is under no warranty and will probably ruin your life, kill your friends, burn your house and bring about the apocalypse


Copyright 2003, Simon Wistow <>

SEE ALSO, Bot::Basic::Pluggable::Blog, Template::Plugin::Chump, Text::WikiFormat, HTML::FromText, Tie::IxHash