++ed by:

1 PAUSE user

Jan Oldřich Krůza


CSS::Adaptor::Whitelist -- filter out potentially dangerous CSS


 use CSS
 use CSS::Adaptor::Whitelist;
 my $css = CSS->new({ adaptor => 'CSS::Adaptor::Whitelist' });
 $css->parse_string( <<EOCSS );
    body {
        margin: 0;
        background-image: url(javascript:alert("I am an evil hacker"));
    #main {
        background-color: yellow;
        content-after: '<img src="http://example.com/xxx-rated-picture.jpg">';
 print $css->output;
 # prints:
 # body {
 #     margin: 0;
 # }
 # #main {
 #     background-color: yellow;
 # }
 # allow the foo selector, but only with value "bar" or "baz"
 # 1) regex way
 $CSS::Adaptor::Whitelist::whitelist{foo} = qr/^ba[rz]$/;
 # 2) hash way
 $CSS::Adaptor::Whitelist::whitelist{foo} = {bar => 1, baz => 1};
 # 3) sub way
 $CSS::Adaptor::Whitelist::whitelist{foo} = sub {
    return ($_[0] eq 'bar' or $_[0] eq 'baz')


This is a subclass of CSS::Adaptor that paranoidly prunes anything from the input CSS code that it doesn't recognize as standard.

It is intended as user-input CSS validation before letting it on your site. The allowed CSS properties and corresponding values were mostly taken from w3schools.com/css .


The allowed constructs are given in the %CSS::Adaptor::Whitelist::whitelist hash. The keys are the allowed selectors and the values can be 1) regular expressions, 2) code refs and 3) hash refs.

Each CSS property is looked up in the whitelist. If it is not found, it is discarded.

Each CSS value found is checked. If it passes the test, then it is output in standard indentation, otherwise a message is passed to the log method.

In case of regexp, it is checked against the regexp. If it matches, the value passes.

In case of subroutine, the value is passed as the only argument to it. If the sub returns a true value, the CSS value passes.

In case of hash, if the CSS value is a key in the hash, that is associated with a true value, then it passes.

Overriding defaults

You are invited to modify the rules, particularly the ones that allow URL's. See set_url_re for a convenient way.

Also the font-family (and thus also font) properties are quite generous. Feel free to allow just a list of expected font families:

 $CSS::Adaptor::Whitelist::whitelist{'font-family'} = qr/^arial|verdana|...$/;



Simplifies giving values in the hash way. Returns hasref.

 list2hash('foo', 'bar', 'baz') # returns {foo => 1, bar => 1, baz => 1}
 space_sep_res($string, $regex, $regex, ...) # returns 1 or 0

SPACE-SEParated Regular ExpresssionS. Given a string like 1px solid #CCFF55 and regular expressions for CSS dimension, border type and CSS color, checks if the string matches piece by piece to these regexps.

Will fail if some of the regexp matches too small a chunk, for example:

 space_sep_res('solid #CCFF55', qr/solid|dotted/, qr/#[A-F\d]{3}|#[A-F\d]{6}/)

will return 0 because the latter regexp stops after matching <#CCF>.

Also beware that the regular expressions provided MUST NOT contain capturing parentheses, otherwise the function will not work. Use (?: ... ) for non-capturing parenthesising.


Sets the regular expression that URL's are checked against. Including the url( ) wrapper. You are encouraged to use this method to provide a regexp that will only allow URL's to domains you control:


Notice that the regexp should not be anchored (no ^ and $ at the edges). It is being used in these properties:


This is a method that stores messages of things being filtered out in the @CSS::Adaptor::Whitelist::message_log array.

You are encouraged to override or redefine this method to treat the log messages in accordance with your logging habits.


Oldrich Kruza <sixtease@cpan.org>



Copyright (c) 2009 Oldrich Kruza. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.