The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

String::EscapeCage - Cage and escape strings to prevent injection attacks

VERSION

Version 0.02

SYNOPSIS

The String::EscapeCage module puts dangerous strings in a cage. It eases escaping to various encodings, helps developers track what data are dangerous, and prevents injection attacks.

    use String::EscapeCage qw( cage uncage escapehtml );

    my $name = cage $cgi->param('name');
    print "Hello, ", $name, "\n";  # croaks to avoid HTML injection attack
    print "Hello, ", escapehtml $name, "\n";  # nice and safe
    print "Hello, ", uncage $name, "\n";  # remove protection

DESCRIPTION

After the cage function cages a string, the uncage method releases it and escapehtml, escapecstring, etc methods safely escape (transform) it. If an application cages all user-supplied strings, then a run-time exception will prevent application code from accidentally allowing an SQL, shell, cross-site scripting, cat -v, etc injection attack. String::EscapeCage's paranoia can be adjusted for development. The concept is similar to "tainted" data, but is implemented by "overload"ing the '""' stringify method on blessed scalar references.

By default String::EscapeCage does not export any subroutines. The subroutines are (available for import and/or as methods):

cage STRING / new STRING

Return a new EscapeCage object holding the given string. cage is only available as an exported function; new is only available as a class method.

uncage CAGE

Returns the string that had been "caged" in the given EscapeCage object. It will be untainted, since you presumably know what you're doing with it. Available as an exported function or an object method.

re CAGE REGEXP

Applies the REGEXP to the string that had been "caged", taking the place of the regular expression binding operator =~.

I want to overload =~ and let an EscapeCage uncage and untaint itself just as if it were a tainted strings, but overload doesn't support =~. So, this is an ugly work-around to get a little brevity and to mark points for when we figure out overloading. Doesn't set the (implicitly local()ized) numbered match variables (eg $1) the way you want.

escapecstring CAGE

Returns the C-string-escaped transformation of the string that had been "caged" in the given EscapeCage object. It will be untainted, since it should be safe to print now. Available as an exported function or an object method.

escapepercent CAGE

Returns the URL percent-escaped transformation of the string that had been "caged" in the given EscapeCage object. It will be untainted, since it should be safe to print now. Available as an exported function or an object method.

ADDING STRING::ESCAPECAGE TO AN EXISTING PROJECT

  • Turn global paranoia off (not yet implemented); cage all incoming strings.

  • Over time, in each package, turn local paranoia on (not yet implemented); escape strings in the package's code and cage new strings.

  • When done, turn global paranoia back on.

  • Remove explicit local paranoia setting if desired.

CAVEATS

  • Different ref()/blessed() behavior

  • Doesn't protect against strings you build yourself; eg building a URL string by manually decoding hex digits (May I suggest that the decoding function should return a cage?).

COMPARISON WITH TAINT

  • Taint checking (for setuid etc) distrusts the invoking user; String::EscapeCage focuses its distrust on explicitly marked data (usually input).

  • A tainted value may be print()ed or syswrite()d; an attempt to print a caged value will croak.

  • Tainting lacks granularity; EscapeCages may be explicitly wrapped around some data but not others.

  • A tainted value may be used as a method name or symbolic sub; String::EscapeCage disallows this.

  • Taintedness can (essentially only) be removed via regular expressions or hash keys; a String::EscapeCage can only be removed with an explicit call to uncage, "re (regular expression)", escapehtml, etc.

  • String::EscapeCage doesn't do the cleanup that the -T taint flag enables: @INC, $ENV{PERL5LIB} and $ENV{PERLLIB}, $ENV{PATH}, any setuid/setgid issues.

BUGS

  • The interface was designed without input from a real project and is subject to change.

  • You can't use a regular expression on a caged string

Please report any bugs or feature requests to bug-escapecage at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=String-EscapeCage. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

TODO

  • Define the interface. Until this is used in a real project, it's tough to say what the optimal interface would be.

  • Provide different levels of strictness/fatality.

  • Provide levels of debugging. Notate cages with information for humans: place where caged, reason, etc.

  • Give formally precise implementations of current escaping schemes: percent, html, cstring.

  • Add other escaping schemes: shell, sql, http header, cat -v, lots more.

  • Add a nice mechanism by which other modules can add other escaping schemas.

  • Make wrappers of standard libraries that perform caging. For example: A wrapper class for an IO::Handle object whose readline returns caged strings or whose print etc automatically htmlescapes caged strings. A sub that changes all the values in an Apache::Request object into caged values. Validation routines that "see through" cages.

  • Optimize. Maybe memoize escaped values, either by object or by value. Maybe add the ability to turn off error checking. Faster implementations of each escaping schema.

AUTHOR

    Mark P Sullivan
    CPAN ID: msulliva
    Zeroth Solutions

COPYRIGHT

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.

SEE ALSO

taint in perlsec, Apache::TaintRequest