The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Taint - Perl utility extensions for tainted data

SYNOPSIS

  use Taint;
  warn "Oops"
    if tainted $num, @ids;      # Test for tainted data
  kill $num, @ids;              # before using it

  use Carp;
  use Taint;
  sub baz { croak "Insecure request" if tainted @_; ... }

  use Taint qw(taint);
  taint @list, $item;           # Intentionally taint data

  use Taint qw(:ALL);
  $pi = 3.14159 + tainted_zero; # I don't trust irrational numbers

DESCRIPTION

Perl has the ability to mark data as 'tainted', as described in perlsec(1). Perl will prevent tainted data from being used for some operations, and you may wish to add such caution to your own code. The routines in this module provide convenient ways to taint data and to check data for taint. To remove the taint from data, use the method described in perlsec(1), or use the make_extractor routine.

Please read "COPYRIGHT" and "DISCLAIMER".

ROUTINES

tainted LIST
is_tainted EXPR
any_tainted LIST
all_tainted LIST

Test one or more items for taint. tainted is an alias for any_tainted, provided for convenience. (Also, tainted is exported by default.) is_tainted is prototyped to take a single scalar argument, the others take lists. (If you're not sure which one to use, use tainted.) When taint checks are off, these always return false.

taintedness LIST

This is a utility function, mostly useful for authors of subroutines in modules. It is possible that an algorithm, by its nature, doesn't propagate taintedness as it should. This routine returns the taintedness of its parameters in the form of a null string which is either tainted or not. (When taint checking is off, the return value is always an untainted null string.) That string may be (for example) appended to a return value to taint it if needed.

    sub frobnicate {
        my($taintedness) = taintedness @_;      # save it
        # ...do some stuff which may or may not
        # properly propagate taint...
        return undef if $you_want_to;
        return $taintedness . $return_value;    # restore it
    }
taint LIST

If taint checks are turned on, marks each (apparently) taintable argument in LIST as being tainted. (References and undef are never taintable and are left unchanged. Some tied and magical variables may fail to be tainted by this routine, try as it may.)

To taint (the values of) an entire hash, use this idiom.

    taint @hash{ keys %hash };          # taint values of %hash
tainted_null
tainted_zero

If you'd rather taint your data yourself, these constants will let you do it. tainted_null is a tainted null string, which may be appended to any data to taint it. (Of course, that will also stringify the data, if needed.) tainted_zero is (surprise) a tainted zero, which may be added to any number to taint it. Note that when taint checking is off, nothing can be tainted, so then these are merely mundane '' and 0 values.

taint_checking

This constant tells whether taint checks are in use. This is usually only useful in connection with the allow_no_taint option (see "allow_no_taint").

    print LOG "Warning: Taint checks not enabled\n"
        unless taint_checking;
make_extractor EXPR

This routine returns a coderef for a subroutine which untaints its arguments according to the pattern passed in the string EXPR. Although the argument to this routine must be untainted, the arguments to the generated code may be tainted or not. When taint checking is off, this routine and its generated code behave in essentially the same way, even though neither their parameters nor return values are tainted.

Note: When untainting data, it's often easier to use the method described in perlsec(1), especially if you're unfamiliar with constructing strings to be used as regular expressions.

Here's one way this routine might be used. This example is part of a server (similar in some ways to fingerd; see fingerd(8)) which, when given a username, runs the Unix who command, extracts and untaints some information about that user, and reports it. Note that the regular expression is compiled just once, (within the make_extractor routine) even though the username may change every time through the main loop.

    while () {  # The server runs in an infinite loop
        my $username = &get_next_request;
        # $username must already be untainted! (But let's not
        # assume it doesn't have metacharacters, even though
        # Unix usernames can't have any.)
        my $pattern =
            '^' .
            quotemeta($username) .
            '\s+(\S+)\s+(.+)$';
        my $get_who = make_extractor $pattern;

        my %info = ();
        for (`who`) {
            # $_ has lines of tainted information
            my($tty, $date) = &$get_who($_);
            # but $tty and $date are untainted
            $info{$tty} = $date;
        }
        # %info now has untainted information
        ...
    }

Any items which need to be extracted should be within memory parens. Because of that, the string should normally have at least one set of memory parens. The pattern will be applied to each of the arguments in turn, returning a list of all matched items in memory parens. Any arguments which fail to match will add no items to the list. If called in a scalar context, the generated sub will return just the first untainted item in the list. No locale is used; see "SECURITY" in perllocale.

Note that the pattern may need to be written a little differently than usual, since it's going to be passed as a string. For example, it's not necessary to backwhack forward slashes in the pattern, since those aren't regexp metacharacters. Also, if the pattern is built up in an expression, it's important that the components all be untainted! And, of course, it needs to be a valid regular expression; otherwise, it causes an immediate error which may be trapped with eval.

For a case-insensitive match, which would usually be indicated with the /i modifier, use the embedded (?i) modifier, as described in perlre(1). The other embeddable modifiers also work.

If the pattern contains backslashes, as many do, it is especially problematic. For example, these attempts to make a pattern aren't doing what they might look like.

    $pattern1 = "(\w+)";        # effectively /(w+)/

    $pattern2 = '\Q' . $foo;    # doesn't use quotemeta

Usually, though, single quotes will do what you expect (and double quotes will confuse you). To help in debugging, you may set $Taint::DEBUGGING = 1 before calling make_extractor, which will produce an allegedly-helpful debugging message as a warning. This message will have a form of the regular expression passed, like /(w+)/ for $pattern1 above.

unconditional_untaint LIST

By unpopular request, this routine is included. Don't use it. Use the method described in perlsec(1) instead. You'd have to be crazy to use this routine. (If you are, read the module itself to see how to enable it. I'm not gonna tell you here.)

Given a list of possibly tainted lvalues, this untaints each of them without any regard for whether they should be untainted or not.

allow_no_taint

By default, importing symbols from this module requires taint checks to be turned on. If you wish to use this module without requiring taint checking (for example, if writing a module which may or may not be run under -T) either import this pseudo-item...

    use Taint qw(allow_no_taint);       # allow to run without -T
    use Taint;                          # default import list

or avoid importing any symbols by explicitly passing an empty import list.

    use Taint ();       # importing no symbols

If you use either of these methods to allow taint checks not to be required, you may want to use the constant taint_checking (see "taint_checking") to determine whether checks are on.

It may be helpful to allow checks to be off during development, but be sure to require them after release!

Exports

The only routine exported by default is tainted(). Fortunately, this is the only one most folks need. Other routines may be imported by name, or with the pseudo-import tag :ALL, or the other pseudo-import tags defined in Exporter.

NOTES

Tainting may be explicitly turned on with the -T invocation option (see "-T" in perlrun). Perl will force taint checking to be on if a process was started with setuid or setgid privileges. By default, this module requires taint checking to be on (but see allow_no_taint).

A set-id script may not necessarily run with privileges; that depends upon your system, the privileges of the user running the script, and possibly upon the configuration of perl. This means that if a set-id script is run by its own id(s), it won't have any taint checks - so your script may fail, but only when you run it!

If you're having trouble getting your script to work when taint checks are on, you should remember that Perl will automatically take some extra precautions. By default, Perl doesn't use some environment variables that it normally would, using locales may cause data to be tainted, and the current directory ('.') won't be included in the @INC list. See perlsec(1) for the full list.

DIAGNOSTICS

Attempt to taint read-only value

Just what it sounds like. taint is not able to taint something which can't be modified, such as an expression or a constant.

Pattern was /.../o

When $Taint::DEBUGGING is set to a true value, this message will be issued as a warning for each pattern passed to make_extractor(). This sub will make an attempt to represent the pattern in the "traditional" /foo/ format, although there are some differences. For example, some escapes, such as \Q, aren't really part of the regular expression engine. So, if this shows a regular expression as /\Q/, that means that it's trying to match a backslash followed by a capital Q. Also, this format does backwhack the slash mark itself (since it'll be quoted in the string by slashes), even though you don't want to pass a backslash before a true slash in the pattern. The represented pattern always ends in /o, since that option is always used internally in make_extractor().

sub unconditional_untaint() not properly imported

You should read perlsec(1) again to see how to untaint your data. Repeat as needed.

Can't make code from tainted string

You tried to pass a tainted string to make_extractor(). You should be ashamed of yourself.

Wrong way to import unconditional_untaint()

You should read perlsec(1) again to see how to untaint your data. Repeat as needed.

Can't redefine

You already had a subroutine with the same name as the unconditional_untaint() routine you were trying to import. How many of these do you need?

Taint checks not enabled

Just what it sounds like. Somehow, you didn't have taint checks turned on, and (since you're using this module) you probably were counting on them. Possible reasons: You thought your script would be run set-id, but it wasn't. You forgot to put -T on the top of your script. You're using a module which uses this one, and you didn't know that that module expects taint checks to be on. (If you wish to allow taint checks to be either on or off, see "allow_no_taint".

Disabled option requested

You tried to use the unconditional_untaint() routine, but whoever installed this module thought you shouldn't. You should read perlsec(1) again to see how to untaint your data. Repeat as needed.

Unexpected error

Something went wrong when trying to taint some data, probably because you tried to taint the untaintable. (For example, a tied variable.) If this happens, please let the author of this module know the circumstances and the error message so that I can try to get a better error message into a future version.

BUGS

We have no way to enforce understanding the docs.

Debugging a program which uses taint checks can be problematic.

Some modules aren't compatible with taint checking. Write to their authors and offer to help improve the modules. Modules which implement tied variables often need help.

The look of some of this module's internal code makes some people think its author was smoking crack. But some people think that when they see any Perl code.

is_tainted @foo isn't what you might think. And it don't use no good grammars, neither, if you asks me.

taint %bar doesn't do anything good. (Hey, I'd make an error message if I knew how to detect it.)

There is no routine which will taint all the taintable parts of a structure more complex than a simple list.

Taint checking is a largely-unexplored area of Perl. It's not unlikely that there are as-yet undiscovered bugs in Perl's tainting code. While working on this module and its tests, the author found three bugs in Perl's internal taint handling. (Using taint checking is like using a safety net with holes. At least it's better than no net at all.) Most new versions of Perl (and even many subversions) fix at least one tainting-related bug. The moral of the story: Stay on alert for announcements about new versions of Perl and vital modules like this one. (Watch comp.lang.perl.announce.)

no Taint; doesn't turn off taint checks (lexically or otherwise), and use Taint; doesn't turn them on. Dang.

Some bugs are documented only in this sentence. (Please send documentation patches and other corrections to the author.)

The following data can never be tainted: references, undef, hash keys, anything which is not a scalar, and some magical or tied variables. Attempting to taint some of these may cause interesting and educational results. (The module which implements a tied variable may allow (or even force) tainting. (For that matter, a tied hash could conceivably have tainted keys! But untainting those would be ...interesting.) Although a reference can't be tainted, it may reference a thingy which is tainted in whole or in part.)

There's no routine which taints data "in passing". That is, there's nothing to which you can pass @foo and get back a tainted copy of it, leaving @foo unmodified. I have a wonderful reason for this, but there's not enough room to write it here in the margin.

Some bugs should be construed as features, and vice versa. This may be one of them.

AUTHOR

Tom Phoenix, <rootbeer@teleport.com>

COPYRIGHT

This entire module is Copyright (C) 1997 by Tom Phoenix. This module is experimental and may be installed for TESTING AND DEVELOPMENT PURPOSES ONLY. It may not be distributed or redistributed except through the Comprehensive Perl Archive Network (CPAN). A modified or partial copy of this package must not be redistributed without prior written permission. In particular, this module and Perl's taint checking may not do what you want, and they may do what you do not want; using this module in any way without understanding that fact is strictly forbidden.

DISCLAIMER

THIS ENTIRE MODULE, INCLUDING ITS DOCUMENTATION AND ALL OTHER FILES, IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.

You must read and understand all appropriate documentation, especially including perlsec(1) and this manpage. I say again, this module and Perl's taint checking may not do what you want, and they may do what you do not want; using this module in any way without understanding that fact is strictly forbidden.

Although all reasonable efforts have been made to ensure its quality, utility, and accuracy, it is the users' responsibility to decide whether this is suitable for any given purpose. You runs your code and you takes your chances.

Okay, this is a heck of a disclaimer. Try not to be too scared; the author uses this code himself (when not writing about himself in the third person). Watch the newsgroup comp.lang.perl.announce for announcements of new versions of this module and other cool stuff.

SEE ALSO

perlsec(1) and perlre(1).

1 POD Error

The following errors were encountered while parsing the POD:

Around line 699:

You forgot a '=back' before '=head1'