The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

String::BlackWhiteList - match a string against a blacklist and a whitelist

SYNOPSIS

    use String::BlackWhiteList;

    use constant BLACKLIST => (
        'POST',
        'PO',
        'P O',
        'P O BOX',
        'P.O.',
        'P.O.B.',
        'P.O.BOX',
        'P.O. BOX',
        'P. O.',
        'P. O.BOX',
        'P. O. BOX',
        'POBOX',
    );

    use constant WHITELIST => (
        'Post Road',
        'Post Rd',
        'Post Street',
        'Post St',
        'Post Avenue',
        'Post Av',
        'Post Alley',
        'Post Drive',
    );

    my @ok = (
        'Post Road 123',
        'Post Rd 123',
        'Post Street 123',
        'Post St 123',
        'Post Avenue 123',
    );

    my @not_ok = (
        'Post',
        'P.O. BOX 37',
        'P.O. BOX 37, Post Drive 9',
        'Post Street, P.O.B.',
    );

    plan tests => @ok + @not_ok;

    my $matcher = String::BlackWhiteList->new(
        blacklist => [ BLACKLIST ],
        whitelist => [ WHITELIST ]
    )->update;

    ok( $matcher->valid($_), "[$_] valid")   for @ok;
    ok(!$matcher->valid($_), "[$_] invalid") for @not_ok;

DESCRIPTION

Using this class you can match strings against a blacklist and a whitelist. The matching algorithm is explained in the valid() method's documentation.

METHODS

new
    my $obj = String::BlackWhiteList->new;
    my $obj = String::BlackWhiteList->new(%args);

Creates and returns a new object. The constructor will accept as arguments a list of pairs, from component name to initial value. For each pair, the named component is initialized by calling the method of the same name with the given value. If called with a single hash reference, it is dereferenced and its key/value pairs are set as described before.

black_re
    my $value = $obj->black_re;
    $obj->black_re($value);

A basic getter/setter method. If called without an argument, it returns the value. If called with a single argument, it sets the value.

black_re_clear
    $obj->black_re_clear;

Clears the value.

blacklist
    my @values    = $obj->blacklist;
    my $array_ref = $obj->blacklist;
    $obj->blacklist(@values);
    $obj->blacklist($array_ref);

Get or set the array values. If called without an arguments, it returns the array in list context, or a reference to the array in scalar context. If called with arguments, it expands array references found therein and sets the values.

blacklist_clear
    $obj->blacklist_clear;

Deletes all elements from the array.

blacklist_count
    my $count = $obj->blacklist_count;

Returns the number of elements in the array.

blacklist_index
    my $element   = $obj->blacklist_index(3);
    my @elements  = $obj->blacklist_index(@indices);
    my $array_ref = $obj->blacklist_index(@indices);

Takes a list of indices and returns the elements indicated by those indices. If only one index is given, the corresponding array element is returned. If several indices are given, the result is returned as an array in list context or as an array reference in scalar context.

blacklist_pop
    my $value = $obj->blacklist_pop;

Pops the last element off the array, returning it.

blacklist_push
    $obj->blacklist_push(@values);

Pushes elements onto the end of the array.

blacklist_set
    $obj->blacklist_set(1 => $x, 5 => $y);

Takes a list of index/value pairs and for each pair it sets the array element at the indicated index to the indicated value. Returns the number of elements that have been set.

blacklist_shift
    my $value = $obj->blacklist_shift;

Shifts the first element off the array, returning it.

blacklist_splice
    $obj->blacklist_splice(2, 1, $x, $y);
    $obj->blacklist_splice(-1);
    $obj->blacklist_splice(0, -1);

Takes three arguments: An offset, a length and a list.

Removes the elements designated by the offset and the length from the array, and replaces them with the elements of the list, if any. In list context, returns the elements removed from the array. In scalar context, returns the last element removed, or undef if no elements are removed. The array grows or shrinks as necessary. If the offset is negative then it starts that far from the end of the array. If the length is omitted, removes everything from the offset onward. If the length is negative, removes the elements from the offset onward except for -length elements at the end of the array. If both the offset and the length are omitted, removes everything. If the offset is past the end of the array, it issues a warning, and splices at the end of the array.

blacklist_unshift
    $obj->blacklist_unshift(@values);

Unshifts elements onto the beginning of the array.

clear_black_re
    $obj->clear_black_re;

Clears the value.

clear_blacklist
    $obj->clear_blacklist;

Deletes all elements from the array.

clear_is_literal_text
    $obj->clear_is_literal_text;

Clears the boolean value by setting it to 0.

clear_white_re
    $obj->clear_white_re;

Clears the value.

clear_whitelist
    $obj->clear_whitelist;

Deletes all elements from the array.

count_blacklist
    my $count = $obj->count_blacklist;

Returns the number of elements in the array.

count_whitelist
    my $count = $obj->count_whitelist;

Returns the number of elements in the array.

index_blacklist
    my $element   = $obj->index_blacklist(3);
    my @elements  = $obj->index_blacklist(@indices);
    my $array_ref = $obj->index_blacklist(@indices);

Takes a list of indices and returns the elements indicated by those indices. If only one index is given, the corresponding array element is returned. If several indices are given, the result is returned as an array in list context or as an array reference in scalar context.

index_whitelist
    my $element   = $obj->index_whitelist(3);
    my @elements  = $obj->index_whitelist(@indices);
    my $array_ref = $obj->index_whitelist(@indices);

Takes a list of indices and returns the elements indicated by those indices. If only one index is given, the corresponding array element is returned. If several indices are given, the result is returned as an array in list context or as an array reference in scalar context.

is_literal_text
    $obj->is_literal_text($value);
    my $value = $obj->is_literal_text;

If called without an argument, returns the boolean value (0 or 1). If called with an argument, it normalizes it to the boolean value. That is, the values 0, undef and the empty string become 0; everything else becomes 1.

is_literal_text_clear
    $obj->is_literal_text_clear;

Clears the boolean value by setting it to 0.

is_literal_text_set
    $obj->is_literal_text_set;

Sets the boolean value to 1.

pop_blacklist
    my $value = $obj->pop_blacklist;

Pops the last element off the array, returning it.

pop_whitelist
    my $value = $obj->pop_whitelist;

Pops the last element off the array, returning it.

push_blacklist
    $obj->push_blacklist(@values);

Pushes elements onto the end of the array.

push_whitelist
    $obj->push_whitelist(@values);

Pushes elements onto the end of the array.

set_blacklist
    $obj->set_blacklist(1 => $x, 5 => $y);

Takes a list of index/value pairs and for each pair it sets the array element at the indicated index to the indicated value. Returns the number of elements that have been set.

set_is_literal_text
    $obj->set_is_literal_text;

Sets the boolean value to 1.

set_whitelist
    $obj->set_whitelist(1 => $x, 5 => $y);

Takes a list of index/value pairs and for each pair it sets the array element at the indicated index to the indicated value. Returns the number of elements that have been set.

shift_blacklist
    my $value = $obj->shift_blacklist;

Shifts the first element off the array, returning it.

shift_whitelist
    my $value = $obj->shift_whitelist;

Shifts the first element off the array, returning it.

splice_blacklist
    $obj->splice_blacklist(2, 1, $x, $y);
    $obj->splice_blacklist(-1);
    $obj->splice_blacklist(0, -1);

Takes three arguments: An offset, a length and a list.

Removes the elements designated by the offset and the length from the array, and replaces them with the elements of the list, if any. In list context, returns the elements removed from the array. In scalar context, returns the last element removed, or undef if no elements are removed. The array grows or shrinks as necessary. If the offset is negative then it starts that far from the end of the array. If the length is omitted, removes everything from the offset onward. If the length is negative, removes the elements from the offset onward except for -length elements at the end of the array. If both the offset and the length are omitted, removes everything. If the offset is past the end of the array, it issues a warning, and splices at the end of the array.

splice_whitelist
    $obj->splice_whitelist(2, 1, $x, $y);
    $obj->splice_whitelist(-1);
    $obj->splice_whitelist(0, -1);

Takes three arguments: An offset, a length and a list.

Removes the elements designated by the offset and the length from the array, and replaces them with the elements of the list, if any. In list context, returns the elements removed from the array. In scalar context, returns the last element removed, or undef if no elements are removed. The array grows or shrinks as necessary. If the offset is negative then it starts that far from the end of the array. If the length is omitted, removes everything from the offset onward. If the length is negative, removes the elements from the offset onward except for -length elements at the end of the array. If both the offset and the length are omitted, removes everything. If the offset is past the end of the array, it issues a warning, and splices at the end of the array.

unshift_blacklist
    $obj->unshift_blacklist(@values);

Unshifts elements onto the beginning of the array.

unshift_whitelist
    $obj->unshift_whitelist(@values);

Unshifts elements onto the beginning of the array.

white_re
    my $value = $obj->white_re;
    $obj->white_re($value);

A basic getter/setter method. If called without an argument, it returns the value. If called with a single argument, it sets the value.

white_re_clear
    $obj->white_re_clear;

Clears the value.

whitelist
    my @values    = $obj->whitelist;
    my $array_ref = $obj->whitelist;
    $obj->whitelist(@values);
    $obj->whitelist($array_ref);

Get or set the array values. If called without an arguments, it returns the array in list context, or a reference to the array in scalar context. If called with arguments, it expands array references found therein and sets the values.

whitelist_clear
    $obj->whitelist_clear;

Deletes all elements from the array.

whitelist_count
    my $count = $obj->whitelist_count;

Returns the number of elements in the array.

whitelist_index
    my $element   = $obj->whitelist_index(3);
    my @elements  = $obj->whitelist_index(@indices);
    my $array_ref = $obj->whitelist_index(@indices);

Takes a list of indices and returns the elements indicated by those indices. If only one index is given, the corresponding array element is returned. If several indices are given, the result is returned as an array in list context or as an array reference in scalar context.

whitelist_pop
    my $value = $obj->whitelist_pop;

Pops the last element off the array, returning it.

whitelist_push
    $obj->whitelist_push(@values);

Pushes elements onto the end of the array.

whitelist_set
    $obj->whitelist_set(1 => $x, 5 => $y);

Takes a list of index/value pairs and for each pair it sets the array element at the indicated index to the indicated value. Returns the number of elements that have been set.

whitelist_shift
    my $value = $obj->whitelist_shift;

Shifts the first element off the array, returning it.

whitelist_splice
    $obj->whitelist_splice(2, 1, $x, $y);
    $obj->whitelist_splice(-1);
    $obj->whitelist_splice(0, -1);

Takes three arguments: An offset, a length and a list.

Removes the elements designated by the offset and the length from the array, and replaces them with the elements of the list, if any. In list context, returns the elements removed from the array. In scalar context, returns the last element removed, or undef if no elements are removed. The array grows or shrinks as necessary. If the offset is negative then it starts that far from the end of the array. If the length is omitted, removes everything from the offset onward. If the length is negative, removes the elements from the offset onward except for -length elements at the end of the array. If both the offset and the length are omitted, removes everything. If the offset is past the end of the array, it issues a warning, and splices at the end of the array.

whitelist_unshift
    $obj->whitelist_unshift(@values);

Unshifts elements onto the beginning of the array.

black_re

The actual regular expression (preferably created by qr//) used for blacklist testing.

white_re

The actual regular expression (preferably created by qr//) used for whitelist testing.

update

Takes the blacklist from blacklist(), generates a regular expression that matches any string in the blacklist and sets the regular expression on black_re().

Also takes the whitelist from whitelist(), generates a regular expression that matches any string in the whitelist and sets the regular expression on white_re().

The individual entries of blacklist() and whitelist() are assumed to be regular expressions. If you have some regular expressions and some literal strings, you can use \Q...\E. If all your strings are literal strings, set is_literal_text().

If you set a black_re() and a white_re() yourself, you shouldn't use <update(), of course.

valid

Takes a string and tries to determine whether it is valid according to the blacklist and the whitelist. This is the algorithm used to determine validity:

If the string matches the whitelist, then the part of the string that didn't match the whitelist is checked against the blacklist. If the remainder matches the blacklist, the string is still considered invalid. If not, it is considered valid.

Consider the example of P.O. BOX 37, Post Drive 9 in the "SYNOPSIS". The Post Drive matches the whitelist, but the P.O. BOX matches the blacklist, so the string is still considered invalid.

If the string doesn't match the whitelist, but it matches the blacklist, then it is considered invalid.

If the string matches neither the whitelist nor the blacklist, it is considered valid.

Undefined values and empty strings are considered valid. This may seem strange, but there is no indication that they are invalid and in dubio pro reo.

valid_relaxed

Like valid(), but once a string passes the whitelist, it is not checked against the blacklist anymore. That is, if a string matches the whitelist, it is valid. If not, it is checked against the blacklist - if it matches, it is invalid. If it matches neither whitelist nor blacklist, it is valid.

String::BlackWhiteList inherits from Class::Accessor::Complex.

The superclass Class::Accessor::Complex defines these methods and functions:

    mk_abstract_accessors(), mk_array_accessors(), mk_boolean_accessors(),
    mk_class_array_accessors(), mk_class_hash_accessors(),
    mk_class_scalar_accessors(), mk_concat_accessors(),
    mk_forward_accessors(), mk_hash_accessors(), mk_integer_accessors(),
    mk_new(), mk_object_accessors(), mk_scalar_accessors(),
    mk_set_accessors(), mk_singleton()

The superclass Class::Accessor defines these methods and functions:

    _carp(), _croak(), _mk_accessors(), accessor_name_for(),
    best_practice_accessor_name_for(), best_practice_mutator_name_for(),
    follow_best_practice(), get(), make_accessor(), make_ro_accessor(),
    make_wo_accessor(), mk_accessors(), mk_ro_accessors(),
    mk_wo_accessors(), mutator_name_for(), set()

The superclass Class::Accessor::Installer defines these methods and functions:

    install_accessor()

TAGS

If you talk about this module in blogs, on del.icio.us or anywhere else, please use the stringblackwhitelist tag.

VERSION

This document describes version 0.05 of String::BlackWhiteList.

BUGS AND LIMITATIONS

No bugs have been reported.

Please report any bugs or feature requests to <bug-string-blackwhitelist@rt.cpan.org>, or through the web interface at http://rt.cpan.org.

INSTALLATION

See perlmodinstall for information and options on installing Perl modules.

AVAILABILITY

The latest version of this module is available from the Comprehensive Perl Archive Network (CPAN). Visit <http://www.perl.com/CPAN/> to find a CPAN site near you. Or see <http://www.perl.com/CPAN/authors/id/M/MA/MARCEL/>.

AUTHOR

Marcel Grünauer, <marcel@cpan.org>

COPYRIGHT AND LICENSE

Copyright 2005-2008 by Marcel Grünauer

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.