The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Dios::Types::Pure - Type checking for the Dios framework (and everyone else too)

VERSION

This document describes Dios::Types::Pure version 0.000001

SYNOPSIS

    use Dios::Types::Pure 'validate';

    # Throw an exception if the VALUE doesn't conform to the specified TYPE
    validate($TYPE, $VALUE);

    # Same, but report errors using the specified MESSAGE
    validate($TYPE, $VALUE, $MESSAGE);

    # Same, but VALUE must satisfy every one of the CONSTRAINTS as well
    validate($TYPE, $VALUE, $DESC, @CONSTRAINTS);

    # If you don't want exceptions in response to type mismatches, use an eval
    if (eval{ validate($TYPE, $VALUE) }) {
        warn "$VALUE not of type $TYPE. Proceeding anyway.";
    }

    use Dios::Types::Pure 'validator_for';

    # Same, but prebuild validator for faster checking...
    my $check = validator_for($TYPE, $DESC, @CONSTRAINTS);

    for my $VALUE (@MANY_VALUES) {
        $check->($VALUE);
    }

DESCRIPTION

Standard types

This module implements type-checking for all of the following types...

Any

Accepts any Perl value.

Bool

Accepts any Perl value that can be used as a boolean. So effectively: any Perl value (just like Any).

This type exists mainly to allow you to be more specific about using a value as a boolean.

Undef

Accepts any value that is undefined. In other words, only the value undef.

Def

Accepts any value that is defined. That is, any value except undef.

Value

Accepts any value is defined...but not a reference. For example: 7 or 0x093FA3D7 or 'word'.

Num

Accepts any value that is defined and also something for which Scalar::Util::looks_like_number() returns true.

However, unlike looks_like_number(), this type does not accept the special value 'NaN'. (I mean, what part of "not a number" does that function not understand???)

Note that this type does accept other special values like "Inf"/"Infinity", as well as objects with numeric overloadings.

Int

Accepts any value for which Scalar::Util::looks_like_number() returns true and which also matches the regex:

     /
        \A
        \s*                     # optional leading space
        [+-]?                   # optional sign
        (?:                     # either...
            \d++                #     digits
            (\.0*)?             #     plus optional decimal zeroes
        |                       # or...
            (?i) inf(?:inity)?  #     some "infinity" variant
        )                       #
        \s*                     # optional trailing space
        \Z
     /x

Note that this type also accepts objects with numeric overloadings that produce integers.

Str

Accepts any value that is a string, or a non-reference that can be converted to a string (e.g. a number), or any objects with a stringification overloading.

Empty

Accepts any value that is a string, or a non-reference that can be converted to a string (e.g. a number), or any objects with a stringification overloading, provided the resulting string in each case is of zero length.

Also accepts empty arrays and hashes (see below).

Class

Accepts any value that's a string that is the name of a symbol-table entry containing at least one of: $VERSION, @ISA, or some CODE entry.

In other words, the value must be the name of a package that is plausibly also a class...either because it has a version number, or because it inherits from some other class, or because it has at least one method defined.

Ref and Ref[T]

Accepts any value that is a reference of some kind (including objects).

The parameterized form specifies what kind(s) of reference the value must be:

    Ref[Str]       # accepts only a reference to a string
    Ref[Int]       # accepts only a reference to an integer
    Ref[Array]     # accepts only a reference to an array
    Ref[Hash]      # accepts only a reference to a hash
    Ref[Code]      # accepts only a reference to a subroutine
    Ref[Str|Num]   # accepts only a reference to a string or number

This implies that an unparameterized Ref is just a shorthand for Ref[Any].

Scalar

Accepts any value that is a reference to a scalar. For example: \1, \2.34e56, \"foo", etc.

Regex

Accepts any value that is a reference to a Regexp object (i.e. the value created by a qr/.../).

Code

Accepts any value that is a reference to a subroutine. Either: \&named_sub or sub {...}.

Glob

Accepts any value that is a reference to a typeglob.

IO

Accepts any value that is a reference to an open filehandle of some kind (as tested by Scalar::Util::openhandle()).

Obj

Accepts any value that is a reference to an object (i.e. anything blessed).

Array and Array[T]

Accepts any value that is a reference to an array.

The parameterized form specifies what kind of values the array must contain:

    Array[Str]         # reference to array containing only strings

    Array[Hash]        # reference to array containing only hash refs

    Array[Code|Array]  # reference to array containing
                       # subroutine refs and/or array refs

Hence an unparameterized Array is just a shorthand for Array[Any].

The module also allows List as a synonym for Array.

Empty

Accepts any value that is a reference to an array that contains no elements.

Also accepts empty strings and hashes (see above and below).

Tuple[T1, T2, T3, ...],

Accepts any value that is a reference to an array in which the sequence of array elements are of the specified types (in order). For example:

    Tuple[Str, Int, Int, Hash]    # accepts: ["Foo", 1, 2,   {bar=>1}]
                                  # but not: ["Foo", 1, 2.1, {bar=>1}]
                                  # and not: [1, 2,  "Foo",  {bar=>1}]

If the final specified type is followed by ..., the remainder of the elements may be any number of values (including none) of that type. For example:

    Tuple[Str, Hash, Str...]  # accepts:  ["Foo", {bar=>1}]
                              # and also: ["Foo", {bar=>1}, 'cat']
                              # and also: ["Foo", {bar=>1}, 'cat', 'dog']
                              # et cetera...

If the last component of a tuple's type list is just ... by itself, the remainder of the elements may be anything (or nothing)...

    Tuple[Str, Hash, ...]     # accepts:  ["Foo", {bar=>1}]
                              # and also: ["Foo", {bar=>1}, 'etc']
                              # and also: ["Foo", {bar=>1}, 3, 4.5]
                              # et cetera...

That is, a trailing ... is just shorthand for a trailing Any...

Hash, Hash[T], and Hash[T=>T]

The unparameterized type accepts any value that is a reference to a hash.

The singly parameterized form additionally constrains what kind of values the hash may contain:

    Hash[Str]         # Each hash value must be a string

    Hash[Hash]        # Each hash value must be a hash reference

    Hash[Code|Array]  # Each hash value must be a subroutine or array reference

Hence an unparameterized Hash is just a shorthand for Hash[Any].

The doubly parameterized form additionally constrains the type of keys the hash may contain. The type specified before the arrow is the type of each key; the type after the arrow is the type of each value:

    Hash[ Not[Empty] => Str ]  # Each key must be at least one character long
                               # and each value must be a string

    Hash[ Match[^q] => Any ]   # Each key must start with a 'q'
                               # but values can be of any type

    Hash[ Class => Obj|Undef ] # Each key must be the name of a class;
                               # Each value must be an object or C<undef>

Hence an unparameterized Hash is also a shorthand for Hash[Str=>Any].

Empty

Accepts any value that is a reference to an hash that contains no entries.

Also accepts empty strings and arrays (see above).

Dict[ k, k => T, k? => T, ...],

Accepts any value that is a reference to a hash containing specific keys (and optionally with those keys having values of specific types).

Keys may be required or optional, and the corresponding values may be typed or untyped (i.e. Any). The set of keys listed may specify the only permitted keys...or allow other keys as well. The following examples cover the various possibilities.

To specify a reference to a hash with only four permitted keys ('name', 'rank', 'ID', and 'notes'), all of which must be present in the hash:

    Dict[ name, rank, ID, notes ]

To specify a reference to a hash with four permitted keys, only two of which are required to be present in the hash:

    Dict[ name, rank?, ID, notes? ]   # may have 'rank' and 'notes' entries
                                      # but not required to

To specify a reference to a hash with two to four permitted keys, with values of specific types:

    Dict[ name => Str, rank? => Rank, ID => Int, notes? => Array ]

To specify a reference to a hash with two to four permitted keys, only some of which have values of specific types:

    Dict[ name, rank? => Rank, ID => Int, notes? ]  # 'name' and 'notes entries
                                                    # can be of any type

To specify a reference to a hash with two to four specific keys, some with specific types, and with any number of other keys also allowed:

    Dict[ name, rank? => Rank, ID => Int, notes?, ... ]

More complex relationships between keys and types can be specified using disjunctive types. For example, a reference to a hash with required 'ID' and 'name' entries and an optional 'rank' entry...but if the 'rank' entry is present, there must also be a 'notes' array:

    Dict[name,ID]|Dict[name,ID,rank,notes=>Array]

Eq[STR]

Accept a value whose stringification is eq to 'STR'.

The string is always assumed to be non-interpolating.

Note that this type does not accept objects unless those objects overload stringification...even if the string specified would match the default 'MyClass=HASH[0x1d15ed17]' stringification of objects.

Match[PATTERN]

Accept a value whose stringification matches the regex: m[PATTERN]x

The pattern is always assumed to have the /x modifier in effect. If you don't want that, you need to turn it off within the pattern:

    Match[      a b c ]     # accepts "abc"
    Match[(?-x) a b c ]     # accepts " a b c "

Note that this type does not accept objects unless those objects overload stringification...even if the pattern specified would match the default 'MyClass=HASH[0x1d15ed17]' stringification of objects.

Can[METHODNAME1, METHODNAME2, ETC]

Accepts any value that is either an object or a classname (i.e. Obj|Class) and for which $VALUE->can('METHODNAME') returns true for each of the methodnames specified.

If you need to be more specific as to whether the value itself is an object or a class, use a conjunction:

      Obj&Can[dump]    # i.e. $object->can('dump') returns true

    Class&Can[dump]    # i.e. MyClass->can('dump') returns true

Overloads[OP1, OP2, ETC]

Accepts any value that is either an object or a classname (i.e. Obj|Class) and for which overload::Method($VALUE,'OP') returns true for each of the ops specified.

If you need to be more specific as to whether the value itself is an object or a class, use a conjunction:

      Obj&Overloads["", 0+]   # object with overloaded stringification and numerification

    Class&Overloads["", 0+]   # class with overloaded stringification and numerification

T1&T2

Accepts any value that both type T1 and type T2 individually accept. For example:

    Obj&Device                # blessed($VALUE) && $VALUE->isa('Device')

    Class&Match[^Internal::]  # an actual class whose name begins: Internal::

Note that there cannot be space between the & and either typename.

The & is associative, so you can add as many types as needed. For example, to accept only a hash-based object from a class in the Storable hierarchy, which must also have a valid restore() method:

    Obj&Hash&Storable&Can[restore]

The component type tests are performed left-to-right and short-circuit on any failure (like the normal Perl && operator), so it will often be an optimization to put the most expensive type tests at the end.

T1|T2

Accepts any value that either type T1 or type T2 individually accepts. For example:

    Str|Obj       # accepts either a string or an object

    Num|Undef     # accepts either a number or undef

    Array|Hash    # accepts either an array or hash reference

Note that there cannot be space between the | and either typename.

The | is associative, so you can add as many type checks as needed. For example, to accept a number or a specific string or a hash of integers:

    Num|Match[quit]|Hash[Int]

The component type tests are performed left-to-right and short-circuit on any success (like the normal Perl || operator), so it will often be an optimization to put the most expensive type tests at the end.

The | and & type compositors have the usual precedences, so you can combine them as expected. For example, to accept an object (of any kind), or else the name of a class in the Storable hierarchy:

    Obj|Class&Storable

If you need to circumvent the usual precedence, then use an Is[...].

Is[T]

Accepts any value that type T itself would accept.

This construct may be used anywhere within a typename, but is mainly useful for "bracketing" types when composing them with | and &.

For example, to match an object of any class in the Storable or Disposable hierarchies, or any object that has a reset() method, using normal &/| precedence, you'd have to write:

    Obj&Storeable|Obj&Disposable|Obj&Can[reset]

With Is[...], that's just:

    Obj&Is[Storeable|Disposable|Can[reset]]

Not[T]

Accepts any value that type T itself would not accept.

For example:

    Not[Num]             # Anything except a number

    Not[Ref]             # Anything except a reference (i.e. a Value)

    Not[Obj]             # Anything unblessed

    Not[Match[error]]    # Anything that doesn't match /error/x

    Not[Obj|Class]       # Anything you can't call methods on

    Not[Obj&Storable]    # Anything that isn't an object of class Storable
                         # (could still be an object of some other hierarchy
                         #  or else a classname in the Storable hierarchy)

User-defined types

Any other type specification that is a valid Perl identifier or qualified identifier is treated as a classname.

If the corresponding class exists, such a "classname type" accepts an object or classname in the corresponding class hierarchy. For example:

    Storable               # object or classname in the Storable hierarchy

    Disk::DVD::Rewritable  # object or classname in D::D::R hierarchy

Such user-defined types can be composed with each other and with all the other type specifiers listed above:

    Storable|Disk::DVD::Rewritable  # object or classname from either hierarchy

    Storable&Can[restore]           # a Storable with a restore() method

    Obj&Disk::DVD::Rewritable       # an object of the hierarchy

Type relationships

Most of the standard types and type compositors listed in the previous section form a single hierarchy, like so:

    Any
      \__Bool
           |___Undef
           |
            \__Def
                |__Value
                |     |___Num
                |     |     \__Int
                |     |
                |      \__Str
                |          |___Empty
                |           \__Class
                |
                 \__Ref
                     |___Ref[<T>]
                     |___Scalar
                     |___Regex
                     |___Code
                     |___Glob
                     |___IO
                     |___Obj
                     |___Array
                     |      |___Empty
                     |      |___Array[<T>]
                     |       \__Tuple[<T>, <T>, <T>, ...],
                     |
                      \__Hash
                           |___Empty
                           |___Hash[<T>]
                            \__Dict[<k> => <T>, <k>? => <T>, ...],

That is, a value that is accepted by any specific type in this diagram will also be accepted by all of its ancestral types. So, for example, the type Tuple[Str,Int] accepts the value ['A',1], so that same value will also be accepted by all of the following types (amongst many others): Tuple[Value,Int], Tuple[Def,Num], Tuple[Any,Bool], Array, Ref, Def, Bool, or Any.

However, the converse is not generally true: a value that is accepted by a "parent" type may not be accepted by all (or any) of its descendants. So while the type Array accepts the value ['A',{}], that same value will not be accepted by any of the "child" types: Empty, Array[Int], or Tuple[Int,Str].

INTERFACE

use Dios::Types::Pure 'validate';

The validate() subroutine is not exported by default, but must be explicitly requested.

use Dios::Types::Pure 'validator_for';

The validator_for() subroutine is not exported by default, but must be explicitly requested.

use Dios::Types::Pure 'validate' => 'OTHER_NAME', 'validator_for' => 'ANOTHER_NAME';

When importing validate() or validator_for(), you can request the module rename it, by passing the desired alternative name as a second argument. For example:

    use Dios::Types::Pure 'validate' => 'typecheck';

    # and later...

    typecheck('Array', $data);

validate($type, $value, $value_desc, @constraint_subs)

This subroutine requires its first two arguments: a type specification and a scalar value. If the type accepts the value, the subroutine returns true. If the type doesn't accept the value, an exception is thrown.

For example:

    # Die if number of matches isn't an integer...
    validate('Int', $matches);

    # Die if any element isn't an open filehandle...
    validate('Array[IO]', \@filehandles);

    # Validate subroutine args...
    sub fill_text {
        validate('Str',                my $text  = shift);
        validate('Int',                my $width = shift);
        validate('Dict[fill?, just?]', my $opts  = shift);
        ...
    }

If you don't want the exception on failure, use an eval to defuse it:

    while (1) {
        say 'Enter an integer: ';
        $input = readline;

        last if eval{ validate('Int', $input) };

        say "Warning: $@";
        redo;
    }

Describing the value passed to validate()

You can also pass one or more extra strings to validate(), which are use to improve the error messages produced for unacceptable values. Any extra arguments passed to the subroutine (that are not references) are concatenated together and used as the description of the value in the exception message. For example:

    my $input = 'seven';

    validate(Int, $input);
    # dies with: "Value ("seven") is not of type Int"

    validate(Int, $input, 'Error count reported by ', get_user_name());
    # dies with: "Error count reported by root is not of type Int"

If the description string contains a %s, it is used as a sprintf format, and the value itself interpolated for the %s. For example:

    validate(Int, $input, 'Error count (%s) reported by ', get_user_name());
    # dies with: "Error count (7.5) reported by root is not of type Int"

Constraining the value passed to validate()

Any other extra arguments must be subroutine references, and these are used as additional constraints on the type-checking.

That is, if the specified type accepts the value, that value is then passed to each constraint subroutine in turn. If any of those subroutines returns false or throws an exception, then the type is considered not to have matched the value.

For example:

    # Is $data a non-empty array of ints?
    validate('Array[Int]', $data, sub{ @{$_[0]} > 0 });

    # Is $filename a string in 8.3 format?
    validate('Str', $filename, sub{ shift =~ qr/^\w{1,8}\.\w{3}$/ };

    # Is $config a valid and normalized hash?
    validate('Hash', $config, \&is_valid, \&is_normalized);

When the constraint subroutines are called, the value being validated is also temporarily aliased to $_, which sometimes simplifies the constraint:

    # Is $data a non-empty array of ints?
    validate('Array[Int]', $data, sub{ @$_ > 0 });

    # Is $filename a string in 8.3 format?
    validate('Str', $filename, sub{ /^\w{1,8}\.\w{3}$/ });

    # Is $ID an unused integer?
    validate('Int', $ID, sub{ !$used_ID[$_] });

When a constraint test fails, validate() does its best to produce a meaningful error message. For example, when $data isn't long enough:

    my $data = [];

    validate('Array[Int]', $data, sub{ @$_ > 0 });

...then the exception thrown is:

    Value ([]) did not satisfy the constraint: { @$_ > 0; }

which is accurate, but maybe not sufficiently enlightening for all users.

There are two ways of improving the message produced. If a constraint is specified as a named subroutine, as in the earlier example:

    validate('Hash', $config, \&is_valid, \&is_normalized);

then validate() attempts to convert the subroutine name into a description of the constraint:

    Value ({ a=>1, b=>2, c=>1 }) did not satisfy the constraint: is normalized

Alternatively, if a constraint subroutine throws an exception on failure, the text of the exception is used as the description of the constraint:

    validate('Array[Int]', $data, sub{ @$_ > 0 or die 'must not be empty' });

Now the exception thrown is:

    Value ([]) did not satisfy the constraint: must not be empty

Note that the two kinds of extra arguments to validate() (i.e. value description strings and constraint subroutines) can be passed in any order, or even intermixed, as there is no ambiguity in the meaning of sub references vs non-references.

validator_for($type, $value_desc, @constraint_subs)

This subroutine requires its first argument: a type specification. It also accepts one or more additional arguments, specifying a description of the value being checked, and any constraints. All these arguments are exactly the same as for validate().

The validator_for() subroutine returns a reference to an anonymous subroutine that should be called with a single value, to check that the specified type accepts that value. If the type accepts the value, the anonymous subroutine returns true. If the type doesn't accept the value, an exception is thrown.

In other words, validator_for() returns the same subroutine that validate() would use to validate a value against a type. Or, in other words:

    validate($type, $value, $desc, @constraints);

is just a shorthand for:

    my $check = validator_for($type, $desc, @constraints);
    $check->($value);

Because validator_for() precompiles much of the checking API, it is usually a more efficient choice when you want to perform the same type check repeatedly. For example, to add type checking to a subroutine parameter, instead of:

    sub delay {
        my $wait = shift;
        validate('Int', $wait, sub { $_ > 0 });

        my $code = shift;
        validate('Code', $code);

        sleep $wait;
        goto &$code;
    }

you could precompile each parameter's type check:

    sub delay {
        state $check_wait = validator_for('Int', sub { $_ > 0 });
        $check_wait->( my $wait = shift );

        state $check_code = validator_for('Code');
        $check_code->( my $code = shift );

        sleep $wait;
        goto &$code;
    }

which would make the checking approximately three times faster.

DIAGNOSTICS

Can't export %s

The module exports only a single subroutine: validate(). You asked it to export something else, which confused it.

If you were trying to export validate() under a different name, then you need:

    use Dios::Types::Pure validate => '<name>';
Two type specifications for key %s in Dict[%s]

The Dict[...] type allows you to specify that a value must be of type Hash, and must only contain specific keys.

You're supposed to list each such key just once inside the square brackets but you listed a key twice (or more). Delete all the repetitions.

If you repeated a key because you were trying to allow its value to have two or more alternative types, like so:

    Dict[name => Str, name => Undef]

then you need to write that using a single junctive type instead:

    Dict[name => Str|Undef]
Incomprehensible type name: %s

The type you specified wasn't one that the module understands. Review the syntax for standard types and user-defined types.

Invalid regex syntax in Match[%s]: %s

The contents of the square brackets must be a valid regex specification (i.e. something you could validly put in an m/.../ or a qr/.../).

The full error message should point to the bad regex syntax. If that message doesn't help, see perlre for details of the standard Perl regex syntax.

Missing specification for constraint: %s

You passed a constraint to validate, but it was not a subroutine reference. Every constraint must be specified as a reference to a subroutine that expects one argument (the value) and returns a boolean value indicating whether the value satisfied the constraint.

%s is not of type %s

This is the default message returned by validate() if the value passed as its second argument doesn't match the type passed as its first argument.

%s did not satisfy the constraint: %s

This is the default message returned by validate() if the value passed as its second argument failed to satisfy one of the constraint subroutines that were also passed to it.

CONFIGURATION AND ENVIRONMENT

Dios::Types::Pure requires no configuration files or environment variables.

DEPENDENCIES

Requires Perl 5.14 or later.

Requires the Data::Dump module.

If typed attributes or parameters are used, also requires the Variable::Magic module.

INCOMPATIBILITIES

None reported.

BUGS AND LIMITATIONS

No bugs have been reported.

Please report any bugs or feature requests to bug-dios-types@rt.cpan.org, or through the web interface at http://rt.cpan.org.

AUTHOR

Damian Conway <DCONWAY@cpan.org>

LICENCE AND COPYRIGHT

Copyright (c) 2015, Damian Conway <DCONWAY@cpan.org>. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.