Michael McClennen
and 1 contributors

NAME

HTTP::Validate - validate and clean HTTP parameter values according to a set of rules

Version 0.982

DESCRIPTION

This module provides validation of HTTP request parameters against a set of clearly defined rules. It is designed to work with Dancer, Mojolicious, Catalyst, and similar web application frameworks, both for interactive apps and for data services. It can also be used with CGI, although the use of CGI::Fast or another similar solution is recommended to avoid paying the penalty of loading this module and initializing all of the rulesets over again for each request. Both an object-oriented interface and a procedural interface are provided.

The rule definition mechanism is very flexible. A ruleset can be defined once and used with multiple URL paths, and rulesets can be combined using the rule types require and allow. This allows a complex application that accepts many different paths to apply common rule patterns. If the parameters fail the validation test, an error message is provided which tells the client how to amend the request in order to make it valid. A suite of built-in validator functions is available, and you can also define your own.

This module also provides a mechanism for generating documentation about the parameter rules. The documentation is generated in Pod format, which can then be converted to HTML, TeX, nroff, etc. as needed.

SYNOPSIS

    package MyWebApp;
    
    use HTTP::Validate qw{:keywords :validators};
    
    define_ruleset( 'filters' => 
        { param => 'lat', valid => DECI_VALUE('-90.0','90.0') },
            "Return all datasets associated with the given latitude.",
        { param => 'lng', valid => DECI_VALUE('-180.0','180.0') },
            "Return all datasets associated with the given longitude.",
        { together => ['lat', 'lng'], errmsg => "you must specify 'lng' and 'lat' together" },
            "If either 'lat' or 'lng' is given, the other must be as well.",
        { param => 'id', valid => POS_VALUE },
            "Return the dataset with the given identifier",
        { param => 'name', valid => STR_VALUE },
            "Return all datasets with the given name");
    
    define_ruleset( 'display' => 
        { optional => 'full', valid => FLAG_VALUE },
            "If specified, then the full dataset descriptions are returned.  No value is necessary",
        { optional => 'short', valid => FLAG_VALUE },
            "If specified, then a brief summary of the datasets is returned.  No value is necessary",
        { at_most_one => ['full', 'short'] },
        { optional => 'limit', valid => [POS_ZERO_VALUE, ENUM('all')], default => 'all',
          errmsg => "acceptable values for 'limit' are either 'all', 0, or a positive integer" },
            "Limits the number of results returned.  Acceptable values are 'all', 0, or a positive integer.");
    
    define_ruleset( 'dataset_query' =>
        "This URL queries for stored datasets.  The following parameters select the datasets",
        "to be displayed, and you must specify at least one of them:",
        { require => 'filters',
          errmsg => "you must specify at least one of the following: 'lat' and 'lng', 'id', 'name'" },
        "The following optional parameters control how the data is returned:",
        { allow => 'display' });
    
    # Validate the parameters found in %ARGS against the ruleset
    # 'dataset_query'.  This is just one example, and in general the parameters
    # may be found in various places depending upon which module (CGI,
    # Dancer, Mojolicious, etc.)  you are using to accept and process HTTP
    # requests.
    
    my $result = check_params('dataset_query', \%ARGS);
    
    if ( my @error_list = $result->errors )
    {
        # if an error message was generated, do whatever is necessary to abort the
        # request and report the error back to the end user
    }
    
    # Otherwise, $result->values will return the cleaned parameter
    # values for use in processing the request.

THE VALIDATION PROCESS

The validation process starts with the definition of one or more sets of rules. This is done via the "define_ruleset" keyword. For example:

    define_ruleset 'some_params' =>
        { param => 'id', valid => POS_VALUE };
        { param => 'short', valid => FLAG_VALUE },
        { param => 'full', valid => FLAG_VALUE },
        { at_most_one => ['short', 'full'],
          errmsg => "the parameters 'short' and 'full' cannot be used together" };

This statement defines a ruleset named 'some_params' that enforces the following rules:

  • The value of parameter 'id' must be a positive integer.

  • The parameter 'short' is considered to have a true value if it appears in a request, and false otherwise. The value, if any, is ignored.

  • The parameter 'full' is treated likewise.

  • The parameters 'short' and 'full' must not be specified together in the same request.

You can define as many rulesets as you wish. For each URL path recognized by your code, you can use the "check_params" function to validate the request parameters against the appropriate ruleset for that path. If the given parameter values are not valid, one or more error messages will be returned. These messages should be sent back to the HTTP client, in order to instruct the user or programmer who originally generated the request how to amend the parameters so that the request will succeed.

During the validation process, a set of parameter values are considered to "pass" against a given ruleset if they are consistent with all of its rules. Rulesets may be included inside other rulesets by means of "allow" and "require" rules. This allows you to define common rulesets to validate various groups of parameters, and then combine them together into specific rulesets for use with different URL paths.

A ruleset is considered to be "fulfilled" by a request if at least one parameter mentioned in a "param" or "mandatory" rule is included in that request, or trivially if the ruleset does not contain any rules of those types. When you use "check_params" to validate a request against a particular ruleset, the request will be rejected unless the following are both true:

  • The request passes against the specified ruleset and all those that it includes.

  • The specified ruleset is fulfilled, along with any other rulesets included by "require" rules. Rulesets included by "allow" rules do not have to be fulfilled.

This provides you with a lot of flexibilty as to requiring or not requiring various parameters. Note that a ruleset without any "param" or "mandatory" rules is automatically fulfilled, which allows you to make all of the paramters optional if you wish. You can augment this mechanism by using "together" and "at_most_one" rules to specify which parameters must or must not be used together.

Ruleset names

Each ruleset must have a unique name, which can be any non-empty string. You may name them after paths, parameters, functionality ("display", "filter") or whatever else makes sense to you.

Ordering of rules

The rules in a given ruleset are always checked in the order they were defined. Rulesets that are included via "allow" and "require" rules are checked immediately when the including rule is evaluated. Each ruleset is checked at most once per validation, even if it is included multiple times.

You should be cautious about including multiple parameter rules that correspond to the same parameter name, as this can lead to situations where no possible value is correct.

Unrecognized parameters

By default, a request will be rejected with an appropriate error message if it contains any parameters not mentioned in any of the checked rulesets. This can be overridden (see below) to generate warnings instead. However, please think carefully before choosing this option. Allowing unrecognized parameters opens up the possibility that optional parameters will be accidentally misspelled and thus ignored, so that the results are mysteriously different from what was expected. If you override this behavior, you should make sure that any resulting warnings are explicitly displayed in the response that you generate.

Rule syntax

Every rule is represented by a hashref that contains a key indicating the rule type. For clarity, you should always write this key first. It is an error to include more than one of these keys in a single rule. You may optionally include additional keys to specify what are the acceptable values for this parameter, what error message should be returned if the parameter value is not acceptable, and many other options.

parameter rules

The following three types of rules define the recognized parameter names.

param

    { param => <parameter_name>, valid => <validator> ... }

If the specified parameter is present with a non-empty value, then its value must pass one of the specified validators. If it passes any of them, the rest are ignored. If it does not pass any of them, then an appropriate error message will be generated. If no validators are specified, then the value will be accepted no matter what it is.

If the specified parameter is present and its value is valid, then the containing ruleset will be marked as "fulfilled". You could use this, for example, with a query URL in order to require that the query not be empty but instead contain at least one significant criterion. The parameters that count as "significant" would be declared by param rules, the others by optional rules.

optional

    { optional => <parameter_name>, valid => <validator> ... }

An optional rule is identical to a param rule, except that the presence or absence of the parameter will have no effect on whether or not the containing ruleset is fulfilled. A ruleset in which all of the parameter rules are optional will always be fulfilled. This kind of rule is useful in validating URL parameters, especially for GET requests.

mandatory

    { mandatory => <parameter_name>, valid => <validator> ... }

A mandatory rule is identical to a param rule, except that this parameter is required to be present with a non-empty value regardless of the presence or absence of other parameters. If it is not, then an error message will be generated. This kind of rule can be useful when validating HTML form submissions, for use with fields such as "name" that must always be filled in.

parameter constraint rules

The following rule types can be used to specify additional constraints on the presence or absence of parameter names.

together

    { together => [ <parameter_name> ... ] }

If one of the listed parameters is present, then all of them must be. This can be used with parameters such as 'longitude' and 'latitude', where neither one makes sense without the other.

at_most_one

    { at_most_one => [ <parameter_name> ... ] }

At most one of the listed parameters may be present. This can be used along with a series of param rules to require that exactly one of a particular set of parameters is provided.

ignore

    { ignore => [ <parameter_name> ... ] }

The specified parameter or parameters will be ignored if present, and will not be included in the set of reported parameter values. This rule can be used to prevent requests from being rejected with "unrecognized parameter" errors in cases where spurious parameters may be present. If you are specifying only one parameter name, it does need not be in a listref.

inclusion rules

The following rule types can be used to include one ruleset inside of another. This allows you, for example, to define rulesets for validating different groups of parameters and then combine them into specific rulesets for use with different URL paths.

It is okay for an included ruleset to itself include other rulesets. A given ruleset is checked at most once per validation no matter how many times it is included.

allow

    { allow => <ruleset_name> }

A rule of this type is essentially an 'include' statement. If this rule is encountered during a validation, it causes the named ruleset to be checked immediately. The parameters must pass against this ruleset, but it does not have to be fulfilled.

require

    { require => <ruleset_name> }

This is a variant of allow, with an additional constraint. The validation will fail unless the named ruleset not only passes but is also fulfilled by the parameters. You could use this, for example, with a query URL in order to require that the query not be empty but instead contain at least one significant criterion. The parameters that count as "significant" would be declared by "param" rules, the others by "optional" rules.

inclusion constraint rules

The following rule types can be used to specify additional constraints on the inclusion of rulesets.

require_one

    { require_one => [ <ruleset_name> ... ] }

You can use a rule of this type to place an additional constraint on a list of rulesets already included with inclusion rules. Exactly one of the named rulesets must be fulfilled, or else the request is rejected. You can use this, for example, to ensure that a request includes either a parameter from group A or one from group B, but not both.

require_any

    { require_any => [ <ruleset_name> ... ] }

This is a variant of require_one. At least one of the named rulesets must be fulfilled, or else the request will be rejected.

allow_one

    { allow_one => [ <ruleset_name> ... ] }

Another variant of require_one. The request will be rejected if more than one of the listed rulesets is fulfilled, but will pass if either none of them or just one of them is fulfilled. This can be used to allow optional parameters from either group A or group B, but not from both groups.

other rules

content_type

    { content_type => <parameter_name>, valid => [ <value> ... ] }

You can use a rule of this type, if you wish, to direct that the value of the specified parameter be used to indicate the content type of the response. Only one of these rules should occur in any given validation. The key valid gives a list of acceptable values and the content types they should map to. For example, if you are using this module with Dancer then you could do something like the following:

    define_ruleset '/some/path' =>
        { require => 'some_params' },
        { allow => 'other_params' },
        { content_type => 'ct', valid => ['html', 'json', 'frob=application/frobnicate'] };
    
    get '/some/path.:ct' => sub {
    
        my $valid_request = check_params('/some/path', params);
        content_type $valid_request->content_type;
        ...
    }

This code specifies that the content type of the response will be set by the URL path suffix, which may be either .html, .json or .frob.

If the value given in a request does not occur in the list, or if no value is found, then an error message will be generated that lists the accepted types.

To match an empty parameter value, include a string that looks like '=some/type'. You need not specify the actual content type string for the well-known types 'html', 'json', 'xml', 'txt' or 'csv', unless you wish to override the default given by this module.

Rule attributes

Any rule definition may also include one or more of the following attributes, specified as key/value pairs in the rule hash:

errmsg

This attribute specifies the error message to be returned if the rule fails, overriding the default message. For example:

    define_ruleset( 'specifier' => 
        { param => 'name', valid => STRING_VALUE },
        { param => 'id', valid => POS_VALUE });
    
    define_ruleset( 'my_route' =>
        { require => 'specifier', 
          errmsg => "you must specify either of the parameters 'name' or 'id'" });

Error messages may include any of the following placeholders: {param}, {value}. These are replaced respectively by the relevant parameter name(s) and original parameter value(s), single-quoted. This feature allows you to define messages that quote the actual parameter values presented in the request, as well as to define common messages and use them with multiple rules.

warn

This attribute causes a warning to be generated rather than an error if the rule fails. Unlike errors, warnings do not cause a request to be rejected. At the end of the validation process, the list of generated warnings can be retrieved by using the "warnings" method of the result object.

If the value of this key is 1, then what would otherwise be the error message will be used as the warning message. Otherwise, the specified string will be used as the warning message.

For parameter rules, this attribute affects only errors resulting from validation of the parameter values. Other error conditions (i.e. multiple parameter values without the "multiple" attribute) continue to be reported as errors.

key

The attribute 'key' specifies the name under which any information generated by the rule will be saved. For a parameter rule, the cleaned value will be saved under this name. For all rules, any generated warnings or errors will be stored under the specified name instead of the parameter name or rule number. This allows you to easily determine after a validation which warnings or errors were generated.

The following keys can be used only with rules of type "param", "optional" or "mandatory":

valid

This attribute specifies the domain of acceptable values for the parameter. The value must be either a single code reference or a list of them. You can either select from the list of built-in validator functions included with this module, or provide your own.

If the parameter named by this rule is present, its value must pass at least one of the specified validators or else an error message will be generated. If multiple validators are given, then the error message returned will be the one generated by the last validator in the list. This can be overridden by using the "errmsg" key.

multiple

This attribute specifies that the parameter may appear multiple times in the request. Without this directive, multiple values for the same parameter will generate an error. For example:

    define_ruleset( 'identifiers' => 
        { param => 'id', valid => POS_VALUE, multiple => 1 });

If this attribute is present with a true value, then the cleaned value of the parameter will be an array ref if at least one valid value was found and undef otherwise. If you wish a request to be considered valid even if some of the values fail the validator, then either use the "list" attribute instead or include a "warn" key as well.

split

This attribute has the same effect as "multiple", and in addition causes each parameter value string to be split ("split" in perlfunc) as indicated by the value of the directive. If this value is a string, then it will be compiled into a regexp preceded and followed by \s*. So in the following example:

    define_ruleset( 'identifiers' =>
        { param => 'id', valid => POS_VALUE, split => ',' });

The value string will be considered to be valid if it contains one or more positive integers separated by commas and optional whitespace. Empty strings between separators are ignored.

    123,456             # returns [123, 456]
    123 , ,456          # returns [123, 456]
    , 456               # returns [456]
    123 456             # not valid
    123:456             # not valid

If you wish more precise control over the separator expression, you can pass a regexp quoted with qr instead.

list

This attribute has the same effect as "split", but generates warnings instead of error messages when invalid values are encountered (as if warn => 1 was also specified). The resulting cleaned value will be a listref containing any values which pass the validator, or undef if no valid values were found. See also "warn" and "bad_value".

bad_value

This attribute can be useful in conjunction with "list". If one or more values are given for the parameter but none of them are valid, this attribute comes into effect. If the value of this attribute is ERROR, then the validation will fail with an appropriate error message. Otherwise, this will be used as the value of the parameter. It is recommended that you set the value to something outside of the valid range, i.e. -1 for a POS_VALUE parameter.

Using this attribute allows you to easily distinguish between the case when the parameter appears with an empty value (or not at all, which is considered equivalent) vs. when the parameter appears with one or more invalid values and no good ones.

alias

This attribute specifies one or more aliases for the parameter name (use a listref for multiple aliases). These names may be used interchangeably in requests, but any request that contains more than one of them will be rejected with an appropriate error message unless "multiple" is also specified. The parameter value and any error or warning messages will be reported under the main parameter name for this rule, no matter which alias is used in the request.

clean

This attribute specifies a subroutine which will be used to modify the parameter values. This routine will be called with the raw value of the parameter as its only argument, once for each value if multiple values are allowed. The resulting values will be stored as the "cleaned" values. The value of this directive may be either a code ref or one of the strings 'uc', 'lc' or 'fc'. These direct that the parameter values be converted to uppercase, lowercase, or fold case respectively.

default

This attribute specifies a default value for the parameter, which will be reported if the parameter is not present in the request or if it is present with an empty value. If the rule also includes a validator and/or a cleaner, the specified default value will be passed to it when the ruleset is defined. An exception will be thrown if the default value does not pass the validator.

undocumented

If this attribute is given with a true value, then this rule will be ignored by any calls to "document_params". This feature allows you to include parameters that are recognized as valid but that are not included in any generated documentation. Such parameters will be invisible to users, but will be visible and clearly marked to anybody browsing your source code.

Documentation

A ruleset definition may include strings interspersed with the rule definitions (see the example at the top of this page) which can be turned into documentation in Pod format by means of the "document_params" keyword. It is recommended that you use this function to auto-generate the PARAMETERS section of the documentation pages for the various URL paths accepted by your web application, translating the output from Pod to whatever format is appropriate. This will help you to keep the documentation and the actual rules in synchrony with one another.

The generated documentation will consist of one or more item lists, separated by ordinary paragraphs. Each parameter rule will generate one item, whose body consists of the documentation strings immediately following the rule definition. Ordinary paragraphs (see below) can be used to separate the parameters into groups for documentation purposes, or at the start or end of a list as introductory or concluding material. Each "require" or "allow" rule causes the documentation for the indicated ruleset(s) to be interpolated, except as noted below. Note that this subsidiary documentation will not be nested. All of the parameters will be documented at the same list indentation level, whether or not they are defined in subsidiary rulesets.

Documentation strings may start with one of the following special characters:

>>

The remainder of this string, plus any strings immediately following, will appear as an ordinary paragraph. You can use this feature to provide commentary paragraphs separating the documented parameters into groups. Any documentation strings occurring before the first parameter rule definition, or following an allow or require rule, will always generate ordinary paragraphs regardless of whether they start with this special character.

>

The remainder of this string, plus any strings immediately following, will appear as a new paragraph of the same type as the preceding paragraph (item body or ordinary paragraph).

!

The preceding rule definition will be ignored by any calls to "document_params", and all documentation for this rule will be suppressed. This is equivalent to specifying the rule attribute "undocumented".

^

Any documentation generated for the preceding rule definition will be suppressed. The remainder of this string plus any strings immediately following will appear as an ordinary paragraph in its place. You can use this, for example, to document a subsidiary ruleset with an explanatory note (i.e. a link to another documentation section or page) instead of explicitly listing all of the included parameters.

?

This character is ignored at the beginning of a documentation string, and the next character loses any special meaning it might have had. You can use this in the unlikely event that you want a documentation paragraph to actually start with one of these special characters.

Note that modifier rules such as at_most_one, require_one, etc. are ignored when generating documentation. Any documentation strings following them will be treated as if they apply to the most recently preceding parameter rule or inclusion rule.

INTERFACE

This module can be used in either an object-oriented or a procedural manner. To use the object-oriented interface, generate a new instance of HTTP::Validate and use any of the routines listed below as methods:

    use HTTP::Validate qw(:validators);
    
    my $validator = HTTP::Validate->new();
    
    $validator->define_ruleset('my_params' =>
        { param => 'foo', valid => INT_VALUE, default => '0' });
    
    my $result = $validator->check_params('my_params', \%ARGS);

Otherwise, you can export these routines to your module and call them directly. In this case, a global ruleset namespace will be assumed:

    use HTTP::Validate qw(:keywords :validators);
    
    define_ruleset('my_params' =>
        { param => 'foo', valid => INT_VALUE, default => '0' });
    
    my $validated = check_params('my_params', \%ARGS);

Using :keywords will import all of the keywords listed below, except 'new'. Using :validators will import all of the validators listed below.

The following can be called either as subroutines or as method names, depending upon which paradigm you prefer:

new

This can be called as a class method to generate a new validation instance (see example above) with its own ruleset namespace. Any of the arguments that can be passed to "validation_settings" can also be passed to this routine.

define_ruleset

This keyword defines a set of rules to be used for validating parameters. The first argument is the ruleset's name, which must be unique within its namespace. The rest of the parameters must be a list of rules (hashrefs) interspersed with documentation strings. For examples, see above.

check_params

    my $result = check_params('my_ruleset', undef, params('query'));
    
    if ( $result->passed )
    {
        # process the request using the keys and values returned by 
        # $result->values
    }
    
    else
    {
        # redisplay the form, send an error response, or otherwise handle the
        # error condition using the error messages returned by $result->errors
    }

This function validates a set of parameters and values (which may be provided either as one or more hashrefs or as a flattened list of keys and values or a combination of the two) against the named ruleset with the specified context. It returns a response object from which you can get the cleaned parameter values along with any errors or warnings that may have been generated.

The second parameter must be either a hashref or undefined. If it is defined, it is passed to each of the validator functions as "context". This allows you to provide attributes such as a database handle to the validator functions. The third parameter must be either a hashref or a listref containing parameter names and values. If it is a listref, any items at the beginning of the list which are themselves hashrefs will be expanded before the list is processed (this allows you, for example, to pass in a hashref plus some additional names and values without having to modify the hashref in place).

You can use the "passed" method on the returned object to determine if the validation passed or failed. In the latter case, you can return an HTTP error response to the user, or perhaps redisplay a submitted form.

Note that you can validate against multiple rulesets at once by defining a new ruleset with inclusion rules referring to all of the rulesets you wish to validate against.

validation_settings

This function allows you to change the settings on the validation routine. For example:

    validation_settings( allow_unrecognized => 1 );

If you are using this module in an object-oriented way, then you can also pass any of these settings as parameters to the constructor method. Available settings include:

allow_unrecognized

If specified, then unrecognized parameters will generate warnings instead of errors.

ignore_unrecognized

If specified, then unrecognized parameters will be ignored entirely.

You may also specify one or more of the following keys, each followed by a string. These allow you to redefine the messages that are generated when parameter errors are detected:

ERR_INVALID, ERR_BAD_VALUES, ERR_MULT_NAMES, ERR_MULT_VALUES, ERR_MANDATORY, ERR_TOGETHER, ERR_AT_MOST, ERR_REQ_SINGLE, ERR_REQ_MULT, ERR_REQ_ONE, ERR_MEDIA_TYPE, ERR_DEFAULT

For example:

    validation_settings( ERR_MANDATORY => 'Missing mandatory parameter {param}',
                         ERR_REQ_SINGLE => 'Found {value} for {param}: only one value is allowed' );

ruleset_defined

    if ( ruleset_defined($ruleset_name) ) {
        # then do something
    }

This function returns true if a ruleset has been defined with the given name, false otherwise.

document_params

This function generates documentation for the given ruleset, in Pod format. This only works if you have included documentation strings in your calls to "define_ruleset". The method returns undef if the specified ruleset is not found.

    $my_doc = document_params($ruleset_name);

This capability has been included in order to simplify the process of documenting web services implemented using this module. The author has noticed that documentation is much easier to maintain and more likely to be kept up-to-date if the documentation strings are located right next to the relevant definitions.

Any parameter rules that you wish to leave undocumented should either be given the attribute 'undocumented' or be immediately followed by a string starting with "!". All others will automatically generate list items in the resulting documentation, even if no documentation string is provided (in this case, the item body will be empty).

list_params

This function returns a list of the names of all parameters accepted by the specified ruleset, including those accepted by included rulesets.

    my @parameter_names = list_ruleset_params($ruleset_name);

This may be useful if your validations allow unrecognized parameters, as it enables you to determine which of the parameters in a given request are significant to that request.

OTHER METHODS

The result object returned by "check_params" provides the following methods:

passed

Returns true if the validation passed, false otherwise.

errors

In a scalar context, this returns the number of errors generated by this validation. In a list context, it returns a list of error messages. If an argument is given, only messages whose key equals the argument are returned.

error_keys

Returns the list of keys for which error messages were generated.

warnings

In a scalar context, this returns the number of warnings generated by the validation. In a list context, it returns a list of warning messages. If an argument is given, only messages whose key equals the argument are returned.

warning_keys

Returns the list of keys for which warning messages were generated.

keys

In a scalar context, this returns the number of parameters that had valid values. In a list context, it returns a list of parameter names in the order they were recognized. Individual parameter values can be gotten by using either "values" or "value".

values

Returns the hash of clean parameter values. This is not a copy, so any modifications you make to it will be reflected in subsequent calls to "value".

value

Returns the value of the specified parameter, or undef if that parameter was not specified in the request or if its value was invalid.

specified

Returns true if the specified parameter was specified in the request with at least one value, whether or not that value was valid. Returns false otherwise.

raw

Returns a hash of the raw parameter values as originally provided to "check_params". Multiple values are represented by array refs. The result of this method can be used, for example, to redisplay a web form if the submission resulted in errors.

content_type

This returns the content type specified by the request parameters. If none was specified, or if no content_type rule was included in the validation, it returns undef.

VALIDATORS

Parameter rules can each include one or more validator functions under the key valid. The job of these functions is two-fold: first to check for good parameter values, and second to generate cleaned values.

There are a number of validators provided by this module, or you can specify a reference to a function of your own.

Predefined validators

INT_VALUE

This validator accepts any integer, and rejects all other values. It returns a numeric value, generated by adding 0 to the raw parameter value.

INT_VALUE(min,max)

This validator accepts any integer between min and max (inclusive). If either min or max is undefined, that bound will not be tested.

POS_VALUE

This is an alias for INT_VALUE(1).

POS_ZERO_VALUE

This is an alias for INT_VALUE(0).

DECI_VALUE

This validator accepts any decimal number, including exponential notation, and rejects all other values. It returns a numeric value, generated by adding 0 to the parameter value.

DECI_VALUE(min,max)

This validator accepts any real number between min and max (inclusive). Specify these bounds in quotes (i.e. as string arguments) if non-zero so that they will appear properly in error messages. If either min or max is undefined, that bound will not be tested.

MATCH_VALUE

This validator accepts any string that matches the specified pattern, and rejects any that does not. If you specify the pattern as a string, it will be converted into a regexp and will have ^ prepended and $ appended, and also the modifier "i". If you specify the pattern using qr, then it is used unchanged. Any rule that uses this validator should be provided with an error directive, since the default error message is by necessity not very informative. The value is not cleaned in any way.

ENUM_VALUE(string,...)

This validator accepts any of the specified string values, and rejects all others. Comparisons are case insensitive. If the version of Perl is 5.016 or greater, or if the module Unicode::Casefold is available and has been required, then the fc function will be used instead of the usual lc when comparing values. The cleaned value will be the matching string value from this call.

If any of the strings is '#', then subsequent values will be accepted but not reported in the standard error message as allowable values. This allows for undocumented values to be accepted.

BOOLEAN_VALUE

This validator is used for parameters that take a true/false value. It accepts any of the following values: "yes", "no", "true", "false", "on", "off", "1", "0", compared case insensitively. It returns an error if any other value is specified. The cleaned value will be 1 or 0.

FLAG_VALUE

This validator should be used for parameters that are considered to be "true" if present with an empty value. The validator returns a value of 1 in this case, and behaves like 'BOOLEAN_VALUE' otherwise.

ANY_VALUE

This validator accepts any non-empty value. Using this validator is equivalent to not specifying any validator at all.

Reusing validators

Every time you use a parametrized validator such as INT_VALUE(0,10), a new closure is generated. If you are repeating a particular set of parameters many times, to save space you may want to instantiate the validator just once:

    my $zero_to_ten = INT_VALUE(0,10);
    
    define_ruleset( 'foo' =>
        { param => 'bar', valid => $zero_to_ten },
        { param => 'baz', valid => $zero_to_ten });

Writing your own validator functions

If you wish to validate parameters which do not match any of the validators described above, you can write your own validator function. Validator functions are called with two arguments:

    ($value, $context)

Where $value is the raw parameter value and $context is a hash ref provided when the validation process is initiated (or an empty hashref if none is provided). This allows the passing of information such as database handles to the validator functions.

If your function decides that the parameter value is valid and does not need to be cleaned, it can indicate this by returning an empty result.

Otherwise, it must return a hash reference with one or more of the following keys:

error

If the parameter value is not valid, the value of this key should be an error message that states what a good value should look like. This message should contain the placeholder {param}, which will be substituted with the parameter name. Use this placeholder, and do not hard-code the parameter name.

Here is an example of a good message:

    "the value of {param} must be a positive integer (was {value})".

Here is an example of a bad message:

    "bad value for 'foo'".
warn

If the parameter value is acceptable but questionable in some way, the value of this key should be a message that states what a good value should look like. All such messages will be made available through the result object that is returned by the validation routine. The code that handles the request may then choose to display these messages as part of the response. Your code may also make use of this information during the process of responding to the request.

value

If the parameter value represents anything other than a simple string (i.e. a number, list, or more complicated data structure), then the value of this key should be the converted or "cleaned" form of the parameter value. For example, a numeric parameter might be converted into an actual number by adding zero to it, or a pair of values might be split apart and converted into an array ref. The value of this key will be returned as the "cleaned" value of the parameter, in place of the raw parameter value provided in the request.

Parametrized validators

If you want to write your own parametrized validator, write a function that generates and returns a closure. For example:

    sub integer_multiple {

        my ($value, $context, $base) = @_;
        
        return { value => $value + 0 } if $value % $base == 0;
        return { error => "the value of {param} must be a multiple of $base (was {value})" };
    }
    
    sub INTEGER_MULTIPLE {

        my ($base) = $_[0] + 0;
        
        croak "INTEGER_MULTIPLE requires a numeric parameter greater than zero"
            unless defined $base and $base > 0;
        
        return sub { return integer_multiple(shift, shift, $base) };
    }
    
    define_ruleset( 'foo' =>
        { param => foo, valid => INTEGER_MULTIPLE(3) });

AUTHOR

Michael McClennen, <mmcclenn at geology.wisc.edu>

SUPPORT

Please report any bugs or feature requests to bug-http-validate at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=HTTP-Validate. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

LICENSE AND COPYRIGHT

Copyright 2014 Michael McClennen.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.