The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Config::Validator - schema based configuration validation

SYNOPSIS

  use Config::Validator;

  # simple usage
  $validator = Config::Validator->new({ type => "list(integer)" });
  $validator->validate([ 1, 2 ]);   # OK
  $validator->validate([ 1, 2.3 ]); # FAIL
  $validator->validate({ 1, 2 });   # FAIL

  # advanced usage
  $validator = Config::Validator->new(
      octet => {
          type => "integer",
          min  => 0,
          max  => 255,
      },
      color => {
          type   => "struct",
          fields => {
              red   => { type => "valid(octet)" },
              green => { type => "valid(octet)" },
              blue  => { type => "valid(octet)" },
          },
      },
  );
  $validator->validate(
      { red => 23, green => 47,  blue => 6 }, "color"); # OK
  $validator->validate(
      { red => 23, green => 470, blue => 6 }, "color"); # FAIL
  $validator->validate(
      { red => 23, green => 47,  lbue => 6 }, "color"); # FAIL

DESCRIPTION

This module allows to perform schema based configuration validation.

The idea is to define in a schema what valid data is. This schema can be used to create a validator object that can in turn be used to make sure that some data indeed conforms to the schema.

Although the primary focus is on "configuration" (for instance as provided by modules like Config::General) and, to a lesser extent, "options" (for instance as provided by modules like Getopt::Long), this module can in fact validate any data structure.

METHODS

The following methods are available:

new([OPTIONS])

return a new Config::Validator object (class method)

options([NAME])

convert the named schema (or the default schema if the name is not given) to a list of Getopt::Long compatible options

validate(DATA[, NAME])

validate the given data using the named schema (or the default schema if the name is not given)

traverse(CALLBACK, DATA[, NAME])

traverse the given data using the named schema (or the default schema if the name is not given) and call the given CALLBACK on each node

FUNCTIONS

The following convenient functions are available:

is_true(SCALAR)

check if the given scalar is the boolean true

is_false(SCALAR)

check if the given scalar is the boolean false

is_regexp(SCALAR)

check if the given scalar is a compiled regular expression

expand_duration(STRING)

convert a string representing a duration (such as "1h10m12s") into the corresponding number of seconds (such as "4212")

expand_size(STRING)

convert a string representing a size (such as "1.5kB") into the corresponding integer (such as "1536")

listof(SCALAR)

return the given scalar as a list, dereferencing it if it is a list reference (this is very useful with the list?(X) type)

string2hash(STRING)

convert a string of space separated key=value pairs into a hash or hash reference

hash2string(HASH)

convert a hash or hash reference into a string of space separated key=value pairs

treeify(HASH)

modify (in place) a hash reference to turn it into a tree, using the dash character to split keys

treeval(HASH, NAME)

return the value of the given option (e.g. foo-bar) in a treeified hash

mutex(HASH, NAME...)

treat the given options as mutually exclusive

reqall(HASH, NAME1, NAME...)

if the first option is set, all the others are required

reqany(HASH, NAME1, NAME...)

if the first option is set, one at least of the others is required

SCHEMAS

A schema is simply a structure (i.e. a hash reference) with the following fields (all of them being optional except the first one):

type

the type of the thing to validate (see the "TYPES" section for the complete list); this can also be a list of possible types (e.g. integer or undef)

subtype

for an homogeneous list or table, the schema of its elements

fields

for a structure, a table of the allowed fields, in the form: field name => corresponding schema

optional

for a structure field, it indicates that the field is optional

min

the minimum length/size, only for some types (integer, number, string, list and table)

max

the maximum length/size, only for some types (integer, number, string, list and table)

match

a regular expression used to validate a string or table keys

check

a code reference allowing to run user-supplied code to further validate the data

As an example, the following schema describe what a valid schema is:

  {
    type   => "struct",
    fields => {
      type     => { type => "list?(valid(type))" },
      subtype  => { type => "valid(schema)",        optional => "true" },
      fields   => { type => "table(valid(schema))", optional => "true" },
      optional => { type => "boolean",              optional => "true" },
      min      => { type => "number",               optional => "true" },
      max      => { type => "number",               optional => "true" },
      match    => { type => "regexp",               optional => "true" },
      check    => { type => "code",                 optional => "true" },
    },
  }

NAMED SCHEMAS

For convenience and self-reference, schemas can be named.

To use named schemas, give them along with their names to the new() method:

  $validator = Config::Validator->new(
      name1 => { ... schema1 ... },
      name2 => { ... schema2 ... },
  );

You can then refer to them in the validate() method:

  $validator->validate($data, "name1");

If you don't need named schemas, you can use the simpler form:

  $validator = Config::Validator->new({ ... schema ... });
  $validator->validate($data);

TYPES

Here are the different types that can be used:

anything

really anything, including undef

undef

the undefined value

undefined

synonym for undef

defined

anything but undef

string

any string (in fact, anything that is defined and not a reference)

boolean

either true or false

number

any number (this is tested using a regular expression)

integer

any integer (this is tested using a regular expression)

duration

any duration (integers with optional time suffixes)

size

any size (integer with optional fractional part and optional byte-suffix)

hostname

any host name (as per RFC 1123)

ipv4

any IPv4 address (this is tested using a regular expression)

ipv6

any IPv6 address (this is tested using a regular expression)

reference

any reference, blessed or not

ref(*)

synonym for reference

blessed

any blessed reference

object

synonym for blessed

isa(*)

synonym for blessed

unblessed

any reference which is not blessed

code

a code reference

regexp

a compiled regular expression

list

an homogeneous list

list(X)

idem but with the given subtype

list?(X)

shortcut for either X or list(X)

table

an homogeneous table

table(X)

idem but with the given subtype

struct

a structure, i.e. a table with known keys

ref(X)

a reference of the given kind

isa(X)

an object of the given kind

valid(X)

something valid according to the given named schema

EXAMPLES

CONFIGURATION VALIDATION

This module works well with Config::General. In particular, the list?(X) type matches the way Config::General merges blocks.

For instance, one could use the following code:

  use Config::General qw(ParseConfig);
  use Config::Validator;
  $validator = Config::Validator->new(
    service => {
      type   => "struct",
      fields => {
        port  => { type => "integer", min => 0, max => 65535 },
        proto => { type => "string" },
      },
    },
    host => {
      type   => "struct",
      fields => {
        name    => { type => "string", match => qr/^\w+$/ },
        service => { type => "list?(valid(service))" },
      },
    },
  );
  %cfg = ParseConfig(-ConfigFile => $path, -CComments => 0);
  $validator->validate($cfg{host}, "host");

This would work with:

  <host>
    name = foo
    <service>
      port = 80
      proto = http
    </service>
  </host>

where $cfg{host}{service} is the service hash but also with:

  <host>
    name = foo
    <service>
      port = 80
      proto = http
    </service>
    <service>
      port = 443
      proto = https
    </service>
  </host>

where $cfg{host}{service} is the list of service hashes.

OPTIONS VALIDATION

This module interacts nicely with Getopt::Long: the options() method can be used to convert a schema into a list of Getopt::Long options.

Here is a simple example:

  use Config::Validator;
  use Getopt::Long qw(GetOptions);
  use Pod::Usage qw(pod2usage);
  $validator = Config::Validator->new({
    type   => "struct",
    fields => {
      debug => {
        type     => "boolean",
        optional => "true",
      },
      proto => {
        type  => "string",
        match => qr/^\w+$/,
      },
      port => {
        type => "integer",
        min  => 0,
        max  => 65535,
      },
    },
  });
  @options = $validator->options();
  GetOptions(\%cfg, @options) or pod2usage(2);
  $validator->validate(\%cfg);

ADVANCED VALIDATION

This module can also be used to combine configuration and options validation using the same schema. The idea is to:

  • define a unique schema validating both configuration and options

  • parse the command line options using Getopt::Long (first pass, to detect a --config option)

  • read the configuration file using Config::General

  • parse again the command line options, using the configuration data as default values

  • validate the merged configuration/options data

In some situations, it may make sense to consider the configuration data as a tree and prefer:

  <incoming>
    uri = foo://host1:1234
  </incoming>
  <outgoing>
    uri = foo://host2:2345
  </outgoing>

to:

  incoming-uri = foo://host1:1234
  outgoing-uri = foo://host2:2345

The options() method flatten the schema to get a list of command line options and the treeify() function transform flat options (as returned by Getopt::Long) into a deep tree so that it matches the schema. Then the treeval() function can conveniently access the value of an option.

See the bundled examples for complete working programs illustrating some of the possibilities of this module.

AUTHOR

Lionel Cons http://cern.ch/lionel.cons

Copyright (C) CERN 2012-2014