The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Sah::Type - Standard types

VERSION

version 0.9.1

DESCRIPTION

This document specifies Sah standard types.

ROLE: BaseType

This is the base type role, all Sah types must implement this role.

Clauses

The list below is ordered by priority, from highest to lowest.

v => INT (default: 1)

Specify schema version. Should be 1 at the moment. See also: DefHash.

Priority: 0 (checked first before everything else).

default => ANY

Supply a default value.

Priority: 1 (very high). This is processed before all other clauses.

Example: Given schema [int => {req=>1}] an undef data is invalid, but given schema [int => {req=>1, default=>3}] an undef data is valid because it will be given default value first.

default_lang => LOCALE_CODE (defaut: en_US)

Set default language for this schema. Language-dependant attribute values (e.g. summary, description) will be assumed to be in the default language. See also: DefHash.

Priority: 2 (very high), after default.

name => STR

A short (usually single-word, without any formatting) to name the schema, useful for identifying the schema when used as a type for human compiler. See also: DefHash.

To store translations, you can use the alt.lang.* clause attributes.

Example:

 [int => {
     'name:alt.lang.en_US' => 'pos_int',
     'name:alt.lang.id_ID' => 'bil_pos',
     min=>0,
 }]

Priority: 2 (very high), after default.

See also: summary, description, tags.

summary => STR

A one-line text (about 72 characters maximum, without any formatting) to describe the schema. This is useful, e.g. for manually describe a schema instead of using the human compiler. It can also be used in form field labels. See also: DefHash.

To store translations, you can use the alt.lang.* clause attributes.

Example:

 # definition for 'single_dice_throw' schema/type
 [int => {
     req => 1,
     'summary:alt.lang.en_US' =>
         'A number representing result of single dice throw (1-6)',
     'summary:alt.lang.id_ID' =>
         'Bilangan yang menyatakan hasil lempar sebuah dadu (1-6)',
     between => [1, 6],
 }]

Using the human compiler, the above schema will be output as the standard, more boring 'Integer, value between 1 and 6.'

Priority: 2 (very high), after default.

See also: name, description, tags.

description => STR

A longer text (a paragraph or more) to describe the schema, useful e.g. for help/usage text. Text should be in Markdown format. See also: DefHash.

To store translations, you can use the alt.lang.* clause attributes.

Example:

 [array => {
     name        => 'http_headers',
     description => <<EOT,
 HTTP headers should be specified as an array of 2-element arrays (pairs). Each
 pair should contain header name in the first element (all lowercase, *-*
 written as *_*) and header value in the second element.

 Example:

 : [[content_type => 'text/html'], [accept => 'text/html'], [accept => '*/*']]

 EOT
     req => 1,
     of  => 'http_header',
  },
  {
      def => {
          http_header => ['array*', len=>2],
      },
 }]

Priority: 2 (very high), after default.

See also: name, summary, tags.

tags => ARRAY OF STR

A list of tags, can be used to categorize schemas. See also: DefHash.

Priority: 2 (very high), after default.

See also: name, summary, description.

req => BOOL

If set to 1, require that data be defined. Otherwise, allow data to be undef (the default behaviour).

Priority: 3 (very high), executed after default.

By default, undef will pass even elaborate schema, e.g. [int => {min=>0, max=>10, div_by=>3}] will still pass an undef. However, undef will not pass [int=>{req=>1}].

This behaviour is much like NULLs in SQL: we *can't* (in)validate something that is unknown/unset.

See also: forbidden

forbidden => BOOL

This is the opposite of req, requiring that data be not defined (i.e. undef).

Priority: 3 (very high), executed after default.

Given schema [int=>{forbidden=>1}], a non-undef value will fail. Another example: the schema [int=>{req=>1, forbidden=>1}] will always fail due to conflicting clauses.

See also: req

prefilters => [EXPR, ...]

Run expression(s), usually to preprocess data before further checking. Data is referred to in expression by variable $_. Prefiltered value should persist until the end of all other clauses (until the end of clause set), after which the old value can be restored.

Priority: 10 (high). Run after default and req/forbidden.

Specific attributes: perm. If set to true, then prefiltered value will persist.

noop => ANY

Will do nothing. This clause is just a convenience if you want to do nothing (or perhaps just use the attributes of this clause to do things).

Priority: 50 (normal)

fail => BOOL

If set to 1, validation of this clause always fails. This is just a convenience to force failure.

Priority: 50 (normal)

cset => HASH

Evaluate a clause set.

Priority: 50 (normal)

if => [CLAUSE1=>VAL, CLAUSE2=>VAL]

If CLAUSE1 succeeds, then CLAUSE2 must also succeed. Otherwise, nothing is done. The second form (2-argument) operates on a clause set (hash) or clause sets (array of hashes).

Example:

 # leap year
 [int => {div_by=>4, if => [div_by => 100, div_by => 400]}]

The if clause states that if input number is divisible by 100, it must also divisible by 400. Otherwise, the clause fails.

if_cset => [CSET1, CSET2]

If clause set CSET1 succeeds, then CSET2 must also succeed. Otherwise, nothing is done.

Examples:

 [str => {min_len=>1, max_len=>10,
          if_cset => [ {min_len=>4, max_len=>6}, {is_palindrome=>1} ]}]

The above says that if a string has length between 4 and 6 then it must be a palindrome. Otherwise it doesn't have to be one. But nevertheless, all input must be between 1 and 10 characters long.

 [str => {if_cset => [ {'cset&' => [{match=>'a'}, {match=>'b'}]},
                       {'cset&' => [{match=>'c'}, {match=>'d'}]}, ]}]

The above says that if a string matches 'a' and 'b', it must also match 'c' and 'd'. As a side note, the above schema can also be written as:

 [str => {if => [ 'match&'=>['a', 'b'], 'match&'=>['c', 'd'] ]}]

check => EXPR

Evaluate expression, which must evaluate to a true value for this clause to succeed.

Priority: 50 (normal)

postfilters => [EXPR, ...]

Run expression(s), usually to postprocess data. Data is referred to in expression by variable $_. From here on, the data will be permanently set to the postfiltered value.

Priority: 90 (very low). Run after all other clauses.

ROLE: Comparable

This is the comparable type role. All types which have comparable values must implement this role. Most types implement this role, including str, all number types, etc.

Clauses

in => [ANY, ...]

Require that the data be one of the specified choices.

See also: match (for type 'str'), has (for 'HasElems' types)

Examples:

 [int => {in => [1, 2, 3, 4, 5, 6]}] # single dice throw value
 [str => {'!in' => ['root', 'admin', 'administrator']}] # forbidden usernames

is => ANY

Require that the data is the same as VALUE. Will perform a numeric comparison for numeric types, or stringwise for string types, or deep comparison for deep structures.

Examples:

 [int => {is => 3}]
 [int => {'is&' => [1, 2, 3, 4, 5, 6]}] # effectively the same as 'in'

ROLE: HasElems

This is the role for types that have the notion of elements/length. It provides clauses like max_len, len, len_between, all_elems, elems, etc. It is used by 'array', 'hash', and also 'str'.

Clauses

max_len => NUM

Requires that the data have at most NUM elements.

Example:

 [str, {req=>1, max_len=>10}] # define a string with at most 10 characters

min_len => NUM

Requires that the data have at least NUM elements.

Example:

 [array, {min_len=>1}] # define an array with at least one element

len_between => [NUM_MIN, NUM_MAX]

A convenience clause that combines min_len and max_len.

Example, the two schemas below are equivalent:

 [str, {len_between=>[1, 10]}]
 [str, {min_len=>1, max_len=>10}]

len => NUM

Requires that the data have exactly NUM elements.

has => ANY

Requires that the data contains the element.

Examples:

 # requires that array has element x
 [array => {has => 'x'}]

 # requires that array has elements 'x', 'y', and 'z'
 [array => {'has&' => ['x', 'y', 'z']}]

 # requires that array does not have element 'x'
 [array => {'!has' => 'x'}]

all_elems => SCHEMA

Requires that every element of the data validate to the specified schema.

Note: filters applied by SCHEMA to elements will be preserved.

Examples:

 [array => {all_elems => 'int'}]

The above specifies an array of ints.

 [hash => {all_elems => [str => { match => '^[A-Za-z0-9]+$' }]}]

The above specifies hash with alphanumeric-only values.

if_elem => [INDEX1 => SCHEMA1, INDEX2 => SCHEMA2]

State that if element with the index of INDEX1 passes SCHEMA1, then element with the index of INDEX2 must also passes SCHEMA2. Otherwise, nothing is done.

Examples:

 [hash => {if_elem => [
     password => 'str*', password_confirmation => 'str*',
 ]}]

The above says: key 'password_confirmation' is required if 'password' is set.

 [hash => {'elem_deps&' => [
   [ province => ['str*', {is => 'Outside US'}],
     zipcode => [str => {forbidden=>1}] ],
   [ province => ['str*', {not => 'Outside US'}],
     zipcode => [str => {required=>1}] ]
 ]}]

The above says: if province is set to 'Outside US', then zipcode must not be specified. Otherwise if province is set to US states, zipcode is required.

if_elem_re => [REGEX1 => SCHEMA1, REGEX2 => SCHEMA2]

State that if all elements with index matching REGEX1 pass SCHEMA1, then all elements with index matching REGEX2 must also pass SCHEMA2. Otherwise, nothing is done.

Example:

 [array => {'if_elem_re&' => [
     [ '^0$',   ['str*'  => {is => 'int'}],
       '[1-9]', ['hash*' => {keys_in => [qw/is min max/]}] ],
     [ '^0$',   ['str*'  => {is => 'str'}],
       '[1-9]', ['hash*' => {keys_in => [qw/is min max min_len max_len/]}] ],
     [ '^0$',   ['str*'  => {is => 'bool'}],
       '[1-9]', ['hash*' => {keys_in => [qw/is/]}] ],
 ]}]

The above says: if first element of array is int, then the following elements must be hash with specified keys. A similar rule is there for first element being 'str' and 'bool'.

Example valid array:

 ['str', {min_len=>0, max_len=>1}, {is=>'a'}]

Example invalid array (key min_len is not allowed):

 ['int', {min_len=>0, max_len=>1}, {is=>'a'}]

Note: You need to be careful with undef, because it matches all schema unless req=>1 (or the shortcut 'foo*') is specified.

ROLE: Sortable

This is the type role for sortable types. It provides clauses like min, max, and between. It is used by many types, for example str, all numeric types, etc.

Clauses

min => ANY

Require that the value is not less than some specified minimum (equivalent in intention to the Perl string ge operator, or the numeric >= operator).

Example:

 [int => {min => 0}] # specify positive numbers

xmin => ANY

Require that the value is not less nor equal than some specified minimum (equivalent in intention to the Perl string gt operator, or the numeric > operator). The x prefix is for "exclusive".

max => ANY

Require that the value is less or equal than some specified maximum (equivalent in intention to the Perl string le operator, or the numeric <= operator).

xmax => ANY

Require that the value is less than some specified maximum (equivalent in intention to the Perl string lt operator, or the numeric < operator). The x prefix is for "exclusive".

between => [ANY_MIN, ANY_MAX]

A convenient clause to combine min and max.

Example, the following schemas are equivalent:

 [float => {between => [0.0, 1.5]}]
 [float => {min => 0.0, max => 1.5}]

xbetween => [ANY_MIN, ANY_MAX]

A convenient clause to combine xmin and xmax.

TYPE: buf

buf stores binary data. Elements of buf data are bytes. It is derived from str.

TYPE: num

num stores numbers. This type assumes the Comparable and Sortable roles.

TYPE: float

int stores real (floating-point) numbers. This type is derived from num.

TYPE: int

int stores integers. This type is derived from num.

Clauses

mod => [INT1, INT2]

Require that (data mod INT1) equals INT2. For example, mod => [2, 1] effectively specifies odd numbers.

div_by => INT

Require that data is divisible by a number. This is effectively just a shortcut for mod => [INT, 0].

Example: Given schema [int=>{div_by=>2}], undef, 0, 2, 4, and 6 are valid but 1, 3, 5 are not.

TYPE: str

str stores strings (text). This type assumes the Comparable, Sortable, and HasElems roles (the elements are individual characters). Default encoding is utf8.

Clauses

match => REGEX|{COMPILER=>REGEX, ...}

Require that string match the specified regular expression.

Since regular expressions might not be 100% compatible from language to language, instead of avoiding the use of regex entirely, you can specify different regex for each target language, e.g.:

 [str => {match => {
   js     => '...',
   perl   => '...',
   python => '...',
 }}]

To match against multiple regexes:

 # string must match a, b, and c
 [str => {"match&"=>['a', 'b', 'c']}]

 # string must match either a or b or c
 [str => {"match|"=>['a', 'b', 'c']}

 # idem, shortcut form
 [str => {"!match"=>'a'}]

 # string must NOT match a nor b nor c (i.e. must match none of those)
 [str => {"match.vals"=>[a, b, c], "match.max_ok"=>0}]

 # string must at least not match a or b or c (i.e. if all match, schema fail;
 # if at least one does not match, schema succeeds)
 [str => {"match.vals"=>[a, b, c], "match.max_ok"=>2}]

is_re => BOOL

If value is true, require that the string be a valid regular expression string. If value is false, require that the string not be a valid regular expression string.

TYPE: bool

Boolean type. This type assumes the Comparable and Sortable roles.

TYPE: array

Array type. This type assumes the Comparable, Sortable, and HasElems roles (the elements are indexed by integers starting from 0).

Clauses

TBD

TYPE: hash

Hash (a.k.a. dictionary) type. This type assumes the Comparable, Sortable, and HasElems roles (the elements are indexed by strings).

Clauses

TBD

SEE ALSO

Sah

AUTHOR

Steven Haryanto <stevenharyanto@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2012 by Steven Haryanto.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.