The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Sah::Type - Standard types

VERSION

version 0.9.5

DESCRIPTION

This document specifies Sah standard types.

TYPE: undef

This type does not have any clauses. The only value it knows is the undefined value (like undef in Perl, or null in PHP).

ROLE: BaseType

This is the base type role, all Sah types (except undef) must implement this role.

Clauses

The list below is ordered by priority, from highest to lowest.

defhash_v : FLOAT

Priority: 0 (checked first before everything else).

Category: metadata.

From DefHash. Normally there is no need to set this.

v : FLOAT (default: 1)

Priority: 0 (checked first before everything else).

Category: metadata.

From DefHash. Specify Sah version. Should be 1 at the moment.

schema_v : FLOAT (default: 1)

Priority: 0 (checked first before everything else).

Category: metadata.

Specify schema version. By default assumed to be 1 if not set.

base_v : FLOAT (default: 1)

Priority: 0 (checked first before everything else).

Category: metadata.

Specify base schema version. By default assumed to be 1 if not set. Using a base schema with a different value will fail. Can be used to force child schemas to update whenever we change our schema. For example:

 // schema: vocal
 ["str", {"in": ["a", "e", "i", "o", "u"]}]

 // schema: consonant, defined in terms of "vocal", by
 ["vocal", {"match": "\\A[a-z]\\z", "in.max_ok": 0}]

However, if vocal changes its implementation or structure to:

 // the new vocal
 ["str", {"match": "\\A[aeiou]\\z"}]

then vocal will break. To force consonant to fail (so its author can update it):

 // the new vocal
 ["str", {"schema_v": 2, "match": "\\A[aeiou]\\z"}]

Since vocal's schema_v is now 2, it is not the same as 1 (which is implied by consonant, having the default value of base_v). consonant's author might then update its own implementation to match vocal:

 // the adjusted consonant
 ["str", {"base_v":2, "cset":{"match":"\\A[a-z]\\z"}, "match.max_ok":0}]

Notice the matching of consonant's base_v against vocal's schema_v. consonant might also add its own "schema_v":2 so other schemas depending on it are forced to adjust, if needed.

default : ANY

Priority: 1 (very high). This is processed before all other clauses.

Category: default.

Supply a default value.

Example: Given schema ["int", {"req": 1}] an undef data is invalid, but given schema ["int", {"req": 1, "default": 3}] an undef data is valid because it will be given default value first.

default_lang : LOCALE_CODE (defaut: en_US)

Priority: 2 (very high), after default.

Category: metadata.

From DefHash. Set default language for this schema. Language-dependant attribute values (e.g. summary, description) will be assumed to be in the default language.

name : STR

Priority: 2 (very high), after default.

Category: metadata.

From DefHash. A short (usually single-word, without any formatting) to name the schema, useful for identifying the schema when used as a type for human compiler.

To store translations, you can use the alt.lang.* clause attributes.

Example:

 ["int", {
     "name.alt.lang.en_US": "pos_int",
     "name.alt.lang.id_ID": "bil_pos",
     "min": 0
 }]

See also: summary, description, tags.

summary : STR

Priority: 2 (very high), after default.

Category: metadata.

From DefHash. A one-line text (about 72 characters maximum, without any formatting) to describe the schema. This is useful, e.g. for manually describe a schema instead of using the human compiler. It can also be used in form field labels.

To store translations, you can use the alt.lang.* clause attributes.

Example:

 // definition for 'single_dice_throw' schema/type
 ["int", {
     "req": 1,
     "summary.alt.lang.en_US":
         "A number representing result of single dice throw (1-6)",
     "summary.alt.lang.id_ID":
         "Bilangan yang menyatakan hasil lempar sebuah dadu (1-6)",
     "between": [1, 6]
 }]

Without the summary, using a compiler to human text the above schema might be output as the standard, more boring "Integer, value between 1 and 6".

See also: name, description, tags.

description : STR

Priority: 2 (very high), after default.

Category: metadata.

From DefHash. A longer text (a paragraph or more) to describe the schema, useful e.g. for help/usage text. Text should be in Markdown format.

To store translations, you can use the alt.lang.* clause attributes.

Example (using Perl syntax because it supports heredoc):

 [array => {
     name        => 'http_headers',
     description => <<EOT,
 HTTP headers should be specified as an array of 2-element arrays (pairs). Each
 pair should contain header name in the first element (all lowercase, *-*
 written as *_*) and header value in the second element.

 Example:

 : [[content_type => 'text/html'], [accept => 'text/html'], [accept => '*/*']]

 EOT
     req => 1,
     of  => 'http_header',
  },
  {
      def => {
          http_header => ['array*', len=>2],
      },
 }]

See also: name, summary, tags.

tags : ARRAY OF STR

Priority: 2 (very high), after default.

Category: metadata.

From DefHash. A list of tags, can be used to categorize schemas.

See also: name, summary, description.

req : BOOL

Priority: 3 (very high), executed after default.

Category: constraint.

If set to 1, require that data be defined. Otherwise, allow data to be undef (the default behaviour).

By default, undef will pass even elaborate schema, e.g. ["int", {"min": 0, "max": 10, "div_by": 3}] will still pass an undef. However, undef will not pass ["int": {"req": 1}].

This behaviour is much like NULLs in SQL: we *can't* (in)validate something that is unknown/unset.

See also: forbidden

forbidden : BOOL

Priority: 3 (very high), executed after default.

Category: constraint.

This is the opposite of req, requiring that data be not defined (i.e. undef).

Given schema ["int", {"forbidden": 1}], a non-undef value will fail. Another example: the schema ["int", {"req": 1, "forbidden": 1}] will always fail due to conflicting clauses.

See also: req

prefilters : [EXPR, ...]

Priority: 10 (high). Run after default and req/forbidden.

Category: filter.

Run expression(s), usually to preprocess data before further checking. Data is referred to in expression by variable $_. Prefiltered value should persist until the end of all other clauses (until the end of clause set), after which the old value can be restored.

Specific attributes: perm. If set to true, then prefiltered value will persist.

ok : ANY -> true

Priority: 50 (normal)

Return value: true (always succeeds).

Category: constraint.

Will do nothing. This clause is just a convenience if you want to do nothing (or perhaps just use the attributes of this clause to do things). It is the default in the else section of the if_clause clause.

To force failure, you can use "!ok": 1.

cset : HASH -> INT

Priority: 50 (normal)

Return value: (number of successful clauses + 1) on success, false on failure.

Category: constraint.

Evaluate a clause set. Note that return value adds 1 to the number of successful clauses to avoid returning 0 (evaluates to false). And it will only be returned if clause is successful. Otherwise false (0) will be returned.

check : EXPR -> ANY

Priority: 50 (normal)

Return value: result of evaluated expression

Category: constraint.

Evaluate expression, which must evaluate to a true value for this clause to succeed. Example:

 // require that string is a palindrome, using a Sah function
 ["str", "check", "is_palindrome($_)"]

 // require that the *length of* string is a prime number
 ["str", "check", "is_prime(len($_))"]

 // same thing, using input.prop attribute
 ["str", "check.input.prop", "len", "check", "is_prime($_)"]

 // check that the email's Subject header is a palindrome
 ["email", "check.input.prop", ["headers", "subject"],
           "check", "is_palindrome($_)"]

 // check that

Evaluate expression against property. Property will be available in expression as $_. Example:

 // require length of string to be divisible by 3
 ["str", "check_prop": ["len", "$_ % 3 == 0"]

if : HASH -> ANY

Priority: 50 (normal)

Return value: if condition is true, then the then_* result, otherwise the else_* result.

Category: constraint.

A generic condition clause.

To use this clause, first specify one of the condition keys in the argument: expr (evaluate expression, value should the expression), clause (evaluate data against clause, value should be an array [CLAUSE_NAME, CLAUSE_ARG]), schema (validate data against schema, value should be the schema), prop (evaluate property against a schema, value is [PROP, SCHEMA] where PROP is the property name, or if property has arguments, [PROP_NAME, PROP_ARGS]).

Then specify one of the then keys and one of the else keys: {then,else}{_prop,_expr,schema,clause}. The default then key is "then": 1. The default else is "else": 1, unless "then": some-true-value is specified, in which case the default else is "else": 0. Examples:

 // forbid the string to be lowercase
 "if": {"clause": ["match", "^[a-z]$"], "then": 0}

 // if string is lowercase, it must be a palindrome
 "if": {"clause": ["match", "^[a-z]$"], "then_expr": "is_palindrome($_)"}

 // if string is lowercase, it must be a palindrome, otherwise it must be longer
 // than 3 characters.
 "if": {"clause": ["match", "^[a-z]$"],
        "then_expr": "is_palindrome($_)",
        "else_expr": "len($_) > 3"}

 // require the length of the string to be an even number
 "if": {"prop": ["len", ["int", "div_by", 2]], "then": 1}

 // if string is a palindrome, then require it to have length > 5
 "if": {"expr": "is_palindrome($_)", "then_prop": ["len", ["int", "xmin": 5]]}
 "if": {"expr": "is_palindrome($_)", "then_expr": "len($_) > 5"}

each : SCHEMA

Priority: 50 (normal)

Category: constraint, looping

Requires that every element of data validate to the specified schema. The first element that fails the schema will terminate the loop.

Examples:

 ["array", {"each": "int"}]
 ["array", {"of": "int"}] // same thing, "of" is the same as "each"

The above specifies an array of integers.

 ["hash", {"each": ["str", {"match": "^[A-Za-z0-9]+$" }]}]

The above specifies hash with alphanumeric-only values.

Using the .input attribute, you can change the input.

check_each : EXPR

Priority: 50 (normal)

Category: constraint, looping

Just like each but instead of using schema, each element is tested using expression.

Using the .input attribute, you can change the input.

exists : SCHEMA

Priority: 50 (normal)

Category: constraint, looping

Test that there is at least one element of data that validates to the schema. That element is returned. Be careful to not return element which has the value which evaluates to false.

Using the .input attribute, you can change the input.

check_exists : EXPR

Priority: 50 (normal)

Category: constraint, looping

Just like exists but instead of using schema, each element is tested using expression.

Using the .input attribute, you can change the input.

postfilters : [EXPR, ...]

Priority: 90 (very low). Run after all other clauses.

Category: filter.

Run expression(s), usually to postprocess data. Data is referred to in expression by variable $_. From here on, the data will be permanently set to the postfiltered value.

ROLE: Comparable

This is the comparable type role. All types which have comparable values must implement this role. Most types implement this role, including str, all number types, etc.

Clauses

in : [ANY, ...]

Priority: 50 (normal)

Category: constraint

Require that the data be one of the specified choices.

See also: match (for type 'str'), has (for 'HasElems' types)

Examples:

 ["int", {"in": [1, 2, 3, 4, 5, 6]}] // single dice throw value
 ["str", {"!in": ["root", "admin", "administrator"]}] // forbidden usernames

is : ANY

Priority: 50 (normal)

Category: constraint

Require that the data is the same as VALUE. Will perform a numeric comparison for numeric types, or stringwise for string types, or deep comparison for deep structures.

Examples:

 ["int", {"is": 3}]
 ["int", {"is&": [1, 2, 3, 4, 5, 6]}] // effectively the same as 'in'

ROLE: HasElems

This is the role for types that have the notion of elements/length. It provides clauses like max_len, len, len_between, each, etc. It is used by array, hash, and also str.

Properties

len -> STR

Clauses

max_len : NUM

Priority: 50 (normal)

Category: constraint

Requires that the data have at most NUM elements.

Example:

 ["str", {"req": 1, "max_len": 10}] // string with at most 10 characters

min_len : NUM

Priority: 50 (normal)

Category: constraint

Requires that the data have at least NUM elements.

Example:

 ["array", {"min_len": 1}] // define an array with at least one element

len_between : [NUM_MIN, NUM_MAX]

Priority: 50 (normal)

Category: constraint

A convenience clause that combines min_len and max_len.

Example, the two schemas below are equivalent:

 ["str", {"len_between": [1, 10]}]
 ["str", {"min_len": 1, "max_len": 10}]

len : NUM

Priority: 50 (normal)

Category: constraint

Requires that the data have exactly NUM elements.

has : ANY

Priority: 50 (normal)

Category: constraint

Requires that the data contains the element.

Examples:

 // requires that array has element x
 ["array", {"has": "x"}]

 // requires that array has elements 'x', 'y', and 'z'
 ["array", {"has&": ["x", "y", "z"]}]

 // requires that array does not have element 'x'
 ["array", {"!has": "x"}]

uniq => BOOL

If set to 1, require that the array values be unique (like in a set). If set to 0, require that there are duplicates in the array.

each_index : SCHEMA

Priority: 50 (normal)

Category: constraint, looping

Like each but iterate over the indices. For type like array, this is 0, 1, ... N. For hash, this is the keys of hash.

Using the .input attribute, you can change the input.

check_each_index : EXPR

Priority: 50 (normal)

Category: constraint, looping

Like each_index but instead of using schema, each index is tested using expression.

Using the .input attribute, you can change the input.

ROLE: Sortable

This is the type role for sortable types. It provides clauses like min, max, and between. It is used by many types, for example str, all numeric types, etc.

Clauses

min => ANY

Require that the value is not less than some specified minimum (equivalent in intention to the Perl string ge operator, or the numeric >= operator).

Example:

 [int => {min => 0}] // specify positive numbers

xmin => ANY

Require that the value is not less nor equal than some specified minimum (equivalent in intention to the Perl string gt operator, or the numeric > operator). The x prefix is for "exclusive".

max => ANY

Require that the value is less or equal than some specified maximum (equivalent in intention to the Perl string le operator, or the numeric <= operator).

xmax => ANY

Require that the value is less than some specified maximum (equivalent in intention to the Perl string lt operator, or the numeric < operator). The x prefix is for "exclusive".

between => [ANY_MIN, ANY_MAX]

A convenient clause to combine min and max.

Example, the following schemas are equivalent:

 [float => {between => [0.0, 1.5]}]
 [float => {min => 0.0, max => 1.5}]

xbetween => [ANY_MIN, ANY_MAX]

A convenient clause to combine xmin and xmax.

TYPE: buf

buf stores binary data. Elements of buf data are bytes. It is derived from str.

TYPE: num

num stores numbers. This type assumes the Comparable and Sortable roles.

TYPE: float

int stores real (floating-point) numbers. This type is derived from num.

Clauses

is_nan => BOOL

Require that number is a "NaN" (or "-NaN").

is_inf => BOOL

Require that number is an "Inf" or "Infinity" (or "-Inf" or "-Infinity").

TYPE: int

int stores integers. This type is derived from num.

Clauses

mod => [INT1, INT2]

Require that (data mod INT1) equals INT2. For example, mod => [2, 1] effectively specifies odd numbers.

div_by => INT

Require that data is divisible by a number. This is effectively just a shortcut for mod => [INT, 0].

Example: Given schema [int=>{div_by=>2}], undef, 0, 2, 4, and 6 are valid but 1, 3, 5 are not.

TYPE: str

str stores strings (text). This type assumes the Comparable, Sortable, and HasElems roles (the elements are individual characters). Default encoding is utf8.

Clauses

match => REGEX|{COMPILER=>REGEX, ...}

Require that string match the specified regular expression.

Since regular expressions might not be 100% compatible from language to language, instead of avoiding the use of regex entirely, you can specify different regex for each target language, e.g.:

 [str => {match => {
   js     => '...',
   perl   => '...',
   python => '...',
 }}]

To match against multiple regexes:

 // string must match a, b, and c
 [str => {"match&"=>['a', 'b', 'c']}]

 // string must match either a or b or c
 [str => {"match|"=>['a', 'b', 'c']}

 // idem, shortcut form
 [str => {"!match"=>'a'}]

 // string must NOT match a nor b nor c (i.e. must match none of those)
 [str => {"match"=>[a, b, c], "match.is_multi"=>1, "match.max_ok"=>0}]

 // string must at least not match a or b or c (i.e. if all match, schema fail;
 // if at least one does not match, schema succeeds)
 [str => {"match"=>[a, b, c], "match.max_ok"=>2}]

is_re => BOOL

If value is true, require that the string be a valid regular expression string. If value is false, require that the string not be a valid regular expression string.

TYPE: cistr

TYPE: bool

Boolean type. This type assumes the Comparable and Sortable roles.

TYPE: array

Array type. This type assumes the Comparable, Sortable, and HasElems roles (the elements are indexed by integers starting from 0).

Clauses

elems => ARRAY_OF_SCHEMA

Specify that corresponding array element must validate to the schema. Example:

 // Rinci function result envelope
 [array => {
     elems => [
         [int => {between => [100, 999]}],
         'str',
         'any',
         'defhash',
     ]
 }]

The above schema validates result envelope, which is an array (see Rinci::function). The first element (the status) must be a 3-digit integer. The second element (the error message) is a string. The third element (the actual result) can be anything. The fourth element (result metadata) is a hash containing extra data. Examples of valid data including:

 [404, "Not found"]
 [200, "OK", [1, 2, 3, 4], {'cmdline.pager'=>'less'}]

of => SCHEMA

This is just an alias to each.

TYPE: hash

Hash (a.k.a. dictionary) type. This type assumes the Comparable, Sortable, and HasElems roles (the elements are indexed by strings).

Clauses

keys => HASH

Specify schema for specific pair value. Also, restrict keys of hash to the list specified in this clause, except if allow_extra_keys clause is set to true. Example:

 ['hash*' => {
     keys => {
         name => 'str',
         address => [any => {of=>['str', [array => {of=>'str'}]]}],
         email => 'email_address',
     },
 }]

The above schema requires data to be a hash with keys name, address, email. None of the keys are required to be present (use req_keys for that), but other keys are not allowed.

allow_extra_keys => BOOL (default: 0)

If set to true, allow hash to have keys other than specified in the keys clause. See also: keys.

each_key => SCHEMA

Alias to each_index.

each_value => SCHEMA

Alias to each.

TYPE: any

A type to specify alternate schemas.

Clauses

of => [SCHEMA, ...]

Specify the schema(s) where the value will need to be valid to at least one of them.

TYPE: all

A type to specify co-schemas (all schemas that must be validated to value).

Clauses

of => [SCHEMA, ...]

Specify the schema(s) where the value will need to be valid to all of them.

TYPE: obj

Object.

Properties

methods

attributes

Clauses

can

isa

TYPE: date

SEE ALSO

Sah

AUTHOR

Steven Haryanto <stevenharyanto@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2012 by Steven Haryanto.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.