The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::Sah - Schema for data structures (Perl implementation)

VERSION

version 0.05

SYNOPSIS

First, familiarize with the schema syntax. Refer to Sah and Sah::Examples. Some example schemas:

 'int'                       # an optional integer
 'int*'                      # a required integer
 [int => {min=>1, max=>10}]  # an integer with some constraints

To use this module:

 use Data::Sah;
 my $sah = Data::Sah->new;

 # get compiler, e.g. 'perl'. 'js' and 'human' are also available.
 my $plc = $sah->get_compiler('perl');

 # use the compiler to generate code
 my $res = $plc->compile(
     data_name             => 'data',
     data_term             => '\%data',
     data_term_is_lvalue   => 0, # default: 1
     err_term              => '$err', # must be an lvalue term
     schema                => [hash => {req=>1, len_between => [1, 10]}],
     schema_is_normalized  => 1, # don't normalize schema because already so

     validator_return_type => 'str',
 );

See also Data::Sah::Simple.

DESCRIPTION

This module, Data::Sah, implements compilers for producing Perl and JavaScript validators, as well as human description text (English and Indonesian included) from Sah schemas. Compiler approach is used instead of interpreter for faster speed.

The generated validator code can run without this module.

STATUS

Early implementation, only Perl compiler implemented. Only a handful of types and attributes supported.

ATTRIBUTES

compilers => HASH

A mapping of compiler name and compiler (Data::Sah::Compiler::*) objects.

METHODS

new() => OBJ

Create a new Data::Sah instance.

$sah->get_compiler($name) => OBJ

Get compiler object. "Data::Sah::Compiler::$name" will be loaded first and instantiated if not already so. After that, the compiler object is cached.

Example:

 my $plc = $sah->get_compiler("perl"); # loads Data::Sah::Compiler::perl

$sah->normalize_schema($schema) => HASH

Normalize a schema, e.g. change int* into [int = {req=>1}]>, as well as do some sanity checks on it. Returns the normalized schema if succeeds, or dies on error.

Can also be used as a function.

$sah->normalize_var($var) => STR

Normalize a variable name in expression into its fully qualified/absolute form.

Not yet implemented (pending specification).

For example:

 [int => {min => 10, 'max=' => '2*$min'}]

$min in the above expression will be normalized as schema:clauses.min.

$sah->compile($compiler_name, %compiler_args) => STR

Basically just a shortcut for get_compiler() and send %compiler_args to the particular compiler. Returns generated code.

$sah->perl(%args) => STR

Shortcut for $sah->compile('perl', %args).

$sah->human(%args) => STR

Shortcut for $sah->compile('human', %args).

$sah->js(%args) => STR

Shortcut for $sah->compile('js', %args).

MODULE ORGANIZATION

Data::Sah::Type::* roles specify Sah types, e.g. Data::Sah::Type::bool specifies the bool type.

Data::Sah::FuncSet::* roles specify bundles of functions, e.g. Data::Sah::FuncSet::Core specifies the core/standard functions.

Data::Sah::Compiler::$LANG:: namespace is for compilers. Each compiler (if derived from BaseCompiler) might further contain ::TH::* and ::FSH::* to implement appropriate functionalities, e.g. Data::Sah::Compiler::perl::TH::bool is the 'bool' type handler for the Perl compiler and Data::Sah::Compiler::perl::FSH::Core is the funcset 'Core' handler for Perl compiler.

Data::Sah::Lang::$LANGCODE::* namespace is reserved for modules that contain translations. Language submodules follows the organization of other modules, e.g. Data::Sah::Lang::en_US::Type::int, Data::Sah::Lang::id_ID::FuncSet::Core, etc.

Data::Sah::Schema:: namespace is reserved for modules that contain bundles of schemas. For example, Data::Sah::Schema::CPANMeta contains the schema to validate CPAN META.yml. Data::Sah::Schema::Sah contains the schema for Sah schema itself.

Data::Sah::TypeX::$TYPENAME::$CLAUSENAME namespace can be used to name distributions that extend an existing Sah type by introducing a new clause for it. It must also contain, at the minimum: perl, js, and human compiler implementations for it, as well as English translations. For example, Data::Sah::TypeX::int::is_prime is a distribution that adds is_prime clause to the int type. It will contain the following packages inside: Data::Sah::Type::int, Data::Sah::Compiler::{perl,human,js}::TH::int. Other compilers' implementation can be packaged under Data::Sah::Compiler::$COMPILERNAME::TypeX::$TYPENAME::$CLAUSENAME, e.g. Data::Sah::Compiler::python::TypeX::int::is_prime distribution. Language can be put in Data::Sah::Lang::$LANGCODE::TypeX::int::is_prime.

FAQ

Relation to Data::Schema?

Data::Schema is the old incarnation of this module, deprecated since 2011.

There are enough incompatibilities between the two (some different syntaxes, renamed clauses). Also, some terminology have been changed, e.g. "attribute" become "clauses", "suffix" becomes "attributes". This warrants a new name.

Compared to Data::Schema, Sah always compiles schemas and there is much greater flexibility in code generation (can generate data term, can generate code to validate multiple schemas, etc). There is no longer hash form, schema is either a string or an array. Some clauses have been renamed (mostly, commonly used clauses are abbreviated, Huffman encoding thingy), some removed (usually because they are replaced by a more general solution), and new ones have been added.

If you use Data::Schema, I recommend you migrate to Data::Sah as I will not be developing Data::Schema anymore. Sorry, there's currently no tool to convert your Data::Schema schemas to Sah, but it should be relatively straightforward. I recommend that you look into Data::Sah::Simple.

Comparison to {JSON::Schema, Data::Rx, Data::FormValidator, ...}?

See Sah::FAQ.

Can I generate another schema dynamically from within the schema?

For example:

 // if first element is an integer, require the array to contain only integers,
 // otherwise require the array to contain only strings.
 ["array", {"min_len": 1, "of=": "[is_int($_[0]) ? 'int':'str']"}]

Currently no, Data::Sah does not support expression on clauses that contain other schemas. In other words, dynamically generated schemas are not supported. To support this, if the generated code needs to run independent of Data::Sah, it needs to contain the compiler code itself (or an interpreter) to compile or evaluate the generated schema.

However, an eval_schema() Sah function which uses Data::Sah can be trivially declared and target the Perl compiler.

SEE ALSO

Alternatives to Sah

Moose has a type system. MooseX::Params::Validate, among others, can validate method parameters based on this.

Some other data validation and data schema modules on CPAN: Data::FormValidator, Params::Validate, Data::Rx, Kwalify, Data::Verifier, Data::Validator, JSON::Schema, Validation::Class.

AUTHOR

Steven Haryanto <stevenharyanto@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2012 by Steven Haryanto.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.