The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

JsonSQL::Validator - JSON schema validation module. Returns a JsonSQL::Validator object for validating a JSON string against a pre-defined schema.

VERSION

version 0.4

SYNOPSIS

This is a supporting module used by JsonSQL::Query modules.

To use this:

    my $validator = JsonSQL::Validator->new(<json_schema_name>, <whitelisting_rule_set>);
    my $perldata = $validator->validate_schema(<json_string>);
    if ( eval { $perldata->is_error } ) {
        return "$perldata->{message}";
    } else {
        ...
    }

To use the whitelisting module:

    my $table_rules = $validator->check_table_allowed({ schema => <schemaname>, table => <tablename> });
    if ( eval { $table_rules->is_error } ) {
        return "Use of table failed access check: $table_rules->{message}";
    } else {
        my $allowedField = $validator->check_field_allowed($table_rules, <fieldname>);
        if ( eval { $allowedField->is_error } ) {
            return "Use of field in table failed access check: $allowedField->{message}";
        } else {
            ...
        }
    }

For more information on the whitelisting module, and how to construct rule sets, see documentation below.

METHODS

Constructor new($jsonSchema, $ruleSets) -> JsonSQL::Validator

Loads the specified $jsonSchema and creates a JSON::Validator instance with it. A reference to the validator and to the provided whitelisting rule sets is saved in the object before it is returned. If an error occurs during schema loading, a JsonSQL::Error object is returned.

    $jsonSchema     => The JSON schema to load. Must be present in JsonSQL::Schemas as a subclass of JsonSQL::Schemas::Schema.
    $ruleSets       => An array of whitelisting rules to be applied when a JsonSQL query object is being constructed (see below).

ObjectMethod validate_schema($json) -> \%hash

Parses the provided JSON string into a Perl data structure, and then uses the stored JSON::Validator to validate it against the specified schema (a JsonSQL::Schemas::<schema>). If the process fails at any step, it will return a JsonSQL::Error object with an appropriate error message. If successful, a Perl data structure (depends on the schema, but usually a hashref) representing the SQL query is returned.

    $json        => The JSON string to validate. Must be valid JSON.

PrivateMethod _getRuleSets($schemaString) -> \@array

Searches the @ruleSets array for the specified $schemaString and returns all matching rule sets.

    $schemaString       => Name of schema to match for identifying rulesets.

    Matches a rule set if ( $schemaString eq $ruleSet->{schema} || $ruleSet->{schema} eq '#anySchema' ).

PrivateMethod _getTableRules($tableString, $ruleSet) -> \@array

Looks at each table rule in the \@ruleSet array and returns it if it matches the specified $tableString.

    $tableString       => Name of table to match for identifying table rules.
    $ruleSet           => The @\ruleSet array to search for table matches.

    Matches table rules of the form ( $ruleSet->{$tableString} || $ruleSet->{'#anyTable'} ).

ObjectMethod check_table_allowed($tableObj) -> \@array

Determines whether access to a table is allowed by the current stored rule set. If yes, a set of table rules applicable to the table is returned to use for column verification. If no, a JsonSQL::Error object is returned.

    $tableObj           => Name of table to match for identifying table rules.
    Takes the form { schema => <schemaname>, table => <tablename> }

ObjectMethod check_field_allowed($table_rules, $field) -> 1 || JsonSQL::Error

Determines whether access to a column is allowed by the supplied table rules. If yes, a true value is returned. If no, a JsonSQL::Error object is returned.

    $table_rules           => Array of table rules as returned by check_table_allowed.
    $field                 => The name of the field to check.

Whitelisting Module

To provide some basic whitelisting support for table and column identifiers, a set of whitelisting rules is saved in the JsonSQL::Validator object when it is being created. The rules take the form of an \@arrayref as follows:

        [
            {
                schema => 'schemaName' || '#anySchema',
                <'#anyTable' || allowedTableName1 => [ '#anyColumn' || allowedFieldName1, allowedFieldName2, ... ]>,
                <... additional table rules ...>
            },
            < ... additional rule sets ... >
        ]

Rule sets are generally grouped by schema. If you are not using schemas (or you are using a DB that doesn't support them), you will have to provide a rule set with the schema property set to '#anySchema'. Whitelisting security is enabled and restrictive by default, so at least one rule set will have to be defined in order to create JsonSQL query objects. If you want to disable whitelisting security (not recommended), use this rule set,

    [ { schema = '#anySchema', '#anyTable' } ]

The above allows access to all tables in any schema. Column restrictions are not meaningful without table restrictions, so table rules have to be defined if you want column restrictions. You can have more than one rule set per schema, but in this case the most restrictive rule set will be the one that takes precedent. This behavior can be used as an effective way to disable access to specific schemas. For example,

    [
        { schema => '#anySchema', '#anyTable' },
        { schema => 'forbiddenSchema' }
    ]

will first allow access to all tables in any schema, and then restrict access to any table in 'forbiddenSchema'. Table verification and column verification take place in separate steps. During table verification, rule sets are selected based on the schema property. The remaining keys in each rule set %hash correspond to tables that the query object is allowed access to.

If there is a key in the rule set with the special name '#anyTable', access to any table in that rule set (schema) will be allowed. For other keys (table names), the value needs to be set to an array of column names. During column verification, this list will be used to determine whether the query object has access to particular columns in the table.

As with schemas, access to a table can be governed by more than one 'table rule'. In this case, the most restrictive rule is the one that takes precedent. For example,

    [
        { schema => 'allowedSchema', '#anyTable', 'allowedTable' => [ 'allowedColumnA', 'allowedColumnB' ] }
    ]

will allow access to all columns of all tables in the schema 'allowedSchema', but for the table 'allowedTable', only access to columns 'allowedColumnA' and 'allowedColumnB' is allowed. Similarly,

    [
        { schema => 'allowedSchema', '#anyTable', 'forbiddenTable' => [] }
    ]

will allow access to all columns of all tables in the schema 'allowedSchema', but block access to the table 'forbiddenTable'. (Technically, it is only blocking access to the columns in that table, but this is effectively the same thing for most SQL operations).

If the column list contains the special string '#anyColumn' access to all columns in the table will be allowed. So,

    [
        { schema => 'allowedSchema', 'allowedTable1' => [ '#anyColumn' ], 'allowedTable2' => [ 'allowedColumn1' ] }
    ]

will allow access to any column in 'allowedTable1' and only column 'allowedColumn1' of 'allowedTable2'. Access to all other tables in 'allowedSchema' will be blocked.

This module is designed to err on the side of caution, and in so doing will always take the more restrictive course of action in the case of ambiguity. As such, many SQL queries will probably fail validation if you don't use fully-qualified table and column identifiers, which is generally recommended as good practice when writing SQL queries anyway. However, if you are writing simple queries and find this to be annoying, you can turn off whitelisting and rely only on database-level security.

It is important to note that while this module aims to reduce the attack surface, it is NOT a replacement for database-level security. But when combined with good database-level security (ex: per-user schemas and Kerberos), it provides for reasonably safe SQL query generation using data from untrusted sources (ex: web browsers).

AUTHOR

Chris Hoefler <bhoefler@draper.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2017 by Chris Hoefler.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.