The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Data::Sah::Filter - Filtering for Data::Sah

VERSION

This document describes version 0.025 of Data::Sah::Filter (from Perl distribution Data-Sah-Filter), released on 2024-07-17.

SYNOPSIS

use Data::Sah::Filter qw(gen_filter);

# a utility routine: gen_filter
my $c = gen_filter(
    filter_names       => ['Str::ltrim', 'Str::rtrim'],
);

my $val = $c->("foo");        # unchanged, "foo"
my $val = $c->(" foo ");      # "foo"

Another example:

my $c = gen_filter(
    filter_names       => [ ['Str::remove_comment' => {style=>'shell'}] ],
    #filter_names      => ['Str::remove_comment=style,shell'], # same as above
);

DESCRIPTION

This distribution contains a standard set of filter rules for Data::Sah (to be used in prefilters and postfilters clauses). It is separated from the Data-Sah distribution and can be used independently.

A filter rule is put in Data::Sah::Filter::$COMPILER::$CATEGORY:$DESCRIPTION module, for example: Data::Sah::Filter::perl::Str::trim for trimming whitespace at the beginning and end of string.

Basically, a filter rule will provide an expression (in expr_filter) in the target language (e.g. Perl, JavaScript, or others) to convert one data to another. Multiple filter rules can be combined to form the final filtering code. This code can be used by Data::Sah when generating validator code from Sah schema, or can be used directly. Some projects which use filtering rules directly include: App::orgadb (which lets users specify filters from the command-line).

meta()

The filter rule module must contain meta subroutine which must return a hashref (DefHash) that has the following keys (* marks that the key is required):

  • v* => int (default: 1)

    Metadata specification version. From DefHash. Currently at 1.

  • summary => str

    From DefHash.

  • might_fail => bool

    Whether coercion might fail, e.g. because of invalid input. If set to 1, expr_filter key that the filter() routine returns must be an expression that returns an array (envelope) of (error_msg, data) instead of just filtered data. Error message should be a string that is set when filtering fails and explains why. Otherwise, if filtering succeeds, the error message string should be set to undefined value.

    This is used for filtering rules that act as a data checker.

  • args => hash

    List of arguments that this filter accepts, in the form of hash where hash keys are argument names and hash values are argument specifications. Argument specification is a DefHash similar to argument specification for functions in Rinci::function specification.

filter()

The filter rule module must also contain filter subroutine which must generate the code for filtering. The subroutine must accept a hash of arguments and will be passed these:

  • data_term => str

  • args => hash

    The arguments for the filter. Hash keys will contain the argument names, while hash values will contain the argument's values.

The filter subroutine must return a hashref with the following keys (* indicates required keys):

  • expr_filter* => str

    Expression in the target language to actually convert data.

  • modules => hash

    A list of modules required by the expression, where hash keys are module names and hash values are modules' minimum versions.

Basically, the filter subroutine must generate a code that accepts a non-undef data and must convert this data to the desired value.

Program/library that uses Data::Sah::Filter can collect rules from the rule modules then compose them into the final code, something like (in pseudo-Perl code):

if (!defined $data) {
  return undef;
} else {
  $data = expr-filter-from-rule1($data);
  $data = expr-filter-from-rule2($data);
  ...
  return $data;
}

Filter modules included in this distribution

1. Data::Sah::Filter::js::Str::downcase
2. Data::Sah::Filter::js::Str::lc
3. Data::Sah::Filter::js::Str::lcfirst
4. Data::Sah::Filter::js::Str::lowercase
5. Data::Sah::Filter::js::Str::ltrim
6. Data::Sah::Filter::js::Str::rtrim
7. Data::Sah::Filter::js::Str::trim
8. Data::Sah::Filter::js::Str::uc
9. Data::Sah::Filter::js::Str::ucfirst
10. Data::Sah::Filter::js::Str::upcase
11. Data::Sah::Filter::js::Str::uppercase
12. Data::Sah::Filter::perl::Array::check_uniq
13. Data::Sah::Filter::perl::Array::check_uniqnum
14. Data::Sah::Filter::perl::Array::check_uniqstr
15. Data::Sah::Filter::perl::Array::remove_undef
16. Data::Sah::Filter::perl::Array::uniq
17. Data::Sah::Filter::perl::Array::uniqnum
18. Data::Sah::Filter::perl::Array::uniqstr
19. Data::Sah::Filter::perl::Float::ceil
20. Data::Sah::Filter::perl::Float::check_has_fraction
21. Data::Sah::Filter::perl::Float::check_int
22. Data::Sah::Filter::perl::Float::floor
23. Data::Sah::Filter::perl::Float::round
24. Data::Sah::Filter::perl::Str::check
25. Data::Sah::Filter::perl::Str::check_lowercase
26. Data::Sah::Filter::perl::Str::check_oneline
27. Data::Sah::Filter::perl::Str::check_uppercase
28. Data::Sah::Filter::perl::Str::downcase
29. Data::Sah::Filter::perl::Str::ensure_trailing_newline
30. Data::Sah::Filter::perl::Str::lc
31. Data::Sah::Filter::perl::Str::lcfirst
32. Data::Sah::Filter::perl::Str::lowercase
33. Data::Sah::Filter::perl::Str::ltrim
34. Data::Sah::Filter::perl::Str::oneline
35. Data::Sah::Filter::perl::Str::remove_comment
36. Data::Sah::Filter::perl::Str::remove_non_latin_alphanum
37. Data::Sah::Filter::perl::Str::remove_nondigit
38. Data::Sah::Filter::perl::Str::remove_whitespace
39. Data::Sah::Filter::perl::Str::replace_map
40. Data::Sah::Filter::perl::Str::rtrim
41. Data::Sah::Filter::perl::Str::trim
42. Data::Sah::Filter::perl::Str::try_center
43. Data::Sah::Filter::perl::Str::uc
44. Data::Sah::Filter::perl::Str::ucfirst
45. Data::Sah::Filter::perl::Str::underscore_non_latin_alphanum
46. Data::Sah::Filter::perl::Str::underscore_non_latin_alphanums
47. Data::Sah::Filter::perl::Str::upcase
48. Data::Sah::Filter::perl::Str::uppercase
49. Data::Sah::Filter::perl::Str::wrap

VARIABLES

$Log_Filter_Code => bool (default: from ENV or 0)

If set to true, will log the generated filter code (currently using Log::ger at trace level). To see the log message, e.g. to the screen, you can use something like:

% TRACE=1 perl -MLog::ger::LevelFromEnv -MLog::ger::Output=Screen \
    -MData::Sah::Filter=gen_filter -E'my $c = gen_filter(...)'

FUNCTIONS

gen_filter

Usage:

gen_filter(%args) -> any

Generate filter code.

This is mostly for testing. Normally the filter rules will be used from Data::Sah.

This function is not exported by default, but exportable.

Arguments ('*' denotes required arguments):

  • filter_names* => array[str]

    (No description)

  • return_type => str (default: "val")

    (No description)

Return value: (any)

ENVIRONMENT

LOG_SAH_FILTER_CODE => bool

Set default for $Log_Filter_Code.

HOMEPAGE

Please visit the project's homepage at https://metacpan.org/release/Data-Sah-Filter.

SOURCE

Source repository is at https://github.com/perlancar/perl-Data-Sah-Filter.

SEE ALSO

Data::Sah

Data::Sah::FilterJS

App::SahUtils, including filter-with-sah to conveniently test filter from the command-line.

Data::Sah::Coerce. Filtering works very similarly to coercion in the Data::Sah framework (see l<Data::Sah::Coerce>) but is simpler and composited differently to form the final filtering code. Mainly, input data will be passed to all filtering expressions.

AUTHOR

perlancar <perlancar@cpan.org>

CONTRIBUTING

To contribute, you can send patches by email/via RT, or send pull requests on GitHub.

Most of the time, you don't need to build the distribution yourself. You can simply modify the code, then test via:

% prove -l

If you want to build the distribution (e.g. to try to install it locally on your system), you can install Dist::Zilla, Dist::Zilla::PluginBundle::Author::PERLANCAR, Pod::Weaver::PluginBundle::Author::PERLANCAR, and sometimes one or two other Dist::Zilla- and/or Pod::Weaver plugins. Any additional steps required beyond that are considered a bug and can be reported to me.

COPYRIGHT AND LICENSE

This software is copyright (c) 2024, 2023, 2022, 2020 by perlancar <perlancar@cpan.org>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

BUGS

Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=Data-Sah-Filter

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.