The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::Filter - filter data structures with structured filters.

SYNOPSIS

  use Data::Filter;

  my %dataSet = (
    0 => {
      name   => 'Data::Filter',
      author => 'Matt Wilson',
    },
    1 => {
      name   => 'Pod::XML',
      author => 'Matt Wilson,
    },
    # ... etc.
  );

  my @filter = [
    Data::Filter::OP_AND,
    [
      're',
      'name',
      '^Pod',
    ],
    [
      're',
      'name',
      'XML$',
    ],
  ];

  my %result = %{ filterData ( \%dataSet, \%filter ) };

DESCRIPTION

The structure of the data set is rarely in this format. However, I decided that this was the easiest method to determine (and guarantee) that recursive filters did not confuse the difference between records (as each record has it's own unique key). If, as is more likely, your data set is in an array format, like so;

  my @dataSet = (
    {
      name   => 'Data::Filter',
      author => 'Matt Wilson',
    },
    {
      name   => 'Pod::XML',
      author => 'Matt Wilson,
    },
    # ... etc.
  );

A helper function is provided to convert your array into the required hash reference form;

  my %dataSet = %{ arrayToHash ( \@dataSet ) };

Where arrayToHash obviously returns a hash reference.

Similarly, the filterData subroutine returns a hash reference in the same form as the provided data set (hash reference, rather than array). As such, there is also a utility subroutine, hashToArray, to deal with such circumstances.

Next, let's take a look at the format of the filtering array, as that's fairly important if you'd like to create any meaningful results!

A filter is of the form;

[ op, column, value, ( value2, value3, ... ), ]

or, more complex;

[ OP_AND, [ (see above), ], [ ], ],

or, possibly;

[ OP_AND, [ OP_NOT, [ OP_EQ, column, value, ], ], [ # ... ], ]

CREATING OPERATORS

It's possible to create your own operator functions (such as the "equals" operator). To do this, simply add a new entry to the Data::Filter::Filters hash, where the key is the name of the operator, and the value is a code reference to the function to call. For instance, the "equals" operator looks like so;

  $Data::Filter::Filters { 'eq' } = \&_filterEqual;

The subroutine takes two parameters, a hash reference which represents the entry being checked, and an array reference of the filter being executed. The return value is whether or not the data hash reference passes this filter. For example, the _filterEqual subroutine looks like so;

  sub _filterEqual
  {
    my ( $data, $filters ) = @_;

    return $data->{ $filters->[ 0 ] } eq $filters->[ 1 ];
  }

Where the $filters array reference contains the elements [ column, value ].

METHODS

\%filteredData = filterData(\%dataSet,\@filter)

Perform the actual filtering work using the filter described by @filter on the hash %dataSet. More information can be found in the description section of this POD.

\@data = hashToArray(\%data)

Convert a internal data representation along the lines of;

  %data = (
    0 => {
      # column => value pairs
    },
    1 => { 
      # column => value pairs
    },
  )

To an array equivalent;

  @data = (
    {
      # column => value pairs
    },
    {
      # column => value pairs
    },
  )
\%data = arrayToHash(\@data)

This subroutine has the opposite effect of the hashToArray subroutine described above.

AUTHOR

Matt Wilson <matt AT mattsscripts DOT co DOT uk>

LICENSE

This is free software, you may use it a distribute it under the same terms as Perl itself.