NAME

Parse::BooleanLogic - parser of boolean expressions

SYNOPSIS

use Parse::BooleanLogic;
use Data::Dumper;

my $parser = Parse::BooleanLogic->new( operators => ['', 'OR'] );
my $tree = $parser->as_array( 'label:parser subject:"boolean logic"' );
print Dumper($tree);

$parser = new Parse::BooleanLogic;
$tree = $parser->as_array( 'x = 10' );
print Dumper($tree);

$tree = $parser->as_array( 'x = 10 OR (x > 20 AND x < 30)' );
print Dumper($tree);

# custom parsing using callbacks
$parser->parse(
    string   => 'x = 10 OR (x > 20 AND x < 30)',
    callback => {
        open_paren   => sub { ... },
        operator     => sub { ... },
        operand      => sub { ... },
        close_paren  => sub { ... },
        error        => sub { ... },
    },
);

DESCRIPTION

This module is quite fast parser for boolean expressions. Originally it's been writen for Request Tracker to parse SQL like expressions and it's still capable, but it can be used to parse other boolean logic sentences with OPERANDs joined using binary OPERATORs and grouped and nested using parentheses (OPEN_PAREN and CLOSE_PAREN).

Operand is not qualified strictly what makes parser flexible enough to parse different things, for example:

# SQL like expressions
(task.status = "new" OR task.status = "open") AND task.owner_id = 123

# Google like search syntax used in Gmail and other service
subject:"some text" (from:me OR to:me) label:todo !label:done

# Binary boolean logic expressions
(a | b) & (c | d)

You can change literals used for boolean operators and parens. Read more about this in description of constructor's arguments.

As you can see quoted strings are supported. Read about that below in "Quoting and dequoting".

METHODS

Building parser

new

A constuctor, takes the following named arguments:

operators, default is ['AND' 'OR']

Pair of literal strings representing boolean operators AND and OR, pass it as array reference. For example:

# from t/custom_ops.t
my $parser = Parse::BooleanLogic->new( operators => [qw(& |)] );

# from t/custom_googlish.t
my $parser = Parse::BooleanLogic->new( operators => ['', 'OR'] );

It's ok to have any operators and even empty.

parens, default is ['(', ')']

Pair of literal strings representing parentheses, for example it's possible to use curly braces:

# from t/custom_parens.t
my $parser = Parse::BooleanLogic->new( parens => [qw({ })] );

No matter which pair is used parens must be balanced in expression.

This constructor compiles several heavy weight regular expressions so it's better avoid building object each time right before parsing, but instead use global or cached one.

init

An initializer, called from the constructor. Compiles regular expressions and do other things with constructor's arguments. Returns this object back.

Parsing expressions

as_array $string [ %options ]

Takes a string and parses it into perl structure, where parentheses represented using array references, operands are hash references with one key/value pair: operand, when binary operators are simple scalars. So string x = 10 OR (x 20 AND x < 30)> is parsed into the following structure:

[
    { operand => 'x = 10' },
    'OR',
    [
        { operand => 'x > 20' },
        'AND',
        { operand => 'x < 30' },
    ]
]

Aditional options:

operand_cb - custom operands handler, for example:
my $tree = $parser->as_array(
    "some string",
    operand_cb => sub {
        my $op = shift;
        if ( $op =~ m/^(!?)(label|subject|from|to):(.*)/ ) {
            ...
        } else {
            die "You have an error in your query, in '$op'";
        }
    },
);
error_cb - custom errors handler
my $tree = $parser->as_array(
    "some string",
    error_cb => sub {
        my $msg = shift;
        MyParseException->throw($msg);
    },
);

parse

Takes named arguments: string and callback. Where the first one is scalar with expression, the latter is a reference to hash with callbacks: open_paren, operator operand, close_paren and error. Callback for errors is optional and parser dies if it's omitted. Each callback is called when parser finds corresponding element in the string. In all cases the current match is passed as argument into the callback.

Here is simple example based on "as_array" method:

# result tree and the current group
my ($tree, $node);
$tree = $node = [];

# stack with nested groups, outer most in the bottom, inner on the top
my @pnodes = ();

my %callback;
# on open_paren put the current group on top of the stack,
# create new empty group and at the same time put it into
# the end of previous one
$callback{'open_paren'} = sub {
    push @pnodes, $node;
    push @{ $pnodes[-1] }, $node = []
};

# on close_paren just switch to previous group by taking it
# from the top of the stack
$callback{'close_paren'} = sub { $node = pop @pnodes };

# push binary operators as is and operands as hash references
$callback{'operator'} = sub { push @$node, $_[0] };
$callback{'operand'}  = sub { push @$node, { operand => $_[0] } };

# run parser
$parser->parse( string => $string, callback => \%callback );

return $tree;

Using this method you can build other representations of an expression.

Quoting and dequoting

This module supports quoting with single quote ' and double ", literal quotes escaped with \.

from Regexp::Common::delimited with ' and " as delimiters.

q, qq, fq and dq

Four methods to work with quotes:

q - quote a string with single quote character.
qq - quote a string with double quote character.
fq - quote with single if string has no single quote character, otherwisee use double quotes.
dq - delete either single or double quotes from a string if it's quoted.

All four works either in place or return result, for example:

$parser->q($str); # inplace

my $q = $parser->q($s); # $s is untouched

Tree evaluation and modification

Several functions taking a tree of boolean expressions as returned by as_array method and evaluating or changing it using a callback.

walk $tree $callbacks @rest

A simple method for walking a $tree using four callbacks: open_paren, close_paren, operand and operator. All callbacks are optional.

Example:

$parser->walk(
    $tree,
    {
        open_paren => sub { ... },
        close_paren => sub { ... },
        ...
    },
    $some_context_argument, $another, ...
);

Any additional arguments (@rest) are passed all the time into callbacks.

filter $tree $callback @rest

Filters a $tree using provided $callback. The callback is called for each operand in the tree and operand is left when it returns true value.

Any additional arguments (@rest) are passed all the time into the callback. See example below.

Boolean operators (AND/OR) are skipped according to parens and left first rule, for example:

X OR Y AND Z -> X AND Z
X OR (Y AND Z) -> X OR Z
X OR Y AND Z -> Y AND Z
X OR (Y AND Z) -> Y AND Z
X OR Y AND Z -> X OR Y
X OR (Y AND Z) -> X OR Y

Returns new sub-tree. Original tree is not changed, but operands in new tree still refer to the same hashes in the original.

Example:

my $filter = sub {
    my ($condition, $some) = @_;
    return 1 if $condition->{'operand'} eq $some;
    return 0;
};
my $new_tree = $parser->filter( $tree, $filter, $some );

See also solve

solve $tree $callback @rest

Solves a boolean expression represented by a $tree using provided $callback. The callback is called for operands and should return a boolean value (0 or 1 will work).

Any additional arguments (@rest) are passed all the time into the callback. See example below.

Functions matrixes:

A B AND OR
0 0 0   0
0 1 0   1
1 0 0   1
1 1 1   1

Whole branches of the tree can be skipped when result is obvious, for example:

1 OR  (...)
0 AND (...)

Returns result of the expression.

Example:

my $solver = sub {
    my ($condition, $some) = @_;
    return 1 if $condition->{'operand'} eq $some;
    return 0;
};
my $result = $parser->solve( $tree, $filter, $some );

See also filter.

fsolve $tree $callback @rest

Does in filter+solve in one go. Callback can return undef to filter out an operand, and a defined boolean value to be used in solve.

Any additional arguments (@rest) are passed all the time into the callback.

Returns boolean result of the equation or undef if all operands have been filtered.

See also filter and solve.

partial_solve $tree $callback @rest

Partially solve a $tree. Callback can return undef or a new expression and a defined boolean value to be used in solve.

Returns either result or array reference with expression.

Any additional arguments (@rest) are passed all the time into the callback.

ALTERNATIVES

There are some alternative implementations available on the CPAN.

Search::QueryParser - similar purpose with several differences.
Another?

AUTHORS

Ruslan Zakirov <ruz@cpan.org>, Robert Spier <rspier@pobox.com>

COPYRIGHT

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.