The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Math::Expression - Safely evaluate arithmetic/string expressions

DESCRIPTION

Evaluating an expression from an untrusted source can result in security or denial of service attacks. Sometimes this needs to be done to do what the user wants.

This module solves the problem of evaluating expressions read from sources such as config/... files and user web forms without the use of eval. String and arithmetic operators are supported (as in C/Perl), as are: variables, loops, conditions, arrays and be functions (inbuilt & user defined).

The program may set initial values for variables and obtain their values once the expression has been evaluated.

The name-space is managed (for security), user provided functions may be specified to set/get variable values. Error messages may be via a user provided function. This is not designed for high computation use.

EXAMPLE

Shipping cost depends on item price by some arbitrary formula. The VAT amount can also vary depending on political edict. Rather than nail these formula into the application code the formula are obtained at run time from some configuration source. These formula are entered by a non technical manager and are thus not to be trusted.

    use Math::Expression;
    my $ArithEnv = new Math::Expression;

# Obtain from a configuration source: my $ShippingFormula = 'Price >= 100 ? Price * 0.1 : (Price >= 50 ? Price * 0.15 : Price * 0.2)'; my $VatFormula = 'VatTax := Price * 0.2';

# Price of what you are selling, set the price variable: my $price = 100; $ArithEnv->VarSetScalar('Price', $price);

# Obtain VAT & Shipping using the configured formula: my $VatTax = $ArithEnv->ParseToScalar($VatFormula); my $Shipping = $ArithEnv->ParseToScalar($ShippingFormula);

    say "Price=$price VatTax=$VatTax Shipping=$Shipping";

# If these will be run many times, parse the formula once:

    my $VatExpr = $ArithEnv->Parse($VatFormula);
    my $ShipExpr = $ArithEnv->Parse($ShippingFormula);

# Evaluate it with the current price many times:

    $ArithEnv->VarSetScalar('Price', $price);
    $VatTax = $ArithEnv->EvalToScalar($VatExpr);
    $Shipping = $ArithEnv->EvalToScalar($ShipExpr);

HOW TO USE

An expression needs to be first compiled (parsed) and the resulting tree may be run (evaluated) many times. The result of an evaluation is an array. Variables are preserved between evaluations. You might also want to take computation results from stored variables. Method ParseToScalar does it all in one: parse, check & evaluate.

See examples later in this document.

For further examples of use please see the test program for the module.

Package methods

new

This must be used before anything else to obtain a handle that can be used in calling other functions.

SetOpt

The items following may be set. In many cases you will want to set a function to extend what the standard one does.

These options may also be given to the new function.

PermitLoops

This must be set true otherwise loops (while) will not be allowed. This is to prevent a denial of service attack when the expression is from an untrusted source.

Default: false

MaxLoopCount

This it the maximum number of times that loops will be allowed to iterate. Where there is more than one loop, all loops count towards this limit. Think carefully before making this too high.

If set to zero, there is no iteration limit. This is probably unwise.

The count restarts when an Eval function is used to evaluate a tree.

Default: 50.

ArrayMaxIndex

The largest number that can be used as an index when assigning to an array.

Default: 100.

EnablePrintf

This must be set true for the printf function to be allowed. Beware this could take a long time to fail: printf('%1000000s', 'foo')

Default: 0.

StringMaxLength

The longest that a string may be.

Default: 1000.

PrintErrFunc

This is a printf style function that will be called in the event of an error, the error text will not have a trailing newline. If this is not set the default is to printf STDERR.

VarHash

The argument is a hash that will be used to store variables. Changing the hash between runs makes it is possible to manage distinct name spaces, ie different computations use different sets of variables.

The name EmptyList should, by convention, exist and be an empty array; this may be used to assign an empty value to a variable.

All variables are arrays, ie a single value is in an array with one element. For speed you can access them directly, eg:

    $ArithEnv->{VarHash}->{i} = [10];    # Set i to 10
    $i = $ArithEnv->{VarHash}->{i}->[0]; # Get the value of i

If you specify one of the functions VarGetFun VarSetFun VarSetScalar you are on your own.

VarGetFun

This specifies the function that returns the value of a variable as an array. The arguments are: 0 - the value returned by new; 1 - the name of the variable wanted. If no value is available you may return the empty array.

VarIsDefFun

This should return 1 if the variable is defined, 0 if it is not defined. The arguments are the same as for VarGetFun.

VarSetFun

This sets the value of a variable as an array. The arguments are: 0 - the value returned by new; 1 - the name of the variable to be set; 2 - the value to set as an array. The return value should be the variable value.

VarSetScalar

This sets the value of a variable as a simple scalar (ie one value). The arguments are: 0 - the value returned by new; 1 - the name of the variable to be set; 2 - the value to set as a scalar. The return value should be the variable value.

FuncEval

This will evaluate functions. The arguments are: 0 - the value returned by new; 1 - the tree as returned by ParseString; 2 - the name of the function to be evaluated; 3... - an array of function arguments. This should return the value of the function: scalar or array.

The purpose is to permit different functions than those provided (eg abs()) to be made available. This option replaces the in built function evaluator FuncValue which may be used as a model for your own evaluator.

ExtraFuncEval

If defined this will be called when evaluating functions. If a defined value is returned that value is used in the expression, it should be numeric or string. If this returns undef the name of function will be matched against the built in functions. This is called before the standard functions are tested and thus can redefine the built in functions. The arguments are as FuncEval.

New function names must be added to property Functions:

  $ArithEnv->{Functions}->{someFunc} = 1;
RoundNegatives

See the description of the round function.

AutoInit

If true automatically initialise undefined values, to the empty string or '0' depending on use. The default is that undefined values cause an error, except that concatentation (.) always results in the empty string being assumed.

Example:

  my $ArithEnv = new Math::Expression(RoundNegatives => 1);

  my %Vars = (
        EmptyList       =>      [()],
  );

  $ArithEnv->SetOpt(
        VarHash => \%Vars,
        VarGetFun => \&VarValue,
        VarIsDefFun => \&VarIsDef,
        PrintErrFunc => \&MyPrintError,
        AutoInit => 1,
        );
ParseString

This parses an expression string and returns a tree that may be evaluated later. The arguments are: 0 - the value returned by new; 1 - the string to parse. If there is an error a complaint will be made via PrintErrFunc and the undefined value returned.

CheckTree

This checks a parsed tree. The arguments are: 0 - the value returned by new; 1 - the tree to check. The input tree is returned. If there is an error a complaint will be made via PrintErrFunc and the undefined value returned.

Parse

This combines ParseString and CheckTree.

VarSetFun

This sets a variable, see the description in SetOpt.

VarSetScalar

This sets a variable, see the description in SetOpt.

FuncValue

This evaluates a function, see the description in SetOpt.

EvalTree

Evaluate a tree or subtree. The result is an array, if you are expecting a single value it is the last (probably $#'th) element. The arguments are: 0 - the value returned by new; 1 - tree to evaluate; 2 - true if a variable name is to be returned rather than it's value (don't set this). You should not use this, use methods Eval or EvalToScalar instead. This does not reset the used loop count property LoopCount.

Eval

Evaluate a tree. The result is an array, if you are expecting a single value it is the last (probably $#'th) element. The arguments are: 0 - the value returned by new; 1 - tree to evaluate.

EvalToScalar

Evaluate a tree. The result is a scalar (simple variable). The arguments are: 0 - the value returned by new; 1 - tree to evaluate.

ParseToScalar

Parse a string, check and Evaluate its tree. The result is a scalar (simple variable). Undefined is returned on error. The arguments are: 0 - the value returned by new; 1 - tree to evaluate.

Functions that may be used in expressions

The following functions may be used in expressions, if you want more than this write your own function evaluator and set 'ExtraFuncEval' with method SetOpt; The POSIX package is used to provide some of the functions.

int

Returns the integer part of an expression.

abs

Returns the absolute value of an expression.

round

Adds 0.5 to input and returns the integer part. If the option RoundNegatives is set round() is sign sensitive, so for negative input subtracts 0.5 from input and returns the integer part.

split

Perl split, the 0th argument is the RE to split on, the last argument what will be split.

join

Joins arguments 1..$#, separating elements with the 0th argument.

printf

The standard perl printf, returns the formatted result. To use this the option EnablePrintf must be set true.

mktime

Passes all the arguments to mktime, returns the result.

strftime

Passes all the arguments to strftime, returns the result.

localtime

Returns the result of applying localtime to the last argument. The variable _TIME is initialised to the current time

defined

Applies the VarIsDefFun to the last argument. Ie returns 1 if the variable is defined (has been assigned a value), 0 if it has not.

push, pop, shift, unshift

Add/remove elements from an array - as in perl.

strlen

The length of a string.

count

The number of elements in an array.

aindex

Searches the arguments for the last argument and returns the index. Return -1 if it is not found. Eg the following will return 1:

  months := 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec';
  aindex(months, 'Feb')

Example of user defined functions:

    # Function that provides extra functions - ie user functions
    # sumArgs   numeric sum of arguments
    # A user defined function must return a scalar or list; it MUST not return undef.
    sub moreFunctions {
        my ($self, $tree, $fname, @arglist) = @_;
    
        if($fname eq 'sumArgs') {
                my $sum = 0;
                $sum += $_ for @arglist;
                return $sum;
        }
    
        # Return undef so that in built functions are scanned
        return undef;
    }
    
    # MUST put user defined functions here so that it is known as a function - while parsing:
    $ArithEnv->{Functions}->{sumArgs} = 1;
    
    $ArithEnv->SetOpt(ExtraFuncEval => \&moreFunctions);

Used in an expression thus:

    sum := sumArgs(2, 4, 6, 8)
    list : = (12, 13, 21, 9, -3)
    sum := sumArgs(list)

Variables

Variables can two forms, there is no difference in usage between any of them. A variable name is either alphanumeric (starting with alpha, underscore is deemed an alpha), or the same name with a leading $. Both refer to the same variable.

        Variable
        _foo123
        $Variable
        $_foo123

A previous version of this module allowed more syntaxes.

Literals

Literals may be: integers, floating point in the forms nn.nn and with an exponent eg (123.4, 1.234e2, 1.234e+2). Strings are bounded by matching single ' or double quotes ". In strings surrounded by double quotes the following escapes will be recognised:

        \n      newline
        \r      carriage return
        \t      tab
        \\      \
        \xXX    character with hex value XX. Eg: \x0A \x3b
        \u{XXX} unicode character. Eg: \u{20AC} is the Euro

A backslash followed by anything else is left as is - ie the backslash will remain.

Operators and precedence

The operators should not surprise any Perl/C programmer, with the exception that assignemnt is :=. Operators associate left to right except for := which associates right to left. Precedence may be overridden with parenthesis ( ). <> is the same as !=.

        ++ --   Pre increment/decrement only
        + - ~ ! (Monadic)
        **
        * / %
        + -
        .       String concatenation
        > < >= <= == != <>
        lt gt le ge eq ne
        &&
        ||
        ? :
        ,
        :=

A semicolon (;) may be used to separate statements; the value is that of the last expression.

Statements may be grouped with brackes: { }

        { a := 10; b := a * 4 }

Order of evaluation

The order of evaluation of an expression is not defined except at sequence points. The sequence points are: while if ; ?: && ||. In particular && and || only evaluate their right hand sides if they need to.

Thus which element of a gets updated by the code below may change in a future release:

        a := (5, 6, 7); i := 0; a[++i] := ++i

Multiple assignment works:

        a := b := 3; if(1) a:= b := 4

Sets a and b to 3 and then sets them to 4.

Arrays

Variables are implemented as arrays, if a simple scalar value is wanted (eg you want to go +) the last element of the array is used. Arrays may be built using the comma operator, arrays may be joined using , eg:

        a1 := (1, 2, 3, 4)
        a2 := (9, 8, 7, 6)
        a1 , a2

yeilds:

        1, 2, 3, 4, 9, 8, 7, 6

And:

        a2 + 10

yeilds:

        16

Arrays may be used to assign multiple values, eg:

        (v1, v2, v3) := (42, 44, 48)

If there are too many values the last variable receives the remainder. If there are not enough values the last ones are unchanged.

You may use [] to numerically index into arrays to obtain and set scalar values. Arrays cannot contain other arrays. Array indexes start with 0. Negative indicies index from the end of the array, thus -1 is the last element.

        a := (20,21,22); a[1] + a[2]
        a := (20,21,22); a[1] := 9; ++a[k + j]
        i := -1; j := 2; a := (20,21,22); a[i + j] := 3

When setting values you can extend an array one element at a time. You can create a variable by setting index 0.

Index greater than ArrayMaxIndex cannot be used to assign to an array. See method SetOpt.

Assigning () is the same as assigning EmptyList, eg:

        em := ()

Conditions and loops

Conditional assignment used to be done by use of the ternary operator, but no longer:

        a > b ? ( c := 3 ) : 0

Variables may be the result of a conditional, so below one of aa or bb is assigned a value:

        a > b ? aa : bb := 124

if and while may be used to perform conditionals and loops:

        if(i < 3) { i := i + j; j := 0}
        if(i < 3) i := 10;
        i := 0; a := 0; if(i < 4) {i := i + 1; a := 9 }; a+i
        i := 0; b := 1; while(++i < 4) b := b * 2;  b
        i := 0; while(i < 4) {i := i + 1;}; i

Note how the braces may be omitted if there is one statement after the if. You can nest if and while within each other.

If the expression is from an untrusted source, loops may cause a denial of service attack. So: the following are avaiable to use with SetOpt: PermitLoops and MaxLoopCount, see the above description for details.

Miscellaneous and examples

There is no ; so each strings Parsed must be one expression.

        my $tree1 = $ArithEnv->Parse('a := 10');
        my $tree2 = $ArithEnv->Parse('b := 3');
        my $tree3 = $ArithEnv->Parse('a + b');

        $ArithEnv->EvalToScalar($tree1);
        $ArithEnv->EvalToScalar($tree2);
        print "Result: ", $ArithEnv->EvalToScalar($tree3), "\n";
        say "a * b = " . $ArithEnv->ParseToScalar('a * b');
        say $ArithEnv->ParseToScalar('"a != b is " . (a != b)');
        say $ArithEnv->ParseToScalar('2 * 3 / 4');
        say $ArithEnv->ParseToScalar('1.2e2 + 0');

prints:

        Result: 13
        a * c = 30
        a != b is 1
        1.5
        120

        $ArithEnv->ParseToScalar('FirstName := "George"');
        $ArithEnv->ParseToScalar('SurName := "Williams"');
        say "Son's name is " . $ArithEnv->ParseToScalar('FirstName . " " . SurName');
        say "Name is George = " . $ArithEnv->ParseToScalar('FirstName eq "George"');

prints:

        Son's name is George Williams
        Name is George = 1

        $ArithEnv->VarSetScalar('_TimeYesterday', time - 86400);
        say $ArithEnv->ParseToScalar('strftime("Yesterday date=%Y/%m/%d", localtime(_TimeYesterday))');
        say $ArithEnv->ParseToScalar('strftime("Today date=%Y/%m/%d", localtime(_TIME))');

prints:

        Yesterday date=2015/03/21'
        Today date=2015/03/22'

        say $ArithEnv->ParseToScalar('10 + (44, 66, 22 + 1)');
        say $ArithEnv->ParseToScalar('c := 12; d := 3; c * d');

prints:

        33
        36

AUTHOR

Alain D D Williams <addw@phcomp.co.uk>

Version "1.48", this is available as: $Math::Expression::Version.

Copyright (c) 2003, 2016 Parliament Hill Computers Ltd/Alain D D Williams. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Please see the module source for the full copyright.