Regexp::Common::Apache2 - Apache2 Expressions
use Regexp::Common qw( Apache2 ); use Regexp::Common::Apache2 qw( $ap_true $ap_false ); while( <> ) { my $pos = pos( $_ ); /\G$RE{Apache2}{Word}/gmc and print "Found a word expression at pos $pos\n"; /\G$RE{Apache2}{Variable}/gmc and print "Found a variable $+{varname} at pos $pos\n"; } # Override Apache2 expressions by the legacy ones $RE{Apache2}{-legacy => 1} # or use it with the Legacy prefix: if( $str =~ /^$RE{Apache2}{LegacyVariable}$/ ) { print( "Found variable $+{variable} with name $+{varname}\n" ); }
v0.1.1
This is the perl port of Apache2 expressions
The regular expressions have been designed based on Apache2 Backus-Naur Form (BNF) definition as described below in "APACHE2 EXPRESSION"
You can also use the extended pattern by calling Regexp::Common::Apache2 like:
$RE{Apache2}{-legacy => 1}
All of the regular expressions use named capture. See "%+" in perlvar for more information on named capture.
BNF:
stringcomp | integercomp | unaryop word | word binaryop word | word "in" listfunc | word "=~" regex | word "!~" regex | word "in" "{" words "}" $RE{Apache2}{Comp}
For example:
"Jack" != "John" 123 -ne 456 # etc
This uses other expressions namely "stringcomp", "integercomp", "word", "listfunc", "regex", "words"
The capture names are:
Contains the entire capture block
Matches the expression that uses a binary operator, such as:
==, =, !=, <, <=, >, >=, -ipmatch, -strmatch, -strcmatch, -fnmatch
The binary op used if the expression is a binary comparison. Binary operator is:
When the comparison is for an integer comparison as opposed to a string comparison.
Contains the list used to check a word against, such as:
"Jack" in {"John", "Peter", "Jack"}
This contains the listfunc when the expressions contains a word checked against a list function, such as:
"Jack" in listMe("some arguments")
The regular expression used when a word is compared to a regular expression, such as:
"Jack" =~ /\w+/
Here, comp_regexp would contain /\w+/
/\w+/
The regular expression operator used when a word is compared to a regular expression, such as:
Here, comp_regexp_op would contain =~
=~
When the comparison is for a string comparison as opposed to an integer comparison.
Matches the expression that uses unary operator, such as:
-d, -e, -f, -s, -L, -h, -F, -U, -A, -n, -z, -T, -R
Contains the word that is the object of the comparison.
Contains the expression of a word checked against a list, such as:
Contains the word when it is being compared to a listfunc, such as:
Contains the expression of a word checked against a regular expression, such as:
Here the word Jack (without the parenthesis) would be captured in comp_word
Jack
Contains the first word in comparison expression
Contains the second word in comparison expression
"true" | "false" | "!" cond | cond "&&" cond | cond "||" cond | comp | "(" cond ")" $RE{Apache2}{Cond}
use Regexp::Common::Apache qw( $ap_true $ap_false ); ($ap_false && $ap_true)
Contains the expression like:
($ap_true && $ap_true)
Contains the false expression like:
($ap_false)
Contains the expression if it is preceded by an exclamation mark, such as:
!$ap_true
($ap_true || $ap_true)
Contains the true expression like:
($ap_true)
BNF: cond | string
$RE{Apache2}{Expr}
Contains the expression of the condition
Contains the expression of a string
BNF: funcname "(" words ")"
$RE{Apache2}{Function}
base64("Some string")
Contains the list of arguments. In the example above, this would be Some string
Some string
The name of the function . In the example above, this would be base64
base64
word "-eq" word | word "eq" word | word "-ne" word | word "ne" word | word "-lt" word | word "lt" word | word "-le" word | word "le" word | word "-gt" word | word "gt" word | word "-ge" word | word "ge" word $RE{Apache2}{IntegerComp}
123 -ne 456 789 gt 234 # etc
The hyphen before the operator is optional, so you can say eq instead of -eq
eq
-eq
Contains the comparison operator
Contains the first word in the string comparison
Contains the second word in the string comparison
BNF: listfuncname "(" words ")"
This is quite similar to the "function" regular expression
"/" regpattern "/" [regflags] | "m" regsep regpattern regsep [regflags] $RE{Apache2}{Regex}
/\w+/i # or m,\w+,i
The regula expression modifiers. See perlre
This can be any combination of:
i, s, m, g
Contains the regular expression. See perlre for example and explanation of how to use regular expression. Apache2 uses PCRE, i.e. perl compliant regular expressions.
Contains the regular expression separator, which can be any of:
/, #, $, %, ^, |, ?, !, ', ", ",", ;, :, ".", _, -
BNF: substring | string substring
$RE{Apache2}{String}
URI accessed is: %{REQUEST_URI}
word "==" word | word "!=" word | word "<" word | word "<=" word | word ">" word | word ">=" word $RE{Apache2}{StringComp}
"John" == "Jack" sub(s/\w+/Jack/i, "John") != "Jack" # etc
BNF: cstring | variable
$RE{Apache2}{Substring}
Jack # or %{REQUEST_URI}
See "variable" and "word" regular expression for more on those.
"%{" varname "}" | "%{" funcname ":" funcargs "}" $RE{Apache2}{Variable} # or $RE{Apache2}{LegacyVariable}
%{REQUEST_URI} # or %{md5:"some string"}
See "word" and "cond" regular expression for more on those.
If this is a condition inside a variable, such as:
%{:$ap_true == $ap_false}
Contains the function arguments.
Contains the function name.
A variable containing a word. See "word" for more information about word expressions.
Contains the variable name without the percent sign or dollar sign (if legacy regular expression is enabled) or the possible surrounding accolades
digits | "'" string "'" | '"' string '"' | word "." word | variable | function | "(" word ")" | rebackref $RE{Apache2}{Word}
This is the most complex regular expression used, since it uses all the others and can recurse deeply
12 # or "John" # or 'Jack' # or %{REQUEST_URI} # or %{HTTP_HOST}.%{HTTP_PORT} # or md5("some string") # or any word surrounded by parenthesis, such as: ("John")
See "string", "word", "variable", "sub", "join", "function" regular expression for more on those.
If the word is actually digits, thise contains those digits.
This contains the text when two words are separated by a dot.
Contains the value of the word enclosed by single or double quotes or by surrounding parenthesis.
Contains the word containing a "function"
If the word is enclosed by single or double quote, this contains the single or double quote character
Contains the word containing a "variable"
word | word "," word $RE{Apache2}{Words}
"Jack" # or "John", "Peter", "Paul"
See "word" and "list" regular expression for more on those.
Contains the word
Contains the list
stringcomp | integercomp | unaryop word | word binaryop word | word "in" listfunc | word "=~" regex | word "!~" regex | word "in" "{" list "}" $RE{Apache2}{TrunkComp}
This uses other expressions namely "stringcomp", "integercomp", "word", "listfunc", "regex", "list"
This is similar to the regular "comp" in "APACHE2 EXPRESSION", except it uses "list" instead of "words"
"true" | "false" | "!" cond | cond "&&" cond | cond "||" cond | comp | "(" cond ")" $RE{Apache2}{TrunkCond}
Same as "cond" in "APACHE2 EXPRESSION"
$RE{Apache2}{TrunkExpr}
$RE{Apache2}{TrunkFunction}
word "-eq" word | word "eq" word | word "-ne" word | word "ne" word | word "-lt" word | word "lt" word | word "-le" word | word "le" word | word "-gt" word | word "gt" word | word "-ge" word | word "ge" word $RE{Apache2}{TrunkIntegerComp}
"join" ["("] list [")"] | "join" ["("] list "," word [")"] $RE{Apache2}{TrunkJoin}
join({"word1" "word2"}) # or join({"word1" "word2"}, ', ')
This uses "list" and "word"
Contains the value of the list
Contains the value for word used to join the list
split | listfunc | "{" words "}" | "(" list ") $RE{Apache2}{TrunkList}
split( /\w+/, "Some string" ) # or {"some", "words"} # or (split( /\w+/, "Some string" )) # or ( {"some", "words"} )
This uses "split", "listfunc", words and "list"
Contains the value if a "listfunc" is used
Contains the value if this is a list embedded within parenthesis
Contains the value if the list is based on a split
Contains the value for a list of words.
BNF: regex | regsub
$RE{Apache2}{TrunkRegany}
This regular expression includes "regany" and "regsub"
Contains the regular expression. See "regex"
Contains the substitution regular expression. See "regsub"
"/" regpattern "/" [regflags] | "m" regsep regpattern regsep [regflags] $RE{Apache2}{TrunkRegex}
BNF: "s" regsep regpattern regsep string regsep [regflags]
$RE{Apache2}{TrunkRegsub}
s/\w+/John/gi
The modifiers used which can be any combination of:
See perlre for an explanation of their usage and meaning
The string replacing the text found by the regular expression
Contains the regular expression which is perl compliant since Apache2 uses PCRE.
"split" ["("] regany "," list [")"] | "split" ["("] regany "," word [")"] $RE{Apache2}{TrunkSplit}
split( /\w+/, "Some string" )
This uses "regany", "list" and "word"
Contains the regular expression used for the split
The list being split. It can also be a word. See below
The word being split. It can also be a list. See above
$RE{Apache2}{TrunkString}
word "==" word | word "!=" word | word "<" word | word "<=" word | word ">" word | word ">=" word $RE{Apache2}{TrunkStringComp}
BNF: "sub" ["("] regsub "," word [")"]
$RE{Apache2}{TrunkSub}
sub(s/\w/John/gi,"Peter")
Contains the substitution expression, i.e. in the example above, this would be:
s/\w/John/gi
The target for the substitution. In the example above, this would be "Peter"
$RE{Apache2}{TrunkSubstring}
Jack # or %{REQUEST_URI} # or %{:sub(s/\b\w+\b/Peter/, "John"):}
This is different from "substring" in "APACHE2 EXPRESSION" in that it does not include regular expression back reference, i.e. $1, $2, etc.
$1
$2
"%{" varname "}" | "%{" funcname ":" funcargs "}" | "%{:" word ":}" | "%{:" cond ":}" | rebackref $RE{Apache2}{TrunkVariable}
%{REQUEST_URI} # or %{md5:"some string"} # or %{:sub(s/\b\w+\b/Peter/, "John"):} # or a reference to previous regular expression capture groups $1, $2, etc..
digits | "'" string "'" | '"' string '"' | word "." word | variable | sub | join | function | "(" word ")" $RE{Apache2}{TrunkWord}
12 # or "John" # or 'Jack' # or %{REQUEST_URI} # or %{HTTP_HOST}.%{HTTP_PORT} # or %{:sub(s/\b\w+\b/Peter/, "John"):} # or sub(s,\w+,Paul,gi, "John") # or join({"Paul", "Peter"}, ', ') # or md5("some string") # or any word surrounded by parenthesis, such as: ("John")
Contains the word containing a "join"
If the word is a substitution, this contains tha substitution
word | word "," list $RE{Apache2}{TrunkWords}
"Jack" # or "John", {"Peter", "Paul"} # or sub(s/\b\w+\b/Peter/, "John"), {"Peter", "Paul"}
It is different from "words" in "APACHE2 EXPRESSION" in that it uses "list" instead of "word"
There are 2 expressions that can be used as legacy:
See "comp"
See "variable"
Feel free to reach out to the author for possible corrections, improvements, or suggestions.
Jacques Deguest <jack@deguest.jp>
https://httpd.apache.org/docs/current/expr.html and https://httpd.apache.org/docs/trunk/en/expr.html
Copyright (c) 2020 DEGUEST Pte. Ltd.
You can use, copy, modify and redistribute this package and associated files under the same terms as Perl itself.
To install Regexp::Common::Apache2, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Regexp::Common::Apache2
CPAN shell
perl -MCPAN -e shell install Regexp::Common::Apache2
For more information on module installation, please visit the detailed CPAN module installation guide.