NAME
tstregex - A Hybrid Regex Diagnostic Tool (single file Library module and command tool) shows the longest Regular Expression match / highlight the rejected part
SYNOPSIS
# Example of command and its terminal output:
tstregex.pm '/^[a-z]*\d{3}$/' 'abc123' 'abc12a'
abc123
abc-12a (^[a-z]*-\d{3}$)
# The bold parts above highlight the rejected string and regex token.
OPTIONS (CLI)
-h --help
show that help..
-v --verbose
shows key info on (un)matching..
-d --diag
Triggers the Enriched Diagnostic View. It displays: - The string with the failing part highlighted. - The exact token in the regex that caused the break. - A visual pointer (^--- HERE) aligned with the regex syntax. - Execution time (useful for spotting ReDoS/Exponential backtracking).
-a --assert
Misc: performs a huge test suite various a large collection of regexp tests with tstregex..
Perl Module SYNOPSIS
use tstregex;
my $ctx = tstregex_init_desc('/^\d{3}/');
tstregex($ctx, '12a');
if (!tstregex_is_full_match($res))
{
my $token = tstregex_get_fail_token($res);
my $pos = tstregex_get_match_len($res);
print "Failure on token '$token' at column $pos\n";
}
API
tstregex_init_desc($raw_re)
Pre-parses the regex, handles delimiters (m!!, //, etc.), extracts modifiers (i, s, m, x), and prepares the nibbling steps. Returns a context hash.
tstregex($ctx, $string)
Executes the diagnostic. Updates the context.
tstregex_is_full_match
Returns match status of input string (BOOL 0 OR 1)
tstregex_get_match_portion
Returns the matching portion in case of full match (might be smaller than input string, depending on anchors..)
tstregex_get_match_len
Returns the matching substring length
tstregex_get_fail_token
Returns the failing token in the regexp
tstregex_get_re_clean
Returns the matching regexp subpart
tstregex_get_re_raw
Returns the internal representation of the regexp
tstregex_get_prefix_offset
Returns the offset of the original regexp in the raw regexp
DESCRIPTION
tstregex is designed to solve the "Black Box" problem of Regular Expressions. When a complex regex fails, Perl usually just says "No Match". This tool identifies exactly where and why it failed by finding the longest possible partial match.
The "Nibbling" Engine
The diagnostic logic uses a "Nibbling" (grignotage) strategy:
- 1. Decomposition
-
The engine breaks down your regex into a hierarchy of valid sub-patterns (lexical groups, atoms, and quantifiers) from longest to shortest.
- 2. Longest Match Search
-
It iteratively tests these sub-patterns against the input string. It's not just checking if the start matches, but what is the maximum sequence of instructions the engine could follow before hitting a wall.
- 3. Failure Point Identification
-
Once the longest matching sub-pattern is found, the tool identifies the very next token in your regex syntax. This is your "Point of Failure".
AUTHOR
Olivier Delouya - 2026
LICENSE
Artistic Version 2