NAME

tstregex - A Hybrid Regex Diagnostic Tool (single file Library module and command tool) shows the longest Regular Expression match / highlight the rejected part

SYNOPSIS

# Example of command and its terminal output:

tstregex.pm '/^[a-z]*\d{3}$/' 'abc123' 'abc12a'

abc123

abc-12a (^[a-z]*-\d{3}$)

# The bold parts above highlight the rejected string and regex token.

OPTIONS (CLI)

-h --help

show that help..

-v --verbose

shows key info on (un)matching..

-d --diag

Triggers the Enriched Diagnostic View. It displays: - The string with the failing part highlighted. - The exact token in the regex that caused the break. - A visual pointer (^--- HERE) aligned with the regex syntax. - Execution time (useful for spotting ReDoS/Exponential backtracking).

-a --assert

Misc: performs a huge test suite various a large collection of regexp tests with tstregex..

Perl Module SYNOPSIS

use tstregex;
my $ctx = tstregex_init_desc('/^\d{3}/');
tstregex($ctx, '12a');
if (!tstregex_is_full_match($res))
    {
    my $token = tstregex_get_fail_token($res);
    my $pos   = tstregex_get_match_len($res);
    print "Failure on token '$token' at column $pos\n";
    }

API

tstregex_init_desc($raw_re)

Pre-parses the regex, handles delimiters (m!!, //, etc.), extracts modifiers (i, s, m, x), and prepares the nibbling steps. Returns a context hash.

tstregex($ctx, $string)

Executes the diagnostic. Updates the context.

tstregex_is_full_match

Returns match status of input string (BOOL 0 OR 1)

tstregex_get_match_portion

Returns the matching portion in case of full match (might be smaller than input string, depending on anchors..)

tstregex_get_match_len

Returns the matching substring length

tstregex_get_fail_token

Returns the failing token in the regexp

tstregex_get_re_clean

Returns the matching regexp subpart

tstregex_get_re_raw

Returns the internal representation of the regexp

tstregex_get_prefix_offset

Returns the offset of the original regexp in the raw regexp

DESCRIPTION

tstregex is designed to solve the "Black Box" problem of Regular Expressions. When a complex regex fails, Perl usually just says "No Match". This tool identifies exactly where and why it failed by finding the longest possible partial match.

The "Nibbling" Engine

The diagnostic logic uses a "Nibbling" (grignotage) strategy:

1. Decomposition

The engine breaks down your regex into a hierarchy of valid sub-patterns (lexical groups, atoms, and quantifiers) from longest to shortest.

It iteratively tests these sub-patterns against the input string. It's not just checking if the start matches, but what is the maximum sequence of instructions the engine could follow before hitting a wall.

3. Failure Point Identification

Once the longest matching sub-pattern is found, the tool identifies the very next token in your regex syntax. This is your "Point of Failure".

AUTHOR

Olivier Delouya - 2026

LICENSE

Artistic Version 2