NAME

CharClass::Matcher -- Generate C macros that match character classes efficiently

SYNOPSIS

    perl Porting/regcharclass.pl

DESCRIPTION

Dynamically generates macros for detecting special charclasses in latin-1, utf8, and codepoint forms. Macros can be set to return the length (in bytes) of the matched codepoint, or the codepoint itself.

To regenerate regcharclass.h, run this script from perl-root. No arguments are necessary.

Using WHATEVER as an example the following macros will be produced:

is_WHATEVER(s,is_utf8)

is_WHATEVER_safe(s,e,is_utf8)

Do a lookup as appropriate based on the is_utf8 flag. When possible comparisons involving octect<128 are done before checking the is_utf8 flag, hopefully saving time.

is_WHATEVER_utf8(s)

is_WHATEVER_utf8_safe(s,e)

Do a lookup assuming the string is encoded in (normalized) UTF8.

is_WHATEVER_latin1(s)

is_WHATEVER_latin1_safe(s,e)

Do a lookup assuming the string is encoded in latin-1 (aka plan octets).

is_WHATEVER_cp(cp)

Check to see if the string matches a given codepoint (hypotethically a U32). The condition is constructed as as to "break out" as early as possible if the codepoint is out of range of the condition.

IOW:

  (cp==X || (cp>X && (cp==Y || (cp>Y && ...))))

Thus if the character is X+1 only two comparisons will be done. Making matching lookups slower, but non-matching faster.

Additionally it is possible to generate what_ variants that return the codepoint read instead of the number of octets read, this can be done by suffixing '-cp' to the type description.

CODE FORMAT

perltidy -st -bt=1 -bbt=0 -pt=0 -sbt=1 -ce -nwls== "%f"

AUTHOR

Author: Yves Orton (demerphq) 2007

BUGS

No tests directly here (although the regex engine will fail tests if this code is broken). Insufficient documentation and no Getopts handler for using the module as a script.

LICENSE

You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the README file.

To install Env, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Env

CPAN shell

perl -MCPAN -e shell
install Env

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)