The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Regexp::Common::ANSIescape -- regexps for ANSI terminal escapes

SYNOPSIS

 use Regexp::Common 'ANSIescape';

 if ($str =~ /$RE{ANSIescape}/) {
    ...
 }

 my $re1 = $RE{ANSIescape}{-only7bit};
 my $re2 = $RE{ANSIescape}{-sepstring};

DESCRIPTION

An ANSIescape pattern matches an ANSI terminal escape sequence like

    Esc[30;48m             # CSI sequence
    Esc[?1h                # CSI with private params
    EscU                   # C1 control
    Esc_ APPSTRING Esc\    # C1 with string param

    \x9B 30m               # ditto in 8-bit forms
    \x9B ?1h
    \x85
    \x9F APPSTRING \x9C         

The 7-bit patterns are simply Esc followed by various combinations of printable ASCII "\x20" through "\x7E".

The 8-bit forms use bytes "\x80" through "\x9F". The -only7bit option below can omit the 8-bit patterns if they might have another meaning.

  • ISO-8859 character sets such as Latin-1 don't use \x80 through \x9F, so they're free to be the ANSI escapes.

  • Unicode code points \x80 through \x9F have the ANSI meaning, so Perl wide-char strings are fine (except on an EBCDIC system).

  • UTF-8 encoding uses bytes \x80 through \x9F as intermediate parts of normal characters, so you must either decode to code points first, or use -only7bit.

  • Other encodings may use \x80 through \x9F as normal characters, for example DOS code page 1252. Generally -only7bit should be used in that case.

The parameter part like "0" in "Esc[0m" can be any bytes 0x30 through 0x3F, so "private parameter" values like the VT100 "DECSET" extensions are matched.

OPTIONS

{-only7bit}
{-only8bit}

Match only the 7-bit forms like "\eE". Or match only the 8-bit forms like "\x{85}". The default is to match both. The 7-bit forms are the most common.

{-sepstring}

By default the string parameter to APC, DCS, OSC, PM and SOS is included in the match, for example an APC like

    \x{9F}Stringarg\x{9C}

is matched in its entirety. With -sepstring the pattern instead matches the start "\x{9F}" and the terminator "\x{9C}" individually, with the Stringarg part unmatched.

{-keep}

With the standard -keep option parens are included to set the following capture variables

$1

The entire escape sequence.

$2

The parameters to a CSI sequence. For example

    \e[30;49m    ->   30;49      (SGR)
    \e[?5h       ->   ?5         (DECSCNM extension)
$3

The intermediate characters (if any) and final character of a CSI escape. For example

    \e[30m       ->   m
    \e[30+P      ->   +P

SEE ALSO

Regexp::Common

The ANSI standard can be obtained as ECMA-48 at http://www.ecma-international.org/publications/standards/Ecma-048.htm

HOME PAGE

http://www.geocities.com/user42_kevin/perlio-via-escstatus/index.html

LICENSE

Copyright 2008, 2009 Kevin Ryde

PerlIO-via-EscStatus is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version.

PerlIO-via-EscStatus is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PerlIO-via-EscStatus. If not, see http://www.gnu.org/licenses/.