NAME
Ecma48::Util - A selection of subroutines supporting ANSI escape sequence handling
SYNOPSIS
use Ecma48::Util qw(remove_seqs move_seqs_before_lastnl ... quotectrl);
my $nude=quotectrl remove_bs_bolding remove_seqs remove_fillchars $decorated;
DESCRIPTION
Ecma48::Util
contains a selection of subroutines which allow the handling of Ecma-48 based markup sequences - better known as ANSI escape sequences.
It helps to separate string handling from decorating. If you can't change the order of processing and you are forced to do your string handling after the decoration is already in effect, then you can find some adequate utility functions here.
USE CASES
Do you like colors in your terminal? And a nice guy has written a plugin to bring in the color - maybe with the help of Term::ANSIColor
? Unfortunately, now things like chomp
and testing if a string is empty do start to fail? Then this module is worth a look.
FUNCTIONS
By default Ecma48::Util
does not export any subroutines. The subroutines defined are
- remove_seqs STRING
-
remove_seqs
returns a string where well-formed Ecma48 sequences from STRING are deleted.$foo = remove_seqs "color\e[34;1mful\e[m example"; # colorful example
Keep in mind that this is not the right tool for secure disarmament. Not all terminal sequences are well-formed and most terminals also accept sequences with some errors. See
quotectrl
. - split_seqs STRING
-
split_seqs
splits string and returns a list where escape sequences are marked by being scalar references.@foo = split_seqs "color\e[34;1mful\e[m example"; # ( 'color', \"\e[34;1m", 'ful', \"\e[m", ' example' )
- ensure_terminating_nl STRING
-
Does a newline exist at the end of the visible part? If not
ensure_terminating_nl
adds one.$foo = ensure_terminating_nl "color\e[34;1mful\e[m"; # add \n $foo = ensure_terminating_nl "color\e[34;1mful\n\e[m"; # as is $foo = ensure_terminating_nl "color\e[34;1mful\e[m\n"; # as is
- remove_terminating_nl STRING
-
Similar to
ensure_terminating_nl
but instead of making the string terminate with newline, it makes the string open ended without a newline at the end.$foo = remove_terminating_nl "color\e[34;1mful\e[m"; # as is $foo = remove_terminating_nl "color\e[34;1mful\n\e[m"; # as in previous example $foo = remove_terminating_nl "color\e[34;1mful\e[m\n"; # ditto
- move_seqs_before_lastnl STRING
-
Makes your STRING
chomp
-friendly.$foo = move_seqs_before_lastnl "color\e[34;1mful\n\e[m"; # "color\e[34;1mful\e[m\n"
- quote_ctrl STRING
-
Replaces control characters with a visible representation. Traditional linebreaks (
\n
,\r\n
) are reasonable exceptions.quotectrl
is an alias ofquote_ctrl
. Whenlocal $Ecma48::Util::PREFER_UNICODE_SYMBOLS=1
is set, control chars from C0 (\00
..\x1F
) and DEL (\x7F
) are displayed with their unicode symbol e.g. \x{241B}= ␛.$foo = quotectrl "color\e[34;1mful\n\e[m"; # "color\\e[34;1mful\n\\e[m" local $Ecma48::Util::PREFER_UNICODE_SYMBOLS=1; $foo = quotectrl "color\e[34;1mful\n\e[m"; # "color\x{241B}[34;1mful\n\x{241B}[m"
- quote_nongraph STRING
-
Like
quote_ctrl
, except for all non printable characters. The decision is based on[[:graph:]]
regex class, and so depends on settings of thelocale
pragma and theunicode_strings
feature. - ctrl_chars LIST
-
ctrl_chars
returns the requested control characters or introducers. LIST can consist of names, the char codes or the actual control characters. Beside the coded char the eventually existing 7-bit equivalent is also returned. In scalar context it returns a regex catching all requested sequence intros including their alternatives.@foo = ctrl_chars 'CSI'; # "\x9b", "\e\[" $foo = ctrl_chars 'CSI'; # as qr/\x9b|\e\[/
Multiple control characters can be given to
ctrl_chars
as separated parameters. - seq_regex
-
seq_regex
returns a regex which catch Ecma-48 sequences. - remove_bs_bolding STRING
-
In the old days you could simulate bold printing with BackSpace (
\cH
) and overstrike with the same character. Some Terminals of the 7-bit era simulate this behavior of that kind of printer.$foo = remove_bs_bolding "A\cHA\cHAB\cHB\cHCD\cHD"; # "AB\cHCD" $foo = remove_bs_bolding "This was b\cHbo\cHol\cHld\cHd."; # "This was bold."
BS as combiner is defined in Ecma-6 and in Ecma-43 it is mentioned that this should not be used in 8-bit environments. It is not part of Ecma-48. However if you have to deal with terminal sequences, you may also have to handle such issues.
- replace_bs_bolding STRING, [PRE, [POST], [INTER]]
-
Like
remove_bs_bolding
but allows you to mark the bold substrings in other ways. Default is bright/bold mode.$foo = replace_bs_bolding "This is b\cHbo\cHol\cHld\cHd."; # "This is \e[1mbold\e[22m." $foo = replace_bs_bolding "This is b\cHbo\cHol\cHld\cHd.",'*'; # "This is *bold*." $foo = replace_bs_bolding "This is b\cHbo\cHol\cHld\cHd.",1,0; # "This is \e[1mbold\e[0m." $foo = replace_bs_bolding "This is b\cHbo\cHol\cHld\cHd.",'','','_'; # "This is b_o_l_d."
If you specify PRE but not POST this function tries to guess the closing sequence.
- closing_seq STRING
-
Tries to find the sequence which resets back again what STRING had changed.
$foo = closing_seq "\e[2m"; # "\e[22m" $foo = closing_seq "\e[3h"; # "\e[3l"
Of course this is only an approximation, because no strict 1:1 mapping exists. This function is also used internally by
replace_bs_bolding
.As a surplus it find counterparts for braces and so on.
$foo = closing_seq '{[('; # ')]}' $foo = closing_seq '.oO '; # ' Oo.' $foo = closing_seq '==>>'; # '<<==' $foo = closing_seq '_*/'; # '/*_' $foo = closing_seq "\x{25C4}"; # "\x{25BA}" $foo = closing_seq "\x{2767}"; # "\x{2619}"
\x{25C4}= ◄, \x{25BA}= ►, \x{2767}= ❧, \x{2619}= ☙
- remove_fillchars STRING
-
remove_fillchars
removes NUL (\00
) and DEL (\x7F
) characters. Also CRs (\r
) which are placed directly for other CRs, because CR is idempotent.
IMPORT TAGS
:all
exports all functions, and :var
exports $PREFER_UNICODE_SYMBOLS
.
CAVEATS
- Mixed 7-bit/8-bit work-flow
-
This module does not entirely honor the extension to handle Ecma-35 artefacts in 7-bit/8-bit transformation processes. If you have to work under such strange circumstances, try to use this module before such stuff came into effect.
- Escape sequences outside the Ecma48 universe
-
Some terminal commands violate/infringe the schema, and are not matched by these routines.
- Different handling compared to terminal (emulators)
-
Most terminals execute ill-formed codes after applying some error correction. But these sequences are ignored by this module and are returned as-is.
- Fill-chars inside escape sequences
-
The standard is unclear in this respect. Anyways, nowadays it shouldn't be an issue. However an own function
remove_fillchar
exists for preparation.
KNOWN BUGS
Returns wrong results under character sets such as EBCDIC.
SEE ALSO
Ecma-48, ISO 6429, ANSI X3.64, A List of many Escape Sequences
LOOSELY RELATED
Term::ANSIColor, Win32::Console::ANSI
COPYRIGHT
(c) 2012 Josef. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.