Char::Eutf2 - Run-time routines for Char/UTF2.pm
use Char::Eutf2; Char::Eutf2::split(...); Char::Eutf2::tr(...); Char::Eutf2::chop(...); Char::Eutf2::index(...); Char::Eutf2::rindex(...); Char::Eutf2::capture(...); Char::Eutf2::chr(...); Char::Eutf2::chr_; Char::Eutf2::glob(...); Char::Eutf2::glob_; # "no Char::Eutf2;" not supported
This module is a run-time routines of the Char::UTF2 module. Because the Char::UTF2 module automatically uses this module, you need not use directly.
Please patches and report problems to author are welcome.
This Char::Eutf2 module first appeared in ActivePerl Build 522 Built under MSWin32 Compiled at Nov 2 1999 09:52:28
INABA Hitoshi <email@example.com>
This project was originated by INABA Hitoshi. For any questions, use <firstname.lastname@example.org> so we can share this file.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
- Split string
@split = Char::Eutf2::split(/pattern/,$string,$limit); @split = Char::Eutf2::split(/pattern/,$string); @split = Char::Eutf2::split(/pattern/); @split = Char::Eutf2::split('',$string,$limit); @split = Char::Eutf2::split('',$string); @split = Char::Eutf2::split(''); @split = Char::Eutf2::split(); @split = Char::Eutf2::split; Scans a UTF-2 $string for delimiters that match pattern and splits the UTF-2 $string into a list of substrings, returning the resulting list value in list context, or the count of substrings in scalar context. The delimiters are determined by repeated pattern matching, using the regular expression given in pattern, so the delimiters may be of any size and need not be the same UTF-2 $string on every match. If the pattern doesn't match at all, Char::Eutf2::split returns the original UTF-2 $string as a single substring. If it matches once, you get two substrings, and so on. If $limit is specified and is not negative, the function splits into no more than that many fields. If $limit is negative, it is treated as if an arbitrarily large $limit has been specified. If $limit is omitted, trailing null fields are stripped from the result (which potential users of pop would do well to remember). If UTF-2 $string is omitted, the function splits the $_ UTF-2 string. If $patten is also omitted, the function splits on whitespace, /\s+/, after skipping any leading whitespace. If the pattern contains parentheses, then the substring matched by each pair of parentheses is included in the resulting list, interspersed with the fields that are ordinarily returned.
$tr = Char::Eutf2::tr($variable,$bind_operator,$searchlist,$replacementlist,$modifier); $tr = Char::Eutf2::tr($variable,$bind_operator,$searchlist,$replacementlist); This function scans a UTF-2 string character by character and replaces all occurrences of the characters found in $searchlist with the corresponding character in $replacementlist. It returns the number of characters replaced or deleted. If no UTF-2 string is specified via =~ operator, the $_ variable is translated. $modifier are: Modifier Meaning ------------------------------------------------------ c Complement $searchlist d Delete found but unreplaced characters s Squash duplicate replaced characters ------------------------------------------------------
- Chop string
$chop = Char::Eutf2::chop(@list); $chop = Char::Eutf2::chop(); $chop = Char::Eutf2::chop; Chops off the last character of a UTF-2 string contained in the variable (or UTF-2 strings in each element of a @list) and returns the character chopped. The Char::Eutf2::chop operator is used primarily to remove the newline from the end of an input record but is more efficient than s/\n$//. If no argument is given, the function chops the $_ variable.
- Index string
$pos = Char::Eutf2::index($string,$substr,$position); $pos = Char::Eutf2::index($string,$substr); Returns the position of the first occurrence of $substr in UTF-2 $string. The start, if specified, specifies the $position to start looking in the UTF-2 $string. Positions are integer numbers based at 0. If the substring is not found, the Char::Eutf2::index function returns -1.
- Reverse index string
$pos = Char::Eutf2::rindex($string,$substr,$position); $pos = Char::Eutf2::rindex($string,$substr); Works just like Char::Eutf2::index except that it returns the position of the last occurence of $substr in UTF-2 $string (a reverse index). The function returns -1 if not found. $position, if specified, is the rightmost position that may be returned, i.e., how far in the UTF-2 string the function can search.
- Make capture number
$capturenumber = Char::Eutf2::capture($string); This function is internal use to m/ /i, s/ / /i, split and qr/ /i.
- Make character
$chr = Char::Eutf2::chr($code); $chr = Char::Eutf2::chr_; This function returns the character represented by that $code in the character set. For example, Char::Eutf2::chr(65) is "A" in either ASCII or UTF-2, and Char::Eutf2::chr(0x82a0) is a UTF-2 HIRAGANA LETTER A. For the reverse of Char::Eutf2::chr, use Char::UTF2::ord.
- Filename expansion (globbing)
@glob = Char::Eutf2::glob($string); @glob = Char::Eutf2::glob_; Performs filename expansion (DOS-like globbing) on $string, returning the next successive name on each call. If $string is omitted, $_ is globbed instead. This function function when the pathname ends with chr(0x5C) on MSWin32. For example, C<<..\\l*b\\file/*glob.p?>> on MSWin32 or UNIX will work as expected (in that it will find something like '..\lib\File/DosGlob.pm' alright). Note that all path components are case-insensitive, and that backslashes and forward slashes are both accepted, and preserved. You may have to double the backslashes if you are putting them in literally, due to double-quotish parsing of the pattern by perl. A tilde ("~") expands to the current user's home directory. Spaces in the argument delimit distinct patterns, so C<glob('*.exe *.dll')> globs all filenames that end in C<.exe> or C<.dll>. If you want to put in literal spaces in the glob pattern, you can escape them with either double quotes. e.g. C<glob('c:/"Program Files"/*/*.dll')>.