Char::Eutf2 - Run-time routines for Char/


  use Char::Eutf2;


  # "no Char::Eutf2;" not supported


This module is a run-time routines of the Char::UTF2 module. Because the Char::UTF2 module automatically uses this module, you need not use directly.


Please patches and report problems to author are welcome.


This Char::Eutf2 module first appeared in ActivePerl Build 522 Built under MSWin32 Compiled at Nov 2 1999 09:52:28


INABA Hitoshi <>

This project was originated by INABA Hitoshi. For any questions, use <> so we can share this file.


This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Split string
  @split = Char::Eutf2::split(/pattern/,$string,$limit);
  @split = Char::Eutf2::split(/pattern/,$string);
  @split = Char::Eutf2::split(/pattern/);
  @split = Char::Eutf2::split('',$string,$limit);
  @split = Char::Eutf2::split('',$string);
  @split = Char::Eutf2::split('');
  @split = Char::Eutf2::split();
  @split = Char::Eutf2::split;

  Scans a UTF-2 $string for delimiters that match pattern and splits the UTF-2
  $string into a list of substrings, returning the resulting list value in list
  context, or the count of substrings in scalar context. The delimiters are
  determined by repeated pattern matching, using the regular expression given in
  pattern, so the delimiters may be of any size and need not be the same UTF-2
  $string on every match. If the pattern doesn't match at all, Char::Eutf2::split returns
  the original UTF-2 $string as a single substring. If it matches once, you get
  two substrings, and so on.
  If $limit is specified and is not negative, the function splits into no more than
  that many fields. If $limit is negative, it is treated as if an arbitrarily large
  $limit has been specified. If $limit is omitted, trailing null fields are stripped
  from the result (which potential users of pop would do well to remember).
  If UTF-2 $string is omitted, the function splits the $_ UTF-2 string.
  If $patten is also omitted, the function splits on whitespace, /\s+/, after
  skipping any leading whitespace.
  If the pattern contains parentheses, then the substring matched by each pair of
  parentheses is included in the resulting list, interspersed with the fields that
  are ordinarily returned.
  $tr = Char::Eutf2::tr($variable,$bind_operator,$searchlist,$replacementlist,$modifier);
  $tr = Char::Eutf2::tr($variable,$bind_operator,$searchlist,$replacementlist);

  This function scans a UTF-2 string character by character and replaces all
  occurrences of the characters found in $searchlist with the corresponding character
  in $replacementlist. It returns the number of characters replaced or deleted.
  If no UTF-2 string is specified via =~ operator, the $_ variable is translated.
  $modifier are:

  Modifier   Meaning
  c          Complement $searchlist
  d          Delete found but unreplaced characters
  s          Squash duplicate replaced characters
Chop string
  $chop = Char::Eutf2::chop(@list);
  $chop = Char::Eutf2::chop();
  $chop = Char::Eutf2::chop;

  Chops off the last character of a UTF-2 string contained in the variable (or
  UTF-2 strings in each element of a @list) and returns the character chopped.
  The Char::Eutf2::chop operator is used primarily to remove the newline from the end of
  an input record but is more efficient than s/\n$//. If no argument is given, the
  function chops the $_ variable.
Index string
  $pos = Char::Eutf2::index($string,$substr,$position);
  $pos = Char::Eutf2::index($string,$substr);

  Returns the position of the first occurrence of $substr in UTF-2 $string.
  The start, if specified, specifies the $position to start looking in the UTF-2
  $string. Positions are integer numbers based at 0. If the substring is not found,
  the Char::Eutf2::index function returns -1.
Reverse index string
  $pos = Char::Eutf2::rindex($string,$substr,$position);
  $pos = Char::Eutf2::rindex($string,$substr);

  Works just like Char::Eutf2::index except that it returns the position of the last
  occurence of $substr in UTF-2 $string (a reverse index). The function returns
  -1 if not found. $position, if specified, is the rightmost position that may be
  returned, i.e., how far in the UTF-2 string the function can search.
Make capture number
  $capturenumber = Char::Eutf2::capture($string);

  This function is internal use to m/ /i, s/ / /i, split and qr/ /i.
Make character
  $chr = Char::Eutf2::chr($code);
  $chr = Char::Eutf2::chr_;

  This function returns the character represented by that $code in the character
  set. For example, Char::Eutf2::chr(65) is "A" in either ASCII or UTF-2, and
  Char::Eutf2::chr(0x82a0) is a UTF-2 HIRAGANA LETTER A. For the reverse of Char::Eutf2::chr,
  use Char::UTF2::ord.
Filename expansion (globbing)
  @glob = Char::Eutf2::glob($string);
  @glob = Char::Eutf2::glob_;

  Performs filename expansion (DOS-like globbing) on $string, returning the next
  successive name on each call. If $string is omitted, $_ is globbed instead.
  This function function when the pathname ends with chr(0x5C) on MSWin32.

  For example, C<<..\\l*b\\file/*glob.p?>> on MSWin32 or UNIX will work as
  expected (in that it will find something like '..\lib\File/'
  Note that all path components are
  case-insensitive, and that backslashes and forward slashes are both accepted,
  and preserved. You may have to double the backslashes if you are putting them in
  literally, due to double-quotish parsing of the pattern by perl.
  A tilde ("~") expands to the current user's home directory.

  Spaces in the argument delimit distinct patterns, so C<glob('*.exe *.dll')> globs
  all filenames that end in C<.exe> or C<.dll>. If you want to put in literal spaces
  in the glob pattern, you can escape them with either double quotes.
  e.g. C<glob('c:/"Program Files"/*/*.dll')>.