mb - run Perl script in MBCS encoding (not only CJK ;-)


  $ perl    
  $ perl -e big5
  $ perl -e big5hkscs
  $ perl -e eucjp
  $ perl -e gb18030
  $ perl -e gbk
  $ perl -e sjis
  $ perl -e uhc
  $ perl -e utf8

  MBCS subroutines:
    mb::do 'file';
    mb::eval 'string';
    mb::require 'file';
    use mb::PERL Module;

  MBCS special variables:

  supported encodings:
    Big5, Big5-HKSCS, EUC-JP, GB18030, GBK, Sjis, UHC, UTF-8

  supported operating systems:
    Apple Inc. OS X,
    Hewlett-Packard Development Company, L.P. HP-UX,
    International Business Machines Corporation AIX,
    Microsoft Corporation Windows,
    Oracle Corporation Solaris,
    and Other Systems

  supported perl versions:
    perl version 5.005_03 to newest perl


To install this software by make, type the following:

   perl Makefile.PL
   make test
   make install


To install this software without make, type the following:

   pmake.bat test
   pmake.bat install


  This software is a source code filter, a transpiler-modulino.

  Perl is said to have been able to handle Unicode since version 5.8. However,
  unlike JPerl, "Easy jobs easy" has been lost. (but we have got it again :-D)

  In Shift_JIS and similar encodings(Big5, Big5-HKSCS, GB18030, GBK, Sjis, UHC)
  have any DAMEMOJI who have metacharacters at second octet. Which characters
  are DAMEMOJI is depends on whether the enclosing delimiter is single quote or
  double quote.

  This software escapes DAMEMOJI in your script, generate a new script and
  run it.

  There are some MBCS encodings in the world.
  in Japan since 1978, JIS C 6226-1978,
  in China since 1980, GB 2312-80,
  in Taiwan since 1984, Big5,
  in South Korea since 1991, KS X 1002:1991, and more.
  Even if you are an avid Unicode proponent, you cannot change this fact. These
  encodings are still used today in most areas except the world wide web.

  This software ...
  * supports MBCS literals in Perl scripts
  * supports Big5, Big5-HKSCS, EUC-JP, GB18030, GBK, Sjis, UHC, and UTF-8
  * does not use the UTF8 flag to avoid MOJIBAKE
  * escapes DAMEMOJI in scripts
  * handles raw encoding to support GAIJI
  * adds multibyte anchoring to regular expressions
  * rewrites character classes in regular expressions to work as MBCS codepoint
  * supports special variables $`, $&, and $'
  * does not change features of octet-oriented built-in functions
  * lc(), lcfirst(), uc(), and ucfirst() convert US-ASCII only
  * character ranges by hyphen of regular expression supports US-ASCII only
  * tr/// and y/// doesn't support ranges by hyphen
  * You have to write mb::* subroutines if you want codepoint semantics

  Let's enjoy MBSC scripting in Perl, together!!


  To understand and use this software, you must know some terminologies.
  But now I have no time for write them. So today is July 7th, I have to go to
  meet Juliet.
  The necessary terms are listed below. Maybe world wide web will help you.
  • byte

  • octet

  • encoding

  • decode

  • character

  • code point

  • grapheme

  • SBCS(Single Byte Character Set)

  • DBCS(Double Byte Character Set)

  • MBCS(Multibyte Character Set)

  • multibyte anchoring

  • character class





MBCS Encodings supported by this software

  The encodings supported by this software and their range of octets are as

  big5 (Big5)
             1st       2nd
             81..FE    00..FF
  big5hkscs (Big5-HKSCS)
             1st       2nd
             81..FE    00..FF
  eucjp (EUC-JP)
             1st       2nd
             A1..FE    00..FF
  gb18030 (GB18030)
             1st       2nd       3rd       4th
             81..FE    30..39    81..FE    30..39
             81..FE    00..FF
  gbk (GBK)
             1st       2nd
             81..FE    00..FF
  sjis (Shift_JIS-like encodings)
             1st       2nd
             81..9F    00..FF
             E0..FC    00..FF
  uhc (UHC)
             1st       2nd
             81..FE    00..FF
  utf8 (UTF-8)
             1st       2nd       3rd       4th
             E1..EC    80..BF    80..BF
             C2..DF    80..BF
             EE..EF    80..BF    80..BF
             F0..F0    90..BF    80..BF    80..BF
             E0..E0    A0..BF    80..BF
             ED..ED    80..9F    80..BF
             F1..F3    80..BF    80..BF    80..BF
             F4..F4    80..8F    80..BF    80..BF

MBCS subroutines provided by this software

  This software provides traditional feature "as was." The new MBCS features
  are provided by subroutines with new names. If you like utf8 pragma, mb::*
  subroutines will help you. On other hand, If you love JPerl, those
  subroutines will not help you very much. Traditional functions of Perl are
  useful still now in octet-oriented semantics.

  elder <--                            age                             --> younger
  bare Perl4         JPerl4                                                       
  bare Perl5         JPerl5             use utf8;                  
  bare Perl7                            pragma             modulino               
  chop               ---                ---                chop                   
  chr                chr                bytes::chr         chr                    
  getc               getc               ---                getc                   
  index              ---                bytes::index       index                  
  lc                 lc                 ---                lc                     
  lcfirst            lcfirst            ---                lcfirst                
  length             length             bytes::length      length                 
  ord                ord                bytes::ord         ord                    
  reverse            reverse            ---                reverse                
  rindex             ---                bytes::rindex      rindex                 
  substr             substr             bytes::substr      substr                 
  uc                 uc                 ---                uc                     
  ucfirst            ucfirst            ---                ucfirst                
  ---                chop               chop               mb::chop               
  ---                ---                chr                mb::chr                
  ---                ---                getc               mb::getc               
  ---                index              ---                mb::index_byte         
  ---                ---                index              mb::index              
  ---                ---                lc                 ---                    
  ---                ---                lcfirst            ---                    
  ---                ---                length             mb::length             
  ---                ---                ord                mb::ord                
  ---                ---                reverse            mb::reverse            
  ---                rindex             ---                mb::rindex_byte        
  ---                ---                rindex             mb::rindex             
  ---                ---                substr             mb::substr             
  ---                ---                uc                 ---                    
  ---                ---                ucfirst            ---                    
  do 'file'          ---                ---                do 'file'              
  eval 'string'      ---                ---                eval 'string'          
  require 'file'     ---                ---                require 'file'         
  use Module         ---                ---                use Module             
  ---                do 'file'          do 'file'          mb::do 'file'          
  ---                eval 'string'      eval 'string'      mb::eval 'string'      
  ---                require 'file'     require 'file'     mb::require 'file'     
  ---                use Module         use Module         use mb::PERL Module    
  $^X                ---                ---                $^X                    
  ---                $^X                $^X                $mb::PERL              
  $0                 $0                 $0                 $mb::ORIG_PROGRAM_NAME 
  ---                ---                ---                $0                     

  DOS-like glob() as MBCS subroutine
  MBCS semantics          broken function, not so useful
  mb::dosglob             glob, and <globbing*>

  index brothers
  functions or subs       works           returns         considered
  index                   as octet        as octet        useful, bare Perl like
  rindex                  as octet        as octet        useful, bare Perl like
  mb::index               as codepoint    as codepoint    not so useful, utf8 pragma like
  mb::rindex              as codepoint    as codepoint    not so useful, utf8 pragma like
  mb::index_byte          as codepoint    as octet        useful, JPerl like
  mb::rindex_byte         as codepoint    as octet        useful, JPerl like

MBCS special variables provided by this software

  This software provides the following two special variables for convenience.
  • $mb::PERL

      system(qq{ $^X });              # had been write this...
                                                     # on modulino
      system(qq{ $^X });   # for SBCS script
      system(qq{ $mb::PERL });   # for MBCS script

      if ($0 =~ /-x64\.pl\z/) { ... }                # had been write this...
                                                     # on modulino
      if ($0 =~ /-x64\.pl\z/) { ... }                # means program name translated by modulino (are you right?)
      if ($mb::ORIG_PROGRAM_NAME =~ /-x64\.pl\z/) { ... }  # means original program name not translated by modulino

Porting from script in bare Perl4, bare Perl5, and bare Perl7

  original script in        script with
  Perl4, Perl5, Perl7 modulino
  chop                      chop
  chr                       chr
  do 'file'                 do 'file'
  eval 'string'             eval 'string'
  getc                      getc
  index                     index
  lc                        lc
  lcfirst                   lcfirst
  length                    length
  no Module                 no Module
  ord                       ord
  require 'file'            require 'file'
  reverse                   reverse
  rindex                    rindex
  substr                    substr
  uc                        uc
  ucfirst                   ucfirst
  use Module                use Module

Porting from script in JPerl4, and JPerl5

  original script in        script with
  JPerl4, JPerl5   modulino
  chop                      mb::chop
  do 'file'                 mb::do 'file'
  eval 'string'             mb::eval 'string'
  index                     mb::index_byte
  no Module                 no mb::PERL Module *1
  require 'file'            mb::require 'file'
  rindex                    mb::rindex_byte
  use Module                use mb::PERL Module *1
  *1 mb::PERL module comes later

Porting from script with utf8 pragma

  original script with      script with
  utf8 pragma      modulino
  chop                      mb::chop
  chr                       mb::chr
  do 'file'                 mb::do 'file'
  eval 'string'             mb::eval 'string'
  getc                      mb::getc
  index                     mb::index
  lc                        ---
  lcfirst                   ---
  length                    mb::length
  no Module                 no mb::PERL Module *2
  ord                       mb::ord
  require 'file'            mb::require 'file'
  reverse                   mb::reverse
  rindex                    mb::rindex
  substr                    mb::substr
  uc                        ---
  ucfirst                   ---
  use Module                use mb::PERL Module *2
  *2 mb::PERL module comes later, and module must be without utf8 pragma.


  In single quote, DAMEMOJI are double-byte characters that include the
  following metacharacters ('', q{}, <<'END', qw{}, m'', s''', split(''),
  split(m''), and qr'')
  hex   character as US-ASCII
  5C    [\]    backslashed escapes

  In double quote, DAMEMOJI are double-byte characters that include the
  following metacharacters ("", qq{}, <<END, <<"END", ``, qx{}, <<`END`, //,
  m//, ??, s///, split(//), split(m//), and qr//)
  hex   character as US-ASCII
  21    [!]    
  22    ["]    
  23    [#]    regexp comment
  24    [$]    sigil of scalar variable
  25    [%]    
  26    [&]    
  27    [']    
  28    [(]    regexp group and capture
  29    [)]    regexp group and capture
  2A    [*]    regexp matches zero or more times
  2B    [+]    regexp matches one or more times
  2C    [,]    
  2D    [-]    
  2E    [.]    regexp matches any octet
  2F    [/]    
  3A    [:]    
  3B    [;]    
  3C    [<]    
  3D    [=]    
  3E    [>]    
  3F    [?]    regexp matches zero or one times
  40    [@]    sigil of array variable
  5B    [[]    regexp bracketed character class
  5C    [\]    backslashed escapes
  5D    []]    regexp bracketed character class
  5E    [^]    regexp true at beginning of string
  60    [`]    command execution
  7B    [{]    regexp quantifier
  7C    [|]    regexp alternation
  7D    [}]    regexp quantifier
  7E    [~]    

How to escape 2nd octet of DAMEMOJI

  ex. Japanese KATAKANA "SO" like [ `/ ] code is "\x83\x5C" in Sjis
                  see     hex dump
  source script   "`/"    [83 5c]
                          hex dump
  escaped script  "`\/"   [83 [5c] 5c]
                    ^--- escape by
  by the by       see     hex dump
  your eye's      "`/\"   [83 5c] [5c]
  perl eye's      "`\/"   [83] \[5c]
                          hex dump
  in the perl     "`/"    [83] [5c]

What transpiles to what by this software?

  This software automatically transpiles MBCS literal strings in scripts to
  octet-oriented strings(OO-quotee).

  in your script                             script transpiled by this software
  do 'file'                                  do 'file'
  do { block }                               do { block }
  mb::do 'file'                              mb::do 'file'
  mb::do { block }                           do { block }
  eval 'string'                              eval 'string'
  eval { block }                             eval { block }
  mb::eval 'string'                          mb::eval 'string'
  mb::eval { block }                         eval { block }
  require 123                                require 123
  require 'file'                             require 'file'
  mb::require 123                            mb::require 123
  mb::require 'file'                         mb::require 'file'
  chop                                       chop
  lc                                         mb::lc
  lcfirst                                    mb::lcfirst
  uc                                         mb::uc
  ucfirst                                    mb::ucfirst
  index                                      index
  rindex                                     rindex
  mb::getc()                                 mb::getc()
  mb::getc($fh)                              mb::getc($fh)
  mb::getc $fh                               mb::getc $fh
  mb::getc(FILE)                             mb::getc(\*FILE)
  mb::getc FILE                              mb::getc \*FILE
  mb::getc                                   mb::getc
  'MBCS-quotee'                              'OO-quotee'
  "MBCS-quotee"                              "OO-quotee"
  `MBCS-quotee`                              `OO-quotee`
  /MBCS-quotee/cgimosx                       m{\G${mb::_anchor}@{[mb::_ignorecase(qr/OO-quotee/mosx)]}@{[mb::_m_passed()]}}cg
  /MBCS-quotee/cgmosx                        m{\G${mb::_anchor}@{[qr/OO-quotee/mosx ]}@{[mb::_m_passed()]}}cg
  ?MBCS-quotee?cgimosx                       m{\G${mb::_anchor}@{[mb::_ignorecase(qr?OO-quotee?mosx)]}@{[mb::_m_passed()]}}cg
  ?MBCS-quotee?cgmosx                        m{\G${mb::_anchor}@{[qr?OO-quotee?mosx ]}@{[mb::_m_passed()]}}cg
  <MBCS-quotee>                              <OO-quotee>
  q/MBCS-quotee/                             q/OO-quotee/
  qx'MBCS-quotee'                            qx'OO-quotee'
  qw/MBCS-quotee/                            qw/OO-quotee/
  m'MBCS-quotee'cgimosx                      m{\G${mb::_anchor}@{[mb::_ignorecase(qr'OO-quotee'mosx)]}@{[mb::_m_passed()]}}cg
  m'MBCS-quotee'cgmosx                       m{\G${mb::_anchor}@{[qr'OO-quotee'mosx ]}@{[mb::_m_passed()]}}cg
  s'MBCS-regexp'MBCS-replacement'eegimosxr   s{(\G${mb::_anchor})@{[mb::_ignorecase(qr'OO-regexp'mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q'OO-replacement'}egr
  s'MBCS-regexp'MBCS-replacement'eegmosxr    s{(\G${mb::_anchor})@{[qr'OO-regexp'mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q'OO-replacement'}egr
  tr/MBCS-search/MBCS-replacement/cdsr       s{[\x00-\xFF]*}{mb::tr($&,q/OO-search/,q/OO-replacement/,'cdsr')}er
  tr/MBCS-search/MBCS-replacement/cds        s{[\x00-\xFF]*}{mb::tr($&,q/OO-search/,q/OO-replacement/,'cdsr')}e
  tr/MBCS-search/MBCS-replacement/ds         s{[\x00-\xFF]*}{mb::tr($&,q/OO-search/,q/OO-replacement/,'dsr')}e
  y/MBCS-search/MBCS-replacement/cdsr        s{[\x00-\xFF]*}{mb::tr($&,q/OO-search/,q/OO-replacement/,'cdsr')}er
  y/MBCS-search/MBCS-replacement/cds         s{[\x00-\xFF]*}{mb::tr($&,q/OO-search/,q/OO-replacement/,'cdsr')}e
  y/MBCS-search/MBCS-replacement/ds          s{[\x00-\xFF]*}{mb::tr($&,q/OO-search/,q/OO-replacement/,'dsr')}e
  qr'MBCS-quotee'cgimosx                     qr{\G${mb::_anchor}@{[mb::_ignorecase(qr'OO-quotee'mosx)]}@{[mb::_m_passed()]}}cg
  qr'MBCS-quotee'cgmosx                      qr{\G${mb::_anchor}@{[qr'OO-quotee'mosx ]}@{[mb::_m_passed()]}}cg
  split m'^'                                 mb::_split qr{@{[qr'^'m ]}}
  split m'MBCS-quotee'cgimosx                mb::_split qr{@{[mb::_ignorecase(qr'OO-quotee'mosx)]}}cg
  split m'MBCS-quotee'cgmosx                 mb::_split qr{@{[qr'OO-quotee'mosx ]}}cg
  split qr'^'                                mb::_split qr{@{[qr'^'m ]}}
  split qr'MBCS-quotee'cgimosx               mb::_split qr{@{[mb::_ignorecase(qr'OO-quotee'mosx)]}}cg
  split qr'MBCS-quotee'cgmosx                mb::_split qr{@{[qr'OO-quotee'mosx ]}}cg
  qq/MBCS-quotee/                            qq/OO-quotee/
  qq'MBCS-quotee'                            qq'OO-quotee'
  qx/MBCS-quotee/                            qx/OO-quotee/
  m/MBCS-quotee/cgimosx                      m{\G${mb::_anchor}@{[mb::_ignorecase(qr/OO-quotee/mosx)]}@{[mb::_m_passed()]}}cg
  m/MBCS-quotee/cgmosx                       m{\G${mb::_anchor}@{[qr/OO-quotee/mosx ]}@{[mb::_m_passed()]}}cg
  s/MBCS-regexp/MBCS-replacement/eegimosxr   s{(\G${mb::_anchor})@{[mb::_ignorecase(qr/OO-regexp/mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q/OO-replacement/}egr
  s/MBCS-regexp/MBCS-replacement/eegmosxr    s{(\G${mb::_anchor})@{[qr/OO-regexp/mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q/OO-replacement/}egr
  qr/MBCS-quotee/cgimosx                     qr{\G${mb::_anchor}@{[mb::_ignorecase(qr/OO-quotee/mosx)]}@{[mb::_m_passed()]}}cg
  qr/MBCS-quotee/cgmosx                      qr{\G${mb::_anchor}@{[qr/OO-quotee/mosx ]}@{[mb::_m_passed()]}}cg
  split /^/                                  mb::_split qr{@{[qr/^/m ]}}
  split /MBCS-quotee/cgimosx                 mb::_split qr{@{[mb::_ignorecase(qr/OO-quotee/mosx)]}}cg
  split /MBCS-quotee/cgmosx                  mb::_split qr{@{[qr/OO-quotee/mosx ]}}cg
  split m/^/                                 mb::_split qr{@{[qr/^/m ]}}
  split m/MBCS-quotee/cgimosx                mb::_split qr{@{[mb::_ignorecase(qr/OO-quotee/mosx)]}}cg
  split m/MBCS-quotee/cgmosx                 mb::_split qr{@{[qr/OO-quotee/mosx ]}}cg
  split qr/^/                                mb::_split qr{@{[qr/^/m ]}}
  split qr/MBCS-quotee/cgimosx               mb::_split qr{@{[mb::_ignorecase(qr/OO-quotee/mosx)]}}cg
  split qr/MBCS-quotee/cgmosx                mb::_split qr{@{[qr/OO-quotee/mosx ]}}cg
  m:MBCS-quotee:cgimosx                      m{\G${mb::_anchor}@{[mb::_ignorecase(qr`OO-quotee`mosx)]}@{[mb::_m_passed()]}}cg
  m:MBCS-quotee:cgmosx                       m{\G${mb::_anchor}@{[qr`OO-quotee`mosx ]}@{[mb::_m_passed()]}}cg
  s:MBCS-regexp:MBCS-replacement:eegimosxr   s{(\G${mb::_anchor})@{[mb::_ignorecase(qr`OO-regexp`mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q:OO-replacement:}egr
  s:MBCS-regexp:MBCS-replacement:eegmosxr    s{(\G${mb::_anchor})@{[qr`OO-regexp`mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q:OO-replacement:}egr
  qr:MBCS-quotee:cgimosx                     qr{\G${mb::_anchor}@{[mb::_ignorecase(qr`OO-quotee`mosx)]}@{[mb::_m_passed()]}}cg
  qr:MBCS-quotee:cgmosx                      qr{\G${mb::_anchor}@{[qr`OO-quotee`mosx ]}@{[mb::_m_passed()]}}cg
  split m:^:                                 mb::_split qr{@{[qr`^`m ]}}
  split m:MBCS-quotee:cgimosx                mb::_split qr{@{[mb::_ignorecase(qr`OO-quotee`mosx)]}}cg
  split m:MBCS-quotee:cgmosx                 mb::_split qr{@{[qr`OO-quotee`mosx ]}}cg
  split qr:^:                                mb::_split qr{@{[qr`^`m ]}}
  split qr:MBCS-quotee:cgimosx               mb::_split qr{@{[mb::_ignorecase(qr`OO-quotee`mosx)]}}cg
  split qr:MBCS-quotee:cgmosx                mb::_split qr{@{[qr`OO-quotee`mosx ]}}cg
  m@MBCS-quotee@cgimosx                      m{\G${mb::_anchor}@{[mb::_ignorecase(qr`OO-quotee`mosx)]}@{[mb::_m_passed()]}}cg
  m@MBCS-quotee@cgmosx                       m{\G${mb::_anchor}@{[qr`OO-quotee`mosx ]}@{[mb::_m_passed()]}}cg
  s@MBCS-regexp@MBCS-replacement@eegimosxr   s{(\G${mb::_anchor})@{[mb::_ignorecase(qr`OO-regexp`mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q@OO-replacement@}egr
  s@MBCS-regexp@MBCS-replacement@eegmosxr    s{(\G${mb::_anchor})@{[qr`OO-regexp`mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q@OO-replacement@}egr
  qr@MBCS-quotee@cgimosx                     qr{\G${mb::_anchor}@{[mb::_ignorecase(qr`OO-quotee`mosx)]}@{[mb::_m_passed()]}}cg
  qr@MBCS-quotee@cgmosx                      qr{\G${mb::_anchor}@{[qr`OO-quotee`mosx ]}@{[mb::_m_passed()]}}cg
  split m@^@                                 mb::_split qr{@{[qr`^`m ]}}
  split m@MBCS-quotee@cgimosx                mb::_split qr{@{[mb::_ignorecase(qr`OO-quotee`mosx)]}}cg
  split m@MBCS-quotee@cgmosx                 mb::_split qr{@{[qr`OO-quotee`mosx ]}}cg
  split qr@^@                                mb::_split qr{@{[qr`^`m ]}}
  split qr@MBCS-quotee@cgimosx               mb::_split qr{@{[mb::_ignorecase(qr`OO-quotee`mosx)]}}cg
  split qr@MBCS-quotee@cgmosx                mb::_split qr{@{[qr`OO-quotee`mosx ]}}cg
  m#MBCS-quotee#cgimosx                      m{\G${mb::_anchor}@{[mb::_ignorecase(qr#OO-quotee#mosx)]}@{[mb::_m_passed()]}}cg
  m#MBCS-quotee#cgmosx                       m{\G${mb::_anchor}@{[qr#OO-quotee#mosx ]}@{[mb::_m_passed()]}}cg
  s#MBCS-regexp#MBCS-replacement#eegimosxr   s{(\G${mb::_anchor})@{[mb::_ignorecase(qr#OO-regexp#mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q#OO-replacement#}egr
  s#MBCS-regexp#MBCS-replacement#eegmosxr    s{(\G${mb::_anchor})@{[qr#OO-regexp#mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q#OO-replacement#}egr
  qr#MBCS-quotee#cgimosx                     qr{\G${mb::_anchor}@{[mb::_ignorecase(qr#OO-quotee#mosx)]}@{[mb::_m_passed()]}}cg
  qr#MBCS-quotee#cgmosx                      qr{\G${mb::_anchor}@{[qr#OO-quotee#mosx ]}@{[mb::_m_passed()]}}cg
  split m#^#                                 mb::_split qr{@{[qr#^#m ]}}
  split m#MBCS-quotee#cgimosx                mb::_split qr{@{[mb::_ignorecase(qr#OO-quotee#mosx)]}}cg
  split m#MBCS-quotee#cgmosx                 mb::_split qr{@{[qr#OO-quotee#mosx ]}}cg
  split qr#^#                                mb::_split qr{@{[qr#^#m ]}}
  split qr#MBCS-quotee#cgimosx               mb::_split qr{@{[mb::_ignorecase(qr#OO-quotee#mosx)]}}cg
  split qr#MBCS-quotee#cgmosx                mb::_split qr{@{[qr#OO-quotee#mosx ]}}cg
  $`                                         mb::_PREMATCH()
  ${`}                                       mb::_PREMATCH()
  $PREMATCH                                  mb::_PREMATCH()
  ${PREMATCH}                                mb::_PREMATCH()
  ${^PREMATCH}                               mb::_PREMATCH()
  $&                                         mb::_MATCH()
  ${&}                                       mb::_MATCH()
  $MATCH                                     mb::_MATCH()
  ${MATCH}                                   mb::_MATCH()
  ${^MATCH}                                  mb::_MATCH()
  $1                                         mb::_CAPTURE(1)
  $2                                         mb::_CAPTURE(2)
  $3                                         mb::_CAPTURE(3)
  @{^CAPTURE}                                mb::_CAPTURE()
  ${^CAPTURE}[0]                             mb::_CAPTURE(0+1)
  ${^CAPTURE}[1]                             mb::_CAPTURE(1+1)
  ${^CAPTURE}[2]                             mb::_CAPTURE(2+1)
  @-                                         mb::_LAST_MATCH_START()
  @LAST_MATCH_START                          mb::_LAST_MATCH_START()
  @{LAST_MATCH_START}                        mb::_LAST_MATCH_START()
  @{^LAST_MATCH_START}                       mb::_LAST_MATCH_START()
  $-[1]                                      mb::_LAST_MATCH_START(1)
  $LAST_MATCH_START[1]                       mb::_LAST_MATCH_START(1)
  ${LAST_MATCH_START}[1]                     mb::_LAST_MATCH_START(1)
  ${^LAST_MATCH_START}[1]                    mb::_LAST_MATCH_START(1)
  @+                                         mb::_LAST_MATCH_END()
  @LAST_MATCH_END                            mb::_LAST_MATCH_END()
  @{LAST_MATCH_END}                          mb::_LAST_MATCH_END()
  @{^LAST_MATCH_END}                         mb::_LAST_MATCH_END()
  $+[1]                                      mb::_LAST_MATCH_END(1)
  $LAST_MATCH_END[1]                         mb::_LAST_MATCH_END(1)
  ${LAST_MATCH_END}[1]                       mb::_LAST_MATCH_END(1)
  ${^LAST_MATCH_END}[1]                      mb::_LAST_MATCH_END(1)
  "$`"                                       "@{[mb::_PREMATCH()]}"
  "${`}"                                     "@{[mb::_PREMATCH()]}"
  "$PREMATCH"                                "@{[mb::_PREMATCH()]}"
  "${PREMATCH}"                              "@{[mb::_PREMATCH()]}"
  "${^PREMATCH}"                             "@{[mb::_PREMATCH()]}"
  "$&"                                       "@{[mb::_MATCH()]}"
  "${&}"                                     "@{[mb::_MATCH()]}"
  "$MATCH"                                   "@{[mb::_MATCH()]}"
  "${MATCH}"                                 "@{[mb::_MATCH()]}"
  "${^MATCH}"                                "@{[mb::_MATCH()]}"
  "$1"                                       "@{[mb::_CAPTURE(1)]}"
  "$2"                                       "@{[mb::_CAPTURE(2)]}"
  "$3"                                       "@{[mb::_CAPTURE(3)]}"
  "@{^CAPTURE}"                              "@{[join $", mb::_CAPTURE()]}"
  "${^CAPTURE}[0]"                           "@{[mb::_CAPTURE(0)]}"
  "${^CAPTURE}[1]"                           "@{[mb::_CAPTURE(1)]}"
  "${^CAPTURE}[2]"                           "@{[mb::_CAPTURE(2)]}"
  "@-"                                       "@{[mb::_LAST_MATCH_START()]}"
  "@LAST_MATCH_START"                        "@{[mb::_LAST_MATCH_START()]}"
  "@{LAST_MATCH_START}"                      "@{[mb::_LAST_MATCH_START()]}"
  "@{^LAST_MATCH_START}"                     "@{[mb::_LAST_MATCH_START()]}"
  "$-[1]"                                    "@{[mb::_LAST_MATCH_START(1)]}"
  "$LAST_MATCH_START[1]"                     "@{[mb::_LAST_MATCH_START(1)]}"
  "${LAST_MATCH_START}[1]"                   "@{[mb::_LAST_MATCH_START(1)]}"
  "${^LAST_MATCH_START}[1]"                  "@{[mb::_LAST_MATCH_START(1)]}"
  "@+"                                       "@{[mb::_LAST_MATCH_END()]}"
  "@LAST_MATCH_END"                          "@{[mb::_LAST_MATCH_END()]}"
  "@{LAST_MATCH_END}"                        "@{[mb::_LAST_MATCH_END()]}"
  "@{^LAST_MATCH_END}"                       "@{[mb::_LAST_MATCH_END()]}"
  "$+[1]"                                    "@{[mb::_LAST_MATCH_END(1)]}"
  "$LAST_MATCH_END[1]"                       "@{[mb::_LAST_MATCH_END(1)]}"
  "${LAST_MATCH_END}[1]"                     "@{[mb::_LAST_MATCH_END(1)]}"
  "${^LAST_MATCH_END}[1]"                    "@{[mb::_LAST_MATCH_END(1)]}"

  The transpile-list below is primarily for Microsoft Windows, but it also
  applies when run on other operating systems to ensure commonality. Even if
  Perl 5.00503, you can stack file test operators, -r -w -f $file works as
  -f $file && -w _ && -r _.

  in your script                             script transpiled by this software
  chdir                                      mb::_chdir
  opendir(DIR,'dir')                         mb::_opendir(\*DIR,'dir')
  opendir DIR,'dir'                          mb::_opendir \*DIR,'dir'
  opendir($dh,'dir')                         mb::_opendir($dh,'dir')
  opendir $dh,'dir'                          mb::_opendir $dh,'dir'
  unlink                                     mb::_unlink
  lstat()                                    mb::_lstat()
  lstat('a')                                 mb::_lstat('a')
  lstat("a")                                 mb::_lstat("a")
  lstat(`a`)                                 mb::_lstat(`a`)
  lstat(m/a/)                                mb::_lstat(m{\G${mb::_anchor}@{[qr/a/ ]}@{[mb::_m_passed()]}})
  lstat(q/a/)                                mb::_lstat(q/a/)
  lstat(qq/a/)                               mb::_lstat(qq/a/)
  lstat(qr/a/)                               mb::_lstat(qr{\G${mb::_anchor}@{[qr/a/ ]}@{[mb::_m_passed()]}})
  lstat(qw/a/)                               mb::_lstat(qw/a/)
  lstat(qx/a/)                               mb::_lstat(qx/a/)
  lstat(s/a/b/)                              mb::_lstat(s{(\G${mb::_anchor})@{[qr/a/ ]}@{[mb::_s_passed()]}}{$1 . qq /b/}e)
  lstat(tr/a/b/)                             mb::_lstat(s{(\G${mb::_anchor})((?:(?=[a])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))}{$1.mb::tr($2,q/a/,q/b/,'r')}eg)
  lstat(y/a/b/)                              mb::_lstat(s{(\G${mb::_anchor})((?:(?=[a])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))}{$1.mb::tr($2,q/a/,q/b/,'r')}eg)
  lstat($fh)                                 mb::_lstat($fh)
  lstat(FILE)                                mb::_lstat(\*FILE)
  lstat(_)                                   mb::_lstat(\*_)
  lstat $fh                                  mb::_lstat $fh
  lstat FILE                                 mb::_lstat \*FILE
  lstat _                                    mb::_lstat \*_
  lstat                                      mb::_lstat
  stat()                                     mb::_stat()
  stat('a')                                  mb::_stat('a')
  stat("a")                                  mb::_stat("a")
  stat(`a`)                                  mb::_stat(`a`)
  stat(m/a/)                                 mb::_stat(m{\G${mb::_anchor}@{[qr/a/ ]}@{[mb::_m_passed()]}})
  stat(q/a/)                                 mb::_stat(q/a/)
  stat(qq/a/)                                mb::_stat(qq/a/)
  stat(qr/a/)                                mb::_stat(qr{\G${mb::_anchor}@{[qr/a/ ]}@{[mb::_m_passed()]}})
  stat(qw/a/)                                mb::_stat(qw/a/)
  stat(qx/a/)                                mb::_stat(qx/a/)
  stat(s/a/b/)                               mb::_stat(s{(\G${mb::_anchor})@{[qr/a/ ]}@{[mb::_s_passed()]}}{$1 . qq /b/}e)
  stat(tr/a/b/)                              mb::_stat(s{(\G${mb::_anchor})((?:(?=[a])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))}{$1.mb::tr($2,q/a/,q/b/,'r')}eg)
  stat(y/a/b/)                               mb::_stat(s{(\G${mb::_anchor})((?:(?=[a])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))}{$1.mb::tr($2,q/a/,q/b/,'r')}eg)
  stat($fh)                                  mb::_stat($fh)
  stat(FILE)                                 mb::_stat(\*FILE)
  stat(_)                                    mb::_stat(\*_)
  stat $fh                                   mb::_stat $fh
  stat FILE                                  mb::_stat \*FILE
  stat _                                     mb::_stat \*_
  stat                                       mb::_stat
  -A $fh                                     mb::_filetest [qw( -A )], $fh
  -A 'file'                                  mb::_filetest [qw( -A )], 'file'
  -A FILE                                    mb::_filetest [qw( -A )], \*FILE
  -A _                                       mb::_filetest [qw( -A )], \*_
  -A qq{file}                                mb::_filetest [qw( -A )], qq{file}
  -B $fh                                     mb::_filetest [qw( -B )], $fh
  -B 'file'                                  mb::_filetest [qw( -B )], 'file'
  -B FILE                                    mb::_filetest [qw( -B )], \*FILE
  -B _                                       mb::_filetest [qw( -B )], \*_
  -B qq{file}                                mb::_filetest [qw( -B )], qq{file}
  -C $fh                                     mb::_filetest [qw( -C )], $fh
  -C 'file'                                  mb::_filetest [qw( -C )], 'file'
  -C FILE                                    mb::_filetest [qw( -C )], \*FILE
  -C _                                       mb::_filetest [qw( -C )], \*_
  -C qq{file}                                mb::_filetest [qw( -C )], qq{file}
  -M $fh                                     mb::_filetest [qw( -M )], $fh
  -M 'file'                                  mb::_filetest [qw( -M )], 'file'
  -M FILE                                    mb::_filetest [qw( -M )], \*FILE
  -M _                                       mb::_filetest [qw( -M )], \*_
  -M qq{file}                                mb::_filetest [qw( -M )], qq{file}
  -O $fh                                     mb::_filetest [qw( -O )], $fh
  -O 'file'                                  mb::_filetest [qw( -O )], 'file'
  -O FILE                                    mb::_filetest [qw( -O )], \*FILE
  -O _                                       mb::_filetest [qw( -O )], \*_
  -O qq{file}                                mb::_filetest [qw( -O )], qq{file}
  -R $fh                                     mb::_filetest [qw( -R )], $fh
  -R 'file'                                  mb::_filetest [qw( -R )], 'file'
  -R FILE                                    mb::_filetest [qw( -R )], \*FILE
  -R _                                       mb::_filetest [qw( -R )], \*_
  -R qq{file}                                mb::_filetest [qw( -R )], qq{file}
  -S $fh                                     mb::_filetest [qw( -S )], $fh
  -S 'file'                                  mb::_filetest [qw( -S )], 'file'
  -S FILE                                    mb::_filetest [qw( -S )], \*FILE
  -S _                                       mb::_filetest [qw( -S )], \*_
  -S qq{file}                                mb::_filetest [qw( -S )], qq{file}
  -T $fh                                     mb::_filetest [qw( -T )], $fh
  -T 'file'                                  mb::_filetest [qw( -T )], 'file'
  -T FILE                                    mb::_filetest [qw( -T )], \*FILE
  -T _                                       mb::_filetest [qw( -T )], \*_
  -T qq{file}                                mb::_filetest [qw( -T )], qq{file}
  -W $fh                                     mb::_filetest [qw( -W )], $fh
  -W 'file'                                  mb::_filetest [qw( -W )], 'file'
  -W FILE                                    mb::_filetest [qw( -W )], \*FILE
  -W _                                       mb::_filetest [qw( -W )], \*_
  -W qq{file}                                mb::_filetest [qw( -W )], qq{file}
  -X $fh                                     mb::_filetest [qw( -X )], $fh
  -X 'file'                                  mb::_filetest [qw( -X )], 'file'
  -X FILE                                    mb::_filetest [qw( -X )], \*FILE
  -X _                                       mb::_filetest [qw( -X )], \*_
  -X qq{file}                                mb::_filetest [qw( -X )], qq{file}
  -b $fh                                     mb::_filetest [qw( -b )], $fh
  -b 'file'                                  mb::_filetest [qw( -b )], 'file'
  -b FILE                                    mb::_filetest [qw( -b )], \*FILE
  -b _                                       mb::_filetest [qw( -b )], \*_
  -b qq{file}                                mb::_filetest [qw( -b )], qq{file}
  -c $fh                                     mb::_filetest [qw( -c )], $fh
  -c 'file'                                  mb::_filetest [qw( -c )], 'file'
  -c FILE                                    mb::_filetest [qw( -c )], \*FILE
  -c _                                       mb::_filetest [qw( -c )], \*_
  -c qq{file}                                mb::_filetest [qw( -c )], qq{file}
  -d $fh                                     mb::_filetest [qw( -d )], $fh
  -d 'file'                                  mb::_filetest [qw( -d )], 'file'
  -d FILE                                    mb::_filetest [qw( -d )], \*FILE
  -d _                                       mb::_filetest [qw( -d )], \*_
  -d qq{file}                                mb::_filetest [qw( -d )], qq{file}
  -e $fh                                     mb::_filetest [qw( -e )], $fh
  -e 'file'                                  mb::_filetest [qw( -e )], 'file'
  -e FILE                                    mb::_filetest [qw( -e )], \*FILE
  -e _                                       mb::_filetest [qw( -e )], \*_
  -e qq{file}                                mb::_filetest [qw( -e )], qq{file}
  -f $fh                                     mb::_filetest [qw( -f )], $fh
  -f 'file'                                  mb::_filetest [qw( -f )], 'file'
  -f FILE                                    mb::_filetest [qw( -f )], \*FILE
  -f _                                       mb::_filetest [qw( -f )], \*_
  -f qq{file}                                mb::_filetest [qw( -f )], qq{file}
  -g $fh                                     mb::_filetest [qw( -g )], $fh
  -g 'file'                                  mb::_filetest [qw( -g )], 'file'
  -g FILE                                    mb::_filetest [qw( -g )], \*FILE
  -g _                                       mb::_filetest [qw( -g )], \*_
  -g qq{file}                                mb::_filetest [qw( -g )], qq{file}
  -k $fh                                     mb::_filetest [qw( -k )], $fh
  -k 'file'                                  mb::_filetest [qw( -k )], 'file'
  -k FILE                                    mb::_filetest [qw( -k )], \*FILE
  -k _                                       mb::_filetest [qw( -k )], \*_
  -k qq{file}                                mb::_filetest [qw( -k )], qq{file}
  -l $fh                                     mb::_filetest [qw( -l )], $fh
  -l 'file'                                  mb::_filetest [qw( -l )], 'file'
  -l FILE                                    mb::_filetest [qw( -l )], \*FILE
  -l _                                       mb::_filetest [qw( -l )], \*_
  -l qq{file}                                mb::_filetest [qw( -l )], qq{file}
  -o $fh                                     mb::_filetest [qw( -o )], $fh
  -o 'file'                                  mb::_filetest [qw( -o )], 'file'
  -o FILE                                    mb::_filetest [qw( -o )], \*FILE
  -o _                                       mb::_filetest [qw( -o )], \*_
  -o qq{file}                                mb::_filetest [qw( -o )], qq{file}
  -p $fh                                     mb::_filetest [qw( -p )], $fh
  -p 'file'                                  mb::_filetest [qw( -p )], 'file'
  -p FILE                                    mb::_filetest [qw( -p )], \*FILE
  -p _                                       mb::_filetest [qw( -p )], \*_
  -p qq{file}                                mb::_filetest [qw( -p )], qq{file}
  -r $fh                                     mb::_filetest [qw( -r )], $fh
  -r 'file'                                  mb::_filetest [qw( -r )], 'file'
  -r -w -f $fh                               mb::_filetest [qw( -r -w -f )], $fh
  -r -w -f 'file'                            mb::_filetest [qw( -r -w -f )], 'file'
  -r -w -f FILE                              mb::_filetest [qw( -r -w -f )], \*FILE
  -r -w -f _                                 mb::_filetest [qw( -r -w -f )], \*_
  -r -w -f qq{file}                          mb::_filetest [qw( -r -w -f )], qq{file}
  -r FILE                                    mb::_filetest [qw( -r )], \*FILE
  -r _                                       mb::_filetest [qw( -r )], \*_
  -r qq{file}                                mb::_filetest [qw( -r )], qq{file}
  -s $fh                                     mb::_filetest [qw( -s )], $fh
  -s 'file'                                  mb::_filetest [qw( -s )], 'file'
  -s FILE                                    mb::_filetest [qw( -s )], \*FILE
  -s _                                       mb::_filetest [qw( -s )], \*_
  -s qq{file}                                mb::_filetest [qw( -s )], qq{file}
  -t $fh                                     mb::_filetest [qw( -t )], $fh
  -t 'file'                                  mb::_filetest [qw( -t )], 'file'
  -t FILE                                    mb::_filetest [qw( -t )], \*FILE
  -t _                                       mb::_filetest [qw( -t )], \*_
  -t qq{file}                                mb::_filetest [qw( -t )], qq{file}
  -u $fh                                     mb::_filetest [qw( -u )], $fh
  -u 'file'                                  mb::_filetest [qw( -u )], 'file'
  -u FILE                                    mb::_filetest [qw( -u )], \*FILE
  -u _                                       mb::_filetest [qw( -u )], \*_
  -u qq{file}                                mb::_filetest [qw( -u )], qq{file}
  -w $fh                                     mb::_filetest [qw( -w )], $fh
  -w 'file'                                  mb::_filetest [qw( -w )], 'file'
  -w FILE                                    mb::_filetest [qw( -w )], \*FILE
  -w _                                       mb::_filetest [qw( -w )], \*_
  -w qq{file}                                mb::_filetest [qw( -w )], qq{file}
  -x $fh                                     mb::_filetest [qw( -x )], $fh
  -x 'file'                                  mb::_filetest [qw( -x )], 'file'
  -x FILE                                    mb::_filetest [qw( -x )], \*FILE
  -x _                                       mb::_filetest [qw( -x )], \*_
  -x qq{file}                                mb::_filetest [qw( -x )], qq{file}
  -z $fh                                     mb::_filetest [qw( -z )], $fh
  -z 'file'                                  mb::_filetest [qw( -z )], 'file'
  -z FILE                                    mb::_filetest [qw( -z )], \*FILE
  -z _                                       mb::_filetest [qw( -z )], \*_
  -z qq{file}                                mb::_filetest [qw( -z )], qq{file}

  Each elements in strings or regular expressions that are double-quote like are
  transpiled as follows.

  in your script                             script transpiled by this software
  \L\u MBCS-quotee \E\E                      \L\u OO-quotee \E\E
  \U\l MBCS-quotee \E\E                      \U\l OO-quotee \E\E
  \L MBCS-quotee \E                          \L OO-quotee \E
  \U MBCS-quotee \E                          \U OO-quotee \E
  \l MBCS-quotee \E                          \l OO-quotee \E
  \u MBCS-quotee \E                          \u OO-quotee \E
  \Q MBCS-quotee \E                          \Q OO-quotee \E

  Each elements in regular expressions are transpiled as follows.

  in your script                             script transpiled by this software (on sjis encoding)
  qr'.'                                      qr{\G${mb::_anchor}@{[qr'(?:(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|.)' ]}@{[mb::_m_passed()]}}
  qr'\B'                                     qr{\G${mb::_anchor}@{[qr'(?:(?<![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])|(?<=[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?=[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_]))' ]}@{[mb::_m_passed()]}}
  qr'\D'                                     qr{\G${mb::_anchor}@{[qr'(?:(?![0123456789])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'\H'                                     qr{\G${mb::_anchor}@{[qr'(?:(?![\x09\x20])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'\N'                                     qr{\G${mb::_anchor}@{[qr'(?:(?!\n)(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'\R'                                     qr{\G${mb::_anchor}@{[qr'(?>\r\n|[\x0A\x0B\x0C\x0D])' ]}@{[mb::_m_passed()]}}
  qr'\S'                                     qr{\G${mb::_anchor}@{[qr'(?:(?![\t\n\f\r\x20])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'\V'                                     qr{\G${mb::_anchor}@{[qr'(?:(?![\x0A\x0B\x0C\x0D])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'\W'                                     qr{\G${mb::_anchor}@{[qr'(?:(?![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'\b'                                     qr{\G${mb::_anchor}@{[qr'(?:(?<![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?=[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])|(?<=[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_]))' ]}@{[mb::_m_passed()]}}
  qr'\d'                                     qr{\G${mb::_anchor}@{[qr'[0123456789]' ]}@{[mb::_m_passed()]}}
  qr'\h'                                     qr{\G${mb::_anchor}@{[qr'[\x09\x20]' ]}@{[mb::_m_passed()]}}
  qr'\s'                                     qr{\G${mb::_anchor}@{[qr'[\t\n\f\r\x20]' ]}@{[mb::_m_passed()]}}
  qr'\v'                                     qr{\G${mb::_anchor}@{[qr'[\x0A\x0B\x0C\x0D]' ]}@{[mb::_m_passed()]}}
  qr'\w'                                     qr{\G${mb::_anchor}@{[qr'[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_]' ]}@{[mb::_m_passed()]}}
  qr'[\b]'                                   qr{\G${mb::_anchor}@{[qr'(?:(?=[\x08])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:alnum:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\x30-\x39\x41-\x5A\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:alpha:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\x41-\x5A\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:ascii:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\x00-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:blank:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\x09\x20])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:cntrl:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\x00-\x1F\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:digit:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\x30-\x39])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:graph:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\x21-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:lower:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[abcdefghijklmnopqrstuvwxyz])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:print:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\x20-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:punct:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\x21-\x2F\x3A-\x3F\x40\x5B-\x5F\x60\x7B-\x7E])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:space:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[\s\x0B])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:upper:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=[ABCDEFGHIJKLMNOPQRSTUVWXYZ])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:word:]]'                             qr{\G${mb::_anchor}@{[qr'(?:(?=[\x30-\x39\x41-\x5A\x5F\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:xdigit:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=[\x30-\x39\x41-\x46\x61-\x66])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^alnum:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x30-\x39\x41-\x5A\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^alpha:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x41-\x5A\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^ascii:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x00-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^blank:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x09\x20])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^cntrl:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x00-\x1F\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^digit:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x30-\x39])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^graph:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x21-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^lower:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![abcdefghijklmnopqrstuvwxyz])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^print:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x20-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^punct:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x21-\x2F\x3A-\x3F\x40\x5B-\x5F\x60\x7B-\x7E])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^space:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\s\x0B])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^upper:]]'                           qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![ABCDEFGHIJKLMNOPQRSTUVWXYZ])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^word:]]'                            qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x30-\x39\x41-\x5A\x5F\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr'[[:^xdigit:]]'                          qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x30-\x39\x41-\x46\x61-\x66])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}}
  qr/./                                      qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_dot]})/ ]}@{[mb::_m_passed()]}}
  qr/\B/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_B]})/ ]}@{[mb::_m_passed()]}}
  qr/\D/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_D]})/ ]}@{[mb::_m_passed()]}}
  qr/\H/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_H]})/ ]}@{[mb::_m_passed()]}}
  qr/\N/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_N]})/ ]}@{[mb::_m_passed()]}}
  qr/\R/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_R]})/ ]}@{[mb::_m_passed()]}}
  qr/\S/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_S]})/ ]}@{[mb::_m_passed()]}}
  qr/\V/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_V]})/ ]}@{[mb::_m_passed()]}}
  qr/\W/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_W]})/ ]}@{[mb::_m_passed()]}}
  qr/\b/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_b]})/ ]}@{[mb::_m_passed()]}}
  qr/\d/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_d]})/ ]}@{[mb::_m_passed()]}}
  qr/\h/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_h]})/ ]}@{[mb::_m_passed()]}}
  qr/\s/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_s]})/ ]}@{[mb::_m_passed()]}}
  qr/\v/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_v]})/ ]}@{[mb::_m_passed()]}}
  qr/\w/                                     qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_w]})/ ]}@{[mb::_m_passed()]}}
  qr/[\b]/                                   qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[\\b])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:alnum:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:alnum:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:alpha:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:alpha:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:ascii:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:ascii:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:blank:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:blank:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:cntrl:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:cntrl:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:digit:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:digit:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:graph:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:graph:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:lower:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:lower:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:print:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:print:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:punct:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:punct:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:space:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:space:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:upper:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:upper:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:word:]]/                             qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:word:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:xdigit:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:xdigit:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^alnum:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^alnum:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^alpha:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^alpha:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^ascii:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^ascii:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^blank:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^blank:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^cntrl:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^cntrl:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^digit:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^digit:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^graph:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^graph:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^lower:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^lower:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^print:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^print:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^punct:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^punct:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^space:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^space:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^upper:]]/                           qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^upper:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^word:]]/                            qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^word:]])]})/ ]}@{[mb::_m_passed()]}}
  qr/[[:^xdigit:]]/                          qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^xdigit:]])]})/ ]}@{[mb::_m_passed()]}}


This software requires perl5.00503 or later.


I have tested and verified this software using the best of my ability. However, a software containing much regular expression is bound to contain some bugs. Thus, if you happen to find a bug that's in this software and not your own program, you can try to reduce it to a minimal test case and then report it to the following author's address. If you have an idea that could make this a more useful tool, please let everyone share it.

  • Special Variables $` and $& need /( Capture All )/

      Because $` and $& use $1.
      in your script      after m//, works as                         after s///, works as
      $`                  CORE::substr($&, 0, -CORE::length($1))      $1
      ${`}                CORE::substr($&, 0, -CORE::length($1))      $1
      $PREMATCH           CORE::substr($&, 0, -CORE::length($1))      $1
      ${^PREMATCH}        CORE::substr($&, 0, -CORE::length($1))      $1
      $&                  $1                                          CORE::substr($&, CORE::length($1))
      ${&}                $1                                          CORE::substr($&, CORE::length($1))
      $MATCH              $1                                          CORE::substr($&, CORE::length($1))
      ${^MATCH}           $1                                          CORE::substr($&, CORE::length($1))
  • return value from tr///s

    tr/// (or y///) operator with /s modifier returns 1 always. If you need right number, you can use mb::tr().

  • chdir

    Function chdir() cannot work if path is ended by chr(0x5C).

      see also,
      Bug #81839
      chdir does not work with chr(0x5C) at end of path
  • mb::substr as Lvalue

    If Perl version is older than 5.14, mb::substr differs from CORE::substr, and cannot be used as a lvalue. To change part of a string, you need use the optional fourth argument which is the replacement string.

    mb::substr($string, 13, 4, "JPerl");

  • Limitation of Regular Expression

    This software has limitation from \G in multibyte anchoring. Only Perl 5.30.0 or later can treat the codepoint string which exceeds 65534 octets with a regular expression, and only Perl 5.10.1 or later can 32766 octets.

      see also,
      The upper limit "n" specifiable in a regular expression quantifier of the form "{m,n}" has been doubled to 65534,n%7D%22-has-been-doubled-to-65534
      In 5.10.0, the * quantifier in patterns was sometimes treated as {0,32767}
      [perl #116379] \G can't treat over 32767 octet
      perlre - Perl regular expressions
      perlre length limit
  • fc(), lc(), lcfirst(), uc(), and ucfirst()

    fc() not supported. lc(), lcfirst(), uc(), and ucfirst() support US-ASCII only.

  • character ranges by hyphen

    Character ranges by hyphen of regular expression supports US-ASCII only. And tr///, y/// doesn't support ranges by hyphen.

  • cloister of regular expression

    The cloister (?s) and (?i) of a regular expression will not be implemented for the time being. Cloister (?s) can be substituted with the .(dot) and \N on /s modifier.

  • Empty Variable in Regular Expression

    Unlike literal null string, an interpolated variable evaluated to the empty string can't use the most recent pattern from a previous successful regular expression.

  • Limitation of ?? and m??

    Multibyte character needs ( ) which is before {n,m}, {n,}, {n}, *, and + in ?? or m??. As a result, you need to rewrite a script about $1,$2,$3,... You cannot use (?: ), ?, {n,m}?, {n,}?, and {n}? in ?? and m??, because delimiter of m?? is '?'.

  • Look-behind Assertion

    The look-behind assertion like (?<=[A-Z]) is not prevented from matching trail octet of the previous MBCS codepoint.

  • Modifier /a /d /l and /u of Regular Expression

    The concept of this software is not to use two or more encoding methods as literal string and literal of regexp in one Perl script. Therefore, modifier /a, /d, /l, and /u are not supported. \d means [0-9] universally.

  • Named Codepoint

    A named codepoint, such \N{GREEK SMALL LETTER EPSILON}, \N{greek:epsilon}, or \N{epsilon} is not supported.

  • Unicode Properties (aka Codepoint Properties) of Regular Expression

    Unicode properties (aka codepoint properties) of regexp are not available. Also (?[]) in regexp of Perl 5.18 is not available. There is no plans to currently support these.

  • Delimiter of String and Regexp

    qq//, q//, qw//, qx//, qr//, m//, s///, tr///, and y/// can't use a wide codepoint as the delimiter.

  • \b{...} Boundaries in Regular Expressions

    Following \b{...} available starting in v5.22 are not supported.

      \b{gcb} or \b{g}   Unicode "Grapheme Cluster Boundary"
      \b{sb}             Unicode "Sentence Boundary"
      \b{wb}             Unicode "Word Boundary"
      \B{gcb} or \B{g}   Unicode "Grapheme Cluster Boundary" doesn't match
      \B{sb}             Unicode "Sentence Boundary" doesn't match
      \B{wb}             Unicode "Word Boundary" doesn't match
  • format

    Function "format" can't handle MBCS codepoints unlike JPerl.

UTF8 Flag Considered Harmful, and Our Goal

P.401 See chapter 15: Unicode of ISBN 0-596-00027-8 Programming Perl Third Edition.

Before the introduction of Unicode support in perl, The eq operator just compared the byte-strings represented by two scalars. Beginning with perl 5.8, eq compares two byte-strings with simultaneous consideration of the UTF8 flag.

-- we have been taught so for a long time.

Perl is a powerful language for everyone, but UTF8 flag is a barrier for common beginners. Because everyone can only one task on one time. So calling Encode::encode() and Encode::decode() in application program is not better way. Making two scripts for information processing and encoding conversion may be better. Please trust me.

  * You are not expected to understand this.
  Information processing model beginning with perl 5.8
    |     Text strings     |                     |
    +----------+-----------|    Binary strings   |
    |  UTF-8   |  Latin-1  |                     |
    | UTF8     |            Not UTF8             |
    | Flagged  |            Flagged              |

  Confusion of Perl string model is made from double meanings of
  "Binary string."
  Meanings of "Binary string" are
  1. Non-Text string
  2. Digital octet string

  Let's draw again using those term.
    |     Text strings     |                     |
    +----------+-----------|   Non-Text strings  |
    |  UTF-8   |  Latin-1  |                     |
    | UTF8     |            Not UTF8             |
    | Flagged  |            Flagged              |
    |            Digital octet string            |

There are people who don't agree to change in the character string processing model at Perl 5.8. It is impossible to get agreement it from majority of Perl programmers who are not heavy users. How to solve it by returning to an original Perl, let's read page 402 of the Programming Perl, 3rd edition, again.

  Information processing model beginning with perl3 or this software
  of UNIX/C-ism.

    |    Text string as Digital octet string     |
    |    Digital octet string as Text string     |
    |       Not UTF8 Flagged, No MOJIBAKE        |

  In UNIX Everything is a File
  - In UNIX everything is a stream of bytes
  - In UNIX the filesystem is used as a universal name space

  Native Encoding Scripting
  - native encoding of file contents
  - native encoding of file name on filesystem
  - native encoding of command line
  - native encoding of environment variable
  - native encoding of API
  - native encoding of network packet
  - native encoding of database

Ideally, We'd like to achieve these five Goals:

  • Goal #1:

    Old byte-oriented programs should not spontaneously break on the old byte-oriented data they used to work on.

    This software attempts to achieve this goal by embedded functions work as traditional and stably.

  • Goal #2:

    Old byte-oriented programs should magically start working on the new character-oriented data when appropriate.

    This software is not a magician, so cannot see your mind and run it.

    You must decide and write octet semantics or codepoint semantics yourself in case by case.

    figure of Goal #1 and Goal #2.

                                   Goal #1 Goal #2
                            (a)     (b)     (c)     (d)     (e)
          | data         |  Old  |  Old  |  New  |  Old  |  New  |
          | script       |  Old  |      Old      |      New      |
          | interpreter  |  Old  |              New              |
          Old --- Old byte-oriented
          New --- New character-oriented

    There is a combination from (a) to (e) in data, script, and interpreter of old and new. Let's add JPerl, utf8 pragma, and this software.

                            (a)     (b)     (c)     (d)     (e)
                                          JPerl,mb        utf8
          | data         |  Old  |  Old  |  New  |  Old  |  New  |
          | script       |  Old  |      Old      |      New      |
          | interpreter  |  Old  |              New              |
          Old --- Old byte-oriented
          New --- New character-oriented

    The reason why JPerl is very excellent is that it is at the position of (c). That is, it is almost not necessary to write a special code to process new codepoint oriented script.

  • Goal #3:

    Programs should run just as fast in the new character-oriented mode as in the old byte-oriented mode.

    It is impossible. Because the following time is necessary.

    (1) Time of escape script for old byte-oriented perl.

    (2) Time of processing regular expression by escaped script while multibyte anchoring.

  • Goal #4:

    Perl should remain one language, rather than forking into a byte-oriented Perl and a character-oriented Perl.

    JPerl remains one Perl "language" by forking to two "interpreters." However, the Perl core team did not desire fork of the "interpreter." As a result, Perl "language" forked contrary to goal #4.

    A codepoint oriented perl is not necessary to make it specially, because a byte-oriented perl can already treat the binary data. This software is only an application program of byte-oriented Perl, a filter program.

    And you will get support from the Perl community, when you solve the problem by the Perl script. modulino keeps one "language" and one "interpreter."

  • Goal #5: users will be able to maintain by Perl.

    May the be with you, always.

Back when Programming Perl, 3rd ed. was written, UTF8 flag was not born and Perl is designed to make the easy jobs easy. This software provides programming environment like at that time.

Perl's motto

   Some computer scientists (the reductionists, in particular) would
  like to deny it, but people have funny-shaped minds. Mental geography
  is not linear, and cannot be mapped onto a flat surface without
  severe distortion. But for the last score years or so, computer
  reductionists have been first bowing down at the Temple of Orthogonality,
  then rising up to preach their ideas of ascetic rectitude to any who
  would listen.
   Their fervent but misguided desire was simply to squash your mind to
  fit their mindset, to smush your patterns of thought into some sort of
  Hyperdimensional Flatland. It's a joyless existence, being smushed.
  --- Learning Perl on Win32 Systems

  If you think this is a big headache, you're right. No one likes
  this situation, but Perl does the best it can with the input and
  encodings it has to deal with. If only we could reset history and
  not make so many mistakes next time.
  --- Learning Perl 6th Edition

   The most important thing for most people to know about handling
  Unicode data in Perl, however, is that if you don't ever use any Uni-
  code data -- if none of your files are marked as UTF-8 and you don't
  use UTF-8 locales -- then you can happily pretend that you're back in
  Perl 5.005_03 land; the Unicode features will in no way interfere with
  your code unless you're explicitly using them. Sometimes the twin
  goals of embracing Unicode but not disturbing old-style byte-oriented
  scripts has led to compromise and confusion, but it's the Perl way to
  silently do the right thing, which is what Perl ends up doing.
  --- Advanced Perl Programming, 2nd Edition


INABA Hitoshi <>

This project was originated by INABA Hitoshi.


This software is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


This software was made referring to software and the document that the following hackers or persons had made. I am thankful to all persons.

 Rick Yamashita, Shift_JIS

 Larry Wall, Perl

 Kazumasa Utashiro,

 Jeffrey E. F. Friedl, Mastering Regular Expressions

 SADAHIRO Tomoyuki, The right way of using Shift_JIS

 Yukihiro "Matz" Matsumoto, YAPC::Asia2006 Ruby on Perl(s)

 jscripter, For jperl users

 Bruce., Unicode in Perl

 Hiroaki Izumi, Shouldn't use Perl5.8 / Perl5.10 on the Windows

 Yuki Kimoto, Is it true that you shouldn't use Perl on Windows?

 chaichanPaPa, Matching Shift_JIS file name

 SUZUKI Norio, Jperl

 WATANABE Hirofumi, Jperl

 Chuck Houpt, Michiko Nozu, MacJPerl

 Kenichi Ishigaki, Pod-PerldocJp, Welcome to modern Perl world

 Fuji, Goro (gfx), Perl Hackers Hub No.16

 Dan Kogai, Encode module

 Takahashi Masatuyo, JPerl Wiki

 Juerd, Perl Unicode Advice

 daily dayflower, 2008-06-25 perluniadvice

 Unicode issues in Perl

 Jesse Vincent, Compatibility is a virtue


 perlunicode, Encode, open, utf8, bytes, Arabic, Big5, Big5HKSCS, CP932::R2,
 CP932IBM::R2, CP932NEC::R2, CP932X::R2, Char::Arabic, Char::Big5HKSCS,
 Char::Big5Plus, Char::Cyrillic, Char::EUCJP, Char::EUCTW, Char::GB18030,
 Char::GBK, Char::Greek, Char::HP15, Char::Hebrew, Char::INFORMIXV6ALS,
 Char::JIS8, Char::KOI8R, Char::KOI8U, Char::KPS9566, Char::Latin1,
 Char::Latin10, Char::Latin2, Char::Latin3, Char::Latin4, Char::Latin5,
 Char::Latin6, Char::Latin7, Char::Latin8, Char::Latin9, Char::OldUTF8,
 Char::Sjis, Char::TIS620, Char::UHC, Char::USASCII, Char::UTF2,
 Char::Windows1252, Char::Windows1258, Cyrillic, GBK, Greek, IOas::CP932,
 IOas::CP932IBM, IOas::CP932NEC, IOas::CP932X, IOas::SJIS2004, Jacode,
 Jacode4e, Jacode4e::RoundTrip, KOI8R, KOI8U, KPS9566, KSC5601, Latin1,
 Latin10, Latin2, Latin3, Latin4, Latin5, Latin6, Latin7, Latin8, Latin9,
 Modern::Open, SJIS2004::R2, Sjis, UTF2, UTF8::R2, Windows1250, Windows1252,
 Windows1254, Windows1257, Windows1258.

 Announcing Perl 7
 Jun 24, 2020 by brian d foy

 Larry Wall, Randal L.Schwartz, Yoshiyuki Kondo
 December 1997
 ISBN 4-89052-384-7

 Programming Perl, Second Edition
 By Larry Wall, Tom Christiansen, Randal L. Schwartz
 October 1996
 Pages: 670
 ISBN 10: 1-56592-149-6 | ISBN 13: 9781565921498

 Programming Perl, Third Edition
 By Larry Wall, Tom Christiansen, Jon Orwant
 Third Edition  July 2000
 Pages: 1104
 ISBN 10: 0-596-00027-8 | ISBN 13: 9780596000271

 The Perl Language Reference Manual (for Perl version 5.12.1)
 by Larry Wall and others
 Paperback (6"x9"), 724 pages
 Retail Price: $39.95 (pound 29.95 in UK)
 ISBN-13: 978-1-906966-02-7

 Perl Pocket Reference, 5th Edition
 By Johan Vromans
 Publisher: O'Reilly Media
 Released: July 2011
 Pages: 102

 Programming Perl, 4th Edition
 By: Tom Christiansen, brian d foy, Larry Wall, Jon Orwant
 Publisher: O'Reilly Media
 Formats: Print, Ebook, Safari Books Online
 Released: March 2012
 Pages: 1130
 Print ISBN: 978-0-596-00492-7 | ISBN 10: 0-596-00492-3
 Ebook ISBN: 978-1-4493-9890-3 | ISBN 10: 1-4493-9890-1

 Perl Cookbook
 By Tom Christiansen, Nathan Torkington
 August 1998
 Pages: 800
 ISBN 10: 1-56592-243-3 | ISBN 13: 978-1-56592-243-3

 Perl Cookbook, Second Edition
 By Tom Christiansen, Nathan Torkington
 Second Edition  August 2003
 Pages: 964
 ISBN 10: 0-596-00313-7 | ISBN 13: 9780596003135

 Perl in a Nutshell, Second Edition
 By Stephen Spainhour, Ellen Siever, Nathan Patwardhan
 Second Edition  June 2002
 Pages: 760
 Series: In a Nutshell
 ISBN 10: 0-596-00241-6 | ISBN 13: 9780596002411

 Learning Perl on Win32 Systems
 By Randal L. Schwartz, Erik Olson, Tom Christiansen
 August 1997
 Pages: 306
 ISBN 10: 1-56592-324-3 | ISBN 13: 9781565923249

 Learning Perl, Fifth Edition
 By Randal L. Schwartz, Tom Phoenix, brian d foy
 June 2008
 Pages: 352
 Print ISBN:978-0-596-52010-6 | ISBN 10: 0-596-52010-7
 Ebook ISBN:978-0-596-10316-3 | ISBN 10: 0-596-10316-6

 Learning Perl, 6th Edition
 By Randal L. Schwartz, brian d foy, Tom Phoenix
 June 2011
 Pages: 390
 ISBN-10: 1449303587 | ISBN-13: 978-1449303587

 Advanced Perl Programming, 2nd Edition
 By Simon Cozens
 June 2005
 Pages: 300
 ISBN-10: 0-596-00456-7 | ISBN-13: 978-0-596-00456-9

 Futato, Irving, Jepson, Patwardhan, Siever
 ISBN 10: 1-56592-370-7

 Perl Resource Kit -- Win32 Edition
 Erik Olson, Brian Jepson, David Futato, Dick Hardt
 ISBN 10:1-56592-409-6

 By Daisuke Maki
 Pages: 344
 ISBN 10: 4798119172 | ISBN 13: 978-4798119175

 Understanding Japanese Information Processing
 By Ken Lunde
 January 1900
 Pages: 470
 ISBN 10: 1-56592-043-0 | ISBN 13: 9781565920439

 CJKV Information Processing
 Chinese, Japanese, Korean & Vietnamese Computing
 By Ken Lunde
 First Edition  January 1999
 Pages: 1128
 ISBN 10: 1-56592-224-7 | ISBN 13: 9781565922242

 By BM Japan Systems Engineering Co.,Ltd. and IBM Japan, Ltd.
 Pages: 887
 ISBN-10: 4756144659 | ISBN-13: 978-4756144652

 Mastering Regular Expressions, Second Edition
 By Jeffrey E. F. Friedl
 Second Edition  July 2002
 Pages: 484
 ISBN 10: 0-596-00289-0 | ISBN 13: 9780596002893

 Mastering Regular Expressions, Third Edition
 By Jeffrey E. F. Friedl
 Third Edition  August 2006
 Pages: 542
 ISBN 10: 0-596-52812-4 | ISBN 13:9780596528126

 Regular Expressions Cookbook
 By Jan Goyvaerts, Steven Levithan
 May 2009
 Pages: 512
 ISBN 10:0-596-52068-9 | ISBN 13: 978-0-596-52068-7

 Regular Expressions Cookbook, 2nd Edition
 By Steven Levithan, Jan Goyvaerts
 Released August 2012
 Pages: 612
 ISBN: 9781449327453

 By Kouji Shibano
 Pages: 1456
 ISBN 4-542-20129-5

 1993 Aug
 Pages: 172
 T1008901080816 ZASSHI 08901-8

 By YAMAGATA Hiroo, Stephen J. Turnbull, Craig Oda, Robert J. Bickel
 June, 2000
 Pages: 376
 ISBN 4-87311-016-5

 Windows NT Shell Scripting
 By Timothy Hill
 April 27, 1998
 Pages: 400
 ISBN 10: 1578700477 | ISBN 13: 9781578700479

 Windows(R) Command-Line Administrators Pocket Consultant, 2nd Edition
 By William R. Stanek
 February 2009
 Pages: 594
 ISBN 10: 0-7356-2262-0 | ISBN 13: 978-0-7356-2262-3

 Kaoru Maeda, Perl's history Perl 1,2,3,4

 nurse, What is "string"

 NISHIO Hirokazu, What's meant "string as a sequence of characters"?

 nurse, History of Japanese EUC 22:00

 Mike Whitaker, Perl And Unicode

 About Windows and Japanese text

 About Windows diagnostic data

 Ricardo Signes, Perl 5.14 for Pragmatists

 Ricardo Signes, What's New in Perl? v5.10 - v5.16 #'

 YAP(achimon)C::Asia Hachioji 2016 mid in Shinagawa
 Kenichi Ishigaki (@charsbar) July 3, 2016 YAP(achimon)C::Asia Hachioji 2016mid

 Causes and countermeasures for garbled Japanese characters in perl

 Perl regular expression bug?

 About Windows and Japanese text

 About Windows diagnostic data

 CPAN Directory INABA Hitoshi

 Recent Perl packages by "INABA Hitoshi"

 Tokyo-pm archive

 Error: Runtime exception on jperl 5.005_03


 TANABATA - The Star Festival - common legend of east asia