mb - run Perl script in MBCS encoding (not only CJK ;-)
$ perl mb.pm MBCS_Perl_script.pl (auto detect encoding of script) $ perl mb.pm -e big5 MBCS_Perl_script.pl $ perl mb.pm -e big5hkscs MBCS_Perl_script.pl $ perl mb.pm -e eucjp MBCS_Perl_script.pl $ perl mb.pm -e gb18030 MBCS_Perl_script.pl $ perl mb.pm -e gbk MBCS_Perl_script.pl $ perl mb.pm -e sjis MBCS_Perl_script.pl $ perl mb.pm -e uhc MBCS_Perl_script.pl $ perl mb.pm -e utf8 MBCS_Perl_script.pl $ perl mb.pm -e wtf8 MBCS_Perl_script.pl MBCS subroutines: mb::chop(...); mb::chr(...); mb::do 'file'; mb::dosglob(...); mb::eval 'string'; mb::getc(...); mb::index(...); mb::index_byte(...); mb::length(...); mb::ord(...); mb::require 'file'; mb::reverse(...); mb::rindex(...); mb::rindex_byte(...); mb::substr(...); mb::use Module; mb::no Module; MBCS special variables: $mb::PERL $mb::ORIG_PROGRAM_NAME supported encodings: Big5, Big5-HKSCS, EUC-JP, GB18030, GBK, Sjis, UHC, UTF-8, WTF-8 supported operating systems: Apple Inc. OS X, Hewlett-Packard Development Company, L.P. HP-UX, International Business Machines Corporation AIX, Microsoft Corporation Windows, Oracle Corporation Solaris, and Other Systems supported perl versions: perl version 5.005_03 to newest perl
To install this software by make, type the following:
perl Makefile.PL make make test make install
To install this software without make, type the following:
pmake.bat test pmake.bat install
This software is a source code filter, a transpiler-modulino. Perl is said to have been able to handle Unicode since version 5.8. However, unlike JPerl, "Easy jobs easy" has been lost. (but we have got it again :-D) In Shift_JIS and similar encodings(Big5, Big5-HKSCS, GB18030, GBK, Sjis, UHC) have any DAMEMOJI who have metacharacters at second octet. Which characters are DAMEMOJI is depends on whether the enclosing delimiter is single quote or double quote. This software escapes DAMEMOJI in your script, generate a new script and run it. There are some MBCS encodings in the world. in Japan since 1978, JIS C 6226-1978, in China since 1980, GB 2312-80, in Taiwan since 1984, Big5, in South Korea since 1991, KS X 1002:1991, and more. Even if you are an avid Unicode proponent, you cannot change this fact. These encodings are still used today in most areas except the world wide web. This software ... * supports MBCS literals in Perl scripts * supports Big5, Big5-HKSCS, EUC-JP, GB18030, GBK, Sjis, UHC, UTF-8, and WTF-8 * does not use the UTF8 flag to avoid MOJIBAKE * escapes DAMEMOJI in scripts * handles raw encoding to support GAIJI * adds multibyte anchoring to regular expressions * rewrites character classes in regular expressions to work as MBCS codepoint * supports special variables $`, $&, and $' * does not change features of octet-oriented built-in functions * lc(), lcfirst(), uc(), and ucfirst() convert US-ASCII only * codepoint range by hyphen of tr/// and y/// support US-ASCII only * You have to write mb::* subroutines if you want codepoint semantics Let's enjoy MBSC scripting in Perl!!
To understand and use this software, you must know some terminologies. But now I have no time for write them. So today is July 7th, I have to go to meet Juliet. The necessary terms are listed below. Maybe world wide web will help you.
byte
octet
encoding
decode
character
codepoint
grapheme
SBCS(Single Byte Character Set)
DBCS(Double Byte Character Set)
MBCS(Multibyte Character Set)
multibyte anchoring
character class
MOJIBAKE
DAMEMOJI
GAIJI
GETA, GETA-MOJI, GETA-MARK
The encodings supported by this software and their range of octets are as follows. ------------------------------------------------------------------------------ big5 (Big5) 1st 2nd 81..FE 00..FF 00..7F https://en.wikipedia.org/wiki/Big5 * needs multibyte anchoring * needs escaping meta char of 2nd octet * unsafe US-ASCII casefolding of 2nd octet ------------------------------------------------------------------------------ big5hkscs (Big5-HKSCS) 1st 2nd 81..FE 00..FF 00..7F https://en.wikipedia.org/wiki/Hong_Kong_Supplementary_Character_Set * needs multibyte anchoring * needs escaping meta char of 2nd octet * unsafe US-ASCII casefolding of 2nd octet ------------------------------------------------------------------------------ eucjp (EUC-JP) 1st 2nd A1..FE 00..FF 00..7F https://en.wikipedia.org/wiki/Extended_Unix_Code#EUC-JP * needs multibyte anchoring * needs no escaping meta char of 2nd octet * safe US-ASCII casefolding of 2nd octet ------------------------------------------------------------------------------ gb18030 (GB18030) 1st 2nd 3rd 4th 81..FE 30..39 81..FE 30..39 81..FE 00..FF 00..7F https://en.wikipedia.org/wiki/GB_18030 * needs multibyte anchoring * needs escaping meta char of 2nd octet * unsafe US-ASCII casefolding of 2nd-4th octet ------------------------------------------------------------------------------ gbk (GBK) 1st 2nd 81..FE 00..FF 00..7F https://en.wikipedia.org/wiki/GBK_(character_encoding) * needs multibyte anchoring * needs escaping meta char of 2nd octet * unsafe US-ASCII casefolding of 2nd octet ------------------------------------------------------------------------------ sjis (Shift_JIS-like encodings) 1st 2nd 81..9F 00..FF E0..FC 00..FF 80..FF 00..7F https://en.wikipedia.org/wiki/Shift_JIS * needs multibyte anchoring * needs escaping meta char of 2nd octet * unsafe US-ASCII casefolding of 2nd octet ------------------------------------------------------------------------------ uhc (UHC) 1st 2nd 81..FE 00..FF 00..7F https://en.wikipedia.org/wiki/Unified_Hangul_Code * needs multibyte anchoring * needs no escaping meta char of 2nd octet * unsafe US-ASCII casefolding of 2nd octet ------------------------------------------------------------------------------ utf8 (UTF-8) 1st 2nd 3rd 4th E1..EC 80..BF 80..BF C2..DF 80..BF EE..EF 80..BF 80..BF F0..F0 90..BF 80..BF 80..BF E0..E0 A0..BF 80..BF ED..ED 80..9F 80..BF F1..F3 80..BF 80..BF 80..BF F4..F4 80..8F 80..BF 80..BF 00..7F https://en.wikipedia.org/wiki/UTF-8 * needs no multibyte anchoring * needs no escaping meta char of 2nd-4th octets * safe US-ASCII casefolding of 2nd-4th octet * enforces surrogate codepoints must be paired ------------------------------------------------------------------------------ wtf8 (WTF-8) 1st 2nd 3rd 4th E1..EF 80..BF 80..BF C2..DF 80..BF E0..E0 A0..BF 80..BF F0..F0 90..BF 80..BF 80..BF F1..F3 80..BF 80..BF 80..BF F4..F4 80..8F 80..BF 80..BF 00..7F http://simonsapin.github.io/wtf-8/ * superset of UTF-8 that encodes surrogate codepoints if they are not in a pair * needs no multibyte anchoring * needs no escaping meta char of 2nd-4th octets * safe US-ASCII casefolding of 2nd-4th octet ------------------------------------------------------------------------------
This software provides traditional feature "as was." The new MBCS features are provided by subroutines with new names. If you like utf8 pragma, mb::* subroutines will help you. On other hand, If you love JPerl, those subroutines will not help you very much. Traditional functions of Perl are useful still now in octet-oriented semantics. elder <-- age --> younger --------------------------------------------------------------------------------- bare Perl4 JPerl4 bare Perl5 JPerl5 use utf8; mb.pm bare Perl7 pragma modulino --------------------------------------------------------------------------------- chop --- --- chop chr chr bytes::chr chr getc getc --- getc index --- bytes::index index lc lc --- lc (by internal mb::lc) lcfirst lcfirst --- lcfirst (by internal mb::lcfirst) length length bytes::length length ord ord bytes::ord ord reverse reverse --- reverse rindex --- bytes::rindex rindex substr substr bytes::substr substr uc uc --- uc (by internal mb::uc) ucfirst ucfirst --- ucfirst (by internal mb::ucfirst) --- chop chop mb::chop --- --- chr mb::chr --- --- getc mb::getc --- index --- mb::index_byte --- --- index mb::index --- --- lc --- --- --- lcfirst --- --- --- length mb::length --- --- ord mb::ord --- --- reverse mb::reverse --- rindex --- mb::rindex_byte --- --- rindex mb::rindex --- --- substr mb::substr --- --- uc --- --- --- ucfirst --- --------------------------------------------------------------------------------- do 'file' --- --- do 'file' eval 'string' --- --- eval 'string' require 'file' --- --- require 'file' use Module --- --- use Module no Module --- --- no Module --- do 'file' do 'file' mb::do 'file' --- eval 'string' eval 'string' mb::eval 'string' --- require 'file' require 'file' mb::require 'file' --- use Module use Module mb::use Module --- no Module no Module mb::no Module $^X --- --- $^X --- $^X $^X $mb::PERL $0 $0 $0 $mb::ORIG_PROGRAM_NAME --- --- --- $0 --------------------------------------------------------------------------------- DOS-like glob() as MBCS subroutine ----------------------------------------------------------------- MBCS semantics broken function, not so useful ----------------------------------------------------------------- mb::dosglob glob, and <globbing*> ----------------------------------------------------------------- but everybody loves split(/\n/,`dir /b *.* 2>NUL`) since Perl4 index brothers ------------------------------------------------------------------------------------------ functions or subs works as returns as considered ------------------------------------------------------------------------------------------ index octet octet useful, bare Perl like rindex octet octet useful, bare Perl like mb::index codepoint codepoint not so useful, utf8 pragma like mb::rindex codepoint codepoint not so useful, utf8 pragma like mb::index_byte codepoint octet useful, JPerl like mb::rindex_byte codepoint octet useful, JPerl like ------------------------------------------------------------------------------------------ Sometimes "compatibility" means "compromise." In that case, "best compatibility" means "most useful compromise." That's what mb::index_byte() and mb::rindex_byte() are. But sorry for the long name.
This software provides the following two special variables for convenience.
$mb::PERL
system(qq{ $^X perl_script.pl }); # had been write this... # on mb.pm modulino system(qq{ $^X SBCS_perl_script.pl }); # for SBCS script system(qq{ $mb::PERL MBCS_perl_script.pl }); # for MBCS script
$mb::ORIG_PROGRAM_NAME
if ($0 =~ /-x64\.pl\z/) { ... } # had been write this... # on mb.pm modulino if ($0 =~ /-x64\.pl\z/) { ... } # means program name translated by mb.pm modulino (are you right?) if ($mb::ORIG_PROGRAM_NAME =~ /-x64\.pl\z/) { ... } # means original program name not translated by mb.pm modulino
----------------------------------------------------------------- original script in script with Perl4, Perl5, Perl7 mb.pm modulino ----------------------------------------------------------------- chop chop chr chr do 'file' do 'file' eval 'string' eval 'string' getc getc index index lc lc lcfirst lcfirst length length no Module no Module no Module qw(ARGUMENTS) no Module qw(ARGUMENTS) ord ord require 'file' require 'file' reverse reverse rindex rindex substr substr uc uc ucfirst ucfirst use Module use Module use Module qw(ARGUMENTS) use Module qw(ARGUMENTS) use Module () use Module () -----------------------------------------------------------------
----------------------------------------------------------------- original script in script with JPerl4, JPerl5 mb.pm modulino ----------------------------------------------------------------- chop mb::chop do 'file' mb::do 'file' eval 'string' mb::eval 'string' index mb::index_byte no Module mb::no Module no Module qw(ARGUMENTS) mb::no Module qw(ARGUMENTS) require 'file' mb::require 'file' rindex mb::rindex_byte use Module mb::use Module use Module qw(ARGUMENTS) mb::use Module qw(ARGUMENTS) use Module () mb::use Module () -----------------------------------------------------------------
----------------------------------------------------------------- original script with script with utf8 pragma mb.pm modulino ----------------------------------------------------------------- chop mb::chop chr mb::chr do 'file' mb::do 'file' eval 'string' mb::eval 'string' getc mb::getc index mb::index lc --- lcfirst --- length mb::length no Module mb::no Module no Module qw(ARGUMENTS) mb::no Module qw(ARGUMENTS) ord mb::ord require 'file' mb::require 'file' reverse mb::reverse rindex mb::rindex substr mb::substr uc --- ucfirst --- use Module mb::use Module use Module qw(ARGUMENTS) mb::use Module qw(ARGUMENTS) use Module () mb::use Module () -----------------------------------------------------------------
In single quote, DAMEMOJI are double-byte characters that include the following metacharacters ('', q{}, <<'END', qw{}, m'', s''', split(''), split(m''), and qr'') ------------------------------------------------------------------ hex character as US-ASCII ------------------------------------------------------------------ 5C [\] backslashed escapes ------------------------------------------------------------------ In double quote, DAMEMOJI are double-byte characters that include the following metacharacters ("", qq{}, <<END, <<"END", ``, qx{}, <<`END`, //, m//, ??, s///, split(//), split(m//), and qr//) ------------------------------------------------------------------ hex character as US-ASCII ------------------------------------------------------------------ 21 [!] 22 ["] 23 [#] regexp comment 24 [$] sigil of scalar variable 25 [%] 26 [&] 27 ['] 28 [(] regexp group and capture 29 [)] regexp group and capture 2A [*] regexp matches zero or more times 2B [+] regexp matches one or more times 2C [,] 2D [-] 2E [.] regexp matches any octet 2F [/] 3A [:] 3B [;] 3C [<] 3D [=] 3E [>] 3F [?] regexp matches zero or one times 40 [@] sigil of array variable 5B [[] regexp bracketed character class 5C [\] backslashed escapes 5D []] regexp bracketed character class 5E [^] regexp true at beginning of string 60 [`] command execution 7B [{] regexp quantifier 7C [|] regexp alternation 7D [}] regexp quantifier 7E [~] ------------------------------------------------------------------
ex. Japanese KATAKANA "SO" like [ `/ ] code is "\x83\x5C" in Sjis see hex dump ----------------------------------------- source script "`/" [83 5c] ----------------------------------------- using mb.pm, hex dump ----------------------------------------- escaped script "`\/" [83 [5c] 5c] ----------------------------------------- ^--- escape by mb.pm by the by see hex dump ----------------------------------------- your eye's "`/\" [83 5c] [5c] ----------------------------------------- perl eye's "`\/" [83] \[5c] ----------------------------------------- hex dump ----------------------------------------- in the perl "`/" [83] [5c] -----------------------------------------
This software automatically transpiles MBCS literal strings in scripts to octet-oriented strings(OO-quotee). ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- in your script script transpiled by this software ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- do 'file' do 'file' do { block } do { block } mb::do 'file' mb::do 'file' mb::do { block } do { block } eval 'string' eval 'string' eval { block } eval { block } mb::eval 'string' mb::eval 'string' mb::eval { block } eval { block } require 123 require 123 require 'file' require 'file' mb::require 123 mb::require 123 mb::require 'file' mb::require 'file' use Module 5.005; use Module 5.005; use Module 5.005 qw(A B C); use Module 5.005 qw(A B C); use Module 5.005 (); use Module 5.005 (); use Module; use Module; use Module qw(A B C); use Module qw(A B C); use Module (); use Module (); mb::use Module 5.005; BEGIN { mb::require 'Module'; Module->VERSION(5.005); Module->import; }; mb::use Module 5.005 qw(A B C); BEGIN { mb::require 'Module'; Module->VERSION(5.005); Module->import(qw(A B C)); }; mb::use Module 5.005 (); BEGIN { mb::require 'Module'; Module->VERSION(5.005); }; mb::use Module; BEGIN { mb::require 'Module'; Module->import; }; mb::use Module qw(A B C); BEGIN { mb::require 'Module'; Module->import(qw(A B C)); }; mb::use Module (); BEGIN { mb::require 'Module'; }; no Module 5.005; no Module 5.005; no Module 5.005 qw(A B C); no Module 5.005 qw(A B C); no Module 5.005 (); no Module 5.005 (); no Module; no Module; no Module qw(A B C); no Module qw(A B C); no Module (); no Module (); mb::no Module 5.005; BEGIN { mb::require 'Module'; Module->VERSION(5.005); Module->unimport; }; mb::no Module 5.005 qw(A B C); BEGIN { mb::require 'Module'; Module->VERSION(5.005); Module->unimport(qw(A B C)); }; mb::no Module 5.005 (); BEGIN { mb::require 'Module'; Module->VERSION(5.005); }; mb::no Module; BEGIN { mb::require 'Module'; Module->unimport; }; mb::no Module qw(A B C); BEGIN { mb::require 'Module'; Module->unimport(qw(A B C)); }; mb::no Module (); BEGIN { mb::require 'Module'; }; chop chop lc mb::lc lcfirst mb::lcfirst uc mb::uc ucfirst mb::ucfirst index index rindex rindex mb::getc() mb::getc() mb::getc($fh) mb::getc($fh) mb::getc $fh mb::getc $fh mb::getc(FILE) mb::getc(\*FILE) mb::getc FILE mb::getc \*FILE mb::getc mb::getc 'MBCS-quotee' 'OO-quotee' "MBCS-quotee" "OO-quotee" `MBCS-quotee` `OO-quotee` /MBCS-quotee/cgimosx m{\G${mb::_anchor}@{[mb::_ignorecase(qr/OO-quotee/mosx)]}@{[mb::_m_passed()]}}cg /MBCS-quotee/cgmosx m{\G${mb::_anchor}@{[qr/OO-quotee/mosx ]}@{[mb::_m_passed()]}}cg ?MBCS-quotee?cgimosx m{\G${mb::_anchor}@{[mb::_ignorecase(qr?OO-quotee?mosx)]}@{[mb::_m_passed()]}}cg ?MBCS-quotee?cgmosx m{\G${mb::_anchor}@{[qr?OO-quotee?mosx ]}@{[mb::_m_passed()]}}cg <MBCS-quotee> <OO-quotee> q/MBCS-quotee/ q/OO-quotee/ qx'MBCS-quotee' qx'OO-quotee' qw/MBCS-quotee/ qw/OO-quotee/ m'MBCS-quotee'cgimosx m{\G${mb::_anchor}@{[mb::_ignorecase(qr'OO-quotee'mosx)]}@{[mb::_m_passed()]}}cg m'MBCS-quotee'cgmosx m{\G${mb::_anchor}@{[qr'OO-quotee'mosx ]}@{[mb::_m_passed()]}}cg s'MBCS-regexp'MBCS-replacement'eegimosxr s{(\G${mb::_anchor})@{[mb::_ignorecase(qr'OO-regexp'mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q'OO-replacement'}egr s'MBCS-regexp'MBCS-replacement'eegmosxr s{(\G${mb::_anchor})@{[qr'OO-regexp'mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q'OO-replacement'}egr tr/MBCS-search/MBCS-replacement/cdsr s{[\x00-\xFF]*}{mb::tr($&,q/OO-search/,q/OO-replacement/,'cdsr')}ser tr/MBCS-search/MBCS-replacement/cds s{[\x00-\xFF]+}{mb::tr($&,q/OO-search/,q/OO-replacement/,'cdsr')}se tr/MBCS-search/MBCS-replacement/ds s{[\x00-\xFF]+}{mb::tr($&,q/OO-search/,q/OO-replacement/,'dsr')}se y/MBCS-search/MBCS-replacement/cdsr s{[\x00-\xFF]*}{mb::tr($&,q/OO-search/,q/OO-replacement/,'cdsr')}ser y/MBCS-search/MBCS-replacement/cds s{[\x00-\xFF]+}{mb::tr($&,q/OO-search/,q/OO-replacement/,'cdsr')}se y/MBCS-search/MBCS-replacement/ds s{[\x00-\xFF]+}{mb::tr($&,q/OO-search/,q/OO-replacement/,'dsr')}se qr'MBCS-quotee'cgimosx qr{\G${mb::_anchor}@{[mb::_ignorecase(qr'OO-quotee'mosx)]}@{[mb::_m_passed()]}}cg qr'MBCS-quotee'cgmosx qr{\G${mb::_anchor}@{[qr'OO-quotee'mosx ]}@{[mb::_m_passed()]}}cg split m'^' mb::_split qr{@{[qr'^'m ]}} split m'MBCS-quotee'cgimosx mb::_split qr{@{[mb::_ignorecase(qr'OO-quotee'mosx)]}}cg split m'MBCS-quotee'cgmosx mb::_split qr{@{[qr'OO-quotee'mosx ]}}cg split qr'^' mb::_split qr{@{[qr'^'m ]}} split qr'MBCS-quotee'cgimosx mb::_split qr{@{[mb::_ignorecase(qr'OO-quotee'mosx)]}}cg split qr'MBCS-quotee'cgmosx mb::_split qr{@{[qr'OO-quotee'mosx ]}}cg qq/MBCS-quotee/ qq/OO-quotee/ qq'MBCS-quotee' qq'OO-quotee' qx/MBCS-quotee/ qx/OO-quotee/ m/MBCS-quotee/cgimosx m{\G${mb::_anchor}@{[mb::_ignorecase(qr/OO-quotee/mosx)]}@{[mb::_m_passed()]}}cg m/MBCS-quotee/cgmosx m{\G${mb::_anchor}@{[qr/OO-quotee/mosx ]}@{[mb::_m_passed()]}}cg s/MBCS-regexp/MBCS-replacement/eegimosxr s{(\G${mb::_anchor})@{[mb::_ignorecase(qr/OO-regexp/mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q/OO-replacement/}egr s/MBCS-regexp/MBCS-replacement/eegmosxr s{(\G${mb::_anchor})@{[qr/OO-regexp/mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q/OO-replacement/}egr qr/MBCS-quotee/cgimosx qr{\G${mb::_anchor}@{[mb::_ignorecase(qr/OO-quotee/mosx)]}@{[mb::_m_passed()]}}cg qr/MBCS-quotee/cgmosx qr{\G${mb::_anchor}@{[qr/OO-quotee/mosx ]}@{[mb::_m_passed()]}}cg split /^/ mb::_split qr{@{[qr/^/m ]}} split /MBCS-quotee/cgimosx mb::_split qr{@{[mb::_ignorecase(qr/OO-quotee/mosx)]}}cg split /MBCS-quotee/cgmosx mb::_split qr{@{[qr/OO-quotee/mosx ]}}cg split m/^/ mb::_split qr{@{[qr/^/m ]}} split m/MBCS-quotee/cgimosx mb::_split qr{@{[mb::_ignorecase(qr/OO-quotee/mosx)]}}cg split m/MBCS-quotee/cgmosx mb::_split qr{@{[qr/OO-quotee/mosx ]}}cg split qr/^/ mb::_split qr{@{[qr/^/m ]}} split qr/MBCS-quotee/cgimosx mb::_split qr{@{[mb::_ignorecase(qr/OO-quotee/mosx)]}}cg split qr/MBCS-quotee/cgmosx mb::_split qr{@{[qr/OO-quotee/mosx ]}}cg m:MBCS-quotee:cgimosx m{\G${mb::_anchor}@{[mb::_ignorecase(qr`OO-quotee`mosx)]}@{[mb::_m_passed()]}}cg m:MBCS-quotee:cgmosx m{\G${mb::_anchor}@{[qr`OO-quotee`mosx ]}@{[mb::_m_passed()]}}cg s:MBCS-regexp:MBCS-replacement:eegimosxr s{(\G${mb::_anchor})@{[mb::_ignorecase(qr`OO-regexp`mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q:OO-replacement:}egr s:MBCS-regexp:MBCS-replacement:eegmosxr s{(\G${mb::_anchor})@{[qr`OO-regexp`mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q:OO-replacement:}egr qr:MBCS-quotee:cgimosx qr{\G${mb::_anchor}@{[mb::_ignorecase(qr`OO-quotee`mosx)]}@{[mb::_m_passed()]}}cg qr:MBCS-quotee:cgmosx qr{\G${mb::_anchor}@{[qr`OO-quotee`mosx ]}@{[mb::_m_passed()]}}cg split m:^: mb::_split qr{@{[qr`^`m ]}} split m:MBCS-quotee:cgimosx mb::_split qr{@{[mb::_ignorecase(qr`OO-quotee`mosx)]}}cg split m:MBCS-quotee:cgmosx mb::_split qr{@{[qr`OO-quotee`mosx ]}}cg split qr:^: mb::_split qr{@{[qr`^`m ]}} split qr:MBCS-quotee:cgimosx mb::_split qr{@{[mb::_ignorecase(qr`OO-quotee`mosx)]}}cg split qr:MBCS-quotee:cgmosx mb::_split qr{@{[qr`OO-quotee`mosx ]}}cg m@MBCS-quotee@cgimosx m{\G${mb::_anchor}@{[mb::_ignorecase(qr`OO-quotee`mosx)]}@{[mb::_m_passed()]}}cg m@MBCS-quotee@cgmosx m{\G${mb::_anchor}@{[qr`OO-quotee`mosx ]}@{[mb::_m_passed()]}}cg s@MBCS-regexp@MBCS-replacement@eegimosxr s{(\G${mb::_anchor})@{[mb::_ignorecase(qr`OO-regexp`mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q@OO-replacement@}egr s@MBCS-regexp@MBCS-replacement@eegmosxr s{(\G${mb::_anchor})@{[qr`OO-regexp`mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q@OO-replacement@}egr qr@MBCS-quotee@cgimosx qr{\G${mb::_anchor}@{[mb::_ignorecase(qr`OO-quotee`mosx)]}@{[mb::_m_passed()]}}cg qr@MBCS-quotee@cgmosx qr{\G${mb::_anchor}@{[qr`OO-quotee`mosx ]}@{[mb::_m_passed()]}}cg split m@^@ mb::_split qr{@{[qr`^`m ]}} split m@MBCS-quotee@cgimosx mb::_split qr{@{[mb::_ignorecase(qr`OO-quotee`mosx)]}}cg split m@MBCS-quotee@cgmosx mb::_split qr{@{[qr`OO-quotee`mosx ]}}cg split qr@^@ mb::_split qr{@{[qr`^`m ]}} split qr@MBCS-quotee@cgimosx mb::_split qr{@{[mb::_ignorecase(qr`OO-quotee`mosx)]}}cg split qr@MBCS-quotee@cgmosx mb::_split qr{@{[qr`OO-quotee`mosx ]}}cg m#MBCS-quotee#cgimosx m{\G${mb::_anchor}@{[mb::_ignorecase(qr#OO-quotee#mosx)]}@{[mb::_m_passed()]}}cg m#MBCS-quotee#cgmosx m{\G${mb::_anchor}@{[qr#OO-quotee#mosx ]}@{[mb::_m_passed()]}}cg s#MBCS-regexp#MBCS-replacement#eegimosxr s{(\G${mb::_anchor})@{[mb::_ignorecase(qr#OO-regexp#mosx)]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q#OO-replacement#}egr s#MBCS-regexp#MBCS-replacement#eegmosxr s{(\G${mb::_anchor})@{[qr#OO-regexp#mosx ]}@{[mb::_s_passed()]}}{$1 . mb::eval mb::eval q#OO-replacement#}egr qr#MBCS-quotee#cgimosx qr{\G${mb::_anchor}@{[mb::_ignorecase(qr#OO-quotee#mosx)]}@{[mb::_m_passed()]}}cg qr#MBCS-quotee#cgmosx qr{\G${mb::_anchor}@{[qr#OO-quotee#mosx ]}@{[mb::_m_passed()]}}cg split m#^# mb::_split qr{@{[qr#^#m ]}} split m#MBCS-quotee#cgimosx mb::_split qr{@{[mb::_ignorecase(qr#OO-quotee#mosx)]}}cg split m#MBCS-quotee#cgmosx mb::_split qr{@{[qr#OO-quotee#mosx ]}}cg split qr#^# mb::_split qr{@{[qr#^#m ]}} split qr#MBCS-quotee#cgimosx mb::_split qr{@{[mb::_ignorecase(qr#OO-quotee#mosx)]}}cg split qr#MBCS-quotee#cgmosx mb::_split qr{@{[qr#OO-quotee#mosx ]}}cg $` mb::_PREMATCH() ${`} mb::_PREMATCH() $PREMATCH mb::_PREMATCH() ${PREMATCH} mb::_PREMATCH() ${^PREMATCH} mb::_PREMATCH() $& mb::_MATCH() ${&} mb::_MATCH() $MATCH mb::_MATCH() ${MATCH} mb::_MATCH() ${^MATCH} mb::_MATCH() $1 mb::_CAPTURE(1) $2 mb::_CAPTURE(2) $3 mb::_CAPTURE(3) @{^CAPTURE} mb::_CAPTURE() ${^CAPTURE}[0] mb::_CAPTURE(0+1) ${^CAPTURE}[1] mb::_CAPTURE(1+1) ${^CAPTURE}[2] mb::_CAPTURE(2+1) @- mb::_LAST_MATCH_START() @LAST_MATCH_START mb::_LAST_MATCH_START() @{LAST_MATCH_START} mb::_LAST_MATCH_START() @{^LAST_MATCH_START} mb::_LAST_MATCH_START() $-[1] mb::_LAST_MATCH_START(1) $LAST_MATCH_START[1] mb::_LAST_MATCH_START(1) ${LAST_MATCH_START}[1] mb::_LAST_MATCH_START(1) ${^LAST_MATCH_START}[1] mb::_LAST_MATCH_START(1) @+ mb::_LAST_MATCH_END() @LAST_MATCH_END mb::_LAST_MATCH_END() @{LAST_MATCH_END} mb::_LAST_MATCH_END() @{^LAST_MATCH_END} mb::_LAST_MATCH_END() $+[1] mb::_LAST_MATCH_END(1) $LAST_MATCH_END[1] mb::_LAST_MATCH_END(1) ${LAST_MATCH_END}[1] mb::_LAST_MATCH_END(1) ${^LAST_MATCH_END}[1] mb::_LAST_MATCH_END(1) "$`" "@{[mb::_PREMATCH()]}" "${`}" "@{[mb::_PREMATCH()]}" "$PREMATCH" "@{[mb::_PREMATCH()]}" "${PREMATCH}" "@{[mb::_PREMATCH()]}" "${^PREMATCH}" "@{[mb::_PREMATCH()]}" "$&" "@{[mb::_MATCH()]}" "${&}" "@{[mb::_MATCH()]}" "$MATCH" "@{[mb::_MATCH()]}" "${MATCH}" "@{[mb::_MATCH()]}" "${^MATCH}" "@{[mb::_MATCH()]}" "$1" "@{[mb::_CAPTURE(1)]}" "$2" "@{[mb::_CAPTURE(2)]}" "$3" "@{[mb::_CAPTURE(3)]}" "@{^CAPTURE}" "@{[join $", mb::_CAPTURE()]}" "${^CAPTURE}[0]" "@{[mb::_CAPTURE(0)]}" "${^CAPTURE}[1]" "@{[mb::_CAPTURE(1)]}" "${^CAPTURE}[2]" "@{[mb::_CAPTURE(2)]}" "@-" "@{[mb::_LAST_MATCH_START()]}" "@LAST_MATCH_START" "@{[mb::_LAST_MATCH_START()]}" "@{LAST_MATCH_START}" "@{[mb::_LAST_MATCH_START()]}" "@{^LAST_MATCH_START}" "@{[mb::_LAST_MATCH_START()]}" "$-[1]" "@{[mb::_LAST_MATCH_START(1)]}" "$LAST_MATCH_START[1]" "@{[mb::_LAST_MATCH_START(1)]}" "${LAST_MATCH_START}[1]" "@{[mb::_LAST_MATCH_START(1)]}" "${^LAST_MATCH_START}[1]" "@{[mb::_LAST_MATCH_START(1)]}" "@+" "@{[mb::_LAST_MATCH_END()]}" "@LAST_MATCH_END" "@{[mb::_LAST_MATCH_END()]}" "@{LAST_MATCH_END}" "@{[mb::_LAST_MATCH_END()]}" "@{^LAST_MATCH_END}" "@{[mb::_LAST_MATCH_END()]}" "$+[1]" "@{[mb::_LAST_MATCH_END(1)]}" "$LAST_MATCH_END[1]" "@{[mb::_LAST_MATCH_END(1)]}" "${LAST_MATCH_END}[1]" "@{[mb::_LAST_MATCH_END(1)]}" "${^LAST_MATCH_END}[1]" "@{[mb::_LAST_MATCH_END(1)]}" v1.20.300.4000 mb::chr(1).mb::chr(20).mb::chr(300).mb::chr(4000) 1.20.300.4000 mb::chr(1).mb::chr(20).mb::chr(300).mb::chr(4000) v1234=>'' v1234=>'' v1234 mb::chr(1234) ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- The transpile-list below is primarily for Microsoft Windows, but it also applies when run on other operating systems to ensure commonality. Even if Perl 5.00503, you can stack file test operators, -r -w -f $file works as -f $file && -w _ && -r _. ----------------------------------------------------------------------------- in your script script transpiled by this software ----------------------------------------------------------------------------- chdir mb::_chdir opendir(DIR,'dir') mb::_opendir(\*DIR,'dir') opendir DIR,'dir' mb::_opendir \*DIR,'dir' opendir($dh,'dir') mb::_opendir($dh,'dir') opendir $dh,'dir' mb::_opendir $dh,'dir' unlink mb::_unlink lstat() mb::_lstat() lstat('a') mb::_lstat('a') lstat("a") mb::_lstat("a") lstat(`a`) mb::_lstat(`a`) lstat(m/a/) mb::_lstat(m{\G${mb::_anchor}@{[qr/a/ ]}@{[mb::_m_passed()]}}) lstat(q/a/) mb::_lstat(q/a/) lstat(qq/a/) mb::_lstat(qq/a/) lstat(qr/a/) mb::_lstat(qr{\G${mb::_anchor}@{[qr/a/ ]}@{[mb::_m_passed()]}}) lstat(qw/a/) mb::_lstat(qw/a/) lstat(qx/a/) mb::_lstat(qx/a/) lstat(s/a/b/) mb::_lstat(s{(\G${mb::_anchor})@{[qr/a/ ]}@{[mb::_s_passed()]}}{$1 . qq /b/}e) lstat(tr/a/b/) mb::_lstat(s{(\G${mb::_anchor})((?=[a])@{mb::_dot})}{$1.mb::tr($2,q/a/,q/b/,'r')}sge) lstat(y/a/b/) mb::_lstat(s{(\G${mb::_anchor})((?=[a])@{mb::_dot})}{$1.mb::tr($2,q/a/,q/b/,'r')}sge) lstat($fh) mb::_lstat($fh) lstat(FILE) mb::_lstat(\*FILE) lstat(_) mb::_lstat(\*_) lstat $fh mb::_lstat $fh lstat FILE mb::_lstat \*FILE lstat _ mb::_lstat \*_ lstat mb::_lstat stat() mb::_stat() stat('a') mb::_stat('a') stat("a") mb::_stat("a") stat(`a`) mb::_stat(`a`) stat(m/a/) mb::_stat(m{\G${mb::_anchor}@{[qr/a/ ]}@{[mb::_m_passed()]}}) stat(q/a/) mb::_stat(q/a/) stat(qq/a/) mb::_stat(qq/a/) stat(qr/a/) mb::_stat(qr{\G${mb::_anchor}@{[qr/a/ ]}@{[mb::_m_passed()]}}) stat(qw/a/) mb::_stat(qw/a/) stat(qx/a/) mb::_stat(qx/a/) stat(s/a/b/) mb::_stat(s{(\G${mb::_anchor})@{[qr/a/ ]}@{[mb::_s_passed()]}}{$1 . qq /b/}e) stat(tr/a/b/) mb::_stat(s{(\G${mb::_anchor})((?=[a])@{mb::_dot})}{$1.mb::tr($2,q/a/,q/b/,'r')}sge) stat(y/a/b/) mb::_stat(s{(\G${mb::_anchor})((?=[a])@{mb::_dot})}{$1.mb::tr($2,q/a/,q/b/,'r')}sge) stat($fh) mb::_stat($fh) stat(FILE) mb::_stat(\*FILE) stat(_) mb::_stat(\*_) stat $fh mb::_stat $fh stat FILE mb::_stat \*FILE stat _ mb::_stat \*_ stat mb::_stat -A $fh mb::_filetest [qw( -A)], $fh -A 'file' mb::_filetest [qw( -A)], 'file' -A FILE mb::_filetest [qw( -A )], \*FILE -A _ mb::_filetest [qw( -A )], \*_ -A qq{file} mb::_filetest [qw( -A )], qq{file} -B $fh mb::_filetest [qw( -B)], $fh -B 'file' mb::_filetest [qw( -B)], 'file' -B FILE mb::_filetest [qw( -B )], \*FILE -B _ mb::_filetest [qw( -B )], \*_ -B qq{file} mb::_filetest [qw( -B )], qq{file} -C $fh mb::_filetest [qw( -C)], $fh -C 'file' mb::_filetest [qw( -C)], 'file' -C FILE mb::_filetest [qw( -C )], \*FILE -C _ mb::_filetest [qw( -C )], \*_ -C qq{file} mb::_filetest [qw( -C )], qq{file} -M $fh mb::_filetest [qw( -M)], $fh -M 'file' mb::_filetest [qw( -M)], 'file' -M FILE mb::_filetest [qw( -M )], \*FILE -M _ mb::_filetest [qw( -M )], \*_ -M qq{file} mb::_filetest [qw( -M )], qq{file} -O $fh mb::_filetest [qw( -O)], $fh -O 'file' mb::_filetest [qw( -O)], 'file' -O FILE mb::_filetest [qw( -O )], \*FILE -O _ mb::_filetest [qw( -O )], \*_ -O qq{file} mb::_filetest [qw( -O )], qq{file} -R $fh mb::_filetest [qw( -R)], $fh -R 'file' mb::_filetest [qw( -R)], 'file' -R FILE mb::_filetest [qw( -R )], \*FILE -R _ mb::_filetest [qw( -R )], \*_ -R qq{file} mb::_filetest [qw( -R )], qq{file} -S $fh mb::_filetest [qw( -S)], $fh -S 'file' mb::_filetest [qw( -S)], 'file' -S FILE mb::_filetest [qw( -S )], \*FILE -S _ mb::_filetest [qw( -S )], \*_ -S qq{file} mb::_filetest [qw( -S )], qq{file} -T $fh mb::_filetest [qw( -T)], $fh -T 'file' mb::_filetest [qw( -T)], 'file' -T FILE mb::_filetest [qw( -T )], \*FILE -T _ mb::_filetest [qw( -T )], \*_ -T qq{file} mb::_filetest [qw( -T )], qq{file} -W $fh mb::_filetest [qw( -W)], $fh -W 'file' mb::_filetest [qw( -W)], 'file' -W FILE mb::_filetest [qw( -W )], \*FILE -W _ mb::_filetest [qw( -W )], \*_ -W qq{file} mb::_filetest [qw( -W )], qq{file} -X $fh mb::_filetest [qw( -X)], $fh -X 'file' mb::_filetest [qw( -X)], 'file' -X FILE mb::_filetest [qw( -X )], \*FILE -X _ mb::_filetest [qw( -X )], \*_ -X qq{file} mb::_filetest [qw( -X )], qq{file} -b $fh mb::_filetest [qw( -b)], $fh -b 'file' mb::_filetest [qw( -b)], 'file' -b FILE mb::_filetest [qw( -b )], \*FILE -b _ mb::_filetest [qw( -b )], \*_ -b qq{file} mb::_filetest [qw( -b )], qq{file} -c $fh mb::_filetest [qw( -c)], $fh -c 'file' mb::_filetest [qw( -c)], 'file' -c FILE mb::_filetest [qw( -c )], \*FILE -c _ mb::_filetest [qw( -c )], \*_ -c qq{file} mb::_filetest [qw( -c )], qq{file} -d $fh mb::_filetest [qw( -d)], $fh -d 'file' mb::_filetest [qw( -d)], 'file' -d FILE mb::_filetest [qw( -d )], \*FILE -d _ mb::_filetest [qw( -d )], \*_ -d qq{file} mb::_filetest [qw( -d )], qq{file} -e $fh mb::_filetest [qw( -e)], $fh -e 'file' mb::_filetest [qw( -e)], 'file' -e FILE mb::_filetest [qw( -e )], \*FILE -e _ mb::_filetest [qw( -e )], \*_ -e qq{file} mb::_filetest [qw( -e )], qq{file} -f $fh mb::_filetest [qw( -f)], $fh -f 'file' mb::_filetest [qw( -f)], 'file' -f FILE mb::_filetest [qw( -f )], \*FILE -f _ mb::_filetest [qw( -f )], \*_ -f qq{file} mb::_filetest [qw( -f )], qq{file} -g $fh mb::_filetest [qw( -g)], $fh -g 'file' mb::_filetest [qw( -g)], 'file' -g FILE mb::_filetest [qw( -g )], \*FILE -g _ mb::_filetest [qw( -g )], \*_ -g qq{file} mb::_filetest [qw( -g )], qq{file} -k $fh mb::_filetest [qw( -k)], $fh -k 'file' mb::_filetest [qw( -k)], 'file' -k FILE mb::_filetest [qw( -k )], \*FILE -k _ mb::_filetest [qw( -k )], \*_ -k qq{file} mb::_filetest [qw( -k )], qq{file} -l $fh mb::_filetest [qw( -l)], $fh -l 'file' mb::_filetest [qw( -l)], 'file' -l FILE mb::_filetest [qw( -l )], \*FILE -l _ mb::_filetest [qw( -l )], \*_ -l qq{file} mb::_filetest [qw( -l )], qq{file} -o $fh mb::_filetest [qw( -o)], $fh -o 'file' mb::_filetest [qw( -o)], 'file' -o FILE mb::_filetest [qw( -o )], \*FILE -o _ mb::_filetest [qw( -o )], \*_ -o qq{file} mb::_filetest [qw( -o )], qq{file} -p $fh mb::_filetest [qw( -p)], $fh -p 'file' mb::_filetest [qw( -p)], 'file' -p FILE mb::_filetest [qw( -p )], \*FILE -p _ mb::_filetest [qw( -p )], \*_ -p qq{file} mb::_filetest [qw( -p )], qq{file} -r $fh mb::_filetest [qw( -r)], $fh -r 'file' mb::_filetest [qw( -r)], 'file' -r -w -f $fh mb::_filetest [qw( -r -w -f)], $fh -r -w -f 'file' mb::_filetest [qw( -r -w -f)], 'file' -r -w -f FILE mb::_filetest [qw( -r -w -f )], \*FILE -r -w -f _ mb::_filetest [qw( -r -w -f )], \*_ -r -w -f qq{file} mb::_filetest [qw( -r -w -f )], qq{file} -r FILE mb::_filetest [qw( -r )], \*FILE -r _ mb::_filetest [qw( -r )], \*_ -r qq{file} mb::_filetest [qw( -r )], qq{file} -s $fh mb::_filetest [qw( -s)], $fh -s 'file' mb::_filetest [qw( -s)], 'file' -s FILE mb::_filetest [qw( -s )], \*FILE -s _ mb::_filetest [qw( -s )], \*_ -s qq{file} mb::_filetest [qw( -s )], qq{file} -t $fh mb::_filetest [qw( -t)], $fh -t 'file' mb::_filetest [qw( -t)], 'file' -t FILE mb::_filetest [qw( -t )], \*FILE -t _ mb::_filetest [qw( -t )], \*_ -t qq{file} mb::_filetest [qw( -t )], qq{file} -u $fh mb::_filetest [qw( -u)], $fh -u 'file' mb::_filetest [qw( -u)], 'file' -u FILE mb::_filetest [qw( -u )], \*FILE -u _ mb::_filetest [qw( -u )], \*_ -u qq{file} mb::_filetest [qw( -u )], qq{file} -w $fh mb::_filetest [qw( -w)], $fh -w 'file' mb::_filetest [qw( -w)], 'file' -w FILE mb::_filetest [qw( -w )], \*FILE -w _ mb::_filetest [qw( -w )], \*_ -w qq{file} mb::_filetest [qw( -w )], qq{file} -x $fh mb::_filetest [qw( -x)], $fh -x 'file' mb::_filetest [qw( -x)], 'file' -x FILE mb::_filetest [qw( -x )], \*FILE -x _ mb::_filetest [qw( -x )], \*_ -x qq{file} mb::_filetest [qw( -x )], qq{file} -z $fh mb::_filetest [qw( -z)], $fh -z 'file' mb::_filetest [qw( -z)], 'file' -z FILE mb::_filetest [qw( -z )], \*FILE -z _ mb::_filetest [qw( -z )], \*_ -z qq{file} mb::_filetest [qw( -z )], qq{file} ----------------------------------------------------------------------------- Each elements in strings or regular expressions that are double-quote like are transpiled as follows. ----------------------------------------------------------------------------------------------- in your script script transpiled by this software ----------------------------------------------------------------------------------------------- "\L\u MBCS-quotee \E\E" "@{[mb::ucfirst(qq<@{[mb::lc(qq< OO-quotee >)]}>)]}" "\U\l MBCS-quotee \E\E" "@{[mb::lcfirst(qq<@{[mb::uc(qq< OO-quotee >)]}>)]}" "\L MBCS-quotee \E" "@{[mb::lc(qq< OO-quotee >)]}" "\U MBCS-quotee \E" "@{[mb::uc(qq< OO-quotee >)]}" "\l MBCS-quotee \E" "@{[mb::lcfirst(qq< OO-quotee >)]}" "\u MBCS-quotee \E" "@{[mb::ucfirst(qq< OO-quotee >)]}" "\Q MBCS-quotee \E" "@{[quotemeta(qq< OO-quotee >)]}" ----------------------------------------------------------------------------------------------- Each elements in regular expressions are transpiled as follows. ---------------------------------------------------------------------------------------------------------------------- in your script script transpiled by this software (on sjis encoding) ---------------------------------------------------------------------------------------------------------------------- qr'.' qr{\G${mb::_anchor}@{[qr'(?:(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|.)' ]}@{[mb::_m_passed()]}} qr'\B' qr{\G${mb::_anchor}@{[qr'(?:(?<![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])|(?<=[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?=[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_]))' ]}@{[mb::_m_passed()]}} qr'\D' qr{\G${mb::_anchor}@{[qr'(?:(?![0123456789])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'\H' qr{\G${mb::_anchor}@{[qr'(?:(?![\x09\x20])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'\N' qr{\G${mb::_anchor}@{[qr'(?:(?!\n)(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'\R' qr{\G${mb::_anchor}@{[qr'(?>\r\n|[\x0A\x0B\x0C\x0D])' ]}@{[mb::_m_passed()]}} qr'\S' qr{\G${mb::_anchor}@{[qr'(?:(?![\t\n\f\r\x20])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'\V' qr{\G${mb::_anchor}@{[qr'(?:(?![\x0A\x0B\x0C\x0D])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'\W' qr{\G${mb::_anchor}@{[qr'(?:(?![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'\b' qr{\G${mb::_anchor}@{[qr'(?:(?<![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?=[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])|(?<=[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_])(?![ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_]))' ]}@{[mb::_m_passed()]}} qr'\d' qr{\G${mb::_anchor}@{[qr'[0123456789]' ]}@{[mb::_m_passed()]}} qr'\h' qr{\G${mb::_anchor}@{[qr'[\x09\x20]' ]}@{[mb::_m_passed()]}} qr'\s' qr{\G${mb::_anchor}@{[qr'[\t\n\f\r\x20]' ]}@{[mb::_m_passed()]}} qr'\v' qr{\G${mb::_anchor}@{[qr'[\x0A\x0B\x0C\x0D]' ]}@{[mb::_m_passed()]}} qr'\w' qr{\G${mb::_anchor}@{[qr'[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_]' ]}@{[mb::_m_passed()]}} qr'[\b]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x08])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:alnum:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x30-\x39\x41-\x5A\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:alpha:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x41-\x5A\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:ascii:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x00-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:blank:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x09\x20])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:cntrl:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x00-\x1F\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:digit:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x30-\x39])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:graph:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x21-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:lower:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[abcdefghijklmnopqrstuvwxyz])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:print:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x20-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:punct:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x21-\x2F\x3A-\x3F\x40\x5B-\x5F\x60\x7B-\x7E])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:space:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\s\x0B])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:upper:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[ABCDEFGHIJKLMNOPQRSTUVWXYZ])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:word:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x30-\x39\x41-\x5A\x5F\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:xdigit:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=[\x30-\x39\x41-\x46\x61-\x66])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^alnum:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x30-\x39\x41-\x5A\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^alpha:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x41-\x5A\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^ascii:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x00-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^blank:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x09\x20])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^cntrl:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x00-\x1F\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^digit:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x30-\x39])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^graph:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x21-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^lower:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![abcdefghijklmnopqrstuvwxyz])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^print:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x20-\x7F])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^punct:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x21-\x2F\x3A-\x3F\x40\x5B-\x5F\x60\x7B-\x7E])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^space:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\s\x0B])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^upper:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![ABCDEFGHIJKLMNOPQRSTUVWXYZ])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^word:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x30-\x39\x41-\x5A\x5F\x61-\x7A])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr'[[:^xdigit:]]' qr{\G${mb::_anchor}@{[qr'(?:(?=(?:(?![\x30-\x39\x41-\x46\x61-\x66])(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F]))))(?^:(?>(?>[\x81-\x9F\xE0-\xFC][\x00-\xFF]|[\x80-\xFF])|[\x00-\x7F])))' ]}@{[mb::_m_passed()]}} qr/./ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_dot]})/ ]}@{[mb::_m_passed()]}} qr/\B/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_B]})/ ]}@{[mb::_m_passed()]}} qr/\D/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_D]})/ ]}@{[mb::_m_passed()]}} qr/\H/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_H]})/ ]}@{[mb::_m_passed()]}} qr/\N/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_N]})/ ]}@{[mb::_m_passed()]}} qr/\R/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_R]})/ ]}@{[mb::_m_passed()]}} qr/\S/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_S]})/ ]}@{[mb::_m_passed()]}} qr/\V/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_V]})/ ]}@{[mb::_m_passed()]}} qr/\W/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_W]})/ ]}@{[mb::_m_passed()]}} qr/\b/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_b]})/ ]}@{[mb::_m_passed()]}} qr/\d/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_d]})/ ]}@{[mb::_m_passed()]}} qr/\h/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_h]})/ ]}@{[mb::_m_passed()]}} qr/\s/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_s]})/ ]}@{[mb::_m_passed()]}} qr/\v/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_v]})/ ]}@{[mb::_m_passed()]}} qr/\w/ qr{\G${mb::_anchor}@{[qr/(?:@{[@mb::_w]})/ ]}@{[mb::_m_passed()]}} qr/[\b]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[\\b])]})/ ]}@{[mb::_m_passed()]}} qr/[[:alnum:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:alnum:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:alpha:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:alpha:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:ascii:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:ascii:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:blank:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:blank:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:cntrl:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:cntrl:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:digit:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:digit:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:graph:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:graph:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:lower:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:lower:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:print:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:print:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:punct:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:punct:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:space:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:space:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:upper:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:upper:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:word:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:word:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:xdigit:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:xdigit:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^alnum:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^alnum:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^alpha:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^alpha:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^ascii:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^ascii:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^blank:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^blank:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^cntrl:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^cntrl:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^digit:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^digit:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^graph:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^graph:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^lower:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^lower:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^print:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^print:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^punct:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^punct:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^space:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^space:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^upper:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^upper:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^word:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^word:]])]})/ ]}@{[mb::_m_passed()]}} qr/[[:^xdigit:]]/ qr{\G${mb::_anchor}@{[qr/(?:@{[mb::_cc(qq[[:^xdigit:]])]})/ ]}@{[mb::_m_passed()]}} ----------------------------------------------------------------------------------------------------------------------
This mb.pm modulino requires perl5.00503 or later to use. Also requires 'strict' module. It requires the 'warnings' module, too if perl 5.6 or later.
You can avoid the following bugs with little hacks.
chdir() on Microsoft Windows
Function chdir() cannot work if path is ended by chr(0x5C).
This problem is specific to Microsoft Windows. It is not caused by the mb.pm modulino or the perl interpreter. # chdir.pl mkdir((qw( `/ ))[0], 0777); print "got=", chdir((qw( `/ ))[0]), " cwd=", `cd`; C:\HOME>perl5.00503.exe chdir.pl GOOD ==> got=1 cwd=C:\HOME\`/ C:\HOME>strawberry-perl-5.8.9.5.exe chdir.pl BAD ==> got=1 cwd=C:\HOME
This is a lost technology in this century.
# suggested module name use mb::WinDir; # supports for all MBCS on Microsoft Windows my $wd = mb::WinDir->new('`/'); $wd->chdir('..'); $wd->open(my $fh, ...);
Look-behind Assertion
The look-behind assertion like (?<=[A-Z]) or (?<![A-Z]) are not prevented from matching trail octet of the previous MBCS codepoint.
Please give us your good hack on this.
Empty Variable in Regular Expression
An empty literal string as regexp means empty string. Unlike original Perl, if 'pattern' is an empty string, the last successfully matched regexp is NOT used. Similarly, empty string made by interpolated variable means empty string, too.
The following is a description of the minor incompatibilities. These are not likely to be programming constraints.
Hyphen of tr/// Supports US-ASCII Only
Supported ranges of tr/// and y/// by hyphen are US-ASCII only.
Special Variables $` and $& need m/( Capture All )/
If you use the special variables $ ` or $&, you must enclose the entire regular expression in parentheses. Because $` and $& needs $1 to implement its.
---------------------------------------------------------------------------------------------------------------------- in your script after m//, works as after s///, works as ---------------------------------------------------------------------------------------------------------------------- $` CORE::substr($&, 0, -CORE::length($1)) $1 ${`} CORE::substr($&, 0, -CORE::length($1)) $1 $PREMATCH CORE::substr($&, 0, -CORE::length($1)) $1 ${^PREMATCH} CORE::substr($&, 0, -CORE::length($1)) $1 $& $1 CORE::substr($&, CORE::length($1)) ${&} $1 CORE::substr($&, CORE::length($1)) $MATCH $1 CORE::substr($&, CORE::length($1)) ${^MATCH} $1 CORE::substr($&, CORE::length($1)) ----------------------------------------------------------------------------------------------------------------------
In the past, Perl scripts with special variables $` and $& had a problem with slow execution. Both that era and today, capturing by parentheses works well.
Return Value from tr///s
tr/// (or y///) operator with /s modifier returns 1 always. If you need right number, you can use mb::tr().
$var1 = 'AAA'; $got = $var1 =~ tr/A/1/s; # works as $got = $var1 =~ s{[\x00-\xFF]*}{mb::tr($&,q/A/,q/1/,'sr')}e; BAD ==> got 1 $var2 = 'BBB'; $got = $var2 =~ tr/A/1/s; # works as $got = $var2 =~ s{[\x00-\xFF]*}{mb::tr($&,q/A/,q/1/,'sr')}e; BAD ==> got 1 $var3 = 'AAA'; $got = mb::tr($var3,'A','1','s'); # works as $got = mb::tr($var3,'A','1','s'); GOOD ==> got 3 Transliteration routine $return = mb::tr($MBCS_string, $searchlist, $replacementlist, $modifier); $return = mb::tr($MBCS_string, $searchlist, $replacementlist); This subroutine is a runtime routine to implement tr/// operator for MBCS codepoint. This subroutine scans an $MBCS_string by codepoint and replaces all occurrences of the codepoint found in $searchlist with the corresponding codepoint in $replacementlist. It returns the number of codepoint replaced or deleted except on /s modifier used. $modifier are: --------------------------------------------------------------------------- Modifier Meaning --------------------------------------------------------------------------- c Complement $searchlist. d Delete found but unreplaced characters. s Squash duplicate replaced characters. r Return transliteration and leave the original string untouched. --------------------------------------------------------------------------- To use with a read-only value without raising an exception, use the /r modifier. print mb::tr('bookkeeper','boep','peob','r'); # prints 'peekkoobor'
mb::substr as Lvalue
If perl version is older than 5.14, mb::substr differs from CORE::substr, and cannot be used as an lvalue. To change part of a string, you need use the optional fourth argument which is the replacement string.
mb::substr($string, 13, 4, "JPerl");
If you use perl 5.14 or later, you can use lvalue feature.
mb.pm modulino does not support the following features. In our experience with JPerl, these features are rarely needed. Moreover, if we are going to implement these, we will need a large amount of code, and we will need to update it frequently. If we are going to implement these, it's better to implement them as other modules.
Delimiter of String and Regexp
qq//, q//, qw//, qx//, qr//, m//, s///, tr///, and y/// can't use a wide codepoint as the delimiter. I didn't implement this feature because it's rarely needed.
fc(), lc(), lcfirst(), uc(), and ucfirst()
fc() not supported. lc(), lcfirst(), uc(), and ucfirst() support US-ASCII only.
# suggested module name use mb::Casing; # supports for all MBCS, including UTF-8 my $lc_string = mb::Casing::lc($string); my $lcfirst_string = mb::Casing::lcfirst($string); my $uc_string = mb::Casing::uc($string); my $ucfirst_string = mb::Casing::ucfirst($string); my $fc_string = mb::Casing::fc($string);
Cloister of Regular Expression
The cloister (?i) and (?i:...) of a regular expression on encoding of big5, big5hkscs, gb18030, gbk, sjis, and uhc will not be implemented for the time being. I didn't implement this feature because it was difficult to implement and less necessary. If you're interested in this issue, try challenge it.
Named Codepoint
A named codepoint, such \N{GREEK SMALL LETTER EPSILON}, \N{greek:epsilon}, or \N{epsilon} is not supported.
# suggested module name use mb::Charnames qw( %N ); # supports for all MBCS, including UTF-8 print "$N{'GREEK SMALL LETTER EPSILON'}"; # By the way, you know how great it is to be able to write MBCS literal strings in your Perl scripts, right?
Unicode Properties (aka Codepoint Properties) of Regular Expression
Unicode properties (aka codepoint properties) of regexp are not available. Also (?[]) in regexp of perl 5.18 is not available. There is no plans to currently support these.
# suggested module name use mb::RegExp::Properties qw( %p %P ); # supports for all MBCS, including UTF-8 $string =~ /$p{Uppercase}/;
This feature (\p{prop} and \P{prop}) is not stable in the Perl specification. Thus, this feature is not available in scripts that require long-term maintenance.
For example, [:alpha:] at Perl 5.005 (not supported) at Perl 5.6 \p{IsAlpha} at Perl 5.12.1 \p{PosixAlpha}, and \p{Alpha} at Perl 5.14 \p{X_POSIX_Alpha}, \p{POSIX_Alpha}, \p{XPosixAlpha}, and \p{PosixAlpha}
\b{...} \B{...} Boundaries in Regular Expressions
Following \b{...} \B{...} available starting in Perl 5.22 are not supported.
\b{gcb} or \b{g} Unicode "Grapheme Cluster Boundary" \b{sb} Unicode "Sentence Boundary" \b{wb} Unicode "Word Boundary" \B{gcb} or \B{g} Unicode "Grapheme Cluster Boundary" doesn't match \B{sb} Unicode "Sentence Boundary" doesn't match \B{wb} Unicode "Word Boundary" doesn't match # suggested module name use mb::RegExp::Boundaries qw( %b %B ); # supports for all MBCS, including UTF-8 $string =~ /$b{wb}(.+)$b{wb}/;
This feature (\b{...} and \B{...}) considered not yet stable in the Perl specification.
Modifier /a /d /l and /u of Regular Expression
I have removed these modifiers to remove your headache. The concept of this software is not to use two or more encoding methods as literal string and literal of regexp in one Perl script. Therefore, modifier /a, /d, /l, and /u are not supported. \d means [0-9] universally.
?? and m?? are Not Supported
Multibyte character needs ( ) which is before {n,m}, {n,}, {n}, *, and + in ?? or m??. As a result, you need to rewrite a script about $1,$2,$3,... You cannot use (?: ), ?, {n,m}?, {n,}?, and {n}? in ?? and m??, because delimiter of m?? is '?'. Here's a quote words from Dan Kogai-san. "I'm just a programmer, so I can't fix the bug of the spec."
format
Unlike JPerl, mb.pm modulino does not support the format feature. Because it is difficult to implement and you can write the same script in other any ways.
Limitation of Regular Expression
This software has limitation from \G in multibyte anchoring. Only perl 5.30.0 or later can treat the codepoint string which exceeds 65534 octets with a regular expression, and only perl 5.10.1 or later can 32766 octets.
see also, The upper limit "n" specifiable in a regular expression quantifier of the form "{m,n}" has been doubled to 65534 https://metacpan.org/pod/release/XSAWYERX/perl-5.30.0/pod/perldelta.pod#The-upper-limit-%22n%22-specifiable-in-a-regular-expression-quantifier-of-the-form-%22%7Bm,n%7D%22-has-been-doubled-to-65534 In 5.10.0, the * quantifier in patterns was sometimes treated as {0,32767} http://perldoc.perl.org/perl5101delta.html [perl #116379] \G can't treat over 32767 octet http://www.nntp.perl.org/group/perl.perl5.porters/2013/01/msg197320.html perlre - Perl regular expressions http://perldoc.perl.org/perlre.html perlre length limit http://stackoverflow.com/questions/4592467/perlre-length-limit
Everything in this world has limits. If you use perl 5.10 or later, or perl 5.30 or later, you can increase those limits. That's all.
Larry Wall must think that "escaping" is the best solution in this case.
P.401 See chapter 15: Unicode of ISBN 0-596-00027-8 Programming Perl Third Edition.
Before the introduction of Unicode support in perl, The eq operator just compared the byte-strings represented by two scalars. Beginning with perl 5.8, eq compares two byte-strings with simultaneous consideration of the UTF8 flag.
-- we have been taught so for a long time.
Perl is a powerful language for everyone, but UTF8 flag is a barrier for common beginners. Because everyone can only one task on one time. So calling Encode::encode() and Encode::decode() in application program is not better way. Making two scripts for information processing and encoding conversion may be better. Please trust me.
/* * You are not expected to understand this. */ Information processing model beginning with perl 5.8 +----------------------+---------------------+ | Text strings | | +----------+-----------| Binary strings | | UTF-8 | Latin-1 | | +----------+-----------+---------------------+ | UTF8 | Not UTF8 | | Flagged | Flagged | +--------------------------------------------+ http://perl-users.jp/articles/advent-calendar/2010/casual/4 Confusion of Perl string model is made from double meanings of "Binary string." Meanings of "Binary string" are 1. Non-Text string 2. Digital octet string Let's draw again using those term. +----------------------+---------------------+ | Text strings | | +----------+-----------| Non-Text strings | | UTF-8 | Latin-1 | | +----------+-----------+---------------------+ | UTF8 | Not UTF8 | | Flagged | Flagged | +--------------------------------------------+ | Digital octet string | +--------------------------------------------+
There are people who don't agree to change in the character string processing model at Perl 5.8. It is impossible to get agreement it from majority of Perl programmers who are not heavy users. How to solve it by returning to an original Perl, let's read page 402 of the Programming Perl, 3rd edition, again.
Information processing model beginning with perl3 or this software of UNIX/C-ism. +--------------------------------------------+ | Text string as Digital octet string | | Digital octet string as Text string | +--------------------------------------------+ | Not UTF8 Flagged, No MOJIBAKE | +--------------------------------------------+ In UNIX Everything is a File - In UNIX everything is a stream of bytes - In UNIX the filesystem is used as a universal name space Native Encoding Scripting - native encoding of file contents - native encoding of file name on filesystem - native encoding of command line - native encoding of environment variable - native encoding of API - native encoding of network packet - native encoding of database
Ideally, We'd like to achieve these five Goals:
Goal #1:
Old byte-oriented programs should not spontaneously break on the old byte-oriented data they used to work on.
This software attempts to achieve this goal by embedded functions work as traditional and stably.
Goal #2:
Old byte-oriented programs should magically start working on the new character-oriented data when appropriate.
This software is not a magician, so cannot see your mind and run it.
You must decide and write octet semantics or codepoint semantics yourself in case by case.
figure of Goal #1 and Goal #2.
Goal #1 Goal #2 (a) (b) (c) (d) (e) +--------------+-------+-------+-------+-------+-------+ | data | Old | Old | New | Old | New | +--------------+-------+-------+-------+-------+-------+ | script | Old | Old | New | +--------------+-------+---------------+---------------+ | interpreter | Old | New | +--------------+-------+-------------------------------+ Old --- Old byte-oriented New --- New codepoint-oriented
There is a combination from (a) to (e) in data, script, and interpreter of old and new. Let's add JPerl, utf8 pragma, and this software.
(a) (b) (c) (d) (e) JPerl,mb utf8 +--------------+-------+-------+-------+-------+-------+ | data | Old | Old | New | Old | New | +--------------+-------+-------+-------+-------+-------+ | script | Old | Old | New | +--------------+-------+---------------+---------------+ | interpreter | Old | New | +--------------+-------+-------------------------------+ Old --- Old byte-oriented New --- New codepoint-oriented
The reason why JPerl is very excellent is that it is at the position of (c). That is, it is almost not necessary to write a special code to process new codepoint oriented script.
Goal #3:
Programs should run just as fast in the new character-oriented mode as in the old byte-oriented mode.
It is impossible. Because the following time is necessary.
(1) Time of escape script for old byte-oriented perl.
(2) Time of processing regular expression by escaped script while multibyte anchoring.
Goal #4:
Perl should remain one language, rather than forking into a byte-oriented Perl and a character-oriented Perl.
JPerl remains one Perl "language" by forking to two "interpreters." However, the Perl core team did not desire fork of the "interpreter." As a result, Perl "language" forked contrary to goal #4.
A codepoint oriented perl is not necessary to make it specially, because a byte-oriented perl can already treat the binary data. This software is only an application program of byte-oriented Perl, a filter program.
And you will get support from the Perl community, when you solve the problem by the Perl script.
mb.pm modulino keeps one "language" and one "interpreter."
Goal #5:
mb.pm users will be able to maintain mb.pm by Perl.
May the mb.pm be with you, always.
Back when Programming Perl, 3rd ed. was written, UTF8 flag was not born and Perl is designed to make the easy jobs easy. This software provides programming environment like at that time.
Some computer scientists (the reductionists, in particular) would like to deny it, but people have funny-shaped minds. Mental geography is not linear, and cannot be mapped onto a flat surface without severe distortion. But for the last score years or so, computer reductionists have been first bowing down at the Temple of Orthogonality, then rising up to preach their ideas of ascetic rectitude to any who would listen. Their fervent but misguided desire was simply to squash your mind to fit their mindset, to smush your patterns of thought into some sort of Hyperdimensional Flatland. It's a joyless existence, being smushed. --- Learning Perl on Win32 Systems If you think this is a big headache, you're right. No one likes this situation, but Perl does the best it can with the input and encodings it has to deal with. If only we could reset history and not make so many mistakes next time. --- Learning Perl 6th Edition The most important thing for most people to know about handling Unicode data in Perl, however, is that if you don't ever use any Uni- code data -- if none of your files are marked as UTF-8 and you don't use UTF-8 locales -- then you can happily pretend that you're back in Perl 5.005_03 land; the Unicode features will in no way interfere with your code unless you're explicitly using them. Sometimes the twin goals of embracing Unicode but not disturbing old-style byte-oriented scripts has led to compromise and confusion, but it's the Perl way to silently do the right thing, which is what Perl ends up doing. --- Advanced Perl Programming, 2nd Edition
INABA Hitoshi <ina@cpan.org>
This project was originated by INABA Hitoshi.
This software is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See the LICENSE file for details.
This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
perlunicode, Encode, open, utf8, bytes, Arabic, Big5, Big5HKSCS, CP932::R2, CP932IBM::R2, CP932NEC::R2, CP932X::R2, Char::Arabic, Char::Big5HKSCS, Char::Big5Plus, Char::Cyrillic, Char::EUCJP, Char::EUCTW, Char::GB18030, Char::GBK, Char::Greek, Char::HP15, Char::Hebrew, Char::INFORMIXV6ALS, Char::JIS8, Char::KOI8R, Char::KOI8U, Char::KPS9566, Char::Latin1, Char::Latin10, Char::Latin2, Char::Latin3, Char::Latin4, Char::Latin5, Char::Latin6, Char::Latin7, Char::Latin8, Char::Latin9, Char::OldUTF8, Char::Sjis, Char::TIS620, Char::UHC, Char::USASCII, Char::UTF2, Char::Windows1252, Char::Windows1258, Cyrillic, GBK, Greek, IOas::CP932, IOas::CP932IBM, IOas::CP932NEC, IOas::CP932X, IOas::SJIS2004, Jacode, Jacode4e, Jacode4e::RoundTrip, KOI8R, KOI8U, KPS9566, KSC5601, Latin1, Latin10, Latin2, Latin3, Latin4, Latin5, Latin6, Latin7, Latin8, Latin9, Modern::Open, SJIS2004::R2, Sjis, UTF2, UTF8::R2, Windows1250, Windows1252, Windows1254, Windows1257, Windows1258. PERL PUROGURAMINGU Larry Wall, Randal L.Schwartz, Yoshiyuki Kondo December 1997 ISBN 4-89052-384-7 http://www.context.co.jp/~cond/books/old-books.html Programming Perl, Second Edition By Larry Wall, Tom Christiansen, Randal L. Schwartz October 1996 Pages: 670 ISBN 10: 1-56592-149-6 | ISBN 13: 9781565921498 http://shop.oreilly.com/product/9781565921498.do Programming Perl, Third Edition By Larry Wall, Tom Christiansen, Jon Orwant Third Edition July 2000 Pages: 1104 ISBN 10: 0-596-00027-8 | ISBN 13: 9780596000271 http://shop.oreilly.com/product/9780596000271.do The Perl Language Reference Manual (for Perl version 5.12.1) by Larry Wall and others Paperback (6"x9"), 724 pages Retail Price: $39.95 (pound 29.95 in UK) ISBN-13: 978-1-906966-02-7 https://dl.acm.org/doi/book/10.5555/1893028 Perl Pocket Reference, 5th Edition By Johan Vromans Publisher: O'Reilly Media Released: July 2011 Pages: 102 http://shop.oreilly.com/product/0636920018476.do Programming Perl, 4th Edition By: Tom Christiansen, brian d foy, Larry Wall, Jon Orwant Publisher: O'Reilly Media Formats: Print, Ebook, Safari Books Online Released: March 2012 Pages: 1130 Print ISBN: 978-0-596-00492-7 | ISBN 10: 0-596-00492-3 Ebook ISBN: 978-1-4493-9890-3 | ISBN 10: 1-4493-9890-1 http://shop.oreilly.com/product/9780596004927.do Perl Cookbook By Tom Christiansen, Nathan Torkington August 1998 Pages: 800 ISBN 10: 1-56592-243-3 | ISBN 13: 978-1-56592-243-3 http://shop.oreilly.com/product/9781565922433.do Perl Cookbook, Second Edition By Tom Christiansen, Nathan Torkington Second Edition August 2003 Pages: 964 ISBN 10: 0-596-00313-7 | ISBN 13: 9780596003135 http://shop.oreilly.com/product/9780596003135.do Perl in a Nutshell, Second Edition By Stephen Spainhour, Ellen Siever, Nathan Patwardhan Second Edition June 2002 Pages: 760 Series: In a Nutshell ISBN 10: 0-596-00241-6 | ISBN 13: 9780596002411 http://shop.oreilly.com/product/9780596002411.do Learning Perl on Win32 Systems By Randal L. Schwartz, Erik Olson, Tom Christiansen August 1997 Pages: 306 ISBN 10: 1-56592-324-3 | ISBN 13: 9781565923249 http://shop.oreilly.com/product/9781565923249.do Learning Perl, Fifth Edition By Randal L. Schwartz, Tom Phoenix, brian d foy June 2008 Pages: 352 Print ISBN:978-0-596-52010-6 | ISBN 10: 0-596-52010-7 Ebook ISBN:978-0-596-10316-3 | ISBN 10: 0-596-10316-6 http://shop.oreilly.com/product/9780596520113.do Learning Perl, 6th Edition By Randal L. Schwartz, brian d foy, Tom Phoenix June 2011 Pages: 390 ISBN-10: 1449303587 | ISBN-13: 978-1449303587 http://shop.oreilly.com/product/0636920018452.do Advanced Perl Programming, 2nd Edition By Simon Cozens June 2005 Pages: 300 ISBN-10: 0-596-00456-7 | ISBN-13: 978-0-596-00456-9 http://shop.oreilly.com/product/9780596004569.do Perl RESOURCE KIT UNIX EDITION Futato, Irving, Jepson, Patwardhan, Siever ISBN 10: 1-56592-370-7 http://shop.oreilly.com/product/9781565923706.do Perl Resource Kit -- Win32 Edition Erik Olson, Brian Jepson, David Futato, Dick Hardt ISBN 10:1-56592-409-6 http://shop.oreilly.com/product/9781565924093.do Announcing Perl 7 Jun 24, 2020 by brian d foy https://www.perl.com/article/announcing-perl-7/ MODAN Perl NYUMON By Daisuke Maki 2009/2/10 Pages: 344 ISBN 10: 4798119172 | ISBN 13: 978-4798119175 https://www.seshop.com/product/detail/10250 Understanding Japanese Information Processing By Ken Lunde January 1900 Pages: 470 ISBN 10: 1-56592-043-0 | ISBN 13: 9781565920439 http://shop.oreilly.com/product/9781565920439.do CJKV Information Processing Chinese, Japanese, Korean & Vietnamese Computing By Ken Lunde O'Reilly Media Print: January 1999 Ebook: June 2009 Pages: 1128 Print ISBN:978-1-56592-224-2 | ISBN 10:1-56592-224-7 Ebook ISBN:978-0-596-55969-4 | ISBN 10:0-596-55969-0 http://shop.oreilly.com/product/9781565922242.do CJKV Information Processing, 2nd Edition By Ken Lunde O'Reilly Media Print: December 2008 Ebook: June 2009 Pages: 912 Print ISBN: 978-0-596-51447-1 | ISBN 10:0-596-51447-6 Ebook ISBN: 978-0-596-15782-1 | ISBN 10:0-596-15782-7 http://shop.oreilly.com/product/9780596514471.do DB2 GIJUTSU ZENSHO By BM Japan Systems Engineering Co.,Ltd. and IBM Japan, Ltd. 2004/05 Pages: 887 ISBN-10: 4756144659 | ISBN-13: 978-4756144652 https://iss.ndl.go.jp/books/R100000002-I000007400836-00 Mastering Regular Expressions, Second Edition By Jeffrey E. F. Friedl Second Edition July 2002 Pages: 484 ISBN 10: 0-596-00289-0 | ISBN 13: 9780596002893 http://shop.oreilly.com/product/9780596002893.do Mastering Regular Expressions, Third Edition By Jeffrey E. F. Friedl Third Edition August 2006 Pages: 542 ISBN 10: 0-596-52812-4 | ISBN 13:9780596528126 http://shop.oreilly.com/product/9780596528126.do Regular Expressions Cookbook By Jan Goyvaerts, Steven Levithan May 2009 Pages: 512 ISBN 10:0-596-52068-9 | ISBN 13: 978-0-596-52068-7 http://shop.oreilly.com/product/9780596520694.do Regular Expressions Cookbook, 2nd Edition By Steven Levithan, Jan Goyvaerts Released August 2012 Pages: 612 ISBN: 9781449327453 https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/ JIS KANJI JITEN By Kouji Shibano Pages: 1456 ISBN 4-542-20129-5 https://www.e-hon.ne.jp/bec/SA/Detail?refISBN=4542201295 UNIX MAGAZINE 1993 Aug Pages: 172 T1008901080816 ZASSHI 08901-8 Shell Script Magazine vol.41 2016 September Pages: 64 https://shell-mag.com/ LINUX NIHONGO KANKYO By YAMAGATA Hiroo, Stephen J. Turnbull, Craig Oda, Robert J. Bickel June, 2000 Pages: 376 ISBN 4-87311-016-5 https://www.oreilly.co.jp/books/4873110165/ Windows NT Shell Scripting By Timothy Hill April 27, 1998 Pages: 400 ISBN 10: 1578700477 | ISBN 13: 9781578700479 https://www.abebooks.com/9781578700479/Windows-NT-Scripting-Circle-Hill-1578700477/plp Windows(R) Command-Line Administrators Pocket Consultant, 2nd Edition By William R. Stanek February 2009 Pages: 594 ISBN 10: 0-7356-2262-0 | ISBN 13: 978-0-7356-2262-3 https://www.abebooks.com/9780735622623/Windows-Command-Line-Administrators-Pocket-Consultant-0735622620/plp CPAN Directory INABA Hitoshi https://metacpan.org/author/INA http://backpan.cpantesters.org/authors/id/I/IN/INA/ https://metacpan.org/release/Jacode4e-RoundTrip https://metacpan.org/release/Jacode4e https://metacpan.org/release/Jacode Recent Perl packages by "INABA Hitoshi" http://code.activestate.com/ppm/author:INABA-Hitoshi/ Tokyo-pm archive https://mail.pm.org/pipermail/tokyo-pm/ https://mail.pm.org/pipermail/tokyo-pm/1999-September/001844.html https://mail.pm.org/pipermail/tokyo-pm/1999-September/001854.html Error: Runtime exception on jperl 5.005_03 http://www.rakunet.org/tsnet/TSperl/12/374.html http://www.rakunet.org/tsnet/TSperl/12/375.html http://www.rakunet.org/tsnet/TSperl/12/376.html http://www.rakunet.org/tsnet/TSperl/12/377.html http://www.rakunet.org/tsnet/TSperl/12/378.html http://www.rakunet.org/tsnet/TSperl/12/379.html http://www.rakunet.org/tsnet/TSperl/12/380.html http://www.rakunet.org/tsnet/TSperl/12/382.html ruby-list http://blade.nagaokaut.ac.jp/ruby/ruby-list/index.shtml http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/2440 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/2446 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/2569 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/9427 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/9431 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/10500 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/10501 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/10502 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/12385 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/12392 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/12393 http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/19156 TANABATA - The Star Festival - common legend of east asia https://ja.wikipedia.org/wiki/%E4%B8%83%E5%A4%95 https://ko.wikipedia.org/wiki/%EC%B9%A0%EC%84%9D https://zh-classical.wikipedia.org/wiki/%E4%B8%83%E5%A4%95 https://zh-yue.wikipedia.org/wiki/%E4%B8%83%E5%A7%90%E8%AA%95 https://zh.wikipedia.org/wiki/%E4%B8%83%E5%A4%95
This software was made referring to software and the document that the following hackers or persons had made. I am thankful to all persons.
Larry Wall, Perl http://www.perl.org/ Jesse Vincent, Compatibility is a virtue https://www.nntp.perl.org/group/perl.perl5.porters/2010/05/msg159825.html Kazumasa Utashiro, jcode.pl: Perl library for Japanese character code conversion, Kazumasa Utashiro https://metacpan.org/author/UTASHIRO ftp://ftp.iij.ad.jp/pub/IIJ/dist/utashiro/perl/ http://web.archive.org/web/20090608090304/http://srekcah.org/jcode/ ftp://ftp.oreilly.co.jp/pcjp98/utashiro/ http://mail.pm.org/pipermail/tokyo-pm/2002-March/001319.html https://twitter.com/uta46/status/11578906320 Jeffrey E. F. Friedl, Mastering Regular Expressions http://regex.info/ SADAHIRO Tomoyuki, Handling of Shift-JIS text correctly using bare Perl http://nomenclator.la.coocan.jp/perl/shiftjis.htm https://metacpan.org/author/SADAHIRO Yukihiro "Matz" Matsumoto, YAPC::Asia2006 Ruby on Perl(s) https://archive.org/details/YAPCAsia2006TokyoRubyonPerls jscripter, For jperl users http://text.world.coocan.jp/jperl.html Bruce., Unicode in Perl http://www.rakunet.org/tsnet/TSabc/18/546.html Hiroaki Izumi, Cannot use Perl5.8/5.10 on Windows ? https://sites.google.com/site/hiroa63iz/perlwin Yuki Kimoto, Is it true that cannot use Perl5.8/5.10 on Windows ? https://philosophy.perlzemi.com/blog/20200122080040.html chaichanPaPa, Matching Shift_JIS file name http://chaipa.hateblo.jp/entry/20080802/1217660826 SUZUKI Norio, Jperl http://www.dennougedougakkai-ndd.org/alte/3tte/jperl-5.005_03@ap522/homepage2.nifty.com..kipp..perl..jperl..index.html WATANABE Hirofumi, Jperl https://www.cpan.org/src/5.0/jperl/ https://metacpan.org/author/WATANABE ftp://ftp.oreilly.co.jp/pcjp98/watanabe/jperlconf.ppt Chuck Houpt, Michiko Nozu, MacJPerl https://habilis.net/macjperl/index.j.html Kenichi Ishigaki, 31st about encoding; To JPerl users as old men https://gihyo.jp/dev/serial/01/modern-perl/0031 Fuji, Goro (gfx), Perl Hackers Hub No.16 http://gihyo.jp/dev/serial/01/perl-hackers-hub/001602 Dan Kogai, Encode module https://metacpan.org/release/Encode https://archive.org/details/YAPCAsia2006TokyoPerl58andUnicodeMythsFactsandChanges http://yapc.g.hatena.ne.jp/jkondo/ Takahashi Masatuyo, JPerl Wiki https://jperl.fandom.com/ja/wiki/JPerl_Wiki Juerd, Perl Unicode Advice https://juerd.nl/site.plp/perluniadvice daily dayflower, 2008-06-25 perluniadvice https://dayflower.hatenablog.com/entry/20080625/1214374293 Unicode issues in Perl https://www.i-programmer.info/programming/other-languages/1973-unicode-issues-in-perl.html numa's Diary: CSI and UCS Normalization https://srad.jp/~numa/journal/580177/ Unicode Processing on Windows with Perl http://blog.livedoor.jp/numa2666/archives/52344850.html http://blog.livedoor.jp/numa2666/archives/52344851.html http://blog.livedoor.jp/numa2666/archives/52344852.html http://blog.livedoor.jp/numa2666/archives/52344853.html http://blog.livedoor.jp/numa2666/archives/52344854.html http://blog.livedoor.jp/numa2666/archives/52344855.html http://blog.livedoor.jp/numa2666/archives/52344856.html Kaoru Maeda, Perl's history Perl 1,2,3,4 https://www.slideshare.net/KaoruMaeda/perl-perl-1234 nurse, What is "string" https://naruse.hateblo.jp/entries/2014/11/07#1415355181 NISHIO Hirokazu, What's meant "string as a sequence of characters"? https://nishiohirokazu.hatenadiary.org/entry/20141107/1415286729 Rick Yamashita, Shift_JIS https://shino.tumblr.com/post/116166805/%E5%B1%B1%E4%B8%8B%E8%89%AF%E8%94%B5%E3%81%A8%E7%94%B3%E3%81%97%E3%81%BE%E3%81%99-%E7%A7%81%E3%81%AF1981%E5%B9%B4%E5%BD%93%E6%99%82us%E3%81%AE%E3%83%9E%E3%82%A4%E3%82%AF%E3%83%AD%E3%82%BD%E3%83%95%E3%83%88%E3%81%A7%E3%82%B7%E3%83%95%E3%83%88jis%E3%81%AE%E3%83%87%E3%82%B6%E3%82%A4%E3%83%B3%E3%82%92%E6%8B%85%E5%BD%93 http://www.wdic.org/w/WDIC/%E3%82%B7%E3%83%95%E3%83%88JIS nurse, History of Japanese EUC 22:00 https://naruse.hateblo.jp/entries/2009/03/08 Mike Whitaker, Perl And Unicode https://www.slideshare.net/Penfold/perl-and-unicode Ricardo Signes, Perl 5.14 for Pragmatists https://www.slideshare.net/rjbs/perl-514-8809465 Ricardo Signes, What's New in Perl? v5.10 - v5.16 #' https://www.slideshare.net/rjbs/whats-new-in-perl-v510-v516 YAP(achimon)C::Asia Hachioji 2016 mid in Shinagawa Kenichi Ishigaki (@charsbar) July 3, 2016 YAP(achimon)C::Asia Hachioji 2016mid https://www.slideshare.net/charsbar/cpan-63708689 Causes and countermeasures for garbled Japanese characters in perl https://prozorec.hatenablog.com/entry/2018/03/19/080000 Perl regular expression bug? http://moriyoshi.hatenablog.com/entry/20090315/1237103809 http://moriyoshi.hatenablog.com/entry/20090320/1237562075 Impressions of talking of Larry Wall at LL Future https://hnw.hatenablog.com/entry/20080903 About Windows and Japanese text https://blogs.windows.com/japan/2020/02/20/about-windows-and-japanese-text/ About Windows diagnostic data https://blogs.windows.com/japan/2019/12/05/about-windows-diagnostic-data/
To install mb, copy and paste the appropriate command in to your terminal.
cpanm
cpanm mb
CPAN shell
perl -MCPAN -e shell install mb
For more information on module installation, please visit the detailed CPAN module installation guide.