Encode::Escape::Unicode - Perl extension for Encoding of Unicode Escape Sequnces


  use Encode::Escape::Unicode;

  $escaped = "What is \\x{D384}? It's Perl!";
  $string = decode 'unicode-escape', $escaped;

  # Now, $string is equivalent "What is \x{D384}? It's Perl!"


  $python_unicode_escape = "And \\u041f\\u0435\\u0440\\u043b? It's Perl, too.";
  $string = decode 'unicode-escape', $python_unicode_escape;

  # Now, $string eq "And \x{041F}\x{0435}\x{0440}\x{043B}? It's Perl, too."

If you have a text data file 'unicode-escape.txt'. It contains a line:

  What is \x{D384}? It's Perl!\n
  And \x{041F}\x{0435}\x{0440}\x{043B}? It's Perl, too.\n

And you want to use it as if it were a normal double quote string in source code. Try this:

  use Encode::Escape::Unicode;

  open(FILE, 'unicode-escape.txt');

  while(<FILE>) {
    print encode 'utf8', decode 'unicode-escape', $_;


Encode::Escape::Unicode module implements encodings of escape sequences.

Simply saying, it converts (decodes) escape sequences into Perl internal string (\x{0000} -- \x{ffff}) and encodes Perl strings to escape sequences.


default or perl mode

 Escape Sequcnes      Description
 ---------------      --------------------------
 \a                   Alarm (beep)
 \b                   Backspace
 \e                   Escape
 \f                   Formfeed
 \n                   Newline
 \r                   Carriage return
 \t                   Tab
 \000     - \377      octal ASCII value. \0, \00, and \000 are equivalent.
 \x00     - \xff      hexadecimal ASCII value. \x0 and \x00 are equivalent.
 \x{0000} - \x{ffff}  hexadecimal ASCII value. \x{0}, \x{00}, x\{000}, \x{0000}

 \\                   Backslash
 \$                   Dollar Sign
 \@                   Ampersand
 \"                   Print double quotes
 \                    Escape next character if known otherwise print

This is the default mode. You don't need to invoke it since you haven't invoke other mode previously.

python or java mode

Python, Java, and C# languages use \uxxxx escape sequence for Unicode character.

 Escape Sequcnes      Description
 ---------------      --------------------------
 \a                   Alarm (beep)
 \b                   Backspace
 \e                   Escape
 \f                   Formfeed
 \n                   Newline
 \r                   Carriage return
 \t                   Tab
 \000   - \377        octal ASCII value. \0, \00, and \000 are equivalent.
 \x00   - \xff        hexadecimal ASCII value. \x0 and \x00 are equivalent.
 \u0000 - \uffff      hexadecimal ASCII value.

 \\                   Backslash
 \$                   Dollar Sign
 \@                   Ampersand
 \"                   Print double quotes
 \                    Escape next character if known otherwise print

If you have data which contains \uxxxx escape sequences, this will translate them to utf8-encoded characters:

 use Encode::Escape;

 Encode::Escape::demode 'unicode-escape', 'python';

 while(<>) {
        print encode 'utf8', decode 'unicode-escape', $_;

And this will translate \uxxxx to \x{xxxx}.

 use Encode::Escape;

 Encode::Escape::enmode 'unicode-escape', 'perl';
 Encode::Escape::demode 'unicode-escape', 'python';

 while(<>) {
        print encode 'unicode-escape', decode 'unicode-escape', $_;


See Encode::Escape.


