The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Encode::DoubleEncodedUTF8 - Fix double encoded UTF-8 bytes to the correct one

SYNOPSIS

  use Encode;
  use Encode::DoubleEncodedUTF8;

  my $string = "\x{5bae}";
  my $bytes  = encode_utf8("\x{5bae}");
  my $dodgy_utf8 = $string . $bytes; # $bytes is now double encoded

  my $fixed = decode("utf-8-de", $dodgy_utf8); # "\x{5bae}\x{5bae}"

DESCRIPTION

Encode::DoubleEncodedUTF8 adds a new encoding utf-8-de and fixes double encoded utf-8 bytes found in the original bytes to the correct Unicode entity.

The double encoded utf-8 frequently happens when strings with UTF-8 flag and without are concatenated. See encoding::warnings for details.

AUTHOR

Tatsuhiko Miyagawa <miyagawa@bulknews.net>

LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

encoding::warnings, Test::utf8