The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

MHonArc::UTF8 - UTF-8 routines for MHonArc

SYNOPSIS

  <CharsetConverters override>
  plain;    mhonarc::htmlize;
  default;  MHonArc::UTF8::str2sgml; MHonArc/UTF8.pm
  </CharsetConverters>

  <TextClipFunc>
  MHonArc::UTF8::clip; MHonArc/UTF8.pm
  </TextClipFunc>

DESCRIPTION

MHonArc::UTF8 provides UTF-8 related routines for use in MHonArc. The main use of the routines provided is to generate mail archives encoded in Unicode UTF-8.

FUNCTIONS

MHonArc::UTF8::to_utf8($data, $from_charset, $to_charset)

Converts $data encoded in $from_charset into UTF-8. $to_charset is ignored since it assumed to be utf-8.

This function is designed to be registered to the TEXTENCODE resource:

  <TextEncode>
  utf-8; MHonArc::UTF8::to_utf8; MHonArc/UTF8.pm
  </TextEncode>
MHonArc::UTF8::str2sgml($data, $charset)

This function is designed to be registered to the CHARSETCONVERTERS resource:

  <CharsetConverters override>
  plain;    mhonarc::htmlize;
  us-ascii; mhonarc::htmlize;
  default;  MHonArc::UTF8::str2sgml; MHonArc/UTF8.pm
  </CharsetConverters>

All data passed in is converted to utf-8 with HTML specials converted into entity references.

MHonArc::UTF8::clip($text, $clip_len, $is_html, $has_tags)

This function is designed to be registered to the TEXTCLIPFUNC resource to have utf-8 strings safely clipped in resource variable expansion:

  <TextClipFunc>
  MHonArc::UTF8::clip; MHonArc/UTF8.pm
  </TextClipFunc>

NOTES

  • MHonArc::UTF8 tries to leverage existing Perl modules for handling conversion to utf-8. The following list the modules checked for in the order of preference:

    1. Encode. The Encode module is standard with Perl v5.8, or later.

    2. Unicode::MapUTF8. Unicode::MapUTF8 is an optional module available via CPAN, and will work with Perl v5.6, or later.

      Note: Since it is unclear about the future of Unicode::MapUTF8, it is possible that support for it may be dropped in the future. It appears to not have been updated in awhile since Perl's Encode module will probably become the standard module to use for handling text encodings.

    3. Fallback implementation. The fallback implementation is designed to work with older versions of Perl 5 if the above modules are not available.

SEE ALSO

The CHARSETCONVERTERS, TEXTCLIPFUNC, and TEXTENCODE resources in the MHonArc documentation.

VERSION

$Id: UTF8.pm,v 1.6 2003/03/05 22:17:15 ehood Exp $

AUTHOR

Earl Hood, earl@earlhood.com

MHonArc comes with ABSOLUTELY NO WARRANTY and MHonArc may be copied only under the terms of the GNU General Public License, which may be found in the MHonArc distribution.