The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Encode::ISO2022 - ISO/IEC 2022 character encoding scheme

SYNOPSIS

  package FooEncoding;
  use base qw(Encode::ISO2022);
  
  __PACKAGE__->Define(
    Name => 'foo-encoding',
    CCS => [ {...CCS one...}, {...CCS two...}, ....]
  );

DESCRIPTION

This module provides a character encoding scheme (CES) switching a set of multiple coded character sets (CCS).

A class method Define() may take following arguments.

Alias => REGEX

The regular expression representing alias of this encoding, if any.

Name => STRING

The name of this encoding as Encode::Encoding object. Mandatory.

CCS => [ FEATURE, FEATURE, ...]

List of features defining CCSs used by this encoding. Mandatory. Each item is a hash reference containing following items.

bytes => NUMBER

Number of bytes to represent each character. Default is 1.

cl => BOOLEAN

If true value is set, this CCS includes map to/from code points between 0/0 and 1/15. There should be one CCS with this flag to reset broken designation.

dec_only => BOOLEAN

If true value is set, this CCS will be used only for decoding.

encoding => STRING | ENCODING

Encode::Encoding object used as CCS, or its name. Mandatory.

Encodings used for CCS must provide "raw" conversion. Namely, they must be stateless and fixed-length conversion over 94^n or 96^n code tables. Encode::ISO2022::CCS lists available CCSs.

g => STRING
g_init => STRING

Working set this CCS may be designated to: 'g0', 'g1', 'g2' or 'g3'.

If g_init is set, this CCS will be designated at beginning of coversion implicitly, and at end of conversion explicitly.

If g or g_init is set and neither of ls nor ss is set, this CCS will be invoked when it is designated.

If neither of g, g_init, ls nor ss is set, this CCS is invoked always.

g_seq => STRING

Escape sequence to designate this CCS, if it can be designated explicitly.

gr => BOOLEAN

If true value is set, this CCS will be invoked to GR using 7-bit conversion table.

ls => STRING
ss => STRING

Escape sequence or control character to invoke this CCS, if it should be invoked explicitly.

If ls is set, this CCS will be invoked by locking-shift. If ss is set, this CCS will be invoked by single-shift.

range => STRING

Possible range of encoded bytes. General value is '\x21-\x7E', '\x20-\x7F', '\xA1-\xFE' or '\xA0-\xFF'. This is required for multibyte CCSs to detect broken multibyte sequences.

LineInit => BOOLEAN

If it is true, designation and invokation states will be initialized at beginning of lines.

SubChar => STRING

Unicode string to be used for substitution character.

To know more about use of this module, the source of Encode::ISO2022JP2 may be an example.

CAVEATS

This module implements small subset of the features defined by ISO/IEC 2022. Each encoding recognizes only several predefined designation and invokation functions. It can handle limited number of coded character sets. Variable length multibyte coded character sets aren't supported. And so on.

SEE ALSO

ISO/IEC 2022 Information technology - Character code structure and extension techniques.

Encode, Encode::ISO2022::CCS.

AUTHOR

Hatuka*nezumi - IKEDA Soji, <nezumi@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2013 by Hatuka*nezumi - IKEDA Soji

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.