NAME

Unicode::Precis - RFC 7564 PRECIS Framework - Enforcement and Comparison

SYNOPSIS

use Unicode::Precis;
$precis = Unicode::Precis->new(options...);
$string = $precis->enforce($input);
$equals = $precis->compare($inputA, $inputB);

DESCRIPTION

Unicode::Precis performs enforcement and comparison of UTF-8 bytestring or Unicode string according to PRECIS Framework.

Note that bytestring will not be upgraded but treated as UTF-8 sequence by this module.

Methods

new ( options ... )

Constructor. Creates new instance of Unicode::Precis class. Following options may be specified.

WidthMappingRule => 'Decomposition'

If specified, maps fullwidth and halfwidth characters to their decomposition mappings using decomposeWidth().

AdditionalMappingRule => 'options...'

If specified, maps spaces. options... may include any of following words:

MapSpace

Maps non-ASCII space characters to ASCII space using mapSpace().

StripSpace

Removes ASCII space character(s) at the beginning and/or ending of the string.

UnifySpace

Maps sequences of more than one ASCII space character to a single ASCII space character.

CaseMappingRule => 'Fold'

If specified, maps uppercase and titlecase characters to lowercase using foldCase().

NormalizationRule => 'NFC' | 'NFKC' | 'NFD' | 'NFKD'

If specified, normalizes string using given normalization form.

DirectionalityRule => 'BiDi'

If specifiled and the string contains right-to-left character, checks string against BiDi Rule.

StringClass => 'FreeFormClass' | 'IdentifierClass'

If specified, checks string according to given string class.

OtherRule => $subref

If specified, replaces and/or checks string with the result of subroutine referred by $subref.

compare ( $stringA, $stringB )

Instance method. Compares strings. If enforcement on both strings succeeds, compares them using compareExactly() and returns 1 or 0. Otherwise returns undef.

Arguments $stringA and $stringB are not modified.

enforce ( $string )

Instance method. Performs enforcement on the string. If processing succeeded, modifys argument $string and returns it. Otherwise returns undef.

Exports

None are exported.

CAVEATS

The repertoire this module can handle is restricted by Unicode database of Perl core: Characters beyond it are considered to be "unassigned" and are disallowed, even if they are available by recent version of Unicode. Table below lists implemented Unicode version by each Perl version.

Perl's version     Implemented Unicode version
------------------ ---------------------------
5.8.7, 5.8.8       4.1.0
5.10.0             5.0.0
5.8.9, 5.10.1      5.1.0
5.12.x             5.2.0
5.14.x             6.0.0
5.16.x             6.1.0
5.18.x             6.2.0
5.20.x             6.3.0

RESTRICTIONS

This module does not support EBCDIC platforms.

SEE ALSO

RFC 7564 PRECIS Framework: Preparation, Enforcement, and Comparison of Internationalized Strings in Application Protocols. https://tools.ietf.org/html/rfc7564.

Unicode::BiDiRule, Unicode::Normalize, Unicode::Precis::Preparation, Unicode::Precis::Utils.

AUTHOR

Hatuka*nezumi - IKEDA Soji, <hatuka@nezumi.nu>

COPYRIGHT AND LICENSE

Copyright (C) 2015 by Hatuka*nezumi - IKEDA Soji

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. For more details, see the full text of the licenses at <http://dev.perl.org/licenses/>.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.