The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Unicode::Collate::Locale - Linguistic tailoring for DUCET via Unicode::Collate

SYNOPSIS

  use Unicode::Collate::Locale;

  $Collator = Unicode::Collate::Locale->
      new(locale => $locale_name, %tailoring);

  @sorted = $Collator->sort(@not_sorted);

DESCRIPTION

This module provides linguistic tailoring for it taking advantage of Unicode::Collate.

Constructor

The new method returns a collator object.

A parameter list for the constructor is a hash, which can include a special key 'locale' and its value (case-insensitive) standing for a two-letter language code (ISO-639) like 'en' for English. For example, Unicode::Collate::Locale->new(locale => 'FR') returns a collator tailored for French.

$locale_name may be suffixed with a territory(country) code or a variant code, which are separated with '_'. E.g. en_US for English in USA, es_ES_traditional for Spanish in Spain (Traditional),

If $localename is not defined, fallback is selected in the following order:

    1. language_territory_variant
    2. language_territory
    3. language__variant
    4. language
    5. default

Tailoring tags provided by Unicode::Collate are allowed as long as they are not used for 'locale' support. Esp. the table tag is always untailorable since it is reserved for DUCET.

E.g. a collator for French, which ignores diacritics and case difference (i.e. level 1), with reversed case ordering and no normalization.

    Unicode::Collate::Locale->new(
        level => 1,
        locale => 'fr',
        upper_before_lower => 1,
        normalization => undef
    )

Methods

Unicode::Collate::Locale is a subclass of Unicode::Collate and methods other than new are inherited from Unicode::Collate.

Here is a list of additional methods:

$Collator->getlocale

Returns a language code accepted and used actually on collation. If linguistic tailoring is not provided for a language code you passed (intensionally for some languages, or due to the incomplete implementation), this method returns a string 'default' meaning no special tailoring.

A list of tailorable locales

      locale name       description
    ----------------------------------------------------------
      ca                Catalan
      cs                Czech
      eo                Esperanto
      es                Spanish
      es__traditional   Spanish ('ch' and 'll' as a grapheme)
      et                Estonian
      fi                Finnish
      fr                French
      lv                Latvian
      nb                Norwegian Bokmal
      nn                Norwegian Nynorsk
      pl                Polish
      ro                Romanian
      sk                Slovak
      sl                Slovenian
      sv                Swedish

AUTHOR

The Unicode::Collate::Locale module for perl was written by SADAHIRO Tomoyuki, <SADAHIRO@cpan.org>. This module is Copyright(C) 2004-2010, SADAHIRO Tomoyuki. Japan. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

Unicode Collation Algorithm - UTS #10

http://www.unicode.org/reports/tr10/

The Default Unicode Collation Element Table (DUCET)

http://www.unicode.org/Public/UCA/latest/allkeys.txt

CLDR - Unicode Common Locale Data Repository

http://cldr.unicode.org/

Unicode::Collate
Unicode::Normalize