NAME

Locale::Messages - Gettext Like Message Retrieval

SYNOPSIS

use Locale::Messages qw(:locale_h :libintl_h);

gettext $msgid;
dgettext $textdomain, $msgid;
dcgettext $textdomain, $msgid, LC_MESSAGES;
ngettext $msgid, $msgid_plural, $count;
dngettext $textdomain, $msgid, $msgid_plural, $count;
dcngettext $textdomain, $msgid, $msgid_plural, $count, LC_MESSAGES;
pgettext $msgctxt, $msgid;
dpgettext $textdomain, $msgctxt, $msgid;
dcpgettext $textdomain, $msgctxt, $msgid, LC_MESSAGES;
npgettext $msgctxt, $msgid, $msgid_plural, $count;
dnpgettext $textdomain, $msgctxt, $msgid, $msgid_plural, $count;
dcnpgettext $textdomain, $msgctxt, $msgid, $msgid_plural, $count, LC_MESSAGES;
textdomain $textdomain;
bindtextdomain $textdomain, $directory;
bind_textdomain_codeset $textdomain, $encoding;
bind_textdomain_filter $textdomain, \&filter, $data;
turn_utf_8_on ($variable);
turn_utf_8_off ($variable);
nl_putenv ('OUTPUT_CHARSET=koi8-r');
my $category = LC_CTYPE;
my $category = LC_NUMERIC;
my $category = LC_TIME;
my $category = LC_COLLATE;
my $category = LC_MONETARY;
my $category = LC_MESSAGES;
my $category = LC_ALL;

DESCRIPTION

The module Locale::Messages is a wrapper around the interface to message translation according to the Uniforum approach that is for example used in GNU gettext and Sun's Solaris. It is intended to allow Locale::Messages(3) to switch between different implementations of the lower level libraries but this is not yet implemented.

Normally you should not use this module directly, but the high level interface Locale::TextDomain(3) that provides a much simpler interface. This description is therefore deliberately kept brief. Please refer to the GNU gettext documentation available at http://www.gnu.org/manual/gettext/ for in-depth and background information on the topic.

The lower level module Locale::gettext_pp(3) provides the Perl implementation of gettext() and related functions.

FUNCTIONS

The module exports by default nothing. Every function has to be imported explicitely or via an export tag ("EXPORT TAGS").

gettext MSGID

Returns the translation for MSGID. Example:

print gettext "Hello World!\n";

If no translation can be found, the unmodified MSGID is returned, i. e. the function can never fail, and will never mess up your original message.

Note for Perl 5.6 and later: The returned string will always have the UTF-8 flag off by default. See the documentation for function bind_textdomain_filter() for a way to change this behavior.

One common mistake is this:

print gettext "Hello $name!";

Perl will interpolate the variable $name before the function will see the string. Unless the corresponding message catalog contains a message "Hello Tom!", "Hello Dick!" or "Hello Harry!", no translation will be found.

Using printf() and friends has its own problems:

print sprintf (gettext ("This is the %s %s."), $color, $thing);

(The example is stupid because neither color nor thing will get translated here ...).

In English the adjective (the color) will precede the noun, many other languages (for example French or Italian) differ here. The translator of the message may therefore have a hard time to find a translation that will still work and not sound stupid in the target language. Many C implementations of printf() allow to change the order of the arguments, and a French translator could then say:

"C'est le %2$s %1$s."

Perl printf() implements this feature as of version 5.8 or better. Consequently you can only use it, if you are sure that your software will run with Perl 5.8 or a later version.

Another disadvantage of using printf() is its cryptic syntax (maybe not for you but translators of your software may have their own opinion).

See the description of the function __x() in Locale::TextDomain(3) for a much better way to get around this problem.

Non-ASCII message ids ...

You should note that the function (and all other similar functions in this module) does a bytewise comparison of the MSGID for the lookup in the translation catalog, no matter whether obscure utf-8 flags are set on it, whether the string looks like utf-8, whether the utf8(3pm) pragma is used, or whatever other weird method past or future perl(1) versions invent for guessing character sets of strings.

Using other than us-ascii characters in Perl source code is a call for trouble, a compatibility nightmare. Furthermore, GNU gettext only lately introduced support for non-ascii character sets in sources, and support for this feature may not be available everywhere. If you absolutely want to use MSGIDs in non-ascii character sets, it is wise to choose utf-8. This will minimize the risk that perl(1) itself will mess with the strings, and it will also be a guaranty that you can later translate your project into arbitrary target languages.

Other character sets can theoretically work. Yet, using another character set in the Perl source code than the one used in your message catalogs will never work, since the lookup is done bytewise, and all strings with non-ascii characters will not be found.

Even if you have solved all these problems, there is still one show stopper left: The gettext runtime API lacks a possibility to specify the character set of the source code (including the original strings). Consequently - in absence of a hint for the input encoding - strings without a translation are not subject to output character set conversion. In other words: If the (non-determinable) output character set differs from the character set used in the source code, output can be a mixture of two character sets. There is no point in trying to address this problem in the pure Perl version of the gettext functions. because breaking compatibilty between the Perl and the C version is a price too high to pay.

This all boils down to: Only use ASCII characters in your translatable strings!

dgettext TEXTDOMAIN, MSGID

Like gettext(), but retrieves the message for the specified TEXTDOMAIN instead of the default domain. In case you wonder what a textdomain is, you should really read on with Locale::TextDomain(3).

dcgettext TEXTDOMAIN, MSGID, CATEGORY

Like dgettext() but retrieves the message from the specified CATEGORY instead of the default category LC_MESSAGES.

ngettext MSGID, MSGID_PLURAL, COUNT

Retrieves the correct translation for COUNT items. In legacy software you will often find something like:

print "$count file(s) deleted.\n";

printf "$count file%s deleted.\n", $count == 1 ? '' : 's';

The first example looks awkward, the second will only work in English and languages with similar plural rules. Before ngettext() was introduced, the best practice for internationalized programs was:

if ($count == 1) {
    print gettext "One file deleted.\n";
} else {
    printf gettext "%d files deleted.\n";
}

This is a nuisance for the programmer and often still not sufficient for an adequate translation. Many languages have completely different ideas on numerals. Some (French, Italian, ...) treat 0 and 1 alike, others make no distinction at all (Japanese, Korean, Chinese, ...), others have two or more plural forms (Russian, Latvian, Czech, Polish, ...). The solution is:

printf (ngettext ("One file deleted.\n",
                 "%d files deleted.\n",
                 $count), # argument to ngettext!
        $count);          # argument to printf!

In English, or if no translation can be found, the first argument (MSGID) is picked if $count is one, the second one otherwise. For other languages, the correct plural form (of 1, 2, 3, 4, ...) is automatically picked, too. You don't have to know anything about the plural rules in the target language, ngettext() will take care of that.

This is most of the time sufficient but you will have to prove your creativity in cases like

printf "%d file(s) deleted, and %d file(s) created.\n";

dngettext TEXTDOMAIN, MSGID, MSGID_PLURAL, COUNT

Like ngettext() but retrieves the translation from the specified textdomain instead of the default domain.

dcngettext TEXTDOMAIN, MSGID, MSGID_PLURAL, COUNT, CATEGORY

Like dngettext() but retrieves the translation from the specified category, instead of the default category LC_MESSAGES.

pgettext MSGCTXT, MSGID

Returns the translation of MSGID, given the context of MSGCTXT.

Both items are used as a unique key into the message catalog.

This allows the translator to have two entries for words that may translate to different foreign words based on their context. For example, the word "View" may be a noun or a verb, which may be used in a menu as File->View or View->Source.

pgettext "Verb: To View", "View\n";
pgettext "Noun: A View", "View\n";

The above will both lookup different entries in the message catalog.

A typical usage are GUI programs. Imagine a program with a main menu and the notorious "Open" entry in the "File" menu. Now imagine, there is another menu entry Preferences->Advanced->Policy where you have a choice between the alternatives "Open" and "Closed". In English, "Open" is the adequate text at both places. In other languages, it is very likely that you need two different translations. Therefore, you would now write:

pgettext "File|", "Open";
pgettext "Preferences|Advanced|Policy", "Open";

In English, or if no translation can be found, the second argument (MSGID) is returned.