The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lingua::LO::Romanize - Romanization of Lao language

VERSION

Version 0.10

SYNOPSIS

This module romanizes Lao text using the BGN/PCGN standard from 1966 (with some modifications, see below).

    use Lingua::LO::Romanize;

    my $foo = Lingua::LO::Romanize->new(text => 'ພາສາລາວ');
    
    my $bar = $foo->romanize;           # $bar will hold the string 'phasalao'
    $bar = $foo->romanize(hyphen => 1); # $bar will hold the string 'pha-sa-lao'

DESCRIPTION

Lingua::LO::Romanize romanizes lao text using the BGN/PCGN standard from 1966 (also know as the 'French style') with some modifications for post-revolutionary spellings (spellings introduced from 1975). One such modification is that Lao words has to be spelled out. For example, 'ສະຫວັນນະເຂດ' will be romanized correctly into 'savannakhét' while the older spelling 'ສວັນນະເຂດ' will not be romanized correctly due to lack of characters.

Furthermore, 'ຯ' will be romanized to '...', Lao numbers will be 'romanized' to Arabic numbers (0,1,2,3 etc.), and 'ໆ' will repeat the previous syllable. Se below for more romanization rules.

Note that all charcters are treated as UTF-8.

Romanization Rules

Consonants and vowels are generally romanized accourding to the following rules:

Consonants

initial and final position 'k'

initial position 'kh'

initial position 'kh'

initial and final position 'ng'

initial postion 'ch'

initial position 's'

intial position 'x'

ຍ,ຽ

initial postion 'gn', final postion 'y'. Could also be a vowel. ຽ is not used in initial position

intitial postion 'd', final postion 't'

initial postion 't'

initial postition 'th'

initial postion 'th'

initial and final position 'n'

intitial position 'b', final position 'p'

initial postion 'p'

initial postion 'ph'

initial postion 'f'

initial postion 'ph'

initial positon 'f'

initial and final position 'm'

initial postion 'y'

ຣ,ຣ໌

initial and final postion 'r'. ຣ໌ is rarely used and only in final position of words for example 'ເບີຣ໌'

ລ,◌ຼ

initial postion 'l'

initial postion 'v' or 'o', final postion 'o','iou', or 'oua'. ວ can also be a vowel depending on it's position. The character ວ at the beginning of a syllable should be romanized v. As the second character of a combination in initial position, ວ should be romanized o. The character ວ at the end of a syllable should be romanized in the following manner. The syllables ◌ິ ວ and ◌ີ ວ should be romanized iou. The syllable ◌ົ ວ (treated as a vowel) should be romanized oua. Otherwise, at the end of a syllable, ວ should be romanized o.

initial postion 'h'. At the beginning of a syllable, the character ຫ unaccompanied by a vowel or tone mark and occurring immediately before ຍ gn, ນ n, ມ m, ຣ r, ລ l, or ວ v should generally not be romanized. Note that the character combinations ຫນ, ຫມ and ຫລ are often written in abbreviated form: ໜ n, ໝ m, and ຫຼ l, respectively. ແຫນ is romanized to hèn and ແໜ romanized to nè.

initial postion '-'. ອ can also be a vowel. At the beginning of a word, ອ should not be romanized. At the beginning of a syllable within a word, ອ should be romanized by a hyphen.

initial positon 'h'

Vowels

'◌' represent any consonant character.

◌ະ,◌ັ,◌າ,◌າ

a

◌ິ,◌ິ,◌ີ,◌ີ

i

◌ຶ,◌ຶ,◌ື,◌ື

u

◌ຸ,◌ຸ,◌ູ,◌ູ

ou

ເ◌ະ,ເ◌ັ,ເ◌,ເ◌

é

ແ◌ະ,ແ◌ັ,ແ◌,ແ◌

è

ໂ◌ະ,◌ົ,ໂ◌,ໂ◌

ô

ເ◌າະ,◌ັອ,◌ໍ,◌ອ

o

◌ົວະ,◌ັວ,◌ົວ,◌ວ

oua

ເ◌ ັຽະ,◌ັຽ,ເ◌ັຽ,◌ຽ

ia

ເ◌ຶອະ,ເ◌ຶອ,ເ◌ືອ,ເ◌ືອ

ua

ເ◌ິະ,ເ◌ິ,ເ◌ີ,ເ◌ື

eu

ໄ◌,ໃ◌

ai

ເ◌ົາ,

ao

◌ຳ

am

Tones

Tonal marks (່້໊໋) are not romanized.

Numbers

The Lao numbers ໐, ໑, ໒, ໓, ໔, ໕, ໖, ໗, ໘, and ໙ are romanized to the Arabic numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9.

Special characters

ໆ is romanized to repeat the previous syllable, for example ແຊວໆ → xèoxèo.

ຯ (the Lao ellipsis) is 'romanized' to '...'

METHODS

new

Creates a new object, a Lao text string is required

    my $foo = Lingua::LO::Romanize->new(text => 'ພາສາລາວ');

text

If a string is passed as argument, this string will be used to romanized from.

    $foo->text('ເບຍ');

If no arguments as passed, an array reference of Lingua::LO::Romanize::Word from the current text will be returned.

all_words

Will return an array reference of Lingua::LO::Romanize::Word from the current text.

romanize

Returns the current text as a romanized string. If hyphen is true, the syllables will be hyphenated.

    my $string = $foo->romanize;
    
    my $string_with_hyphen = $foo->romanize(hyphen => 1);

syllable_array

Returns the current text as an array of hash references. The key 'lao' represents the original syllable and 'romanized' the romanized syllable.

    foreach my $syllable ($foo->syllable_array) {
        my $lao_syllable = $syllable->{lao};
        my $romanized_syllable = $syllable->{romanized};
        ...
    }

AUTHOR

Joakim Lagerqvist, <jokke at cpan.org>

BUGS

Please report any bugs or feature requests to bug-lingua-lo-romanize at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Lingua-LO-Romanize. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Lingua::LO::Romanize

You can also look for information at:

COPYRIGHT & LICENSE

Copyright 2009 Joakim Lagerqvist, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.