Lingua::LO::Romanize - Romanization of Lao language
Version 0.10
This module romanizes Lao text using the BGN/PCGN standard from 1966 (with some modifications, see below).
use Lingua::LO::Romanize; my $foo = Lingua::LO::Romanize->new(text => 'ພາສາລາວ'); my $bar = $foo->romanize; # $bar will hold the string 'phasalao' $bar = $foo->romanize(hyphen => 1); # $bar will hold the string 'pha-sa-lao'
Lingua::LO::Romanize romanizes lao text using the BGN/PCGN standard from 1966 (also know as the 'French style') with some modifications for post-revolutionary spellings (spellings introduced from 1975). One such modification is that Lao words has to be spelled out. For example, 'ສະຫວັນນະເຂດ' will be romanized correctly into 'savannakhét' while the older spelling 'ສວັນນະເຂດ' will not be romanized correctly due to lack of characters.
Furthermore, 'ຯ' will be romanized to '...', Lao numbers will be 'romanized' to Arabic numbers (0,1,2,3 etc.), and 'ໆ' will repeat the previous syllable. Se below for more romanization rules.
Note that all charcters are treated as UTF-8.
Consonants and vowels are generally romanized accourding to the following rules:
initial and final position 'k'
initial position 'kh'
initial and final position 'ng'
initial postion 'ch'
initial position 's'
intial position 'x'
initial postion 'gn', final postion 'y'. Could also be a vowel. ຽ is not used in initial position
intitial postion 'd', final postion 't'
initial postion 't'
initial postition 'th'
initial postion 'th'
initial and final position 'n'
intitial position 'b', final position 'p'
initial postion 'p'
initial postion 'ph'
initial postion 'f'
initial positon 'f'
initial and final position 'm'
initial postion 'y'
initial and final postion 'r'. ຣ໌ is rarely used and only in final position of words for example 'ເບີຣ໌'
initial postion 'l'
initial postion 'v' or 'o', final postion 'o','iou', or 'oua'. ວ can also be a vowel depending on it's position. The character ວ at the beginning of a syllable should be romanized v. As the second character of a combination in initial position, ວ should be romanized o. The character ວ at the end of a syllable should be romanized in the following manner. The syllables ◌ິ ວ and ◌ີ ວ should be romanized iou. The syllable ◌ົ ວ (treated as a vowel) should be romanized oua. Otherwise, at the end of a syllable, ວ should be romanized o.
initial postion 'h'. At the beginning of a syllable, the character ຫ unaccompanied by a vowel or tone mark and occurring immediately before ຍ gn, ນ n, ມ m, ຣ r, ລ l, or ວ v should generally not be romanized. Note that the character combinations ຫນ, ຫມ and ຫລ are often written in abbreviated form: ໜ n, ໝ m, and ຫຼ l, respectively. ແຫນ is romanized to hèn and ແໜ romanized to nè.
initial postion '-'. ອ can also be a vowel. At the beginning of a word, ອ should not be romanized. At the beginning of a syllable within a word, ອ should be romanized by a hyphen.
initial positon 'h'
'◌' represent any consonant character.
a
i
u
ou
é
è
ô
o
oua
ia
ua
eu
ai
ao
am
Tonal marks (່້໊໋) are not romanized.
The Lao numbers ໐, ໑, ໒, ໓, ໔, ໕, ໖, ໗, ໘, and ໙ are romanized to the Arabic numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9.
ໆ is romanized to repeat the previous syllable, for example ແຊວໆ → xèoxèo.
ຯ (the Lao ellipsis) is 'romanized' to '...'
Creates a new object, a Lao text string is required
my $foo = Lingua::LO::Romanize->new(text => 'ພາສາລາວ');
If a string is passed as argument, this string will be used to romanized from.
$foo->text('ເບຍ');
If no arguments as passed, an array reference of Lingua::LO::Romanize::Word from the current text will be returned.
Will return an array reference of Lingua::LO::Romanize::Word from the current text.
Returns the current text as a romanized string. If hyphen is true, the syllables will be hyphenated.
my $string = $foo->romanize; my $string_with_hyphen = $foo->romanize(hyphen => 1);
Returns the current text as an array of hash references. The key 'lao' represents the original syllable and 'romanized' the romanized syllable.
foreach my $syllable ($foo->syllable_array) { my $lao_syllable = $syllable->{lao}; my $romanized_syllable = $syllable->{romanized}; ... }
Joakim Lagerqvist, <jokke at cpan.org>
<jokke at cpan.org>
Please report any bugs or feature requests to bug-lingua-lo-romanize at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Lingua-LO-Romanize. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
bug-lingua-lo-romanize at rt.cpan.org
You can find documentation for this module with the perldoc command.
perldoc Lingua::LO::Romanize
You can also look for information at:
RT: CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Lingua-LO-Romanize
AnnoCPAN: Annotated CPAN documentation
http://annocpan.org/dist/Lingua-LO-Romanize
CPAN Ratings
http://cpanratings.perl.org/d/Lingua-LO-Romanize
Search CPAN
http://search.cpan.org/dist/Lingua-LO-Romanize/
Copyright 2009 Joakim Lagerqvist, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Lingua::LO::Romanize, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Lingua::LO::Romanize
CPAN shell
perl -MCPAN -e shell install Lingua::LO::Romanize
For more information on module installation, please visit the detailed CPAN module installation guide.