The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Lingua::LO::NLP::Analyze - Analyze a Lao syllable and provide accessors to its constituents

FUNCTION

Objects of this class represent a Lao syllable with an analysis of its constituents. After passing a valid syllable to the constructor, the parts are available via accessor methods as outlined below.

METHODS

new

new( $syllable, %options )

The constructor takes a syllable and any number of options as hash-style arguments. The only option specified so far is normalize, a boolean value indicating whether to run the syllable through Unicode::Normalize::NFC and tone mark normalization (see "normalize_tone_marks" in Lingua::LO::NLP::Data). Set this if you are unsure that your text is well-formed according to Unicode rules.

ACCESSORS

syllable

The original syllable as used by the parser. This may be subtly different from the one passed to the constructor:

  • If the normalize option was set, tone marks and vowels may have been reordered

  • If the decomposed form of LAO VOWEL SIGN AM (◌າ) is used, it will have been converted to the composed form

  • Combinations of ຫ with ລ, ມ or ນ will have been converted to the combined characters.

parse

A hash of raw constituents as returned by the parsing regexp. Although the other accessors present constituents in a more accessible way and take care of morphological special cases like the treatment of ຫ, this may come in handy to quickly check e.g. if there was a vowel component before the core consonant.

vowel

The syllable's vowel or diphthong. As the majority of vowels have more than one code point, the consonant position is represented by the Unicode character designated for this function, DOTTED CIRCLE or U+25CC.

consonant

The syllable's core consonant.

end_consonant

The end consonant if present, undef otherwise.

tone_mark

The tone mark if present, undef otherwise.

semivowel

The semivowel following the core consonant if present, undef otherwise.

h

"ຫ" if the syllable contained a combining ຫ, i.e. one that isn't the core consonant.

vowel_length

The string 'long' or 'short'.

live

Boolean indicating whether this is a "live" or a "dead" syllable. Dead syllables end in a short vowel or stopped consonant (ກ, ດ or ບ), lives ones end in a long vowel, diphthong, semivowel or nasal consonant. This is used for tone determination but also available as an attribute, just in case it might be useful. true indicates a live syllable.

tone

One of the following strings, depending on core consonant class, vowel length and tone mark:

LOW_RISING
LOW
MID
HIGH
MID_FALLING
HIGH_FALLING

The latter two occur with short vowels, the other ones with long vowels.