26 Jun 2002 15:14:39 UTC
- Distribution: Lingua-Zompist-Barakhinei
- Module version: 0.02
- Source (raw)
- Browse (raw)
- How to Contribute
- Testers (405 / 0 / 0)
- KwaliteeBus factor: 0
- 66.26% Coverage
- License: unknown
- Activity24 month
- Download (23.55KB)
- MetaCPAN Explorer
- Subscribe to distribution
- This version
- Latest version
- SEE ALSO
- COPYRIGHT AND LICENSE
Lingua::Zompist::Barakhinei - Inflect Barakhinei nouns, verbs, and adjectives
This document refers to version 0.02 of Lingua::Zompist::Barakhinei, released on 2002-06-26.
use Lingua::Zompist::Barakhinei; $i_am = Lingua::Zompist::Barakhinei::demeric('eza')->;
use Lingua::Zompist::Barakhinei ':all'; $i_am = demeric('eza')->;
use Lingua::Zompist::Barakhinei qw( demeric scrifel ); $you_know = demeric("shkriv\xea", 1)->; $they_had = crifel("ten\xea", 1)->; # note "\xea" = e with circumflex # nouns and pronouns $word = noun('belu', 'masc', 'beluri'); # nouns $word = noun("s\xfb"); # pronouns ("\xfb" is u with circumflex: su^) $word = noun('mukh'); # in general $word = noun( NOUN [, GENDER [, PLURAL ] ] ); # adjectives $word = adj("kh\xf4t\xea"); # adjectives (ho^te^) # verbs # note: "ibr\xea" is ibre^ $word = demeric("ibr\xea", 1); # present $word = scrifel("ibr\xea", 1); # past $word = izhcrifel("ibr\xea", 1); # past anterior $word = budemeric("ibr\xea", 1); # present subjunctive $word = buscrifel("ibr\xea", 1); # past subjunctive $word = befel("ibr\xea", 1); # imperative $word = part("ibr\xea", 1); # participles # in general $word = FUNC( VERB [, CLASS ] ); # Setting inflection tables # nouns $Lingua::Zompist::Barakhinei::gendertab = \%mygendertab; $Lingua::Zompist::Barakhinei::pluraltab = \%mypluraltab; # verbs $Lingua::Zompist::Barakhinei::classtab = \%myclasstab; # ones that you will probably not need as often $Lingua::Zompist::Barakhinei::rootconstab = \%myrootconstab; $Lingua::Zompist::Barakhinei::subjtab = \%mysubjtab; $Lingua::Zompist::Barakhinei::cadhctab = \%mycadhctab; $Lingua::Zompist::Barakhinei::cadhgtab = \%mycadhgtab; $Lingua::Zompist::Barakhinei::cadhutab = \%mycadhutab;
Lingua::Zompist::Barakhinei is a module which allows you to inflect Barakhinei words. You can conjugate verbs and decline nouns, pronouns, and adjectives.
There is one function to inflect nouns and pronouns, and another to inflect adjectives. Verbs are covered by several functions: one for each tense or mood and another for the participles.
Lingua::Zompist::Barakhinei exports no functions by default, in order to avoid namespace pollution. This enables, for example, Lingua::Zompist::Barakhinei and Lingua::Zompist::Cadhinor to be used in the same program, since otherwise many of the function names would clash. However, all functions listed here can be imported explicitly by naming them, or they can be imported all together by using the tag ':all'.
This module expects input to be in iso-8859-1 (Latin-1) and will return output in that character set as well. For example, lelcê (meaning to see) should have a byte with the value 234 as the last character, and its accusative, lelcâ, will have a byte with the value 226 as its last character.
In the future, this module may expect and produce the charset used by the Maraille font. At that point, the module Lingua::Zompist::Convert is expected to be available, which should be able to convert between that charset and standard charsets such as iso-8859-1 and utf-8.
This function allows you to inflect nouns and pronouns.
The noun or pronoun to inflect.
(optional) The gender of the noun (one of 'masc', 'neut', or 'fem'), or
undeffor the function to guess. (This can remain
(optional) The (nominative) plural of the noun, or
undeffor the function to guess.
In Barakhinei, it is necessary to know the singular, the gender, and the plural of a noun in order to inflect a noun correctly. However, if you do know the plural form, you can pass
undefto this function and the function will attempt to guess based on a built-in list of nouns.
nounreturns an arrayref on success, or
undefor the empty list on failure (for example, because it could not determine which declension or gender a noun belonged to).
In general, the arrayref will have seven elements, in the following order: nominative singular, accusative singular, dative singular, genitive singular, nominative plural, accusative/dative plural, genitive plural. In some cases, some of those elements may be
undef; the most common case is when you ask for the declension of a plural personal pronoun such as ta or kêt.
If you use a singular personal pronoun as input to this function, you will get back an arrayref with seven elements, corresponding to both singular and plural forms of the pronoun. Note that this will cause the accusative/dative distinction to be thrown in away in the plural forms, since nouns make no such distinction! So it is better to input the plural form separately to get the full form.
(This behaviour may change in the future. I'm not sure whether dropping one form is the right thing to do... singular pronouns may end up returning only the first four elements filled.)
If you use a plural personal pronoun as input to this function, only the first four elements will be filled (with the plural forms) and the last three elements will be
undef. This appears to be more DWIMmish (at least, it is for me -- I've used ta, for example, as input and wondered why it was being treated as a noun rather than as a personal pronoun).
The genitive form of sû, lê, ta, mukh, and kâ will be returned in parenthesis to show that it is a regular adjective and not an undeclined genitive form.
The reflexive pronouns are listed under the pseudo-nominative forms zê and za; in the return list, the nominative forms will be the empty string.
This function inflects adjectives. It expects two arguments:
The adjective to be inflected
(optional) The root consonant in the oblique forms (for example, for na "north", which has the root nan- in the oblique forms, pass in
'n'). If you pass in undef for this argument or simply leave it out, the function will attempt to guess whether the adjective has a different oblique stem (using "$rootconstab").
adjreturns an arrayref on success and
undefor the empty list on failure.
The arrayref will itself contain three arrayrefs, each with seven elements. The first arrayref will contain the masculine forms, the second arrayref the neuter forms, and the third arrayref the feminine forms. The forms are in the same order as in the arrayref returned by the noun function. Briefly, this order is nominative - accusative - dative - genitive in the singular and nominative - accusative/dative - genitive in the plural.
This function should determine the declension of an adjective automatically.
There is currently no function which returns the declension of an adjective (partly because the matter is so simple -- declension I adjectives end in -C or have an extra oblique stem consonant, declension II adjectives end in -ê, and declension III adjectives end in -i); however, if there is popular demand for such a function it could be quickly added.
This function declines a verb in the present tense. It takes two arguments:
The verb to be conjugated
(optional) The declension of the verb as an integer (only strictly necessary for verbs in -ê, which can be first, third, or fifth declension, corresponding to Cadhinor verbs in -EC, -EN, and -ER)
demericreturns an arrayref on success and
undefor the empty list on failure.
The arrayref will contain six elements, in the following order: first person singular ("I"), second person singular ("thou"), third person singular ("he/she/it"), first person plural ("we"), second person plural ("[all of] you"), third person plural ("they").
This function declines a verb in the past tense. It is otherwise similar to the function demeric.
This function declines a verb in the past anterior tense. It is otherwise similar to the function demeric.
This function declines a verb in the present subjunctive. It is otherwise similar to the function demeric.
The name derives from Cadhinor grammar terms buprilise "remote" and demeric "present", since the Barakhinei subjunctive mood derived from the Cadhinor remote forms of a verb.
This function declines a verb in the past subjunctive. It is otherwise similar to the function demeric.
The name derives from Cadhinor grammar terms buprilise "remote" and scrifel "past", since the Barakhinei subjunctive mood derived from the Cadhinor remote forms of a verb.
This function declines a verb in the imperative. It is otherwise similar to the function demeric.
The first and fourth elements of the arrayref will be empty, since Barakhinei has no first person imperative, neither singular nor plural.
This function returns the two participles of a verb. It takes the verb and declension number (compare "demeric") as an argument and returns an arrayref (in scalar context) or a list (in list context) of two elements: the present participle and the past participle. On failure, this function returns
undefor the empty list.
Specifically, the form returned for each participle is the masculine nominative singular form of the participle (which is the citation form). Since participles decline like regular adjectives (with an oblique stem consonant of 'l' in the case of participles in -u), the other forms of the participles may be obtained by calling the adj function, if desired.
Since inflection in Barakhinei usually cannot be determined by the ending alone, this module makes use of lookup tables to provide additional information. For example, nouns ending in a consonant can be masculine, feminine, or neuter; if the gender is not passed explicitly to the "noun" function, that function attempts first to lookup the gender in a table, and if that fails, it attempts to guess the gender from the ending. Similarly with verb inflections or with the plural of nouns.
This section describes the various lookup tables which the module uses to perform its inflection tasks. All the tables described here can be overridden from the outside; this is most useful for
$classtab, which do not come pre-filled since they would be fairly large.
It is up to you how you fill those tables -- you can leave them empty, the way they come, and explicitly pass the necessary information to each function; you can fill the tables from a hash which you initialise statically in your code; you can read in the data from a file each time; or you could use a tied hash (say, a DBM file). The last can be useful if you only want to make a couple of requests and don't want to load the entire database into memory; simply tie the data to a hash in your program and assign a reference to that hash to the appropriate variable.
Sample tables, generated programmatically from baralex.htm as of 2002-05-29 and hand-massaged slightly afterwards, are included as tab-separated value files: class.tsv, gender.tsv, and plural.tsv. It will be trivial to convert those to any representation you desire. There may also be other tab-separated value files in the distribution; have a look. Their purpose should be obvious from the filename.
These are the lookup tables which are used by the program and which can be influenced from outside:
This is a hashref whose keys are nouns and whose values are one of
'neut'. This is used to determine the gender of nouns. For example:
san => 'neut',
indicating that the noun san is neuter.
This is a hashref whose keys are nouns and whose values are the plural form of the noun. For example:
ibor => 'ibro',
indicating that the (nominative) plural of the noun ibor is ibro.
This is a hashref whose keys are verbs and whose values are the declension number. First declension verbs end in -ê and derive from Cadhinor verbs in -EC; second declension verbs end in -a and derive from Cadhinor verbs in -AN; third declension verbs end in -ê and derive from Cadhinor verbs in -EN; fourth declension verbs end in -i and derive from Cadhinor verbs in -IR; fifth declension verbs end in -ê and derive from Cadhinor verbs in -ER.
Strictly speaking, entries in this hashref are necessary only for first and fifth declension verbs; second and fourth declension verbs can be identified by their endings alone, and verbs ending in -ê are taken to be third declension if no other declension is specified.
An example entry is
"hab\xea" => 5,
indicating that the verb habê is a fifth declension verb. (In your source code, you'd probably write
This is a hashref whose keys are adjectives and whose values are the extra consonant which is added to the end in the oblique forms, for first declension adjectives such as na, nan-. This would be listed as
na => 'n',
You may not need to add to this table, as there aren't that many of these adjectives, and the ones listed in baralex.htm as of 2002-05-29 should already be in the module.
This is a hashref whose keys are verbs and whose values are the subjunctive forms of those verbs. This is used for verbs which use a different subjunctive stem (derived from Cadhinor verbs with a separate remote stem), for example
laoda => 'loda',
which indicates that the subjunctive stem of laoda is lod-. As indicated in the example, the final letter of the subjunctive stem should be the same as that of the normal infinitive; effectively, it is as if the subjunctive of those verbs is the indicative of another verb.
You may not need to add to this table, as there aren't that many of these verbs, and the ones listed in baralex.htm as of 2002-05-29 should already be in the module.
This is a hashref whose keys are verbs which derive from a Cadhinor verb with a -C- stem consonant. The value is not used (but it is a good idea to have the value be true; for example, you could use the Cadhinor infinitive). This is used because verbs deriving from Cadhinor verbs in -C- suffer consonant changes in some forms. Compare "$cadhgtab".
You will probably not need to add to or replace this table.
This is a hashref whose keys are verbs which derive from a Cadhinor verb with a -G- stem consonant. The value is not used (but it is a good idea to have the value be true; for example, you could use the Cadhinor infinitive). This is used because verbs deriving from Cadhinor verbs in -G- suffer consonant changes in some forms. Compare "$cadhctab".
You will probably not need to add to or replace this table.
This is a hashref whose keys are verbs which derive from a Cadhinor verb with a -U- in the last syllable of the verb stem. The value is not used (but it is a good idea to have the value be true; for example, you could use the Cadhinor infinitive). This is used because verbs deriving from Cadhinor verbs with -U- suffer vowel changes in some forms. Compare "$cadhctab" and "$cadhgtab".
This module should handle irregular words correctly. However, if there is a word that is inflected incorrectly, please send me email and notify me. (Since Barakhinei has all sorts of funky sound changes, I wouldn't be surprised if this module makes mistakes! However, I think it handles correctly all the examples on the web page as of 2002-05-29.)
However, please make sure that you have checked against a current version of http://www.zompist.com/bara.htm or that you asked Mark Rosenfelder himself; the grammar occasionally changes as small errors are found or words change.
Flesh out the dictionary from baralex.htm.
document masculines & feminines in -u (decline like adjectives)
test masculines & feminines in -u (e.g. rizundu = m/f, klâtandu = m, redêlu = f)
test adjectives in -â: mudrâ, shkrâ
test verbs with different subjunctive stems
If you use this module, I'd appreciate it if you drop me a line at the email address in "AUTHOR", just so that I have an idea of how many people use this module at all. Also, if you have any comments, feel free to email me.
Philip Newton, <firstname.lastname@example.org>
(This is basically the BSD licence.)
Copyright (C) 2002 by Philip Newton. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Module Install Instructions
To install Lingua::Zompist::Barakhinei, copy and paste the appropriate command in to your terminal.
perl -MCPAN -e shell install Lingua::Zompist::Barakhinei
For more information on module installation, please visit the detailed CPAN module installation guide.