Thomas Thurman

NAME

bernard - alphabet remix

AUTHOR

Thomas Thurman <thomas@thurman.org.uk>

SYNOPSIS

  bernard <source> -o <target>

DESCRIPTION

bernard takes files written in the conventional alphabet and returns them written in some other alphabet.

At present, only the Shavian alphabet is supported.

SWITCHES

-o <filename>, --output <filename>

Select output file. If this is not specified, the output is written to the standard output.

-s <alphabet>, --script <alphabet>

Select alphabet. Use the ISO 15924 code. This is not case-sensitive. The only arguments currently accepted are "Shaw", which represents the Shavian alphabet, and "Latn", which causes no transformation to the input text.

-S <alphabet>, --source <alphabet>

Specifies the alphabet of the source document. The default is "Latn". This is not automatically detected, because the use-cases are so different. This is not case-sensitive. The only two values allowed are "Latn" and "Shaw". Selecting "Shaw" will allow you to transliterate a document in Shavian into, for example, Deseret.

If "Shaw" is selected, this has the additional effect of causing every stanza in a .po file to be transliterated, not only the fuzzy and empty ones. It also disables the --in-place switch.

Selecting the same source and target alphabet is a valid choice, but means that there will be no change between input and output.

It is currently an error to select "Shaw" as the source alphabet and "Latn" as the target alphabet. In other words, you can't yet undo a transliteration into Shavian. This may be added one day.

This entire option is not yet implemented.

-n <file>, --names <file>

This switch only makes sense with gettext .po files. It means that the msgids in the file are not English strings, but identifiers, and that the English strings are in the .po file whose name is supplied. This is often found in Nokia catalogues.

This is not yet implemented.

-c, --check

Runs the resulting file through "msgfmt -c" to check its validity.

-i, --in-place

This writes the output file over the top of the input file.

This switch is only useful with gettext .po files. It is disabled for other filetypes because it would be dangerous: you would lose the original text.

-a, --armour

This replaces Shavian letters with their traditional ASCII equivalents. It is disabled for other alphabets. This will cause obvious difficulties if the output would ordinarily contain Latin-alphabet letters. Latin-alphabet letters discovered in the text will be retained.

This is not currently implemented.

The inverse operation is obtained by using -m unarmour.

-D, --shift-down

This is a nasty hack. It shifts the letters of the output alphabet down so that they begin at codepoint 128. This is needed because of shortcomings in the UTF-8 decoding of some programs, and when you may be unable to use -a because you need to include characters from both alphabets. You will, of course, need a special font with the relevant glyphs at these non-standard positions.

This is not currently implemented.

-e <text>, --expression <text>

Transliterates the given expression. This is output before any other file.

-U, --update

Checks to see whether there's an updated version of the Shavian set used for transliteration, and downloads it if there is.

This is not currently implemented.

-m <magic>, --magic <magic>

Selects an alternative mode of operation. The defalt is single, which behaves as described above. Other values have other effects, described in "Magic modes", below.

-p, --apostrophe

George Bernard Shaw believed that apostrophes, which he called "uncouth bacilli", were redundant. In honour of this opinion, the -p option strips apostrophes from the transliterated output where they occur within words. The rare apostrophes at the beginnings or endings of words (as in 'tis) will not be stripped, in case you use them for quotation marks.

This is not currently implemented.

-D, --define

This allows you to define the Shavian spelling of a word temporarily. Its argument is the Latin-alphabet spelling, followed by an equals sign, followed by the Shavian spelling. In case you cannot type Shavian letters, you may use the standard ASCII-armouring. For example, to cause the word "of" to be written out in full, rather than as a single-letter abbreviation, use -Dof=ov.

This is not currently implemented.

MAGIC MODES

These are selected using the -m or --magic switch.

single

This is the default, and behaves as described above.

gnome

In this mode, the sole non-option argument should be the name of a Shavian .po file. The master template for that package will be downloaded and merged with the .po file, the transliterations will be updated, and then run through msgfmt -c to check them.

Alternatively, the non-option argument may be the name of a directory. Each subdirectory of this directory should contain a GNOME package, which contains a file po/en@shaw.po. Each of these files will be acted on as described in the previous paragraph.

unarmour

This undoes the effect of the -a or --armour switch. The single non-option argument is a file, which is output verbatim except that characters from the Latin alphabet will be replaced with their corresponding values in the old Shavian-to-Latin mapping.

This is not currently implemented.

BUGS

Probably many.

Code to update the Shavian transliteration of Firefox exists, but has not yet been merged into bernard. It will be merged at some point.

It will also be possible later to translate Qt's .ts files.

Code to handle .srt subtitle files exists, but has not yet been merged.

It doesn't handle any other alphabets than Shavian and the conventional alphabet. At least Deseret will be added.

There are several other planned features which are as yet unimplemented.

COPYRIGHT

This Perl module is copyright (C) Thomas Thurman, 2010. This is free software, and can be used/modified under the same terms as Perl itself.