Lingua::EN::ABC - American, British, and Canadian English
use Lingua::EN::ABC ':all'; my $colour = a2b ('color'); print "$colour\n";
produces output
colour
(This example is included as synopsis.pl in the distribution.)
This documents Lingua::EN::ABC version 0.12 corresponding to git commit 20d6292bafb5dd6d7e04bd1b24ffc119d01ab761 released on Tue Nov 16 17:19:21 2021 +0900.
This module offers functions to convert between the spellings and vocabulary of American, British, and Canadian versions of English.
The naming convention for the functions is "a" for American, "b" for British, "c" for Canadian, so "a2b" converts "American to British".
my $british = a2b ('color'); # $british = 'colour'.
Convert American into British spellings. An option oxford controls whether to use Oxford spelling (realize rather than realise):
oxford
my $oxford_british = a2b ('realize', oxford => 1);
This does not convert words with different pronunciations or words which are completely different between American and British uses.
This cannot correctly convert ambiguous spellings like "program", which may be either "program" or "programme" in British English. See "BUGS". It tries to convert American formations like "gotten" into "got".
An option s, if true, results in a spelling-only conversion:
s
use utf8; use Lingua::EN::ABC ':all'; print a2b ('aluminum airplane labor center pajamas'), "\n"; print a2b ('aluminum airplane labor center pajamas', s => 1), "\n";
aluminium aeroplane labour centre pyjamas aluminum airplane labour centre pyjamas
(This example is included as alairlab.pl in the distribution.)
In this case, word pairs with differing pronunciations, like "burnt" and "burned" are not interchanged, and word pairs which are ambiguous, like "check" and "cheque", are also not interchanged.
my $american = b2a ('the colour of my pyjamas'); # $american = 'the color of my pajamas'
Convert British spellings into American spellings. This cannot convert British formations like "got" into "gotten" due to the grammatical ambiguity ("I've got a car" versus "I've gotten into an accident", or "I got into an accident").
An option s, if true, results in a spelling-only conversion. See "a2b".
my $canadian = a2c ('the color'); # $canadian = 'the colour'
Convert American to Canadian spelling. An option s, if true, results in a spelling-only conversion. See "a2b".
my $american = c2a ('the color'); # $american = 'the colour'
Convert Canadian to American spelling. An option s, if true, results in a spelling-only conversion. See "a2b".
my $canadian = b2c ('the programme'); # $canadian = 'the program'
Convert British to Canadian spelling. An option s, if true, results in a spelling-only conversion. See "a2b".
my $british = c2b ($canadian);
Convert Canadian to British spelling. An option oxford controls whether to use Oxford spelling (realize rather than realise):
my $oxford_british = c2b ($canadian, oxford => 1);
Carp is used to print errors.
JSON::Parse is used to read in the file of spelling data.
This is used to make a regular expression which converts the words from one form to another.
This is the underlying data for this module, put into POD format so that it's easy to search and check.
respell is a tool to convert English text from one spelling system to another. This used to be at http://membled.com/work/apps/respell, but that web site has now disappeared as of Tue Nov 16 17:19:21 2021 +0900.
respell
There is a script called econv in the distribution which runs these functions on its command line. Please use econv --help for detailed usage instructions.
econv --help
The data file provided with the distribution isn't intended to be human-edited. The master file containing the spelling variations is abc.txt in the top directory of the distribution. The comment at the top of the file contains information about the format. To add to this module's list of words, edit the file and send a pull request on github.
"Program" is used in British English for computer programs, whereas a theatre programme uses the -mme spelling.
For example, "a2c" will not convert "The Color Purple" or "The World Trade Center" into "The Colour Purple" or "The World Trade Centre". This is a feature as well as a bug, since proper names like movie titles or place names should not be respelt.
Please feel free to contribute. See "DATA FILE" for an easy way to contribute new items.
Up to version 0.05 of the module, the ambiguity data about which words are ambiguous (vice/vise etc.) was not being put into the JSON data file, and yet it was passing all its tests, so there cannot be any tests of this.
Additional word pairs coloured, colouration, mouldy, vapourise, vapourisation.
Ambiguous spellings (check/cheque, meter/metre) no longer converted when using the s option.
Some pairs incorrectly marked as spelling-only (towards, mum) restored.
Plurals ending in s were added.
A list of words by Wikipedia user Ohconfucius was used in the preparation of the data. Nigel Horne (NJH) and Ed Avis (EDAVIS) contributed some word additions and other suggestions.
Ben Bullock, <bkb@cpan.org>
This package and associated files are copyright (C) 2013-2021 Ben Bullock.
You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.
To install Lingua::EN::ABC, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Lingua::EN::ABC
CPAN shell
perl -MCPAN -e shell install Lingua::EN::ABC
For more information on module installation, please visit the detailed CPAN module installation guide.