NAME
OpenOffice::Wordlist - Read/write OpenOffice.org wordlists
SYNOPSIS
This module allows reading and writing of OpenOffice.org wordlists (dictionaries).
For example:
use OpenOffice::Wordlist;
my $dict = OpenOffice::Wordlist->new;
$dict->read(".openoffice.org/3/user/wordlist/standard.dic");
# Print all words.
foreach my $word ( @{ $dict->words } ) {
print $word, "\n";
}
# Add some words.
$dict->append( "openoffice", "great" );
# Write a new dictionary.
$dict->write("new.dic");
When used as a program this module will read all dictionaries given on the command line and write the resultant list of words to standard output. For example,
$ perl OpenOffice/Wordlist.pm standard.dic
METHODS
$dict = new( [ type => 'WDSWG6', language => 2057, neg => 0 ] )
Creates a new dict object.
Optional arguments:
type => 'WBSWG6' or 'WBSWG2' or 'WBSWG5'.
'WBSWG6' (default) indicates a UTF-8 encoded dictionary, the others indicate a ISO-8859.1 encoded dictionary.
language => code
The code for the language. I assume there's an extensive list of these codes somewhere. Some values determined experimentally:
255 All
1031 German (Germany)
1036 French (France)
1043 Dutch (Netherlands)
2047 English UK
2057 English USA
neg => 0 or 1
Whether the dictionary contains exceptions (neg = 1) or regular words (neg = 0).
If language and neg are not specified they are taken from the first file read, if any.
$dict->read( $file )
Reads the contents of the indicated file.
$dict->append( @words )
Append a list of words to the dictionary. To avoid unpleasant surprises, the words must be encoded in Perl's internal encoding.
The arguments may be constant strings or references to lists of strings.
$dict->words
Returns a reference to the list of words in the dictionary,
The words are encoded in Perl's internal encoding.
$dict->write( $file [ , $type ] )
Writes the contents of the object to a new dictionary.
Arguments: The name of the file to be written, and (optionally) the type of the file to be written (one of 'WBSWG6', 'WBSWG5', 'WBSWG2') overriding the type of the dictionary as establised at create time.
EXAMPLE
This example reads all dictionaries that are supplied on the command file, merges them, and writes a new dictionary.
my $dict = OpenOffice::Wordlist->new( type => 'WBSWG6' );
$dict->read( shift );
foreach ( @ARGV ) {
my $extra = OpenOffice::Wordlist->new->read($_);
$dict->append( $extra->words );
}
$dict->write("new.dic");
Settings like the language and exceptions are copied from the file that is initially read.
AUTHOR
Johan Vromans, <jv at cpan.org>
BUGS
There's currently no checking done on dictionary types arguments.
Please report any bugs or feature requests to bug-openoffice-wordlist at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=OpenOffice-Wordlist. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc OpenOffice::Wordlist
You can also look for information at:
RT: CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=OpenOffice-Wordlist
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
COPYRIGHT & LICENSE
Copyright 2010 Johan Vromans, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.