WordList - Word lists
This document describes version 0.1.4 of WordList (from Perl distribution WordList), released on 2018-02-20.
Use one of the WordList::* modules.
WordList::*
WordList::* modules are modules that contain, well, list of words. This module, WordList, serves as a base class and establishes convention for such modules.
WordList
WordList is an alternative interface for Games::Word::Wordlist and Games::Word::Wordlist::*. Its main difference is: WordList::* wordlists are read-only/immutable and the modules are designed to have low startup overhead. This makes them more suitable for use in CLI scripts which often only want to pick a word from one or several lists.
Games::Word::Wordlist::*
Words (or phrases) must be put in __DATA__ section, *sorted* ascibetically (or by Unicode code point), one per line. Putting the wordlist in the __DATA__ section relieves perl from having to parse the list during the loading of the module. To search for words or picking some random words from the list, the module also need not slurp the whole list into memory (and will not do so unless explicitly instructed). Sorting makes it more convenient to diff different versions of the module, as well as performing binary search.
__DATA__
Since this is a new and non-backward compatible interface from Games::Word::Wordlist, I also make some other changes:
Namespace is put outside Games::
Games::
Because obviously word lists are not only useful for games.
Interface is simpler
This is partly due to the list being read-only. The methods provided are just:
- pick (pick one or several random entries)
pick
- word_exists (check whether a word is in the list)
word_exists
- each_word (run code for each entry)
each_word
- all_words (return all the words in a list)
all_words
A couple of other functions might be added, with careful consideration.
Namespace is more language-neutral and not English-centric
TODOS:
Interface for random pick from a subset
Pick $n words of length $L.
Pick $n words matching regex $re.
Interface to enable faster lookup/caching
Constructor.
Call $code for each word in the list. The code will receive the word as its first argument.
$code
Pick $n (default: 1) random words from the list. If there are less then $n words in the list, only that many will be returned.
$n
The algorithm used is from perlfaq ("perldoc -q "random line""), which scans the whole list once. The algorithm is for returning a single entry and is modified to support returning multiple entries.
Check whether $word is in the list.
$word
Algorithm is binary search (NOTE: not yet implemented, currently linear search).
Return all the words in a list.
Please visit the project's homepage at https://metacpan.org/release/WordList.
Source repository is at https://github.com/perlancar/perl-WordList.
Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=WordList
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
WordListC is just like WordList except it does not impose ascibetical sorting order requirement. You can sort the wordlist in whatever order you need.
Bencher::Scenario::GamesWordlistModules
Bencher::Scenario::WordListModules
perlancar <perlancar@cpan.org>
This software is copyright (c) 2018, 2017, 2016 by perlancar@cpan.org.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install WordList, copy and paste the appropriate command in to your terminal.
cpanm
cpanm WordList
CPAN shell
perl -MCPAN -e shell install WordList
For more information on module installation, please visit the detailed CPAN module installation guide.