The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

word - display words starting or matching a string or pattern

SYNOPSIS

word [options] [string | pattern]

Given a string, show all words starting with that string (look mode). Given a pattern, show all lines matching that pattern (grep mode).

An argument with non-alphabetic characters is always a pattern. Force grep mode with --grep=pattern or by starting the pattern with a slash, which will be ignored.

Use --man to get the full manpage.

DESCRIPTION

Search a large list of words in one of two modes. In look mode, only words starting with the given string are displayed. This mode runs very quickly. Only purely alphabetic strings are allowed. The system look(1) program is co-opted into helping.

In grep mode, any entries matching the pattern are shown. This takes much longer to run, because the entire 26 megabyte file must be grepped through. The pattern is not a grep(1) pattern, but rather a perl(1) pattern. You may use Unicode named characters, plus several custom aliases, in your pattern.

EXAMPLES

Look up terms starting with "cat":

    % word cat

The same, but bump verbose display level to see parts of speech:

    % word -v cat

Look at only verbs starting with cat:

    % word -pv cat

Look at all "cat" entries, with verbose set high:

    % word -A cat

Look for all (irregular) plurals that start with "ex":

    % word -ppl ex

Look for obsolete prefixes that start with "s":

    % word -o -ppref s

Grep terms with "cat" anywhere at all:

    % word --grep cat
    % word /cat

Grep terms containing "cat" or "cats" surrounded by word boundaries:

    % word '\bcats?\b'

Grep terms with the Unicode "Mark" property:

    % word '\pM'

Grep all plurals ending in "-ata":

    % word -A -ppl 'ata\b'

Grep terms with the Unicode "Dash" property:

    % word '\p{Dash}'

Grep for an "e" with an acute accent:

    % word '\N{eacute}'

Grep for any acute accents no matter the letter:

    % word '\N{acute}'

Grep for terms containing an "a", "o", "u" in any case, followed by a diaeresis:

    % word '(?i)[oau]\N{dier}'

OPTIONS

Display options are:

    --verbose / -v      use up to three times for more verbosity

        level 0 is just the word, like look
        level 1 includes parts of speech
        level 2 also includes assorted markings
        level 3 is the entire original entry 

    --nopager           never call the pager

Part of speech filtering options are:

    --pos /   -p POS    only entries matching all POS shown
    --nopos / -P POS    no   entries matching any POS shown

    POS is a comma-separated list of parts of speech like
    n/noun, v/verb, a/adjective, adv/adverb, pro/pronoun, 
    and pl/plural.

Type of entry filtering options are:

    --headwords      -h  show headwords only
    --everything     -a  include all types of entry
    --all-verbose    -A  all entries, plus sets verbose to 2

Some entries contain markings telling what kind it is. Include or exclude such entries using:

    --normal         -n  normal entries (on by default)
    --foreign        -f  unassimilated entries (on by default)

    --obsolete       -o  obsolete entries (off by default)
    --catachrestic   -e  catechrestic entries (off by default)
    --illustrations  -i  illustrative examples (off by default)
    --crossref       -x  crossrefs w/old spellings (off by default)

The previous six entry types can be excluded using the corresponding --noXXX long option or the capitalized short option; e.g., --noforeign is equivalent to -F.

Other options:

    --version           print version info and exit
    --help              this help page
    --man               the full manpage
    --debug             internal debugging

    --fuzzy          -z use agrep(1) fuzzy matching in "best mode"
    --all-fuzzy      -Z like -zavv

PATTERN SHORTCUTS

Besides all normal Perl pattern syntax, an extensive set of named characters is provide for nmemonic convenience so you don't have to write numeric code points like \x{3b2} for non-ASCII characters.

  • The full Unicode name, like \N{EN DASH} or \N{LATIN SMALL LETTER THORN}, or Latin or Greek letter names, like \N{thorn} or \N{alpha}.

  • HTML abbrevations like \N{eacute}, \N{ccedil}, \N{iuml}.

  • Diacritic abbreviations: \N{macron}, \N{acute}, \N{grave}, \N{diaeresis }, \N{dier}, \N{circumflex }, \N{circ}, and \N{tilde}; \N{stress1} and \N{stress2}.

  • Abbreviations for the type of entry:

    \N{ali} (unassimilated), \N{obs} (obsolete), \N{xref} (crossreference), \N{ill} (illustrative), \N{spu} (catachrestic), and \N{err} (erroneous).

ERRORS

TO BE WRITTEN: ERRORS

ENVIRONMENT

PAGER

FILES

words.utf8

PROGRAMS

look, agrep

BUGS

TO BE WRITTEN: BUGS

SEE ALSO

perlre(1), perlunicode(1)

AUTHOR

TO BE WRITTEN: AUTHOR

COPYRIGHT AND LICENCE

TO BE WRITTEN: COPYRIGHT AND LICENCE