The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Tree::Lexicon - Object class for storing and retrieving a lexicon in a tree of affixes

VERSION

Version 0.01

SYNOPSIS

    use Tree::Lexicon;

    my $lexicon = Tree::Lexicon->new();

    $lexicon->insert( 'apply', '', 'Apple', 'Windows', 'Linux', 'app', 'all day' );
    # Warns of strings not matching /^\w+/ without inserting

    if ($lexicon->contains( 'WiNdOwS' )) {
        $lexicon->remove( 'wInDoWs' );
        $lexicon->insert( 'Vista' );
    }
    
    my @words = $lexicon->vocabulary;
    # Same as:
    @words = ( 'Apple', 'Linux', 'Windows', 'app', 'apply' );

    @words = $lexicon->auto_complete( 'ap' );
    # Same as:
    @words = ( 'app', 'apply' );
    
    my $regexp = $lexicon->as_regexp();
    # Same as:
    $regexp = qr/\b(?:Apple|Linux|Windows|app(?:ly)?)\b/;

    my $caseless->Tree::Lexicon->new( 0 )->insert( 'apply', '', 'Apple', 'Windows', 'Linux', 'app', 'all day' );
    # Warns of strings not matching /^\w+/ without inserting

    if ($caseless->contains( 'WiNdOwS' )) {
        $caseless->remove( 'wInDoWs' );
        $caseless->insert( 'Vista' );
    }
    
    @words = $caseless->vocabulary;
    # Same as:
    @words = ( 'APP', 'APPLE', 'APPLY', 'LINUX', 'VISTA' );
    
    @words = $caseless->auto_complete( 'ap' );
    # Same as:
    @words = ( 'APP', 'APPLE', 'APPLY' );
    
    my $regexp = $caseless->as_regexp();
    # Same as:
    $regexp = qr/\b(?:[Aa][Pp[Pp](?:[Ll](?:[Ee]|[Yy]))?|[Ll][Ii][Nn][Uu][X]|[Vv][Ii][Ss][Tt][Aa])\b/;
    
    use Tree::Lexicon qw( cs_regexp ci_regexp );
    
    my $cs_regexp = cs_regexp( @words );
    # Same as:
    $cs_regexp = Tree::Lexicon->new()->insert( @words )->as_regexp();
    
    my $ci_regexp = ci_regexp( @words );
    # Same as:
    $ci_regexp = Tree::Lexicon->new( 0 )->insert( @words )->as_regexp();

DESCRIPTION

The purpose of this module is to provide a simple and effective means to store a lexicon. It is intended to aid parsers in identifying keywords and interactive applications in identifying user-provided words.

EXPORT

cs_regexp

Convenience function for generating a case sensitive regular expression from list of words.

    my $cs_regexp = cs_regexp( @words );
    # Same as:
    $cs_regexp = Tree::Lexicon->new( 1 )->insert( @words )->as_regexp();

ci_regexp

Convenience function for generating a case insensitive regular expression from list of words.

    my $ci_regexp = cs_regexp( @words );
    # Same as:
    $ci_regexp = Tree::Lexicon->new( 0 )->insert( @words )->as_regexp();

METHODS

Passing a string not matching /^\w+/ as an argument to insert, remove, contains or auto_complete yields a warning to STDERR and nothing else.

new

Returns a new empty Tree::Lexicon object. By default, the tree's contents are case-sensitive. Passing a single false argument to the constuctor makes its contents case-insensitive.

    $lexicon = Tree::Lexicon->new();
    # Same as:
    $lexicon = Tree::Lexicon->new( 1 );
    
    # or #
    
    $lexicon = Tree::Lexicon->new( 0 );

insert

Inserts zero or more words into the lexicon tree and returns the object.

    $lexicon->insert( 'list', 'of', 'words' );

If you already have an initial list of words, then you can chain this method up with the constructor.

    my $lexicon = Tree::Lexicon->new()->insert( @words );

remove

Removes zero or more words from the lexicon tree and returns them (or undef if not found).

    @removed = $lexicon->remove( 'these', 'words' );

contains

Returns 1 or '' for each word as to its presence or absense, respectively.

    @verify = $lexicon->contains( 'these', 'words' );

auto_complete

Returns all words beginning with the string passed.

    @words = $lexicon->auto_complete( 'a' );

vocabulary

Returns all words in the lexicon.

    @words = $lexicon->vocabulary();

as_regexp

Returns a regular expression equivalent to the lexicon tree. The regular expression has the form qr/\b(?: ... )\b/.

    $regexp = $lexicon->as_regexp();

AUTHOR

S. Randall Sawyer, <srandalls at cpan.org>

BUGS

Please report any bugs or feature requests to bug-tree-lexicon at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Tree-Lexicon. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Tree::Lexicon

You can also look for information at:

ACKNOWLEDGMENTS

This module's framework generated with module-starter.

LICENSE AND COPYRIGHT

Copyright 2013 S. Randall Sawyer.

This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at:

http://www.perlfoundation.org/artistic_license_2_0