The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Tree::BK - Structure for efficient fuzzy matching

VERSION

version 0.02

SYNOPSIS

        use Tree::BK;
        my $tree = Tree::BK->new();
        $tree->insert(qw(cuba cubic cube cubby thing foo bar));
        $tree->find('cube', 1); # cuba, cube
        $tree->find('cube', 2); # cuba, cubic, cube, cubby

DESCRIPTION

The Burkhard-Keller, or BK tree, is a structure for efficiently performing fuzzy matching. By default, this module assumes string input and uses "distance" in Text::Levenshtein::XS to compare items and build the tree. However, a subroutine giving the distance between two tree members may be provided, making this structure more generally usable.

METHODS

new

 Tree::BK->new(\&metric);

Creates a new instance of Tree::BK. A metric may be provided as an argument. It should be a subroutine which takes two tree members as arguments and returns a positive integer indicating the distance between them. If no metric is provided, then the tree members are assumed to be strings, and "distance" in Text::Levenshtein::XS is used as the metric.

insert

Inserts an object into the tree. Returns nothing if the object was already in the tree, or the object if it was added to the tree.

insert_all

Inserts all of the input objects into the tree. Returns the number of objects that were added to the tree (i.e. the number of objects that weren't already present in the tree).

find

 $tree->find($target, $distance)

Returns an array ref containing all of the objects in the tree which are at most $distance distance away from $target.

size

Returns the number of objects currently stored in the tree.

SEE ALSO

These sites explain the concept of a BK tree pretty well:

AUTHOR

Nathan Glenn <nglenn@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Nathan Glenn.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.