Search::WuManber -- A fast algorithm for multi-pattern searching
use Search::WuManber; my $search = Search::WuManber->new([qw(tribute reserved serve distribute)]); my @matches = $search->all( lc "All rights reserved. Distribute freely."); my $match = $search->first(lc "All rights reserved. Distribute freely."); $match = $search->next();
This module implements the Wu-Manber multi-pattern parallel search algorithm.
The search strings passed to new() are prepared for parallel lookup. A perl reference pointing to all internal data is returned. Treat this reference as opaque. first() and next() iterate over all text positions where matches occur. Each match is returned as a reference to a two-element array representing text offset and list index of a search string. Options for new(): * <return_string = 1>> make the return value of the iterator a three-element array, containing also the search string itself. * <case_sensitive = 0>> run the search case insensitive (slightly slower).
new()
first()
next()
<return_string =
<case_sensitive =
The matches are returned roughly sorted. Offset usually increments, but may jump backwards by the length difference of neighbouring search strings.
New() allocates a constant amount of memory (between 130k and 2MB). This memory can be returned by undef $search;
New()
undef $search;
The iterator is inefficient. first() just calls all() ...,
all()
This implementation uses internal hash-functions, which may not be optimal.
Efficiency of WuManber depends on the minimum length of search strings. Suggested minimum length is 5. new() switches to a slower algorithm if one of the strings has less than 3 characters.
Changes in the list of search patterns are not seen by the search algorithm after new() was called. Changes in the text are undefined. Call first() to restart with a new text.
Sun Wu and Udi Manber (1994) A fast algorithm for multi-pattern searching. Technical Report TR-94-17, University of Arizona. http://webglimpse.net/pubs/TR94-17.pdf
ftp://ftp.cs.arizona.edu/agrep/agrep-2.04.tar.Z
www.snort.org
Rolf Stiebe, Textalgorithmen, WS 2005/6
Text::Scan, Algorithm::AhoCorasick, Algorithm::RabinKarp
Text::Scan
Algorithm::AhoCorasick
Algorithm::RabinKarp
Juergen Weigert <jw@cs.fau.de>
Copyright (c) 2007, Juergen Weigert, Novell Inc. This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.
To install Search::WuManber, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Search::WuManber
CPAN shell
perl -MCPAN -e shell install Search::WuManber
For more information on module installation, please visit the detailed CPAN module installation guide.