The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

SuffixTree - Efficient string manipulation data structure interface for Perl.

SYNOPSIS

        use SuffixTree;

        my $str = "mississippi";
        my $tree=ST_CreateTree($str, length($str));
        ST_PrintTree($tree);
        my $position = ST_FindSubstring($tree, "ssis", 4);

        printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);

        ST_DeleteTree($tree);

DESCRIPTION

The intention of this project is to provide an open-source implementation for an efficient data structure for strings manipulation - the Suffix Tree.

The code was written with as much consistency with the theoretical algorithm as possible (see references). It provides a set of interface functions for creating and searching the tree.

The suffix tree is implemented in ANSI-C. The code is written based on the original algorithm by E.Ukkonen. This is the Perl interface for the underlying ANSI-C implementation.

FUNCTIONS

All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will definitly address in the future.

$tree = ST_CreateTree($string, length($string))

Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string, length of the string. Returns a reference to the tree.

$position = ST_FindSubstring($tree, $substring, length($substring))

Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for, length of substring. Returns the position it was found in the source string or ST_ERROR if string is not in the tree. NOTE: We did not make sure how ST_ERROR 'looks like' in Perl. This should be further explained in future releases.

ST_PrintTree($tree)

Prints the tree. Parameters: the tree to print.

ST_DeleteTree($tree)

Deletes a suffix tree. Parameters: the tree to delete. Returns : void.

ST_SelfTest($tree)

Tests a suffix tree - search for all substrings of the main string (see Testing Guide in the README file). Parameters: the tree to test. Returns : 1 for success, 0 for failure.

BUGS

This Perl interface was mostly build automatically (using SWIG). Little to no attention was given to testing. In future relases of this Perl Module (along with its underlying ANSI-C implementation) we hope to fix all problems that might currenly interfere with successful usage of this module. Please send bug reports to the author(s) of this module.

FUTURE WORK

[1] A better Perl-ish interface

[2] Building tests for this module (for the `make test` part of the installation)

[3] Object Oriented like usage

SEE ALSO

        http://cs.haifa.ac.il/~shlomo/suffix_tree/

AUTHOR

Shlomo Yona <shlomo@cs.haifa.ac.il>

THANKS TO

Offer Kaye (http://okaye.esmartweb.com/) for useful ideas and support.

COPYRIGHT

Copyright (c) 2002, 2003 Shlomo Yona. All rights reserved.

This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 84:

You forgot a '=back' before '=head1'