Mark Rogaski

NAME

Tree::Ternary - Perl implementation of ternary search trees.

SYNOPSIS

  use Tree::Ternary;

  $obj = new Tree::Ternary;

  $ref = $obj->insert($str);
  $ref = $obj->rinsert($str);

  $ref = $obj->search($str);
  $ref = $obj->rsearch($str);

  $cnt = $obj->nodes();
  $cnt = $obj->terminals();

  $cnt = $obj->pmsearch($char, $str);
  @list = $obj->pmsearch($char, $str);

  $cnt = $obj->nearsearch($dist, $str);
  @list = $obj->nearsearch($dist, $str);

  @list = $obj->traverse();

DESCRIPTION

Tree::Ternary is a pure Perl implementation of ternary search trees as described by Jon Bentley and Robert Sedgewick.

Ternary search trees are interesting data structures that provide a means of storing and accessing strings. They combine the time efficiency of digital tries with the space efficiency of binary search trees. Unlike a hash, they also maintain information about relative order.

For more information on ternary search trees, visit:

http://www.cs.princeton.edu/~rs/strings/

This module is a translation (albeit not a direct one) from the C implementation published in Bentley and Sedgewick's article in the April 1998 issue of Dr. Dobb's Journal (see SEE ALSO).

METHODS

new()

Creates a new Tree::Ternary object.

insert( STRING )

Inserts STRING into the tree. When a string is inserted, a scalar variable is created to hold whatever data you may wish to associate with the string. A reference to this scalar is returned on a successful insert. If the string is already in the tree, undef is returned.

rinsert( STRING )

This is a recursive implementation of the insert function. It behaves the same as insert(), except it is slower and will carp about deep recursion for strings near 100 characters in length.

This is included for reference purposes only and may eventually deprecated as an alias for insert().

search( STRING )

Searches for the presence of STRING in the tree. If the string is found, a reference to the associated scalar is returned, otherwise undef is returned.

rsearch( STRING )

A recursive implementation of search(), suffers the same drawbacks as rinsert().

This is included for reference purposes only and may eventually deprecated as an alias for search().

nodes()

Returns the total number of nodes in the tree. This count does not include terminal nodes.

terminals()

Returns the total number of terminal nodes in the tree.

pmsearch( CHAR, STRING )

Performs a pattern match for STRING against the tree, using CHAR as a wildcard character. The wildcard will match any characters. For example, if '.' was specified as the wildcard, and STRING was the pattern ".a.a.a." would match "bananas" and "pajamas" (if they were both stored in the tree). In a scalar context, returns the count of matches found. In an array context, returns a list of the matched strings.

nearsearch( DISTANCE, STRING )

Searches for all strings in a tree that differ from STRING by DISTANCE or fewer characters. In a scalar context, returns the count of matches found. In an array context, returns a list of the matched strings.

traverse()

Simply returns a sorted list of the strings stored in the tree. This method will do more tricks in the future.

NOTES

Character Set

Tree::Ternary currently only has support for strings of 8-bit characters. Since it uses a 2 character string to represent termination of the input strings, it will handle any 8-bit character properly.

In the future, I plan to expand the scope of its character handling, and even include Unicode support.

Attributes

Specifying the :attrib tag as an argument to the use statement will export the following internal constants for debugging purposes. Tree::Ternary was built using Greg Bacon's array-based object design, and these constants are used as attribute indices.

  SPLIT_CHAR
  LO_KID
  EQ_KID
  HI_KID
  PAYLOAD
  NODE_COUNT
  TERMINAL_COUNT

AUTHOR

Mark Rogaski <mrogaski@cpan.org>

CREDITS

Many thanks to Tom Phoenix for his invaluable advice and critique.

COPYRIGHT

Copyright (C) 1999, Mark Rogaski; all rights reserved.

This package is free software and is provided "as is" without express or implied warranty. It may be used, redistributed and/or modified under the same terms as Perl itself.

SEE ALSO

Bentley, Jon and Sedgewick, Robert. "Ternary Search Trees". Dr. Dobbs Journal, April 1998. http://www.drdobbs.com/database/ternary-search-trees/184410528

Bentley, Jon and Sedgewick, Robert. "Fast Algorithms for Sorting and Searching Strings". Eighth Annual ACM-SIAM Symposium on Discrete Algorithms New Orleans, January, 1997. http://www.cs.princeton.edu/~rs/strings/




Hosting generously
sponsored by Bytemark