The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Tree::Node - Memory-efficient tree nodes in Perl

SYNOPSIS

  use Tree::Node;

  $node = Tree::Node->new(2);

  $node->set_child(0, $left);
  $node->set_child(1, $right);

  while ($node->key_cmp($key) < 0) {
    $node = $node->get_child(0);
  }    

DESCRIPTION

This module implements a memory-efficient node type (for trees, skip lists and similar data structures) for Perl.

You may ask "Why bother implementing an ordered structure such as a tree when Perl has hashes built-in?" Since Perl is optimized for speed over memory usage, hashes (and lists) use a lot of memory.

Using Devel::Size for a reference, a list with four elements (corresponding to a key, value, and two child node pointers) will use at least 120 bytes. A hash with four key/value pairs will use at least 228 bytes. But an equivalent Tree::Node object will use at least 68 bytes. (However, see the "KNOWN ISSUES" section below for caveats regarding memory usage.)

So the purpose of this package is to provide a simple low-level Node class which can be used as a base class to implement various kinds of tree structures. Each node has a key/value pair and a variable number of "children" pointers.

How nodes are organized or the algorithm used to organize them is for you to implement.

Object Oritented Interface

new
  $node = Tree::Node->new( $child_count );

Creates a new node with $child_count children. Only as much space as is needed is allocated, and the number of children cannot be expanded later on.

$child_count cannot exceed "MAX_LEVEL".

child_count
  $child_count = $node->child_count;

Returns the number of childen allocated.

set_key
  $node->set_key($key);

Sets the key. Once it is set, it cannot be changed.

force_set_key
  $node->force_set_key($key);

Sets the key, irregardless of whether it is already set. (Note that many data structures assume that the key is never changed, so you should only use this for cases where it is safe to do so.)

key
  $key = $node->key;

Returns the node key.

key_cmp
  if ($node->key_cmp($key) < 0) { ... }

Compares the node key with a key using the Perl string comparison function cmp. Returns -1 if the node key is less than the argument, 0 if it is equal, and 1 if it is greater.

If the key is undefined, then it will always return -1 (even if the argument is undefined).

This method can be overriden if you need a different comparison routine. To use numeric keys, for example:

  package Tree::Node::Numeric;

  use base 'Tree::Node';

  sub key_cmp {
    my $self = shift;
    my $key  = shift;
    return ($self->key <=> $key);
  }

Warning: if you are also using the "Procedural Interface", then you should be aware that "p_key_cmp" will not be inherited. Instead, you should use something like the following:

  {
    no warnings 'redefine';

    sub p_key_cmp {
      my $ptr  = shift;
      my $key  = shift;
      return (p_key_key($ptr) <=> $key);
    }

    sub key_cmp {
      my $self = shift;
      my $key  = shift;
      return (p_key_cmp($self->to_p_node), $key);
    }
  }
set_value
  $node->set_value($value);

Sets the value of the node. The value can be changed.

value
  $value = $node->value;

Returns the value of the node.

set_child
  $node->set_child($index, $child);

Sets the child node. $index must be between 0 and "child_count"-1. Dues when the $index is out of bounds.

get_child
  $child = $node->get_child($index);

Returns the child node. Dies when the $index is out of bounds.

get_child_or_undef
  $child = $node->get_child_or_undef($index);

Like "get_child", but returns undef rather than dying when the $index is out of bounds.

get_children
  @children = $node->get_children;
add_children
  $node->add_children(@children)

Increases the "child_count" and allocates space for the child nodes specified. (The child nodes can be undef.)

add_children_left

Same as "add_children", except that the new nodes are added to the beginning rather than end of the node list.

MAX_LEVEL
  use Tree::Node ':utility';

  ...

  $max = MAX_LEVEL;

Returns the maximum number of children. Defaults to the C constant UCHAR_MAX, which is usually 255.

_allocated
  $size = $node->_allocated;

This is a utility routine which says how much space is allocated for a node. It does not include the Perl overhead (see "KNOWN ISSUES" below).

_allocated_by_child_count
  use Tree::Node ':utility';

  ...

  $size = _allocated_by_child_count( $child_count );

This is a utility routine which returns the amount of space that would be allocated for a node with $child_count children.

to_p_node
  $ptr = $node->to_p_node;

This returns the pointer to the raw node data, which can be used in the "Procedural Interface".

Warning: do not mix and match object-oriented and procedural interface calls when reading child nodes! Child node pointers are stored in an incompatible format.

Procedural Inferface

The experimental procedural interface was added in version 0.06. The advantage of this interface is that there is much less overhead than the object-oriented interface (16 bytes instead of 45 bytes). A disadvantage is that the node cannot be simply subclassed to change the "p_key_cmp" function.

To use the procedural interface, you must import the procedure names:

  use Tree::Node ':p_node';

Aside from working with pointers rather than blessed objects, the procedures listed below are analagous to their object-oriented counterparts.

However, you must manually call "p_destroy" when you are done with the node, since Perl will not automatically destroy it when done.

p_new
  $ptr = p_new( $child_count );
p_child_count
  $child_count = p_child_count( $ptr );
p_set_child
  p_set_child( $mother_ptr, $index, $daughter_ptr );
p_get_child
  $daughter_ptr = p_get_child( $mother_ptr, $index );
p_get_child_or_null
  $daughter_ptr = p_get_child_or_null( $mother_ptr, $index );
p_set_key
  p_set_key( $ptr, $key );

See "to_p_node" for caveats about mixing interfaces.

p_force_set_key
  p_force_set_key( $ptr, $key );

See "to_p_node" for caveats about mixing interfaces.

p_get_key
  $key = p_get_key( $ptr );

See "to_p_node" for caveats about mixing interfaces.

p_key_cmp
  if (p_key_cmp( $ptr, $key ) < 0) { ... }

See "key_cmp" for caveats about mixing interfaces.

p_set_value
  p_set_value( $ptr, $value );
p_get_value
  $value = p_get_value( $ptr );
p_allocated
  $size = p_allocated($ptr);
p_destroy
  p_destroy($ptr);

This unallocates the memory. Perl will not call this automatically, so you must remember to manually destroy each pointer!

KNOWN ISSUES

This module implements a Perl wrapper around a C struct, which for the object-oriented inferface involves a blessed reference to a pointer to the struct. This overhead of about 45 bytes may make up for any memory savings that the C-based struct provided!

So if you what you are doing is implementing a simple key/value lookup, then you may be better off sticking with hashes. If what you are doing requires a special structure that cannot be satisfied with hashes (even sorted hashes), or requires a very large number of nodes, then this module may be useful to you.

Another alternative is to use the "Procedural Interface".

Packages such as Clone and Storable cannot properly handle Tree::Node objects.

Devel::Size may not properly determine the size of a node. Use the "_allocated" method to determine how much space is allocated for the node in C. This does not include the overhead for Perl to maintain a reference to the C struct.

SEE ALSO

Tree::DAG_Node is written in pure Perl, but it offers a more flexible interface.

AUTHOR

Robert Rothenberg <rrwo at cpan.org>

Suggestions and Bug Reporting

Feedback is always welcome. Please use the CPAN Request Tracker at http://rt.cpan.org to submit bug reports.

LICENSE

Copyright (c) 2005,2007 Robert Rothenberg. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.