NAME
Tree::Parser - Module to parse formatted files into tree structures
SYNOPSIS
use Tree::Parser;
# create a new parser object with some input
my $tp = Tree::Parser->new($input);
# set a parse filter
$tp->setParseFilter(sub {
my ($line_iterator) = @_;
my $line = $line_iterator->next();
my ($tabs, $node) = $line =~ /(\t*)(.*)/;
my $depth = length $tabs;
return ($depth, $node);
});
# parse our input and get back a tree
my $tree = $tp->parse();
# set our deparse filter
$tp->setDeparseFilter(sub {
my ($tree) = @_;
return ("\t" x $tree->getDepth()) . $tree->getNodeValue();
});
# deparse our tree and get back a string
my $tree_string = $tp->deparse();
DESCRIPTION
This module can parse various types of input (formatted and containing hierarchal information) into a tree structures. It can also deparse the same tree structures back into a string. It accepts various types of input, such as; strings, filenames, array references. The tree structure is a hierarchy of Tree::Simple objects.
The parsing is controlled through a parse filter, which is used to process each "line" in the input (see setParseFilter
below for more information about parse filters).
The deparseing as well is controlled by a deparse filter, which is used to covert each tree node into a string representation.
This module can be viewed (somewhat simplistically) as a serialization tool for Tree::Simple objects. Properly written parse and deparse filters can be used to do "round-trip" tree handling.
METHODS
Constructor
- new ($tree | $input)
-
The constructor is used primarily for creating an object instance. Initializing the object is done by the
_init
method (see below).
Input Processing
- setInput ($input)
-
This method will take varios types of input, and pre-process them through the
prepareInput
method below. - prepareInput ($input)
-
The
prepareInput
method is used to pre-process certain types of$input
. It accepts any of the follow types of arguments:- an Array::Iterator object
-
This just gets passed on through.
- an array reference containing the lines to be parsed
-
This type of argument is used to construct an Array::Iterator instance.
- a filename which ends in
.tree
-
The file is opened, its contents slurped into an array, which is then used to construct an Array::Iterator instance.
- a string
-
The string is expected to have embedded newlines, and in fact must have at least, more than one as a single node tree does not make much sense.
It then returns an Array::Iterator object ready for the parser.
Filter Methods
- setParseFilter ($filter)
-
A parse filter is a subroutine reference which is used to process each element in the input. As the main parse loop runs, it calls this filter routine and passes it the Array::Iterator instance which represents the input. To get the next element/line/token in the iterator, the filter must call
next
, the element should then be processed by the filter. A filter can if it wants advance the iterator further by callingnext
more than once if nessecary, there are no restrictions as to what it can do. However, the filter must return these two values in order to correctly construct the tree:- the depth of the node
- the value of the node (which can be anything; string, array ref, object instanace, you name it)
The following is an example of a very basic filter which simply counts the number of tab characters to determine the node depth and then captures any remaining character on the line.
$tree_parser->setParseFilter(sub { my ($iterator) = @_; my $line = $iterator->next(); # match the tables and all that follows it my ($tabs, $node) = ($line =~ /(\t*)(.*)/); # calculate the depth by seeing how long # the tab string is. my $depth = length $tabs; # return the depth and the node value return ($depth, $node); });
- setDeparseFilter ($filter)
-
The deparse filter is the opposite of the parse filter, it takes each element of the tree and returns a string representation of it. The filter routine gets passed a Tree::Simple instance and is expected to return a single string. However, this is not enforced we actually will gobble up all the filter returns, but keep in mind that each element returned is considered to be a single line in the output, so multiple elements will be treated as mutiple lines.
Here is an example of a deparse filter. This can be viewed as the inverse of the parse filter example above.
$tp->setDeparseFilter(sub { my ($tree) = @_; return ("\t" x $tree->getDepth()) . $tree->getNodeValue(); });
Accessors
- getTree
-
This method returns the tree held by the parser or set through the constructor.
Parse/Deparse
- parse
-
Parsing is pretty automatic once everthing is set up. This routine will check to be sure you have all you need to proceed, and throw an execption if not. Once the parsing is complete, the tree will be stored interally as well as returned from this method.
- deparse
-
This method too is pretty automatic, it verifies that it has all its needs, throwing an exception if it does not. It will return an array of lines in list context, or in scalar context it will join the array into a single string seperated by newlines.
Private Methods
- _init ($tree | $input)
-
This will initialize the slots of the object. If given a
$tree
object, it will store it. If given some other kind of input, it will process this through theprepareInput
method. - _parse
-
This is where all the parsing work is done. If you are truely interested in the inner workings of this method, I suggest you refer to the source. It is a very simple algorithm and should be easy to understand.
- _deparse
-
This is where all the deparsing work is done. As with the
_parse
method, if you are interested in the inner workings, I suggest you refer to the source.
TO DO
Make some default filters,.. turn them into constants which can be used.
Make a way to define the "indent" instead of defining a filter.
BUGS
None that I am aware of. Of course, if you find a bug, let me know, and I will be sure to fix it. This module, in an earlier/simpler form, has been and is being used in production for approx. 1 year now without incident. This version has been improved and the test suite added.
CODE COVERAGE
I use Devel::Cover to test the code coverage of my tests, below is the Devel::Cover report on this module's test suite.
------------------------------ ------ ------ ------ ------ ------ ------ ------
File stmt branch cond sub pod time total
------------------------------ ------ ------ ------ ------ ------ ------ ------
/Tree/Parser.pm 100.0 82.6 73.3 100.0 100.0 25.6 93.2
t/10_Tree_Parser_test.t 100.0 n/a n/a 100.0 n/a 19.5 100.0
t/20_Tree_Parser_inputs_test.t 98.9 50.0 n/a 100.0 n/a 35.8 98.1
t/30_Tree_Parser_errors_test.t 95.5 n/a n/a 90.0 n/a 19.1 93.8
------------------------------ ------ ------ ------ ------ ------ ------ ------
Total 98.9 81.2 73.3 96.4 100.0 100.0 95.4
------------------------------ ------ ------ ------ ------ ------ ------ ------
SEE ALSO
DEPENDENCIES
This module uses two other module which I have written, you will need to install these both.
AUTHOR
stevan little, <stevan@iinteractive.com>
COPYRIGHT AND LICENSE
Copyright 2004 by Infinity Interactive, Inc.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.