RandomJungle::Tree - A Random Jungle classification tree
Version 0.04
RandomJungle::Tree represents a classification tree from Random Jungle. This class uses RandomJungle::Tree::Node to represent the nodes in the tree.
use RandomJungle::Tree; my $tree = RandomJungle::Tree->new( %params ) || die $RandomJungle::Tree::ERROR; my $tree_id = $tree->id; # Returns the variables used in the tree my $aref = $tree->get_variables; # aref of indices my $href = $tree->get_variables( variable_labels => 1 ); # label => index # Classifies $data using this tree and returns either the predicted phenotype # or RandomJungle::Tree::Node object for the terminal node my $predicted_pheno = $tree->classify_data( $data ); my $node_obj = $tree->classify_data( $data, as_node => 1 ); my $node_obj = $tree->classify_data( $data, skip_validation => 1 ); my $node_obj = $tree->get_node_by_vector_index( $vi ) || warn $tree->err_str; my $vi = $tree->get_root_node; my $node_obj = $tree->get_root_node( as_node => 1 ); my $aref = $tree->get_all_nodes; # aref of vector indices my $aref = $tree->get_all_nodes( as_node => 1 ); # aref of node objects my $aref = $tree->get_terminal_nodes; # aref of vector indices my $aref = $tree->get_terminal_nodes( as_node => 1 ); # aref of node objects # Carps and returns undef on error (invalid index) or if called with index 0 (no parent) my $vi_of_parent = $tree->get_parent_of_vector_index( $vi ); my $node_obj = $tree->get_parent_of_vector_index( $vi, as_node => 1 ); # Returns an aref containing vector indices of all nodes in the path to the specified # vector index, beginning at the root of the tree and ending at the specified vector index. my $aref = $tree->get_path_to_vector_index( $vi ) || warn $tree->err_str; my $depth = $tree->get_depth_of_vector_index( $vi ); # $href contains the max depth of the tree and a list of all vector indices at that depth my $href = $tree->max_node_depth; # Error handling $tree->set_err( 'Something went boom' ); my $msg = $tree->err_str; my $trace = $tree->err_trace;
Creates and returns a new RandomJungle::Tree object:
my $tree = RandomJungle::Tree->new( %params ) || die $RandomJungle::Tree::ERROR;
Required keys in %params: id => $tree_id (from the XML file) var_id_str => $str (from the XML file) values_str => $str (from the XML file) branches_str => $str (from the XML file)
Optional keys in %params: variable_labels => $aref (variables from the RAW file, excluding headers)
The required components of %params are returned from RandomJungle::File::XML->get_tree_data(). The aref for variable_labels can be obtained from RandomJungle::Jungle->get_variable_labels().
Sets $ERROR and returns undef on failure.
Returns the tree ID:
my $tree_id = $tree->id;
Returns the variables used in the tree. By default, returns an aref of indices (see RAW file). If 'variable_labels => 1' is specified in %params, returns a href { $label => $index } if variable_labels was specified in new(), or sets err_str and returns undef otherwise.
my $aref = $tree->get_variables; # variable indices my $href = $tree->get_variables( variable_labels => 1 ); # $href->{$label} = $index
Classifies $data using this tree. Returns the terminal value (predicted phenotype) by default. If as_node => 1 is specified, returns a RandomJungle::Tree::Node object that represents the terminal node after classification. If skip_validation => is specified, the data validation step will be skipped; this is a performance improvement but if invalid data is present the classification will fail and undef will be returned. Use skip_validation with caution.
my $predicted_pheno = $tree->classify_data( $data ); my $node_obj = $tree->classify_data( $data, as_node => 1 ); my $node_obj = $tree->classify_data( $data, skip_validation => 1 );
$data must be an arrayref containing the data values to be classified. The order of the columns must be the same as that which was used to construct the tree (see RAW file). Note: $data must not include header values (for FID, IID, PAT, and MAT).
$data can be obtained from RandomJungle::Jungle->get_sample_data_by_label().
Sets err_str and returns undef if an error occurs (e.g., $data contains a value that is not 0, 1, or 2).
Returns a RandomJungle::Tree::Node object for a given vector index (from the varID/values/branches arrays in the XML file).
my $node_obj = $tree->get_node_by_vector_index( $vi );
Sets err_str and returns undef on error (invalid index).
Returns the root node in the tree (vector index 0). The vector index is returned by default. If called with 'as_node => 1' a RandomJungle::Tree::Node object is returned.
my $vi = $tree->get_root_node; my $node_obj = $tree->get_root_node( as_node => 1 );
Returns an aref of all nodes in the tree. Vector indices are returned by default. If called with 'as_node => 1' RandomJungle::Tree::Node objects are returned.
my $aref = $tree->get_all_nodes; my $aref = $tree->get_all_nodes( as_node => 1 );
Returns an aref of all terminal nodes in the tree. Vector indices are returned by default. If called with 'as_node => 1' RandomJungle::Tree::Node objects are returned.
my $aref = $tree->get_terminal_nodes; my $aref = $tree->get_terminal_nodes( as_node => 1 );
Returns the parent of the node with the specified vector index. The vector index of the parent node is returned by default. If called with 'as_node => 1' a RandomJungle::Tree::Node object is returned.
my $vi_of_parent = $tree->get_parent_of_vector_index( $vi ); my $node_obj = $tree->get_parent_of_vector_index( $vi, as_node => 1 );
Sets err_str and returns undef on error (invalid index) or if called with index 0 (index 0 is the root node, which has no parent).
Returns an aref containing the vector indices of all nodes in the path to the specified vector index, beginning at the root of the tree and ending at the specified vector index.
my $aref = $tree->get_path_to_vector_index( $vi );
Sets err_str and returns undef on error (invalid vector index).
Returns the depth of the node with the specified vector index, where the root node has a depth of 1, the child nodes of the root have depth = 2, etc.
my $depth = $tree->get_depth_of_vector_index( $vi );
Returns a hash reference that contains the max depth of the tree and a list of all vector indices at that depth.
my $href = $tree->max_node_depth;
$href has the following structure: depth => $max_depth, vector_indices => $aref_of_vi,
where $aref_of_vi is an array reference that contains all vector indices at the max depth.
Sets the error message (provided as a parameter) and creates a stack trace:
$tree->set_err( 'Something went boom' );
Returns the last error message that was set:
my $msg = $tree->err_str;
Returns a backtrace for the last error that was encountered:
my $trace = $tree->err_trace;
Parses the 'varID' string from the XML file and returns an array of variable indices.
my @var_ids = $tree->_parse_var_id_string();
Note: @var_ids are indices (column numbers) of the variables within the RAW file, not variable labels.
The varID string is a required parameter of new().
Parses the 'branches' string from the XML file and returns an array of branch elements. Each element is a string of the format 'left,right', which are the vector indices of the child nodes of the current node.
my @branches = $tree->_parse_branches_string();
The branches string is a required parameter of new().
Parses the 'values' string from the XML file and returns an array of values which are used as thresholds for classifying genotype data.
my @values = $tree->_parse_values_string();
The values string is a required parameter of new().
$retval = create cytoscape file ( $out_filename )
Add caching for node depth and path to node (if used a lot)
RandomJungle::Jungle, RandomJungle::Tree, RandomJungle::Tree::Node, RandomJungle::XML, RandomJungle::OOB, RandomJungle::RAW, RandomJungle::DB, RandomJungle::Classification_DB
Robert R. Freimuth
Copyright (c) 2011 Mayo Foundation for Medical Education and Research. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.
To install RandomJungle::Tree, copy and paste the appropriate command in to your terminal.
cpanm
cpanm RandomJungle::Tree
CPAN shell
perl -MCPAN -e shell install RandomJungle::Tree
For more information on module installation, please visit the detailed CPAN module installation guide.