NAME
RandomJungle::Jungle - Consolidated interface for Random Jungle input and output data
VERSION
Version 0.05
SYNOPSIS
RandomJungle::Jungle provides a simplified interface to access Random Jungle input and output data. See RandomJungle::Tree for methods relating to the classification trees produced by Random Jungle, and RandomJungle::File::DB for lower-level methods that are wrapped by this module.
use RandomJungle::Jungle;
my $rj = RandomJungle::Jungle->new( db_file => $file ) || die $RandomJungle::Jungle::ERROR;
$rj->store( xml_file => $file, oob_file => $file, raw_file => $file ) || die $rj->err_str;
my $href = $rj->summary_data(); # for loaded data
my $href = $rj->get_filenames; # filenames specified in store()
my $href = $rj->get_rj_input_params; # input params that were used when RJ was run
my $aref = $rj->get_variable_labels; # (expected: SEX PHENOTYPE var1 ...)
my $aref = $rj->get_sample_labels; # from the IID column of the RAW file
# Returns data for the specified sample, where $label is the IID from the RAW file
my $href = $rj->get_sample_data_by_label( label => $label ) || warn $rj->err_str;
my $aref = $rj->get_tree_ids;
my $tree = $rj->get_tree_by_id( $id ) || warn $rj->err_str; # RJ::Tree object
# Returns hash of arefs containing lists of tree IDs, by OOB state for the sample
my $href = $rj->get_oob_for_sample( $label ) || warn $rj->err_str;
# Returns the OOB state for a given sample and tree ID
my $val = $rj->get_oob_state( sample => $label, tree_id => $id ) || warn $rj->err_str;
# Returns a hash of arefs containing lists of sample labels, by OOB for the tree
my $href = $rj->get_oob_for_tree( $tree_id ) || warn $rj->err_str;
# Error handling
$rj->set_err( 'Something went boom' );
my $msg = $rj->err_str;
my $trace = $rj->err_trace;
METHODS
new()
Creates and returns a new RandomJungle::Jungle object:
my $rj = RandomJungle::Jungle->new( db_file => $file ) || die $RandomJungle::Jungle::ERROR;
The 'db_file' parameter is required. Returns undef and sets $ERROR on failure.
store()
This method loads data into the RandomJungle::File::DB database. All parameters are optional, so files can be loaded in a single call or in multiple calls. Each type of file can only be loaded once; subsequent calls to this method for a given file type will overwrite the previously-loaded data.
$rj->store( xml_file => $file, oob_file => $file, raw_file => $file ) || die $rj->err_str;
Returns true on success. Sets err_str and returns false if an error occurred.
get_filenames()
Returns a hash reference containing the names of the files specified in store():
my $href = $rj->get_filenames;
Keys in the href are db, xml, oob, and raw.
get_rj_input_params()
Returns a href of the input parameters used when Random Jungle was run:
my $href = $rj->get_rj_input_params; # $href->{$param_name} = $param_value;
get_variable_labels()
Returns a reference to an array that contains the variable labels from the RAW file:
my $aref = $rj->get_variable_labels; # (expected: SEX PHENOTYPE var1 ...)
get_sample_labels()
Returns a reference to an array that contains the sample labels from the IID column of the RAW file:
my $aref = $rj->get_sample_labels;
get_sample_data_by_label()
Returns a hash ref containing data for the sample specified by label => $label, where label is the IID from the RAW file. Sets err_str and returns undef if label is not specified or is invalid.
my $href = $rj->get_sample_data_by_label( label => $label ) || warn $rj->err_str;
$href has the following structure: SEX => $val, PHENOTYPE => $val, orig_data => $line, (unsplit, unspliced) index => $i, (index in aref from get_sample_labels(), can be used to index into OOB matrix) classification_data => $aref, (can be passed to RandomJungle::Tree->classify_data)
get_tree_ids()
Returns an array ref of tree IDs (sorted numerically):
my $aref = $rj->get_tree_ids;
get_tree_by_id()
Returns a RandomJungle::Tree object for the specified tree.
my $tree = $rj->get_tree_by_id( $id ) || warn $rj->err_str;
Sets err_str and returns undef if tree ID is undef or invalid, or if an internal error occurred.
get_oob_for_sample()
Returns lists of tree IDs, by OOB state, for the specified sample label.
my $href = $rj->get_oob_for_sample( $label ) || warn $rj->err_str;
The href contains the following keys, each of which point to an array reference containing tree IDs: sample_used_to_construct_trees => [], sample_not_used_to_construct_trees => [],
Sets err_str and returns undef if the specified sample cannot be found (invalid label) or on error.
get_oob_state()
Returns the OOB state for a given sample label and tree ID:
my $val = $rj->get_oob_state( sample => $label, tree_id => $id ) || warn $rj->err_str;
Expected values are 0 (the sample is "in bag" for the tree) or 1 (the sample is "out of bag" for the tree).
Sets err_str and returns undef if sample or tree_id are not defined, or if sample label is invalid.
get_oob_for_tree()
Returns lists of sample labels, by OOB state, for the specified tree ID.
my $href = $rj->get_oob_for_tree( $tree_id ) || warn $rj->err_str;
The href contains the following keys, each of which point to an array reference containing sample labels: in_bag_samples => [], oob_samples => [],
Sets err_str and returns undef if the specified tree ID cannot be found (invalid) or on error.
summary_data()
Returns an href containing a summary of the data that is loaded into the db:
my $href = $rj->summary_data();
$href contains the output of other methods in this class, and it has the following structure:
filenames => get_filenames(),
rj_params => get_rj_input_params(),
variable_labels => get_variable_labels() and see below,
sample_labels => get_sample_labels() and see below,
tree_ids => get_tree_ids() and see below,
The keys variable_labels, sample_labels, and tree_ids all point to hrefs. Each href has the following structure:
all_labels => $aref, (for variable_labels and sample_labels)
all_ids => $aref, (for tree_ids only)
first => $val, (the first element of the all* aref)
last => $val, (the last element of the all* aref)
count => $val, (the size of the all* aref)
set_err()
Sets the error message (provided as a parameter) and creates a stack trace:
$rj->set_err( 'Something went boom' );
err_str()
Returns the last error message that was set:
my $msg = $rj->err_str;
err_trace()
Returns a backtrace for the last error that was encountered:
my $trace = $rj->err_trace;
SEE ALSO
RandomJungle::Jungle, RandomJungle::Tree, RandomJungle::Tree::Node, RandomJungle::XML, RandomJungle::OOB, RandomJungle::RAW, RandomJungle::DB, RandomJungle::Classification_DB
AUTHOR
Robert R. Freimuth
COPYRIGHT
Copyright (c) 2011 Mayo Foundation for Medical Education and Research. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.