NAME

Bio::ToolBox::db_helper::bigwig

DESCRIPTION

This module provides support for binary BigWig files to the Bio::ToolBox package. It also supports a directory of one or more bigWig files as a combined database, known as a BigWigSet.

USAGE

The module requires Bio::DB::BigWig to be installed, which in turn requires the UCSC Kent C library to be installed.

In general, this module should not be used directly. Use the methods available in Bio::ToolBox::db_helper or <Bio::ToolBox::Data>.

All subroutines are exported by default.

Available subroutines

open_bigwig_db

This subroutine will open a BigWig database connection. Pass either the local path to a bigWig file (.bw or .bigwig extension) or the URL of a remote bigWig file. It will return the opened database object.

open_bigwigset_db

This subroutine will open a BigWigSet database connection using a directory of BigWig files and one metadata index file, as described in Bio::DB::BigWigSet. Essentially, this treats a directory of BigWig files as a single database with each BigWig file representing a different feature with unique attributes (type, source, strand, etc).

Pass the subroutine a scalar value representing the local path to the directory. It presumes a feature_type of 'region', as expected by the other Bio::ToolBox db_helper subroutines and modules. It will return the opened database object.

collect_bigwig_score

This subroutine will collect a single value from a binary bigWig file. It uses the low-level summary method to collect the statistical information and is therefore significantly faster than the other methods, which rely upon parsing individual data points across the region.

The subroutine is passed a parameter array reference. See "Data Collection Parameters Reference" below for details.

The object will return either a valid score or a null value.

collect_bigwigset_score

Similar to "collect_bigwig_score" but using a BigWigSet database of BigWig files. Unlike individual BigWig files, BigWigSet features support stranded data collection if a strand attribute is defined in the metadata file.

The subroutine is passed a parameter array reference. See "Data Collection Parameters Reference" below for details.

collect_bigwig_scores

This subroutine will collect only the score values from a binary BigWig file for the specified database region. The positional information of the scores is not retained.

The subroutine is passed a parameter array reference. See "Data Collection Parameters Reference" below for details.

The subroutine returns an array or array reference of the requested dataset values found within the region of interest.

collect_bigwigset_scores

Similar to "collect_bigwig_scores" but using a BigWigSet database of BigWig files. Unlike individual BigWig files, BigWigSet features support stranded data collection if a strand attribute is defined in the metadata file.

The subroutine is passed a parameter array reference. See "Data Collection Parameters Reference" below for details.

collect_bigwig_position_scores

This subroutine will collect the score values from a binary BigWig file for the specified database region keyed by position.

The subroutine is passed a parameter array reference. See "Data Collection Parameters Reference" below for details.

The subroutine returns a hash of the defined dataset values found within the region of interest keyed by position. Note that only one value is returned per position, regardless of the number of dataset features passed. Usually this isn't a problem as only one dataset is examined at a time.

collect_bigwigset_position_score

Similar to "collect_bigwig_position_scores" but using a BigWigSet database of BigWig files. Unlike individual BigWig files, BigWigSet features support stranded data collection if a strand attribute is defined in the metadata file.

The subroutine is passed a parameter array reference. See below for details.

Data Collection Parameters Reference

The data collection subroutines are passed an array reference of parameters. The recommended method for data collection is to use the "get_segment_score" in Bio::ToolBox::db_helper method.

The parameters array reference includes these items:

1. chromosome
1. start coordinate
3. stop coordinate

Coordinates are in BioPerl-style 1-base system.

4. strand

Should be standard BioPerl representation: -1, 0, or 1.

5. strandedness

A scalar value representing the desired strandedness of the data to be collected. Acceptable values include "sense", "antisense", or "all". Only those scores which match the indicated strandedness are collected.

6. score method

Acceptable values include mean, min, max, stddev, sum, and count. Used when collecting a single value over a genomic segnment.

Note: methods of pcount and ncount are technically supported, but are treated the same as count.

7. A database object.

Pass the opened Bio::DB::BigWigSet database object when working with BigWigSets. Otherwise, pass undef for BigWig files.

8. Dataset name

For BigWig files, pass the path of the local or URL of a remote bigWig file. Opened BigWig objects are cached.

For BigWigSet databases, pass the name of the dataset within the BigWigSet database to use. Either the name or type may be used.

Additional dataset items may be added to the list when merging data.

SEE ALSO

Bio::ToolBox::Data::Feature, Bio::ToolBox::db_helper, Bio::DB::BigWig, Bio::DB::BigWigSet

AUTHOR

 Timothy J. Parnell, PhD
 Howard Hughes Medical Institute
 Dept of Oncological Sciences
 Huntsman Cancer Institute
 University of Utah
 Salt Lake City, UT, 84112

This package is free software; you can redistribute it and/or modify it under the terms of the Artistic License 2.0.