The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::ToolBox::Data::core - Common functions to Bio:ToolBox::Data family

DESCRIPTION

Common methods for metadata and manipulation in a Bio::ToolBox::Data data table and Bio::ToolBox::Data::Stream file stream. This module should not be used directly. See the respective modules for more information.

METHODS REFERENCE

For reference only. Please use Bio::ToolBox::Data

new

Generate new object.

verify

Verify the integrity of the Data object. Checks multiple things, including metadata, table integrity (consistent number of rows and columns), and special file format structure.

open_database

Open the database that is listed in the metadata. Returns the database connection. Pass a true value to force a new database connection to be opened, rather than returning a cached connection object (useful when forking).

verify_dataset($dataset)

Verifies the existence of a dataset or data file before collecting data from it. Multiple datasets may be verified. This is a convenience method to Bio::ToolBox::db_helper::verify_or_request_feature_types().

delete_column(@indices)

Delete one or more columns in a data table.

reorder_column(@indices)

Reorder the columns in a data table. Allows for skipping (deleting) and duplicating columns.

feature

Returns or sets the string of the feature name listed in the metadata.

feature_type

Returns "named", "coordinate", or "unknown" based on what kind of feature is present in the data table.

program

Returns or sets the program string in the metadata.

database

Returns or sets the name of the database in the metadata.

gff

Returns or sets the GFF version value in the metadata.

bed

Returns or sets the number of BED columns in the metadata.

ucsc

Returns or sets the number of columns in a UCSC-type file format, including genePred and refFlat.

vcf

Returns or sets the VCF version value in the metadata.

number_columns

Returns the number of columns in the data table.

last_row

Returns the array index of the last row in the data table.

filename

Returns the complete filename listed in the metadata.

basename

Returns the base name of the filename listed in the metadata.

path

Returns the path portion of the filename listed in the metadata.

extension

Returns the recognized extension of the filename listed in the metadata.

comments

Returns an array of comment lines present in the metadata.

add_comment($string)

Adds a string to the list of comments to be included in the metadata.

delete_comment($index)

Deletes the indicated array index from the metadata comments array.

list_columns

Returns an array of the column names

name($index)
name($index, $newname)

Returns or sets the name of the column.

metadata($index, $key)
metadata($index, $key, $value)

Returns or sets the metadata key/value pair for a specific column.

delete_metadata($index, $key)

Deletes the metadata key for a column.

copy_metadata($source, $target)

Copies the metadata values from one column to another column.

find_column("string")

Returns the column index for the column with the specified name. Name searches are case insensitive and can tolerate a # prefix character. The first match is returned.

chromo_column

Returns the index of the column that best represents the chromosome column.

start_column

Returns the index of the column that best represents the start, position, or transcription start column.

stop_column
end_column

Returns the index of the column that best represents the stop or end column.

strand_column

Returns the index of the column that best represents the strand.

name_column

Returns the index of the column that best represents the name.

type_column

Returns the index of the column that best represents the type.

id_column

Returns the index of the column that represents the Primary_ID column used in databases.

AUTHOR

 Timothy J. Parnell, PhD
 Dept of Oncological Sciences
 Huntsman Cancer Institute
 University of Utah
 Salt Lake City, UT, 84112

This package is free software; you can redistribute it and/or modify it under the terms of the Artistic License 2.0.