Bio::ToolBox::Data::core - Common functions to Bio:ToolBox::Data family
Common methods for metadata and manipulation in a Bio::ToolBox::Data data table and Bio::ToolBox::Data::Stream file stream. This module should not be used directly. See the respective modules for more information.
For quick reference only. Please see Bio::ToolBox::Data for implementation.
Generate new object. Used as a common base for Bio::ToolBox::Data and Bio::ToolBox::Data::Stream.
Verify the integrity of the Data object. Checks multiple things, including metadata, table integrity (consistent number of rows and columns), and special file format structure.
This is wrapper method that tries to do the right thing and passes on to either "open_meta_database" or "open_new_database" methods. Basically a legacy method for "open_meta_database".
Open the database that is listed in the metadata. Returns the database connection. Pass a true value to force a new database connection to be opened, rather than returning a cached connection object (useful when forking).
Convenience method for opening a second or new database that is not specified in the metadata, useful for data collection. This is a shortcut to "open_db_connection" in Bio::ToolBox::db_helper. Pass the database name.
Verifies the existence of a dataset or data file before collecting data from it. Multiple datasets may be verified. This is a convenience method to "verify_or_request_feature_types" in Bio::ToolBox::db_helper. Pass the name of the dataset to verify.
Delete one or more columns in a data table. Pass a list of the indices to delete.
Reorder the columns in a data table. Allows for skipping (deleting) and duplicating columns. Pass a list of the new index order.
Returns or sets the string of the feature name listed in the metadata.
Returns "named", "coordinate", or "unknown" based on what kind of feature is present in the data table.
Returns or sets the program string in the metadata.
Returns or sets the name of the database in the metadata.
Returns or sets the short name of bam adapter being used: "sam" or "hts".
Returns or sets the short name of the bigWig and bigBed adapter being used: "ucsc" or "big".
Returns a text string describing the format of the file contents, such as gff3, gtf, bed, genePred, narrowPeak, etc.
gff3
gtf
bed
genePred
narrowPeak
Returns or sets the GFF version value in the metadata.
Returns or sets the number of BED columns in the metadata.
Returns or sets the number of columns in a UCSC-type file format, including genePred and refFlat.
Returns or sets the VCF version value in the metadata.
Returns the number of columns in the data table.
Returns the number of rows in the data table.
Returns the array index of the last column in the data table.
Returns the array index of the last row in the data table.
Returns the complete filename listed in the metadata.
Returns the base name of the filename listed in the metadata.
Returns the path portion of the filename listed in the metadata.
Returns the recognized extension of the filename listed in the metadata.
Returns an array of comment lines present in the metadata.
Adds a string to the list of comments to be included in the metadata.
Deletes the indicated array index from the metadata comments array.
Partially parses VCF metadata header lines into a hash structure.
Rewrites the vcf headers back into the metadata comments array.
Returns an array of the column names
Returns or sets the name of the column. Pass the index, and optionally new name.
Returns or sets the metadata key/value pair for a specific column. Pass the index, key, and optionally new value.
Deletes the metadata key for a column. Pass the index and key.
Copies the metadata values from one column to another column. Pass the source and target indices.
Returns the column index for the column with the specified name. Name searches are case insensitive and can tolerate a # prefix character. The first match is returned. Pass the name to search.
Returns the index of the column that best represents the chromosome column.
Returns the index of the column that best represents the start, position, or transcription start column.
Returns the index of the column that best represents the stop or end column.
Returns the index of the column that best represents the strand.
Returns the index of the column that best represents the name.
Returns the index of the column that best represents the type.
Returns the index of the column that represents the Primary_ID column used in databases.
Returns the index of the column that represents the Score column in certain formats, such as GFF, BED, bedGraph, etc.
Returns true (1) or false (0) if the coordinate system appears to be an interbase, half-open, or zero-based coordinate system. This is based on file type, e.g. .bed, or if the start coordinate column name is start0. The coordinate system can also be explicitly changed by passing an appropriate value; note that this will also change the start coordinate column name as appropriate.
start0
Returns the stored SeqFeature object for a given row.
Bio::ToolBox::Data
Timothy J. Parnell, PhD Dept of Oncological Sciences Huntsman Cancer Institute University of Utah Salt Lake City, UT, 84112
This package is free software; you can redistribute it and/or modify it under the terms of the Artistic License 2.0.
To install Bio::ToolBox, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::ToolBox
CPAN shell
perl -MCPAN -e shell install Bio::ToolBox
For more information on module installation, please visit the detailed CPAN module installation guide.