Bio::DB::GFF::Adaptor::dbi::mysqlopt -- Optimized Bio::DB::GFF adaptor for mysql
See Bio::DB::GFF
This adaptor is similar to Bio::DB::GFF::Adaptor::mysqlopt, except that it implements several optimizations:
It uses a hierarchical binning scheme to dramatically accelerate feature queries that use positional information.
Because mysql is slow when fetching substrings out of large text BLOBs, this adaptor uses Bio::DB::Fasta to fetch DNA segments rapidly. out of FASTA files.
Features can be linked to ACEDB objects, allowing this module to be used as a replacement for the Ace::Sequence module.
The schema is identical to Bio::DB::GFF::Adaptor::dbi, except for the fdata table:
fid feature ID (integer) fref reference sequence name (string) fstart start position relative to reference (integer) fstop stop postion relative to reference (integer) fbin bin containing this feature (float) ftypeid feature type ID (integer) fscore feature score (float); may be null fstrand strand; one of "+" or "-"; may be null fphase phase; one of 0, 1 or 2; may be null gid group ID (integer) ftarget_start for similarity features, the target start position (integer) ftarget_stop for similarity features, the target stop position (integer)
The only difference is the "fbin" field, which indicates the interval in which the feature is contained. This module uses a hierarchical set of bins, the smallest of which are 1 kb, and the largest is 100 megabases.
In the call to initialize() you can set the following options:
-minbin minimum value to use for binning -maxbin maximum value to use for binning -straight_join_limit size of range over which it is faster to force mysql to use the range for indexing
-minbin and -maxbin indicate the minimum and maximum sizes of features, and are important for range query optimization. They are set at reasonable values -- in particular, the maximum bin size is set to 100 megabases. Do not change them unless you know what you are doing.
Title : new Usage : $db = Bio::DB::GFF->new(@args) Function: create a new adaptor Returns : a Bio::DB::GFF object Args : see below Status : Public
The new constructor is identical to the "dbi" adaptor's new() method, except that the prefix "dbi:mysql" is added to the database DSN identifier automatically if it is not there already.
Argument Description -------- ----------- -dsn the DBI data source, e.g. 'dbi:mysql:ens0040' or "ens0040" -fasta path to a directory containing FASTA files for this database (e.g. "/usr/local/share/fasta") -acedb an acedb URL to use when converting features into ACEDB objects (e.g. sace://localhost:2005) -user username for authentication -pass the password for authentication -minbin minimum value to use for binning -maxbin maximum value to use for binning
The path indicated by -fasta must be writable by the current process. This is needed in order to build an index of the fasta files.
Title : freshen Usage : $flag = Bio::DB::GFF->freshen_ace; Function: Refresh internal acedb handle Returns : flag if correctly freshened Args : none Status : Public
ACeDB has an annoying way of timing out, leaving dangling database handles. This method will invoke the ACeDB reopen() method, which causes dangling handles to be refreshed. It has no effect if you are not using ACeDB to create ACeDB objects.
To install Bio::Seq, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::Seq
CPAN shell
perl -MCPAN -e shell install Bio::Seq
For more information on module installation, please visit the detailed CPAN module installation guide.