Bio::MUST::Core::Ali::Temporary - Thin wrapper for a temporary mapped Ali written on disk
version 0.182420
#!/usr/bin/env perl use Modern::Perl '2011'; # same as: # use strict; # use warnings; # use feature qw(say); use Bio::MUST::Core; use aliased 'Bio::MUST::Core::Ali::Temporary'; # build Ali::Temporary object from existing ALI file my $temp_db = Temporary->new( seqs => 'database.ali' ); # get properties my $db = $temp_db->filename; my $dbtype = $temp_db->type; # pass it to external program system("makeblastdb -in $db -dbtype $dbtype"); # alternative constructor call # build Ali::Temporary object from existing Ali object use aliased 'Bio::MUST::Core::Ali'; my $ali = Ali->load('queries.ali'); my $temp_qu = Temporary->new( seqs => $ali ); # pass it to external program use File::Temp; my $query = $temp_qu->filename; my $out = File::Temp->new( UNLINK => 0, SUFFIX => '.blastp' ); system("blastp -query $query -db $db -out $out"); say "report: $out"; # later... when parsing the BLAST report # let's say $id is a BLAST hit in database.ali my $id = 'seq2'; my $long_id = $temp_db->long_id_for($id); say "hit id: $long_id"; # ... # more alternative constructor calls # build Ali::Temporary object from list of Seq objects my @seqs = $ali->filter_seqs( sub { $_->seq_len >= 500 } ); my $temp_ls = Temporary->new( seqs => \@seqs ); # build Ali::Temporary object preserving gaps in Seq objects # (and persistent associated FASTA file) my $temp_gp = Temporary->new( seqs => \@seqs, args => { degap => 0, persistent => 1 } ); my $filename = $temp_gp->filename; # later... unlink $filename;
This module implements a class representing a temporary FASTA file where sequence ids are automatically abbreviated (seq1, seq2...) for maximum compatibility with external programs. To this end, it combines an internal Bio::MUST::Core::Ali object and a Bio::MUST::Core::IdMapper object.
seq1
seq2
An Ali::Temporary can be built from an existing ALI (or FASTA) file or on-the-fly from a list (ArrayRef) of Bio::MUST::Core::Seq objects (see the SYNOPSIS for examples).
Ali::Temporary
Its sequences can be aligned or not but by default sequences are degapped before writing the associated temporary FASTA file. If gaps are to be preserved, this behavior can be altered via the optional args attribute.
args
Bio::MUST::Core::Ali object (required)
This required attribute contains the Bio::MUST::Core::Seq objects that are written in the associated temporary FASTA file. It can be specified either as a path to an ALI/FASTA file or as an Ali object or as an ArrayRef of Seq objects (see the SYNOPSIS for examples).
Ali
Seq
For now, it provides the following methods: count_comments, all_comments, get_comment, guessing, all_seq_ids, has_uniq_ids, is_protein, is_aligned, get_seq, get_seq_with_id, first_seq, all_seqs, filter_seqs and count_seqs (see Bio::MUST::Core::Ali).
count_comments
all_comments
get_comment
guessing
all_seq_ids
has_uniq_ids
is_protein
is_aligned
get_seq
get_seq_with_id
first_seq
all_seqs
filter_seqs
count_seqs
HashRef (optional)
When specified this optional attribute is passed to the temp_fasta method of the internal Ali object. Its purpose is to allow the fine-tuning of the format of the associated temporary FASTA file.
temp_fasta
By default, its contents is <clean = 1>> and <degap = 1>>, so as to generate a FASTA file of degapped sequences where ambiguous and missing states are replaced by X.
<clean =
<degap =
X
Additionally, if you want to keep your temporary files around for debugging purposes, you can pass the option <persistent = 1>>. This will disable the autoremoval of the file on object destruction.
<persistent =
Path::Class::File object (auto)
This attribute is automatically initialized with the path of the associated temporary FASTA file. Thus, it cannot be user-specified.
It provides the following methods: remove and filename (see below).
remove
filename
Bio::MUST::Core::IdMapper object (auto)
This attribute is automatically initialized with the mapper associating the long ids of the internal Ali object to the abbreviated ids used in the associated temporary FASTA file. Thus, it cannot be user-specified.
It provides the following methods: all_long_ids, all_abbr_ids, long_id_for and abbr_id_for (see Bio::MUST::Core::IdMapper).
all_long_ids
all_abbr_ids
long_id_for
abbr_id_for
Returns the stringified filename of the associated temporary FASTA file.
This method does not accept any arguments.
Returns the type of the sequences in the internal Ali object using BLAST denomination (prot or nucl). See Bio::MUST::Core::Seq::is_protein for the exact test performed.
prot
nucl
Remove (unlink) the associated temporary FASTA file.
Since this method is in principle automatically invoked on object destruction, users should not need it. Note that persistent temporary files (see object constructor) have to be removed manually, which requires to get and store their filename before object destruction.
persistent
Denis BAURAIN <denis.baurain@uliege.be>
This software is copyright (c) 2013 by University of Liege / Unit of Eukaryotic Phylogenomics / Denis BAURAIN.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Bio::MUST::Core, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::MUST::Core
CPAN shell
perl -MCPAN -e shell install Bio::MUST::Core
For more information on module installation, please visit the detailed CPAN module installation guide.