Bio::SeqIO - Handler for SeqIO Formats
use Bio::SeqIO; $in = Bio::SeqIO->new(-file => "inputfilename" , -format => 'Fasta'); $out = Bio::SeqIO->new(-file => ">outputfilename" , -format => 'EMBL'); while $seq ( $in->next_seq() ) { $out->write_seq($out); }
or
use Bio::SeqIO; $in = Bio::SeqIO->new(-file => "inputfilename" , -format => 'Fasta'); $out = Bio::SeqIO->new(-file => ">outputfilename" , -format => 'EMBL'); tie INPUT, 'Bio::SeqIO::Handler', $in; tie OUTPUT, 'Bio::SeqIO::Handler', $out; while $seq ( <INPUT> ) { print OUTPUT $seq; }
Bio::SeqIO is a handler module for the formats in the SeqIO set (eg, Bio::SeqIO::Fasta). It is the officially sanctioned way of getting at the format objects, which most people should use.
The SeqIO system replaces the old parse_XXX functions in the Seq object.
The idea is that you request a stream object for a particular format. All the stream objects have a notion of an internal file that is read from or written to (the same object handles both input and output). A physical example of a stream object is the Bio::SeqIO::Fasta object.
Each stream object has functions
$stream->next_seq();
and
$stream->write_seq($seq);
also
$stream->type() # returns 'INPUT' or 'OUTPUT'
As an added bonus, any stream object can be tied to the SeqIO handler, so that you can use it as if it were a perl style file handle, except rather than producing (or writing) lines it produces (or writes) string objects. So - you can do this
use Bio::SeqIO; $stream = Bio::SeqIO->new(-file => $filename , -format => 'Fasta'); tie *SEQ, 'Bio::SeqIO::Handler' , $stream; while $seq ( <SEQ> ) { # do something with $seq }
print SEQ $seq; # when stream is in output mode
This makes the simplest ever reformatter
#!/usr/local/bin/perl $format1 = shift; $format2 = shift || die "Usage: reformat format1 format2 < input > output"; use Bio::SeqIO; $in = Bio::SeqIO->new(-fh => \*STDIN , -format => $format1 ); $out = Bio::SeqIO->new(-fh => \*STDOUT , -format => $format2 ); tie INPUT, 'Bio::SeqIO::Handler', $in; tie OUTPUT, 'Bio::SeqIO::Handler', $out; while $seq ( <INPUT> ) { print OUTPUT $seq; }
Notice that the reformatter will only convert information that is held in the Seq object, which at the moment is only the sequence and the id. More information will be converted through the expanded or larger object which the bioperl developers are talking about.
It is not good for reformatting genbank to embl therefore, but was never designed for this task anyway.
User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.
vsns-bcd-perl@lists.uni-bielefeld.de - General discussion vsns-bcd-perl-guts@lists.uni-bielefeld.de - Technically-oriented discussion http://bio.perl.org/MailList.html - About the mailing lists
Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via email or the web:
bioperl-bugs@bio.perl.org http://bio.perl.org/bioperl-bugs/
Email birney@sanger.ac.uk
Describe contact details here
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
Title : new Usage : $stream = Bio::SeqIO->new(-file => $filename, -format => 'Format') Function: Returns a new seqstream Returns : A Bio::SeqIO::Handler initialsed with the appropiate format Args : -file => $filename -format => format -fh => filehandle to attach to
Title : _load_format_module Usage : *INTERNAL SeqIO stuff* Function: Loads up (like use) a module at run time on demand Example : Returns : Args :
To install Bio::Seq, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::Seq
CPAN shell
perl -MCPAN -e shell install Bio::Seq
For more information on module installation, please visit the detailed CPAN module installation guide.