Bio::Lite - Lightweight and fast module with a simplified API to ease scripting in bioinformatics
version 0.003
# Reverse complementing a sequence my $seq = reverseComplemente("ATGC"); # Reading a FASTQ file my $it = seqFileIterator('file.fastq','fastq'); while(my $entry = $it->()) { print "Sequence name : $entry->{name} Sequence : $entry->{seq} Sequence quality: $entry->{qual}","\n"; } # Reading paired-end files easier my $it = pairedEndSeqFileIterator($file); while (my $entry = $it->()) { print "Read_1 : $entry->{read1}->{seq} Read_2 : $entry->{read2}->{seq}"; } # Parsing a GFF file my $it = gffFileIterator($file); while (my $annot = $it->()) { print "chr : $annot->{chr} start : $annot->{start} end : $annot->{end}"; }
Bio::Lite is a set of subroutines that aims to answer similar questions as Bio-perl distribution in a FAST and SIMPLE way.
Bio::Lite does not make use of complexe data struture, or objects, that would lead to a slow execution.
All methods can be imported with a single "use Bio::Lite".
Bio::Lite is a lightweight-single-module with NO DEPENDENCIES.
Reverse complemente the (nucleotid) sequence in arguement.
Example:
my $seq_revcomp = reverseComplemente($seq);
reverseComplemente is more than 100x faster than Bio-Perl revcom_as_string()
Convert strand from '+/-' standard to '1/-1' standard and the opposite.
say "Forward a: ",convertStrand('+'); say "Forward b: ",convertStrand(1); say "Reverse a: ",convertStrand('-'); say "Reverss b: ",convertStrand(-1);
will print
Forward a: 1 Forward b: + Reverse a: -1 Reverse b: -
This are some tools that aim to read (bio) files like
Open Fasta, or Fastq files (can be gziped). seqFileIterator has an automatic file extension detection but you can force it using a second parameter with the format : 'fasta' or 'fastq'.
my $it = seqFileIterator('file.fastq','fastq'); while(my $entry = $it->()) { print "Sequence name : $entry->{name} Sequence : $entry->{seq} Sequence quality: $entry->{qual}","\n"; }
Return: HashRef
{ name => 'sequence_identifier', seq => 'sequence_value', qual => 'sequence_quality', # only defined for FASTQ files }
seqFileIterator is more than 50x faster than Bio-Perl Bio::SeqIO for FASTQ files seqFileIterator is 4x faster than Bio-Perl Bio::SeqIO for FASTA files
Open Paired-End Sequence files using seqFileIterator()
Paird-End files are generated by Next Generation Sequencing technologies (like Illumina) where two reads are sequenced from the same DNA fragment and saved in separated files.
my $it = pairedEndSeqFileIterator($file); while (my $entry = $it->()) { print "Read_1 : $entry->{read1}->{seq} Read_2 : $entry->{read2}->{seq}"; }
{ read1 => 'see seqFileIterator() return', read2 => 'see seqFileIterator() return' }
pairedEndSeqFileIterator has no equivalent in Bio-Perl
manage GFF3 and GTF2 file format
my $it = gffFileIterator($file); while (my $annot = $it->()) { print "chr : $annot->{chr} start : $annot->{start} end : $annot->{end}"; }
Return a hashref with the annotation parsed:
{ chr => 'field_1', source => 'field_2', feature => 'field_3', start => 'field_4', end => 'field_5', score => 'field_6', strand => 'field_7', frame => 'field_8' attributes => { 'attribute_id' => 'attribute_value', ...} }
gffFileIterator is 5x faster than Bio-Perl Bio::Tools::GFF
Return a file handle for the file in argument. Display errors if file cannot be oppenned and manage gzipped files (based on .gz file extension)
my $fh = getReadingFileHandle('file.txt.gz'); while(<$fh>) { print $_; } close $fh;
my $fh = getWritingFileHandle('file.txt.gz'); print $fh "Hello world\n"; close $fh;
Jérôme Audoux <jaudoux@cpan.org>
This software is Copyright (c) 2014 by Jérôme Audoux.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007
To install Bio::Lite, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::Lite
CPAN shell
perl -MCPAN -e shell install Bio::Lite
For more information on module installation, please visit the detailed CPAN module installation guide.