The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::Lite - Lightweight and fast module with a simplified API to ease scripting in bioinformatics

VERSION

version 0.003

SYNOPSIS

  # Reverse complementing a sequence
  my $seq = reverseComplemente("ATGC");

  # Reading a FASTQ file
  my $it = seqFileIterator('file.fastq','fastq');
  while(my $entry = $it->()) {
    print "Sequence name   : $entry->{name}
           Sequence        : $entry->{seq}
           Sequence quality: $entry->{qual}","\n";
  }

  # Reading paired-end files easier
  my $it = pairedEndSeqFileIterator($file);
  while (my $entry = $it->()) {
    print "Read_1 : $entry->{read1}->{seq}
           Read_2 : $entry->{read2}->{seq}";
  }

  # Parsing a GFF file
  my $it = gffFileIterator($file);
  while (my $annot = $it->()) {
    print "chr    : $annot->{chr}
           start  : $annot->{start}
           end    : $annot->{end}";
  }

DESCRIPTION

Bio::Lite is a set of subroutines that aims to answer similar questions as Bio-perl distribution in a FAST and SIMPLE way.

Bio::Lite does not make use of complexe data struture, or objects, that would lead to a slow execution.

All methods can be imported with a single "use Bio::Lite".

Bio::Lite is a lightweight-single-module with NO DEPENDENCIES.

UTILS

reverseComplemente

Reverse complemente the (nucleotid) sequence in arguement.

Example:

  my $seq_revcomp = reverseComplemente($seq);

reverseComplemente is more than 100x faster than Bio-Perl revcom_as_string()

convertStrand

Convert strand from '+/-' standard to '1/-1' standard and the opposite.

Example:

  say "Forward a: ",convertStrand('+');
  say "Forward b: ",convertStrand(1);
  say "Reverse a: ",convertStrand('-');
  say "Reverss b: ",convertStrand(-1);

will print

  Forward a: 1
  Forward b: +
  Reverse a: -1
  Reverse b: -

PARSING

This are some tools that aim to read (bio) files like

Sequence files : FASTA, FASTQ
Annotation files : GFF3, GTF2, BED6, BED12, ...
Alignement files : SAM, BAM

seqFileIterator

Open Fasta, or Fastq files (can be gziped). seqFileIterator has an automatic file extension detection but you can force it using a second parameter with the format : 'fasta' or 'fastq'.

Example:

  my $it = seqFileIterator('file.fastq','fastq');
  while(my $entry = $it->()) {
    print "Sequence name   : $entry->{name}
           Sequence        : $entry->{seq}
           Sequence quality: $entry->{qual}","\n";
  }

Return: HashRef

  { name => 'sequence_identifier',
    seq  => 'sequence_value',
    qual => 'sequence_quality', # only defined for FASTQ files
  }

seqFileIterator is more than 50x faster than Bio-Perl Bio::SeqIO for FASTQ files seqFileIterator is 4x faster than Bio-Perl Bio::SeqIO for FASTA files

pairedEndSeqFileIterator

Open Paired-End Sequence files using seqFileIterator()

Paird-End files are generated by Next Generation Sequencing technologies (like Illumina) where two reads are sequenced from the same DNA fragment and saved in separated files.

Example:

  my $it = pairedEndSeqFileIterator($file);
  while (my $entry = $it->()) {
    print "Read_1 : $entry->{read1}->{seq}
           Read_2 : $entry->{read2}->{seq}";
  }

Return: HashRef

  { read1 => 'see seqFileIterator() return',
    read2 => 'see seqFileIterator() return'
  }

pairedEndSeqFileIterator has no equivalent in Bio-Perl

gffFileIterator

manage GFF3 and GTF2 file format

Example:

  my $it = gffFileIterator($file);
  while (my $annot = $it->()) {
    print "chr    : $annot->{chr}
           start  : $annot->{start}
           end    : $annot->{end}";
  }

Return a hashref with the annotation parsed:

  { chr         => 'field_1',
    source      => 'field_2',
    feature     => 'field_3',
    start       => 'field_4',
    end         => 'field_5',
    score       => 'field_6',
    strand      => 'field_7',
    frame       => 'field_8'
    attributes  => { 'attribute_id' => 'attribute_value', ...}
  }

gffFileIterator is 5x faster than Bio-Perl Bio::Tools::GFF

FILES IO

getReadingFileHandle

Return a file handle for the file in argument. Display errors if file cannot be oppenned and manage gzipped files (based on .gz file extension)

Example:

  my $fh = getReadingFileHandle('file.txt.gz');
  while(<$fh>) {
    print $_;
  }
  close $fh;

getWritingFileHandle

Return a file handle for the file in argument. Display errors if file cannot be oppenned and manage gzipped files (based on .gz file extension)

Example:

  my $fh = getWritingFileHandle('file.txt.gz');
  print $fh "Hello world\n";
  close $fh;

TODO

AUTHOR

Jérôme Audoux <jaudoux@cpan.org>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2014 by Jérôme Audoux.

This is free software, licensed under:

  The GNU General Public License, Version 3, June 2007