NAME

Bio::FASTASequence - Parsing sequence informations in FASTA format.

VERSION

version 0.07

SYNOPSIS

  use Bio::FASTASequence;
  my $fasta = qq~>sp|P01815|HV2B_HUMAN Ig heavy chain V-II region COR - Homo sapiens (Human).
QVTLRESGPALVKPTQTLTLTCTFSGFSLSSTGMCVGWIRQPPGKGLEWLARIDWDDDKY
YNTSLETRLTISKDTSRNQVVLTMDPVDTATYYCARITVIPAPAGYMDVWGRGTPVTVSS
  ~;
  my $seq = Bio::FASTASequence->new($fasta);

DESCRIPTION

This perl module is a simple utility to simplify the job of bioinformatics. It parses several information about a given FASTA-Sequence such as:

accession number
description
sequence itself
length of sequence
crc64 checksum (as it is used by SWISS-PROT)
seq2xml

METHODS

new

getAccessionNr

        my $accession = $seq->getAccessionNr();

returns the AccessionNr of the FASTA-Sequence

getDescription

        my $description = $seq->getDescription();

returns the description standing in the first line of the FASTA-format (without the accession number)

getSequence

        my $sequence = $seq->getSequence();

returns the sequence

getCrc64

        my $crc64_checksum = $seq->getCrc64();

returns the crc64 checksum of the sequence. This checksum corresponds with the crc64 checksum of SWISS-PROT

addDBRef

        $seq->addDBRef(DB, REFERENCE_AC);

DB is the name of the referenced database

REFERENCE_AC is the accession number in the referenced database

seq2file

        $seq->seq2file(FILENAME);

FILENAME is the path of the file where the sequence has to be stored.

allIndexesOf

        my $indexes = $seq->allIndexesOf(EXPR);

returns a reference on an array, which contains all indexes of EXPR in the sequence

getSequenceLength

        my $length = $seq->getSequenceLength();

returns the length of the sequence

getDBRefs

        my $hashref = $seq->getDBRefs();

returns a hashreference. The hash contains all references hashref = {'SWISS-PROT' => 'P01815'},

getFASTA

        my $fasta_sequence = $seq->getFASTA();

returns the sequence in FASTA-format

EXAMPLE

        use Bio::FASTASequence;
        my $fasta = qq~>sp|P01815|HV2B_HUMAN Ig heavy chain V-II region COR - Homo sapiens (Human).
        QVTLRESGPALVKPTQTLTLTCTFSGFSLSSTGMCVGWIRQPPGKGLEWLARIDWDDDKY
        YNTSLETRLTISKDTSRNQVVLTMDPVDTATYYCARITVIPAPAGYMDVWGRGTPVTVSS
        ~;

        my $seq = Bio::FASTASequence->new($fasta);

        print 'The sequence of '.$seq->getAccessionNr().' is '.$seq->getSequence(),"\n";
        print 'This sequence contains '.scalar($seq->allIndexesOf('C').' times Cystein at the following positions:';
        print $_+1.', ' for(@{$seq->allIndexesOf('C')});

ABSTRACT

  Bio::FASTASequence is a perl module to parse information out off a Fasta-Sequence.

ADDITIONAL INFORMATION

accepted formats

This module can parse the following formats:

>P02656 APC3_HUMAN Apolipoprotein C-III precursor (Apo-CIII).
>IPI:IPI00166553|REFSEQ_XP:XP_290586|ENSEMBL:ENSP00000331094|TREMBL:Q8N3H0 T Hypothetical protein
>sp|P01815|HV2B_HUMAN Ig heavy chain V-II region COR - Homo sapiens (Human).

structure

The structure of the hash for the example is:

        $VAR1 = {
                 'seq_length' => 120,
                 'accession_nr' => 'P01815',
                 'text' => 'QVTLRESGPALVKPTQTLTLTCTFSGFSLSSTGMCVGWIRQPPGKGLEWLARIDWDDDKYYNTSLETRLTISKDTSRNQVVLTMDPVDTATYYCARITVIPAPAGYMDVWGRGTPVTVSS',
                 'crc64' => '158A8B29AE7EEB98',
                 'dbrefs' => {},
                 'description' => 'Ig heavy chain V-II region COR - Homo sapiens (Human).'
               }

if you miss something please contact me.

BUGS

There is no bug known. If you experienced any problems, please contact me.

MODIFICATIONS

More FASTA-Description lines are accepted.

AUTHOR

Renee Baecker <reneeb@cpan.org>

COPYRIGHT AND LICENSE

This is free software, licensed under:

  The Artistic License 2.0 (GPL Compatible)

To install Bio::FASTASequence, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Bio::FASTASequence

CPAN shell

perl -MCPAN -e shell
install Bio::FASTASequence

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)