The Perl Advent Calendar needs more articles for 2022. Submit your idea today!


Bio::Phylo::Parsers::Fasta - Parser used by Bio::Phylo::IO, no serviceable parts inside


A very symplistic FASTA file parser. To use it, you need to pass an argument that specifies the data type of the FASTA records into the parse function, i.e.

 my $project = parse(
    -type   => 'dna', # or rna, protein
    -format => 'fasta',
    -file   => 'infile.fa',
    -as_project => 1

For each FASTA record, the first "word" on the definition line is used as the name of the produced datum object. The entire line is assigned to:

 $datum->set_generic( 'fasta_def_line' => $line )

So you can retrieve it by calling:

 my $line = $datum->get_generic('fasta_def_line');

BioPerl actually parses definition lines to get GIs and such out of there, so if you're looking for that, use Bio::SeqIO from the bioperl-live distribution. You can always pass the resulting Bio::Seq objects to Bio::Phylo::Matrices::Datum->new_from_bioperl to turn the Bio::Seq objects that Bio::SeqIO produces into Bio::Phylo::Matrices::Datum objects.


There is a mailing list at!forum/bio-phylo for any user or developer questions and discussions.


The fasta parser is called by the Bio::Phylo::IO object. Look there to learn more about parsing.


Also see the manual: Bio::Phylo::Manual and


If you use Bio::Phylo in published research, please cite it:

Rutger A Vos, Jason Caravas, Klaas Hartmann, Mark A Jensen and Chase Miller, 2011. Bio::Phylo - phyloinformatic analysis using Perl. BMC Bioinformatics 12:63.