Bio::Phylo::Parsers::Newick - Parser used by Bio::Phylo::IO, no serviceable parts inside
This module parses tree descriptions in parenthetical format. It is called by the Bio::Phylo::IO facade, don't call it directly. Several additional flags can be passed to the Bio::Phylo::IO parse and parse_tree functions to influence how to deal with complex newick strings:
-keep => [ ...list of taxa names... ]
-keep flag allows you to only retain certain taxa of interest, ignoring others while building the tree object.
-ignore_comments => 1,
This will treat comments in square brackets as if they are a normal taxon name character, this so that names such as
Choristoneura diversana|BC ZSM Lep 23401[05/* are parsed "successfully". (Note: square brackets should NOT be used in this way as it will break many parsers).
-keep_whitespace => 1,
This will treat unescaped whitespace as if it is a normal taxon name character. Normally, whitespace is only retained inside quoted strings (e.g.
'Homo sapiens'), otherwise it is the convention to use underscores (
Homo_sapiens). This is because some programs introduce whitespace to prettify a newick string, e.g. to indicate indentation/depth, in which case you almost certainly want to ignore it. This is the default behaviour. The option to keep it is provided for dealing with incorrectly formatted data.
There is a mailing list at https://groups.google.com/forum/#!forum/bio-phylo for any user or developer questions and discussions.
The newick parser is called by the Bio::Phylo::IO object. Look there to learn how to parse newick strings.
If you use Bio::Phylo in published research, please cite it:
Rutger A Vos, Jason Caravas, Klaas Hartmann, Mark A Jensen and Chase Miller, 2011. Bio::Phylo - phyloinformatic analysis using Perl. BMC Bioinformatics 12:63. http://dx.doi.org/10.1186/1471-2105-12-63