The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

biotree - Tree manipulations based on BioPerl

SYNOPSIS

biotree [options] <tree file>

biotree [-h | --help | -V | --version | --man]

 biotree -t tree.newick                   # preview [t]ext tree
 biotree -l tree.newick                   # total tree [l]ength
 biotree -m tree.newick                   # [m]id-point rooting
 biotree -u tree.newick                   # list all OT[u]s
 biotree -d 'otu1,otu2,otu3' tree.newick  # [d]elete these OTUs
 biotree -s 'otu1,otu2' tree.newick       # [s]ubset these OTUs
 biotree -D '0.9' tree.newick             # [D]elete low-support (< 0.9) branches
 biotree -r 'otu1' tree.newick            # [r]eroot with a OTU as outgroup
 biotree -o 'tabtree' tree.newick         # [o]utput tree in text format
 biotree --ci 'binary-trait' tree         # consistency indices at informative sites

DESCRIPTION

Designed as a UNIX-like utility, biotree reads a tree file and reformats branches and nodes based on these BioPerl modules: Bio::TreeIO, Bio::Tree::Tree, Bio::Tree::Node, and Bio::Tree::TreeFunctionsI.

Trees can be in any format supported by Bio::TreeIO in BioPerl. However, biotree has not been tested on all possible formats, so behavior may be unexpected with some. Currently, biotree does not support multiple trees per file.

biotree supports reading from STDIN, so that multiple tree manipulations could be chained using pipe ("|").

OPTIONS

--as-text, -t

Draw an ASCII tree for quick preview (needs refinement). Default max screen width 100 characters.

--ci, -c 'binary-trait-file'

Attach a file containing binary trait values and prints consistency index for informative sites (not verified)

--clean-br, -b

Remove branch lengths from all nodes.

--clean-boot, -B

Remove all branch support values.

--cut-tree "length"

Identify clades based on branches that bisected by a cut line (half way to max by default).

Through error if cut is greater than the least deep OTU (based on node height, which is the distance from root)

--cut-sis "cutoff"

Prefix node id by "cut_" to cut if the number of descendant nodes are < number

--del-otus, -d 'otu1,out2,etc'

Get a subtree by removing specified OTUs

--del-low-boot, -D 'cutoff'

Remove branches supported by less than specified cutoff value, creating a multi-furcating tree.

--del-short-br, -E 'cutoff'

Remove branches shorter than specified cutoff value, creating a multi-furcating tree.

--depth 'node'

Prints depth to root. Accepts node names and/or IDs.

--dist 'node1,node2'

Prints the distance between a pair of nodes or leaves.

--dist-all

Prints half-matrix list of distances between all leaves.

--ead

Edge-length abundance distribution, a statistics of tree balance (O'Dwyer et al. PNAS 2015)

--ids-all

Print ids for all nodes (internal nodes included) in the order of tree traversal from root

--input, -i 'format'

Input file format. Accepts newick and nhx. Now also a parent-child table.

--label-nodes

Prepends ID to each leaf/node label. Useful when identifying unlabled nodes, such as when using --otus-desc or --subset.

--label-selected-nodes 'file'

Adds clade labels to selected internal nodes, based on a file containing, on each line, an internal id and a label. Internal id could be obtained by using "--label nodes" or "-U 'all'".

Nodes not in the file are unlabeled (or removed if bootstrap value exists).

--lca 'node1,node2,node3,etc'

Returns ID of most recent common ancestor across provided nodes. Returns direct ancestor if single leaf/node provided.

--length, -l

Print total branch length.

--length-all, -L

Prints all nodes and branch lengths.

--ltt 'number_of_bins'

For making lineage-through-time plot: Divides tree into number of specified segments and counts branches up to height the segment. Returns: bin_number, branch_count, bin_floor, bin_ceiling.

--mid-point, -m

Reroot tree at mid-point

--multi2bi

Force a multi-furcating tree into a bifurcating tree (by randomly resolve nodes with multiple descendants)

--otus-all, -u

Print leaf nodes with branch lengths.

--otus-desc, -U 'internal_node_id' | 'all'

Prints all OTU's that are descended from the given internal node (identifiable by running --label-nodes). If 'all', a complete list of all internal nodes and their descendents is returned instead (given in the order of "walking" through the tree from the root node).

--otus-num, -n

Print total number of OTUs (leaves).

--output, -o 'format'

Output file format. Accepts newick, nhx, and tabtree.

--random sample_size

Builds a tree of a random subset of the original tree's OTUs.

--ref <OTU>

Rotate <OTU> to be the top tip

--rename-tips <file>

Rename tips according to a 2-column table

--reroot, -r 'newroot'

Reroot tree to specified node by creating new branch, by either an OTU name (-r "otu:id"), or by using an internal node id (-r "intid:xxx"). Note that an OTU could be named by either way, but an internal node by on the "intid" tag.

--rotate-node 'inode_internal_id'

Flip two descendant nodes of an internal node (die if multi-furcation). Useful for plotting.

--sis-pairs

For each pair of OTUs, print 1/0 if they are (or not) sister OTUs.

--subset, -s 'node1,node2,node3,etc'

Creates a tree of only the specified leaves/nodes and their descendants. Specifying a single internal node produces a subtree from that node.

--swap-otus 'OTU'

Output tree with each possible pairs swapped (can't remember why this method was written, please ignore)

--tips-to-root

Print all tip distances to root

--tree-shape

Print a matrix of tree shapes (input file for R Package apTreeshape)

--trim-tips 'num'

Trim from tips, as oppose to cut-tree (from root)

--walk, -w 'otu'

Walks along the tree starting from the specified OTU and prints the total distance traveled while reaching each other OTU. Does not count any segment more than once. Useful when calculating evolutionary distance from a reference OTU.

--walk-edge 'otu'

Traverse tree starting from the specified OTU and prints the edges while reaching each other OTU. Does not count any edge more than once. Used for, e.g., transforming tree into network

Options common to all BpWrappers utilities

--help, -h

Print a brief help message and exit.

--man (but not "-m")

Print the manual page and exit.

--version, -V

Print current release version of this command and exit.

SEE ALSO

CONTRIBUTORS

  • Rocky Bernstein (testing & release)

  • Yözen Hernández yzhernand at gmail dot com (initial design of implementation)

  • Weigang Qiu <weigang@genectr.hunter.cuny.edu> (maintainer)

TO ADD

  • Insert an option in biotree

  • Insert new code in lib/Bio/BPWrapper/TreeManipulations.pm. Test by using or adding a test file in test-files/.

  • Add documentation in POD in biotree

TO DO

  • Place holder

TO CITE

  • Hernandez, Bernstein, Pagan, Vargas, McCaig, Laracuente, Di, Vieira, and Qiu (2018). "BpWrapper: BioPerl-based sequence and tree utilities for rapid prototyping of bioinformatics pipelines". BMC Bioinformatics. 19:76. https://doi.org/10.1186/s12859-018-2074-9

  • Stajich et al (2002). "The BioPerl Toolkit: Perl Modules for the Life Sciences". Genome Research 12(10):1611-1618.