extract_bgc_sequences.pl - Extracts protein sequences for different BGC scales into a FASTA file
version 0.191800
extract_bgc_sequences.pl - This tool extracts sequences from Palantir (or antiSMASH) annotations and returns a FASTA file. The sequences may be extracted at different levels: cluster, gene, module and domain.
$0 [options] --report-file [=] <infile>
Path to the output file of antismash, which can be either a biosynML.xml (antiSMASH 3-4) or a regions.js file (antiSMASH 5).
BGC annotation to use for extracting sequences. Annotations allowed: palantir or antismash [default: palantir]
Filter clusters on a/several specific type(s).
Types allowed: acyl_amino_acids, amglyccycl, arylpolyene, bacteriocin, butyrolactone, cyanobactin, ectoine, hserlactone, indole, ladderane, lantipeptide, lassopeptide, microviridin, nrps, nucleoside, oligosaccharide, otherks, phenazine, phosphonate, proteusin, PUFA, resorcinol, siderophore, t1pks, t2pks, t3pks, terpene.
Any combination of these types, such as nrps-t1pks or t1pks-nrps, is also allowed. The argument is repeatable.
Prefix string to use in sequences ids (e.g., if Strain1: >Strain1@Cluster...)
FASTA output filename.
BGC scale from which extracts sequences: cluster, gene, module and domain [default: gene].
print the usual program information
Loic MEUNIER <lmeunier@uliege.be>
Denis BAURAIN <denis.baurain@uliege.be>
This software is copyright (c) 2019 by University of Liege / Unit of Eukaryotic Phylogenomics / Loic MEUNIER and Denis BAURAIN.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Bio::Palantir, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::Palantir
CPAN shell
perl -MCPAN -e shell install Bio::Palantir
For more information on module installation, please visit the detailed CPAN module installation guide.