NAME

extract_bgc_sequences.pl - Extracts protein sequences for different BGC scales into a FASTA file

VERSION

version 0.192560

NAME

extract_bgc_sequences.pl - This tool extracts sequences from Palantir (or antiSMASH) annotations and returns a FASTA file. The sequences may be extracted at different levels: cluster, gene, module and domain.

USAGE

        $0 [options] --report-file [=] <infile>

REQUIRED ARGUMENTS

--report[-file] [=] <infile>

Path to the output file of antismash, which can be either a biosynML.xml (antiSMASH 3-4) or a regions.js file (antiSMASH 5).

OPTIONAL ARGUMENTS

--annotation [=] <str>

BGC annotation to use for extracting sequences. Annotations allowed: palantir or antismash [default: palantir]

--types [=] <str>...

Filter clusters on a/several specific type(s).

Types allowed: acyl_amino_acids, amglyccycl, arylpolyene, bacteriocin, butyrolactone, cyanobactin, ectoine, hserlactone, indole, ladderane, lantipeptide, lassopeptide, microviridin, nrps, nucleoside, oligosaccharide, otherks, phenazine, phosphonate, proteusin, PUFA, resorcinol, siderophore, t1pks, t2pks, t3pks, terpene.

Any combination of these types, such as nrps-t1pks or t1pks-nrps, is also allowed. The argument is repeatable.

--prefix [=] <str>

Prefix string to use in sequences ids (e.g., if Strain1: >Strain1@Cluster...)

--outfile [=] <filename>

FASTA output filename.

--scale [=] <str>

BGC scale from which extracts sequences: cluster, gene, module and domain [default: gene].

--version
--usage
--help
--man

print the usual program information

AUTHOR

Loic MEUNIER <lmeunier@uliege.be>

CONTRIBUTOR

Denis BAURAIN <denis.baurain@uliege.be>

COPYRIGHT AND LICENSE

This software is copyright (c) 2019 by University of Liege / Unit of Eukaryotic Phylogenomics / Loic MEUNIER and Denis BAURAIN.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.