NAME

export_bgc_sql_tables.pl - Exports SQL tables of BGC results (antiSMASH or Palantir version)

VERSION

version 0.191620

NAME

export_bgc_sql_tables.pl - This tool creates a sql database for the AntiSMASH results and allow to make queries on it.

VERSION

This documentation refers to version 0.0.1

USAGE

    $0 [options] --paths <biosynml_path> --taxdir <dir>         

REQUIRED ARGUMENTS

OPTIONS

--infiles [=] <paths>...

Path to biosynML.xml or regions.js files, including at least the directory containing this file. This option can takes multiple values.

--file-table [=] <tsv_file>

TSV (Tabulation separated values) format file to give non ambiguously the path of xml reports, proteomes and quast files. Order : xml reports (1st column), proteomes (2nd column) and quast files (3rd column). If you only want to parse xml and quast reports, you can follow this format : "biosynML.xml undef quast.tsv".

--taxdir [=] <dir>

Path to a local mirror of the NCBI Taxonomy database.

--idm[-file] [=] <file>

Path to an id mapper file to retrieve the assemblies accession numbers. The file should be in tabular format with accession number in the secund column.

--proteomes

Use organism proteome to predict with external pHMMs domains to include in SQL database.

--quast

Create an additionnal table "Assemblies" with Quast statistics. For this option, you need to use the transposed_report.tsv output of quast and name it with the basename of your xml file. For example, if you use my_org.xml, name your Quast file my_org.tsv.

--new-db

Remove the previous sql tables to start over the db

--db-name [=] <name>

Name of your database [default: bgc-db]

--gap-filling [=] <bool>

Tries to find domains if gaps present in clusters.

--undef-cleaning [=] <bool>

Eliminates undef domains from antiSMASH output that can't be recovered.

--undef-recov [=] <bool>

Try to recover antismash undef domain values.

--evalue-threshold [=] <n>
--mode-debug

Conserve the temporary files.

--cpu [=] <n>

Number of threads/cpus to use.

--more
--version
--usage
--help
--man

print the usual program information

AUTHOR

Loic MEUNIER <lmeunier@uliege.be>

CONTRIBUTOR

Denis BAURAIN <denis.baurain@uliege.be>

COPYRIGHT AND LICENSE

This software is copyright (c) 2019 by University of Liege / Unit of Eukaryotic Phylogenomics / Loic MEUNIER and Denis BAURAIN.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.