NAME - Exports SQL tables of BGC results (antiSMASH or Palantir version)


version 0.191620

NAME - This tool creates a sql database for the AntiSMASH results and allow to make queries on it.


This documentation refers to version 0.0.1


    $0 [options] --paths <biosynml_path> --taxdir <dir>         



--infiles [=] <paths>...

Path to biosynML.xml or regions.js files, including at least the directory containing this file. This option can takes multiple values.

--file-table [=] <tsv_file>

TSV (Tabulation separated values) format file to give non ambiguously the path of xml reports, proteomes and quast files. Order : xml reports (1st column), proteomes (2nd column) and quast files (3rd column). If you only want to parse xml and quast reports, you can follow this format : "biosynML.xml undef quast.tsv".

--taxdir [=] <dir>

Path to a local mirror of the NCBI Taxonomy database.

--idm[-file] [=] <file>

Path to an id mapper file to retrieve the assemblies accession numbers. The file should be in tabular format with accession number in the secund column.


Use organism proteome to predict with external pHMMs domains to include in SQL database.


Create an additionnal table "Assemblies" with Quast statistics. For this option, you need to use the transposed_report.tsv output of quast and name it with the basename of your xml file. For example, if you use my_org.xml, name your Quast file my_org.tsv.


Remove the previous sql tables to start over the db

--db-name [=] <name>

Name of your database [default: bgc-db]

--gap-filling [=] <bool>

Tries to find domains if gaps present in clusters.

--undef-cleaning [=] <bool>

Eliminates undef domains from antiSMASH output that can't be recovered.

--undef-recov [=] <bool>

Try to recover antismash undef domain values.

--evalue-threshold [=] <n>

Conserve the temporary files.

--cpu [=] <n>

Number of threads/cpus to use.


print the usual program information




