The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

yaml-generator-42.pl - Interactive or batch generator for 42 YAML config files

VERSION

version 0.213470

USAGE

    # wizard: configuration assistant
    ./yaml-generator-42.pl --wizard

    # basic: derive organisms name directly from filenames
    ./yaml-generator-42.pl --bank_dir ./banks/ --ref_bank_dir ./ref_banks/ \
    --tax_dir ~/taxdump/ --bank_suffix=.nsq --ref_bank_suffix=.psq \
    --queries queries.idl [optional arguments]

    # use IDM files for banks, orgs and eventually tax_filters
    ./yaml-generator-42.pl --bank_dir ./banks/ --ref_bank_dir ./ref_banks/ --tax_dir  \
    ~/taxdump/ --bank_suffix=.nsq --ref_bank_suffix=.psq --queries queries.idl  \
    --bank_mapper banks/bank_mapper.idm  \
    --ref_bank_mapper ref_banks/ref_bank_mapper.idm

    # dynamic choice of tax_filter using NCBI Taxonomy
    ./yaml-generator-42.pl --bank_dir ./banks/ --ref_bank_dir ./ref_banks/ --tax_dir  \
    ~/taxdump/ --bank_suffix=.nsq --ref_bank_suffix=.psq --queries queries.idl  \
    --bank_mapper banks/bank_mapper.idm  \
    --ref_bank_mapper ref_banks/ref_bank_mapper.idm \
    --levels=family

    # interactive choice of tax_filter using NCBI Taxonomy
    ./yaml-generator-42.pl --bank_dir ./banks/ --ref_bank_dir ./ref_banks/ --tax_dir  \
    ~/taxdump/ --bank_suffix=.nsq --ref_bank_suffix=.psq --queries queries.idl  \
    --bank_mapper banks/bank_mapper.idm  \
    --ref_bank_mapper ref_banks/ref_bank_mapper.idm \
    --levels=family --choose_tax_filter

REQUIRED ARGUMENTS

OPTIONAL ARGUMENTS

--wizard

Activate if you want an interactive step by step configuration.

--outdir [=] <dir>|--out_dir [=] <dir>

Optional output dir that will contain the generated YAML (and .sh) config files (will be created if needed) [default: none]. Otherwise, output files will be written in the working directory.

--run_mode=<str>

'phylogenomic' or 'metagenomic'

--out_suffix=<str>
--queries [=] <file>

Organisms to use as queries (one per line = IDL format). Comments are supported.

--evalue=<n>
--SSUrRNA
--homologues_seg=<str>
--max_target_seqs=<str>
--templates_seg=<str>
--ref_brh=<str>

on or off.

--ref_bank_dir [=] <dir>

Path to reference bank files directory.

--ref_bank_suffix=<str>
--ref_bank_mapper [=] <file>

TSV file (IDM format) associating each ref_org to its bank [default: none]. When unspecified, the script derives ref_org names from ref_bank filenames. Commenting (or deleting) lines in this file allows the user to reduce the set of ref_orgs.

--ref_org_mul=<n>
--ref_score_mul=<n>
--tol_check=<str>
--tol_db [=] <db>

Path to TreeOfLife database.

--trim_homologues=<str>
--trim_max_shift=<n>
--trim_extra_margin=<n>
--merge_orthologues=<str>
--merge_min_ident=<n>
--merge_min_len=<n>
--aligner_mode=<str>

off, blast, exonerate, exoblast.

--ali_skip_self=<str>

on or off

--ali_cover_mul=<n>
--ali_keep_old_new_tags=<str>

'on' or 'off'

--bank_dir [=] <dir>

Path to bank files directory.

--bank_suffix=<str>
--bank_mapper [=] <file>

TSV file associating each org to its bank [default: none]. When unspecified, the script derives org names from bank filenames. Commenting (or deleting) lines in this file allows the user to reduce the set of orgs to be processed. Additional columns can be used to specify a tax_filter and a lineage.

--tax_dir [=] <dir>|--taxdir [=] <dir>

Path to taxdump directory.

--levels=<level>...

Taxonomic filter level(s). Several levels are allowed as input; in this case, the first defined level will be returned.

Available levels are: 'superkingdom' 'kingdom' 'phylum' 'subphylum' 'class' 'superorder' 'order' 'suborder' 'infraorder' 'parvorder' 'superfamily' 'family' 'subfamily' 'genus' 'species'

--choose_tax_filter=<n>

Interactively choose taxonomic filter. 0 => 'from org mapper file' 1 => "from NCBI's taxonomy - auto + prompt for missing" 2 => "from NCBI's taxonomy - auto + prompt for all"

--tax_reports=<str>

'on' or 'off'

--best_hit

Overides 'tax_' parameters and auto-sets 'tax_score_mul' to compute LCA in a MEGAN-like mode i.e. based on bitscore.

--megan_like

Overides 'tax_' parameters and auto-sets 'tax_score_mul' to compute LCA in a MEGAN-like mode i.e. based on bitscore.

--tax_max_hits=<n>
--tax_min_hits=<n>
--tax_min_ident=<n>
--tax_min_len=<n>
--tax_min_score=<n>
--tax_score_mul=<n>
--ali_keep_lengthened_seqs=<str>

keep or remove

--code=<n>
--version
--usage
--help
--man

Print the usual program information

AUTHOR

Denis BAURAIN <denis.baurain@uliege.be>

CONTRIBUTOR

Mick VAN VLIERBERGHE <mvanvlierberghe@doct.uliege.be>

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by University of Liege / Unit of Eukaryotic Phylogenomics / Denis BAURAIN.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.