The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Name

BioX::Wrapper::Gemini::Examples - Documentation describing the use of the BioX::Wrapper::Gemini module.

For more detailed Gemini usage please refer to the gemini documentation at L:<Gemini Read the Docs|http://gemini.readthedocs.org/en/latest/>.

Full Workflow Example

Create the Bash script

This set of commands creates an (obviously fake) vcf file, compressed vcf file, and the correct directory structure.

    mkdir example
    cd example
    touch test1.vcf
    touch test2.vcf.gz

    echo "#!/bin/bash" > gemini_process.sh
    gemini-wrapper.pl --indir `pwd` --outdir `pwd`/processed >> gemini_process.sh

Directory Structure

Gemini wrapper creates your directory structure for you. If the --outdir option does not exist it will be created for you.

    /example
        gemini_process.sh
        /processed
            /gemini-wrapper
                /gemini_sqlite
                /norm_annot_vcf
        test1.vcf
        test2.vcf.gz

The normalized compressed vcf files will be in process/gemini-wrapper/norm_annot_vcf .

The sqlite databases will be in /processed/gemini-wrapper/gemini_sqlite .

See the process file

    #!/bin/bash

    #######################################################################
    # This file was generated with the following options
    #   --indir /home/user/example
    #   --outdir        /home/user/example/processed
    #######################################################################


    #######################################################################
    # Starting Sample Info Section
    #######################################################################

    # test1, test2

    #######################################################################
    # Ending Sample Info Section
    #######################################################################


    #######################################################################
    # Starting Bgzip Section
    #######################################################################
    # The following samples must be bgzipped before processing can begin
    # /home/user/example/test1.vcf
    #######################################################################

    bgzip /home/user/example/test1.vcf && tabix /home/user/example/test1.vcf.gz

    wait


    #######################################################################
    # Finished Bgzip Section
    #######################################################################

    #######################################################################
    # Normalizing with VT and annotating with SNPEFF the following samples
    # test1, test2
    #######################################################################

    bcftools view /home/user/example/test1.vcf.gz | sed 's/ID=AD,Number=./ID=AD,Number=R/' \
        | vt decompose -s - \
        | vt normalize -r $REFGENOME - \
        | java -Xmx4G -jar $SNPEFF/snpEff.jar -c \$SNPEFF/snpEff.config -formatEff -classic GRCh37.75  \
        | bgzip -c > \
        /home/user/example/processed/gemini-wrapper/norm_annot_vcf/test1.norm.snpeff.gz && tabix /home/user/example/processed/gemini-wrapper/norm_annot_vcf/test1.norm.snpeff.gz


    bcftools view /home/user/example/test2.vcf.gz | sed 's/ID=AD,Number=./ID=AD,Number=R/' \
        | vt decompose -s - \
        | vt normalize -r $REFGENOME - \
        | java -Xmx4G -jar $SNPEFF/snpEff.jar -c \$SNPEFF/snpEff.config -formatEff -classic GRCh37.75  \
        | bgzip -c > \
        /home/user/example/processed/gemini-wrapper/norm_annot_vcf/test2.norm.snpeff.gz && tabix /home/user/example/processed/gemini-wrapper/norm_annot_vcf/test2.norm.snpeff.gz


    wait


    #######################################################################
    # Finished Normalize Annotate Section
    #######################################################################

    #######################################################################
    # Gemini is loading the following samples
    # test1, test2
    #######################################################################

    gemini load -v /home/user/example/processed/gemini-wrapper/norm_annot_vcf/test1.norm.snpeff.gz \
         -t snpEff \
         /home/user/example/processed/gemini-wrapper/gemini_sqlite/test1.vcf.db


    gemini load -v /home/user/example/processed/gemini-wrapper/norm_annot_vcf/test2.norm.snpeff.gz \
         -t snpEff \
         /home/user/example/processed/gemini-wrapper/gemini_sqlite/test2.vcf.db


    wait


    #######################################################################
    # Finished Gemini Load Section
    #######################################################################

Notes on Default Values

BioX::Wrapper::Gemini assumes you have environmental variables set for the SnpEff install ($SNPEFF), reference genome ($REFGENOME).

You can change this with

    gemini-wrapper.pl --ref /path/to/reference.fa --snpeff /path/to/snpeff

If you change the snpeff variable you will probably need to change the snpeff_opt

    gemini-wrapper.pl --snpef_opt "-c /path/to/snpeff/snpEff.config -formatEff -classic GRCh37.75"

The gemini db_load_opts has the default set as " -t snpeff". If you would instead like to load it with "--skip_cadd --cores 3 -t snpeff" look at the following.

    gemini_wrapper.pl --db_load_opts "--skip_cadd --cores 3 -t snpeff"

This can be easily accomplished with environment modules. For more information please refer to the modules documentation. http://modules.sourceforge.net/

For more functional documentation check out the wikipedia page. https://en.wikipedia.org/wiki/Environment_Modules_(software)

For easy install try

    sudo yum install environment-modules

See what the variables are with

env |grep MODULE

You can create a custom module path with

    export MODULEPATH=/path/to/user/modules:$MODULEPATH

SnpEff Module example

    #%Module1.0#######################################################################
    ## SnpEff modulefile
    ##
    proc ModulesHelp { } {

            puts stderr "\tAdds the snpeff binaries to your path.\n module load snpeff \n java -jar $SNPEFF/snpEff.jar -c /path/to/configuration/file/snpEff.config <args>"
    }

    module-whatis   "Adds the snpeff binaries to your path."

    module add java/jdk1.7.0_67

    set ROOT_DIR    /system/apps/software
    setenv  SNPEFF  $ROOT_DIR/snpEff

    #For analysis and denormalizing
    module add samtools/1.2
    prepend-path    PATH    $ROOT_DIR/vt

Upcoming Functionality

Add in a VEP workflow option

AUTHOR

Jillian Rowe <jillian.e.rowe@gmail.com>

ACKNOWLEDGEMENTS

This module was originally developed at and for Weill Cornell Medical College in Qatar. With approval from WCMC-Q, this information was generalized and put on github, for which the authors would like to express their gratitude.