The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

SmotifTF Template-free Modeling Method

SYNOPSIS

Please read this document completely for running the SmotifTF software successfully on any local computer.

Pre-requisites:

The Smotif-based modeling algorithm requires the query protein sequence as input. Additionally, if the structure of the protein is known from any alternate resource, then a PDB-formatted structure file is required. This pdb-file can be present in a centralized local directory or a user-designated separate directory.

Software / data:

1. Psipred (http://bioinf.cs.ucl.ac.uk/psipred/)

2. HHSuite (ftp://toolkit.genzentrum.lmu.de/pub/HH-suite/)

3. Psiblast and Delta-blast (http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastDocs&DOC_TYPE=Download)

4. Modeller (version 9.14 https://salilab.org/modeller/)

5. Local PDB directory (central or user-designated) - updated (http://www.rcsb.org).

Download and install the above mentioned software / data according to their instructions.

Note: Psipred and Psiblast require legacy blast and Delta-blast is a part of the Blast+ package.

Databases required:

1. PDBAA blast database is required (ftp://ftp.ncbi.nlm.nih.gov/blast/db/). 2. HHsuite databases NR20 and PDB70 are required (ftp://toolkit.genzentrum.lmu.de/pub/HH-suite/databases/hhsuite_dbs/)

SmotifTF Download and Installation:

The following components need to be downloaded and installed to run SmotifTF:

1. The software from CPAN (http://)

Installation of the software (also available in the README file):

        tar -zxvf SmotifTF-0.01.tar.gz

        cd SmotifTF-0.01/

        perl Makefile.PL PREFIX=/home/user/SmotifTF-0.01

        make

        make test

        make install

Set up the configuration file:

The configuration file, smotiftf_config.ini has all the information regarding the required library files and other pre-requisite software.

Set all the paths and executables in this file correctly.

Set environment varible in .bashrc file:

export SMOTIFTF_CONFIG_FILE=/home/user/SmotifTF-0.01/smotiftf_config.ini

        Modeling algorithm steps: 

         ----------------------------------------------------
            |Step 0:                                                         |
            |   Run Pre-requisites                                           |
            |   Psipred, HHblits+HHsearch, Psiblast, Delta-blast |
                |                                                                                                        |
            |   Single-core job                                      |
            |   Usage: perl smotiftf_prereq.pl --step=all                |
                |                       --pdb=1zzz --chain=A --dir=1zzz                  |
             ----------------------------------------------------

             ----------------------------------------------------
        |Step 1:                                             |
        |       Compare SmotifS                                                  |
                |                                                                                                        |       
            |       Multi-core / cluster job                         |
            |       Usage: perl smotiftf.pl --step=1 --pdb=1zzz  |
        |              --chain=A --havestructure=0           |
         ----------------------------------------------------

             ----------------------------------------------------
        |Step 2:                                             |
        |       Rank SmotifS                                                     |
        |                                                                    |
        |       Multi-core / cluster job                     |  
            |       Usage: perl smotiftf.pl --step=2 --pdb=1zzz  |
        |              --chain=A --havestructure=0           |
         ----------------------------------------------------

                 ----------------------------------------------------
        |Step 3:                                             |
        |       Enumerate all possible combinations of       |
        |       Smotifs (about a million models)                 |
                |                                                                                                        |
        |       Multi-core / cluster job                     |  
                |       Usage: perl smotiftf.pl --step=3 --pdb=1zzz  |
        |              --chain=A --havestructure=0           |
         ----------------------------------------------------

                 ----------------------------------------------------
        |Step 4:                                             |
        |       Rank enumerated structures using a               |
        |       composite energy function                        |
                |                                                                                                        |
        |       Single-core job                              |  
                |       Usage: perl smotiftf.pl --step=4 --pdb=1zzz  |
        |              --chain=A --havestructure=0           |
         ----------------------------------------------------

                 ----------------------------------------------------
        |Step 5:                                             |
        |       Run Modeller to generate top 5 complete      |
        |       models                                                           |
                |                                                                                                        |
        |       Single-core job                              |  
                |       Usage: perl smotiftf.pl --step=5 --pdb=1zzz  |
        |              --chain=A --havestructure=0           |
         ----------------------------------------------------

How to run the program:

1. Create a subdirectory with a dummy pdb file name (eg: 1abc or 1zzz).

2. Put the query fasta file (1zzz.fasta) in this directory.

3. Optional: If structure is known, include a pdb format structure file in the same directory. 1abc/pdb1abc.ent or 1zzz/pdb1zzz.ent

4. Run the pre-requisites step first. This runs Psipred, HHblits+HHsearch, Psiblast and Delta-blast. It will then generate the dynamic database of Smotifs and the list of putative Smotifs in the query protein. perl smotiftf_prereq.pl --step=all --pdb=1zzz --chain=A --dir=1zzz

4. Run steps 1 to 5 as given above sequentially. Output from previous steps are often required in subsequent steps. Wait for each step to be completed without errors before going to the next step.

5. To run steps 1-5 together use: perl smotiftf.pl --step=all --pdb=1zzz --chain=A --havestructure=0

6. Use multiple-cores or clusters as available, for steps 1 & 3. These are slow and require a lot of computational resources.

7. If structure is known, use --havestructure=1. Else, use --havestructure=0 in all the steps.

Results:

Top 5 models are stored in the subdirectory (1abc or 1zzz) as: Model.1.pdb, Model.2.pdb, Model.3.pdb, Model.4.pdb & Model.5.pdb

Reference:

Vallat BK, Fiser A. Modularity of protein folds as a tool for template-free modeling of sequences Manuscript under review.

Authors:

Brinda Vallat, Carlos Madrid and Andras Fiser.

OPTIONS

-help

Print a brief help message and exits.

-man

Prints the manual page and exits.

--step

1,2,3,4,5 or all

--pdb

Give 4-letter dummy pdb_code

--chain

Give 1-letter chain_id

--havestructure

0 or 1 depending on whether a structure is known for the protein from alternate sources.

DESCRIPTION

SmotifTF will carry out template-free structure prediction of a protein from its sequence to model its complete structure using the Smotif library.