The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

SmotifTF Template-free Modeling Method - The pre-requisites

SYNOPSIS

SmotifTF carries out template-free structure prediction using a dynamic library of supersecondary structure fragments obtained from a set of remotely related PDB structures.

This perl script runs all the pre-requisites (HHblits/HHsearch, Psi-Blast, Delta-Blast and Psipred) required for modeling. The input is the query protein sequence in fasta format and the outputs are:

1. A dynamic library of supersecondary structure motifs, tailor-made for the query protein (obtained using HHblits/HHsearch, Psi-Blast and Delta-Blast).

2. Definitions for putative Smotifs in the query protein (obtained from Psipred).

Once the pre-requisites are completed, modeling can be carried out using the script smotiftf.pl (use "perldoc smotiftf.pl" for instructions).

PRE-REQUISITES

The Smotif-based modeling algorithm requires the query protein sequence as input.

Software / data:

1. Psipred (http://bioinf.cs.ucl.ac.uk/psipred/)

2. HHSuite (ftp://toolkit.genzentrum.lmu.de/pub/HH-suite/)

3. Psiblast and Delta-blast (http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastDocs&DOC_TYPE=Download)

4. Modeller (version 9.14 https://salilab.org/modeller/)

5. Local PDB directory (central or user-designated) - updated (http://www.rcsb.org).

Download and install the above mentioned software / data according to their instructions.

Note: Psipred and Psiblast require legacy blast and Delta-blast is a part of the Blast+ package.

Databases required:

1. PDBAA blast database is required (ftp://ftp.ncbi.nlm.nih.gov/blast/db/).

2. HHsuite databases NR20 and PDB70 are required (ftp://toolkit.genzentrum.lmu.de/pub/HH-suite/databases/hhsuite_dbs/)

SMOTIFTF DOWNLOAD AND INSTALLATION

Download SmotifTF package from CPAN:

http://search.cpan.org/dist/SmotifTF/

Installation of the software (also available in the README file):

 tar -zxvf SmotifTF-version.tar.gz

 cd SmotifTF-version/

 perl Makefile.PL PREFIX=/home/user/MyPerlLib/

 make

 make test

 make install

SETUP CONFIGURATION FILE

The configuration file, smotiftf_config.ini has all the information regarding the required library files and other pre-requisite software.

Set all the paths and executables in this file correctly.

Set environment varible in .bashrc file:

export SMOTIFTF_CONFIG_FILE=/home/user/MyPerlLib/share/perl5/SmotifTF-version/smotiftf_config.ini

PRE-REQUISITES STEPS

      -------------------------------------------------------
     | Run Pre-requisites:                                   |
     | Psipred, HHblits+HHsearch, Psiblast, Delta-blast      |
     |                                                       |
     | Single-core job                                       |
     | Usage:                                                |
     |   perl smotiftf_prereq.pl --sequence_file=1zzz.fasta  |
     |         --dir=1zzz --step=all                         |
      -------------------------------------------------------

HOW TO RUN THE PRE-REQUISITES

1. If installed locally, provide the correct path name to the SmotifTF perl library in this perl script (line 14).

2. Create a subdirectory with a dummy pdb file name (eg: 1abc or 1zzz).

3. Put the query fasta file (1zzz.fasta) in this directory.

4. Run the pre-requisites step first. This runs Psipred, HHblits+HHsearch, Psiblast and Delta-blast. It will then generate the dynamic database of Smotifs and the list of putative Smotifs in the query protein. For more information about the pre-requisites use: perl smotiftf_prereq.pl -help

   Usage: perl smotiftf_prereq.pl --sequence_file=1zzz.fasta --dir=1zzz --step=all

5. Next, run smotiftf.pl according to the instructions given there. For more information use: perl smotiftf.pl -help

REFERENCE

Vallat BK, Fiser A. Modularity of protein folds as a tool for template-free modeling of sequences Manuscript under review.

AUTHORS

Brinda Vallat, Carlos Madrid and Andras Fiser.

OPTIONS

-help

Print a brief help message and exits.

-man

Prints the manual page and exits.

--step

1,2,3,4,5,6,7,8 or all (to run all steps consecutively)

--sequence_file

Give the name of the fasta file.

--dir

Give 4-letter dummy pdb_code or any other directory where the fasta file is present.

DESCRIPTION

SmotifTF will carry out template-free structure prediction of a protein from its sequence to model its complete structure using the Smotif library.