The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

RankEnumeratedStructures

VERSION

Version 0.01

SYNOPSIS

This module ranks all the enumerated structures using a composite energy function that consists of four parameters: (1) Radius of Gyration (2) Solvation potential (3) Hydrogen bond potential (4) Statistical potential

    use RankEnumeratedStructures;

    rank_structures ($pdbcode,$stericlimit,@indices);

EXPORT rank_structures pre_rank_structures get_energy

pre_rank_structures

Subroutine to prepare for rank_structures

rank_structures

This subroutine ranks the structures generated by the full enumeration of the candidate smotif combinations. The ranking takes place in two parts: the full set is ranked using a 'coarse' scoring function, and the top 1000 structures are re-ranked using a 'refined' scoring function. Both functions use 4 scoring component values: radius of gyration, statistical pairwise contact potential, implicit solvation potential, and long range H bond potential.

INPUT ARGUMENTS 1) $pdbcode - the 4-character name of the folder to store input and output data 2) $sterlimit - number of allowable steric clashes (these clashes are calculated during the enumeration and are part of the input file - they are not calculated directly by this script) 3) @st - list of 4 numbers corresponding to the indices of the scoring function components in the tab-delimited output file from the full enumeration. Index numbering starts from 0 (not 1). See "INPUT FILES" for further details.

REQUIRED FILES (all to be found in the <pdbcode> directory) <pdbcode>.out - file containing a list of start and end points of smotifs in the query protein, as well as secondary structure and loop lengths. This is one of the standard output files of the generate_shift_files.pl script. <pdbcode>_motifs_best.csv - file containing a list of candidates for each putative smotif. This is one of the standard output files of the findranks.pl script.

INPUT FILES In the <pdbcode> directory, a set of files indicating the results of the full enumeration. These are the standard output files from the all_enum.pl script, and have the following format:

Sample line for a structure with 4 smotifs 1.437 0.740 1.867 8.377 224162 148918 54194 127698 1.7483 0.9973 0.9616 1.2306 8.8294 58.8240 12 0 0 0 0

Explanation: 1.437 0.740 1.867 8.377 : RMSDs of the 4 smotif components individually 224162 148918 54194 127698 : Nids of the 4 smotif components 1.7483 : Per-residue radius of gyration z-score 0.9973 : Per-residue pairwise contact potential z-score 0.9616 : Per-residue solvation potential z-score 1.2306 : Long-range H-bond potential z-score 8.8294 : Overall structure RMSD (from solved structure) 58.8250 : Overall structure GDT_TS score 12 0 0 0: List of indices of smotifs, as found in the <pdbcode>_motifs_best.csv file 0 : Number of steric clashes

In this case, the indices for the scoring function components are 8,9,10, and 11. In general, the indices will be from 2*n through 2*n+3 inclusive, where n is the number of smotifs

OUTPUT FILES In the <pdbcode> directory: 1) <pdbcode>_ranked_coarse.csv : Top 5000 structures as ranked by the coarse scoring function. The format of each line is the same as in the enumeration output files (see INPUT FILES, above), with an additional final entry representing the scoring function output for each line. 2) <pdbcode>_ranked_refined.csv : Same as 1), but for the top structures re-ranked using the refined scoring function.

rank_energies

Subroutine to calculate energy and rank structures, given a list of energy function component scores and weights

AUTHORS

Fiserlab Members , <andras at fiserlab.org>

BUGS

Please report any bugs or feature requests to bug-. at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=..

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc RankEnumeratedStructures

LICENSE AND COPYRIGHT

Copyright 2015 Fiserlab Members .

This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at:

http://www.perlfoundation.org/artistic_license_2_0

Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license.

If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license.

This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder.

This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed.

Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.