The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::Tools::Blast::Run::Webblast.pm - Bioperl module for running Blast analyses using a HTTP interface.

SYNOPSIS

    # Run a Blast
    use Bio::Tools::Blast::Run::Webblast qw(&blast_remote);

    @out_file_names = &blast_remote($object, %named_parameters);

blast_remote is the only exported method of this module and it returns a list of local file names containing the Blast reports. $object is a reference to a Bio::Root::Object.pm object or subclass. See blast_remote() for a description of available parameters.

    # Obtain a list of available databases

    use Bio::Tools::Blast::Run::Webblast qw(@Blast_dbp_remote
                                            @Blast_dbn_remote);

    @amino_dbs      = @Blast_dbp_remote;
    @nucleotide_dbs = @Blast_dbn_remote;

INSTALLATION

This module is included with the central Bioperl distribution:

   http://bio.perl.org/Core/Latest
   ftp://bio.perl.org/pub/DIST

Follow the installation instructions included in the README file.

DESCRIPTION

Bio::Tools::Blast::Run::Webblast.pm contains methods and data necessary for running Blast sequence analyses using a remote server and saving the results locally.

Bio::Tools::Blast::run() provides an interface for Webblast.pm, so, ideally, you shouldn't use Webblast.pm directly, but via Blast.pm.

FEATURES:

  • Supports NCBI Blast1, Blast2, and PSI-Blast servers as well as WashU-Blast servers.

  • Can operate through a proxy server enabling operation from behind a firewall.

  • Can save reports with and without HTML formatting.

  • Uses LWP.

In principle, this module can be customized to use different servers that provide a Blast interface like the NCBI or WashU style servers. Such servers could be remote or local. This hasn't been well-tested however.

DEPENDENCIES

Bio::Tools::Blast::Run::Webblast.pm is used by Bio::Tools::Blast.pm. The development of this is thus linked with the Blast.pm module.

SEE ALSO

 Bio::Tools::Blast.pm                    - Blast object.
 Bio::Tools::Blast::Run::LocalBlast.pm   - Utility module for running Blasts locally.
 Bio::Tools::Blast::HTML.pm              - Blast HTML-formating utility class.
 Bio::Seq.pm                             - Biosequence object  
 Bio::Root::Object.pm                    - Bioperl base object class.


 http://bio.perl.org/Projects/modules.html  - Online module documentation
 http://bio.perl.org/Projects/Blast/        - Bioperl Blast Project     
 http://bio.perl.org/                       - Bioperl Project Homepage

FEEDBACK

Mailing Lists

User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.

    vsns-bcd-perl@lists.uni-bielefeld.de          - General discussion
    vsns-bcd-perl-guts@lists.uni-bielefeld.de     - Technically-oriented discussion
    http://bio.perl.org/MailList.html             - About the mailing lists

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via email or the web:

    bioperl-bugs@bio.perl.org                   
    http://bio.perl.org/bioperl-bugs/           

AUTHOR

Steve A. Chervitz <sac@genome.stanford.edu>
    - Webblast.pm modularized version of webblast script.
Alex Dong Li <ali@genet.sickkids.on.ca>
    - original webblast script.
Ross N. Crowhurst <RCrowhurst@hort.cri.nz>
    - modified Webblast.pm to use LWP to give proxy server support.

VERSION

Bio::Tools::Blast::Run::Webblast.pm, 1.22

COPYRIGHT

Copyright (c) 1998, 1999 Steve A. Chervitz, Alex Dong Li, Ross N. Crowhurst. All Rights Reserved.This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

APPENDIX

Methods beginning with a leading underscore are considered private and are intended for internal use by this module. They are not considered part of the public interface and are described here for documentation purposes only.

blast_remote

 Usage     : @files = blast_remote( $blast_object,  %namedParameters);
           : This method is exported.
 Purpose   : Run a remote Blast analysis on one or more sequences.
           : NOTE: The name of this method is potentially misleading
           :       since the a local server could be specified.
           :       Probably should be called blast_http.
 Returns   : Array containing a list of filenames of the Blast reports.
 Argument  : First argument should be a Bio::Tools::Blast.pm object reference.
           : This object is primarily used for error reporting
           : Remaining arguments are named parameters: 
           : (PARAMETER TAGS CAN BE UPPER OR LOWER CASE).
           :
           :   -ALIGN      => integer, number of alignments (B, 100)
           :   -ALIGN_VIEW => alignment view option (see below)
           :   -CUTOFF     => Blast score cutoff (60-110 or 'default')
           :   -DATABASE   => name of database (see below)
           :   -DESCR      => integer, number of on-line descriptions (V, 100)
           :   -EXPECT     => expect value cutoff
           :   -EXPECT_PSI => expect value for inclusion in PSI-BLAST iteration 1 
           :   -FILTER     => sequence complexity filter ('default' or 'none')
           :   -GAP        => 'on' or 'off'
           :   -GAP_CREATE => gap creation penalty (G, 5) 
           :   -GAP_EXTEND => gap extension penalty (E, 2)
           :   -GEN_CODE   => integer for special genetic code (see below) blastx only
           :   -GRAPH      => 'on' or 'off' (graphical overview not yet supported)
           :   -HISTOGRAM  => 'on' or 'off' or 'both'
           :   -HTML       => 'on' or 'off' or 'both'
           :   -INPUT_TYPE => 'Sequence in FASTA format' or 'Accession or GI'
           :   -MATRIX     => substitution scoring matrix (blast1 only for NCBI server)
           :   -NCBI_GI    => 'on' or 'off'
           :   -MATCH      => match reward (r, 1)       (blastn only)
           :   -MAX_LEN    => max query sequence length to blast
           :   -MIN_LEN    => min query sequence length to blast
           :   -MISMATCH   => mismatch penalty (q, -3)  (blastn only)
           :   -ORGANISM   => organism name to limit Blast2 search.
           :   -ORGANISM_CUSTOM  => custom organism or taxon name.
           :   -OUT_DIR    => output directory to store blast result files
           :   -PROG       => name of blast program (blastp, blastx, etc.)
           :   -SEQS       => ref to an array of Bio::Seq.pm objects. 
           :   -SERVER     => blast server to use (default is NCBI Blast2)
           :   -STRAND     => Default = 'Both' (not used by NCBI servers)
           :   -VERSION    => blast version (1, 2, PSI, WashU)
           :   -WORD       => word size (W, 11 for blastn, 3 for all others)
#rnc:   LIST_ORG
#       valid list_org entries for blast2 are a string of 50 chars max, default is empty string
           :
 Throws    : Exception if:
           :   - Cannot obtain parameters by calling _rearrange() on the
           :     first argument, which should be a Bio::Tools::Blast.pm object ref.
           :   - No sequences are provided.
           :   - Sequence type is incompatible with Blast program type.
           :   - Database name is not one of the valid names.
           :   - Supplied e-mail address looks invalid.
 Comments  :
  -------------------------------------------------------------
  Available programs: blastn, blastx, dbest, blastp, tblastn, tblastx
  Program versions: 1, 2, PSI, WashU (or WU)

  -------------------------------------------------------------
  Available databases:
        nr, month, swissprot, dbest, dbsts, 
        est_mouse, est_human, est_others, pdb, vector, kabat,
        mito, alu, epd, yeast, ecoli, gss, htgs.
 
    These are exported by this module in the @Blast_dbp_remote
    and @Blast_dbn_remote arrays.
  -------------------------------------------------------------
  Available Genetic Codes are (blastx only): 

        (1) Standard                    (2) Vertebrate Mitochondrial
        (3) Yeast Mitochondrial         (4) Mold Mitochondrial; ... 
        (5) Invertebrate Mitochondrial  (6) Ciliate Nuclear; ...
        (9) Echinoderm Mitochondrial    (10) Euplotid Nuclear
        (11) Bacterial                  (12) Alternative Yeast Nuclear
        (13) Ascidian Mitochondrial     (14) Flatworm Mitochondrial
        (15) Blepharisma Macronuclear
 
  -------------------------------------------------------------
  Available values for organism (Blast2):

      (None)   (DEFAULT; note that the parentheses are required.)
      Arabidopsis thaliana 
      Bacillus subtilis 
      Bos taurus 
      Caenorhabditis elegans 
      Danio rerio 
      Dictyostelium discoideum 
      Drosophila melanogaster 
      Escherichia coli 
      Gallus gallus 
      Homo sapiens 
      Human immunodeficiency virus type 1 
      Mus musculus 
      Oryctolagus cuniculus 
      Oryza sativa 
      Ovis aries 
      Plasmodium falciparum 
      Rattus norvegicus 
      Saccharomyces cerevisiae 
      Schizosaccharomyces pombe 
      Simian immunodeficiency virus 
      Xenopus laevis 
      Zea mays 

  -------------------------------------------------------------
  Available values for align_view (Blast2):

       0             Pairwise  (DEFAULT)
       1             master-slave with identities
       2             master-slave without identities
       3             flat master-slave with identities
       4             flat master-slave without identities
 -------------------------------------------------------------
  Available substitution scoring matrices (NCBI):
      
      BLAST2 matrices: BLOSUM80, BLOSUM62, BLOSUM45, PAM30, PAM70

      BLAST1 matrices: BLOSUM62, PAM40, PAM120, PAM250, IDENTITY.
      
      Others members of the BLOSUM and PAM family of matrices
      may be available as well.
      These are exported by this module in the @Blast_matrix_remote array.

      Note that certain combinations of matrices and gap creation/extension
      penalties are disallowed (E.g., PAM250 will work with 12/2 but not 11/1).
 --------------------------------------------------------------
   Limited values for gap creation and extension are supported for 
   blastp, blastx, tblastn.  Some supported and suggested values are:

  Creation     Extension

     10             1
     10             2
     11             1
      8             2
      9             2
  -------------------------------------------------------------
  Available sequence complexity filters:
       SEG, SEG+XNU, XNU, dust, none.

See Also : _set_options(), _adjust_options(), _validate_options(), _blast(), Bio::Tools::Blast.pm

APPENDIX 2: Parameter listings

Parameters for Blast (NCBI ungapped, no longer supported by NCBI so should dicontinue use of ungapped blast), Blast2 (NCBI), PSI-Blast2 (NCBI). WashU-Blast2 has yet to be added as does PHI-Blast2 (NCBI).

These lists of parameters for posting to blast servers were obtained directly from the respective WWW forms for each server.

Basic ungapped BLAST Search Server Parameters

PROGRAM [default value]:blastn blastp tblastn tblastx blastx

DATALIB [default value]:nr month swissprot dbest dbsts pdb vector kabat mito alu epd yeast gss htgs ecoli

INPUT_TYPE [default value]:Sequence in FASTA format Accession or GI

SEQUENCE

EXPECT [default value]:default 0.0001 0.01 1 10 100 1000

CUTOFF [default value]:default 60 70 80 90 100 110

MATRIX [default value]:default BLOSUM62 PAM40 PAM120 PAM250 IDENTITY

STRAND [default value]:both top bottom

FILTER [default value]:default none dust SEG SEG+XNU XNU

HISTOGRAM [default value]:'' HISTOGRAM

NCBI_GI [default value]:"" NCBI_GI

DESCRIPTIONS [default value]:default 0 10 50 100 250 500

ALIGNMENTS [default value]:default 0 10 50 100 250 500

ADVANCED [default value]:""

EMAIL [default value]:'' IS_SET

PATH [default value]:""

HTML [default value]:'' HTML

Basic Blast 2

PROGRAM [default value]:blastn blastp blastx tblastn tblastx

DATALIB [default value]:nr month swissprot dbest dbsts est_mouse est_human est_others pdb pat vector kabat mito alu epd yeast ecoli gss htgs

UNGAPPED_ALIGNMENT [default value]:'' is_set

FSET [default value]:is_set ''

OVERVIEW [default value]:is_set ''

INPUT_TYPE [default value]:Sequence in FASTA format Accession or GI

SEQUENCE

EMAIL [default value]:'' IS_SET

PATH [default value]:""

HTML [default value]:'' IS_SET

BLAST2 ADVANCED

PROGRAM [default value]:blastn blastp blastx tblastn tblastx

DATALIB [default value]:nr month swissprot dbest dbsts est_mouse est_human est_others pdb pat vector kabat mito alu epd yeast ecoli gss htgs

UNGAPPED_ALIGNMENT [default value]:"" is_set

INPUT_TYPE [default value]:Sequence in FASTA format Accession or GI

SEQUENCE

GI_LIST [default value]:(None) Arabidopsis thaliana Bacillus subtilis Bos taurus Caenorhabditis elegans Danio rerio Dictyostelium discoideum Drosophila melanogaster Escherichia coli Gallus gallus Homo sapiens Human immunodeficiency virus type 1 Mus musculus Oryctolagus cuniculus Oryza sativa Ovis aries Plasmodium falciparum Rattus norvegicus Saccharomyces cerevisiae Schizosaccharomyces pombe Simian immunodeficiency virus Xenopus laevis Zea mays

LIST_ORG

EXPECT [default value]:10 0.0001 0.01 1 10 100 1000

FILTER [default value]:default none

NCBI_GI [default value]:'' is_set

OVERVIEW [default value]:is_set ''

DESCRIPTIONS [default value]:500 0 10 50 100 250 500

ALIGNMENTS [default value]:500 0 10 50 100 250 500

ALIGNMENT_VIEW [default value]:0 #Pairwise 1 #master-slave with identities 2 #master-slave without identities 3 #flat master-slave with identities 4 #flat master-slave without identities

GENETIC_CODE [default value]:Standard (1) Vertebrate Mitochondrial (2) Yeast Mitochondrial (3) Mold Mitochondrial; ... (4) Invertebrate Mitochondrial (5) Ciliate Nuclear; ... (6) Echinoderm Mitochondrial (9) Euplotid Nuclear (10) Bacterial (11) Alternative Yeast Nuclear (12) Ascidian Mitochondrial (13) Flatworm Mitochondrial (14) Blepharisma Macronuclear (15)

MAT_PARAM [default value]:BLOSUM62 11 1 PAM30 9 1 PAM70 10 1 BLOSUM80 10 1 BLOSUM62 11 1 BLOSUM45 14 2 PAM30 7 2 PAM30 6 2 PAM30 5 2 PAM30 10 1 PAM30 9 1 #recommended PAM30 8 1 PAM70 8 2 PAM70 7 2 PAM70 6 2 PAM70 11 1 PAM70 10 1 #recommended PAM70 9 1 BLOSUM80 8 2 BLOSUM80 7 2 BLOSUM80 6 2 BLOSUM80 11 1 BLOSUM80 10 1 #recommended BLOSUM80 9 1 BLOSUM62 9 2 BLOSUM62 8 2 BLOSUM62 7 2 BLOSUM62 12 1 BLOSUM62 11 1 #recommended BLOSUM62 10 1 BLOSUM45 13 3 BLOSUM45 12 3 BLOSUM45 11 3 BLOSUM45 10 3 BLOSUM45 15 2 BLOSUM45 14 2 #recommended BLOSUM45 13 2 BLOSUM45 12 2 BLOSUM45 19 1 BLOSUM45 18 1 BLOSUM45 17 1 BLOSUM45 16 1

OTHER_ADVANCED [default value]:""

EMAIL [default value]:'' IS_SET

PATH [default value]:""

HTML [default value]:'' IS_SET

PSI BLAST2

PROGRAM [default value]:blastp

DATALIB [default value]:nr month swissprot pdb kabat alu yeast ecoli

GAPPED_ALIGNMENT [default value]:is_set ''

INPUT_TYPE [default value]:Sequence in FASTA format Accession or GI

SEQUENCE

EXPECT [default value]:10 0.0001 0.01 1 10 100 1000

FILTER [default value]:default none

NCBI_GI [default value]:'' is_set

GRAPHIC_OVERVIEW [default value]:is_set ''

DESCRIPTIONS [default value]:500 0 10 50 100 250 500

ALIGNMENTS [default value]:500 0 10 50 100 250 500

E_THRESH [default value]:0.001 #max value is 10

MAT_PARAM [default value]:BLOSUM62 11 1 PAM30 9 1 PAM70 10 1 BLOSUM80 10 1 BLOSUM62 11 1 BLOSUM45 14 2 PAM30 7 2 PAM30 6 2 PAM30 5 2 PAM30 10 1 PAM30 9 1 PAM30 8 1 PAM70 8 2 PAM70 7 2 PAM70 6 2 PAM70 11 1 PAM70 10 1 PAM70 9 1 BLOSUM80 8 2 BLOSUM80 7 2 BLOSUM80 6 2 BLOSUM80 11 1 BLOSUM80 10 1 BLOSUM80 9 1 BLOSUM62 9 2 BLOSUM62 8 2 BLOSUM62 7 2 BLOSUM62 12 1 BLOSUM62 11 1 BLOSUM62 10 1 BLOSUM45 13 3 BLOSUM45 12 3 BLOSUM45 11 3 BLOSUM45 10 3 BLOSUM45 15 2 BLOSUM45 14 2 BLOSUM45 13 2 BLOSUM45 12 2 BLOSUM45 19 1 BLOSUM45 18 1 BLOSUM45 17 1 BLOSUM45 16 1

OTHER_ADVANCED [default value]:""

WashU BLAST2

WU-Blast2 Database Searches http://www2.ebi.ac.uk/blast2/

email ""

title Sequence

srchtype interactive email

database swall swissprot swnew trembl tremblnew pdb gpcrdb prints HLAprot embl emnew est igvec emvec imgt HLAnuc

program WU-blastp WU-blastx WU-blastn

matrix blosum62 blosum30 blosum35 blosum40 blosum45 blosum50 blosum65 blosum70 blosum75 blosum80 blosum85 blosum90 blosum100 GONNET pam10 pam20 pam30 pam40 pam50 pam60 pam70 pam80 pam90 pam100 pam110 pam120 pam130 pam140 pam150 pam160 pam170 pam180 pam190 pam200 pam210 pam220 pam230 pam240 pam250 pam260 pam270 pam280 pam290 pam300 pam310 pam320 pam330 pam340 pam350 pam360 pam370 pam380 pam390 pam400 pam410 pam420 pam430 pam440 pam450 pam460 pam470 pam480 pam490 pam500

strand default top bottom

exp default 1.0 10 100 1000

filter none seg xnu seg+xnu dust

echofilter no yes

histogram no yes

stats sump poisson

sort pvalue count highscore totalscore

scores default 5 10 20 50 100 150 200 250

numal default 5 10 20 50 100 150 200 250

sequence

1 POD Error

The following errors were encountered while parsing the POD:

Around line 326:

Can't have a 0 in =over 0