The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::Phylo::CIPRES - Reusable components for CIPRES REST API access

SYNOPSIS

 my $cipres = Bio::Phylo::CIPRES->new( 
        'infile'    => 'infile.fasta',                 # input data file
        'tool'      => 'MAFFT_XSEDE',                  # tool to run
        'param'     => { 'vparam.runtime_' => 7.5 },   # extra parameters, e.g. max runtime
        'outfile'   => { 'output.mafft' => 'out.fa' }, # name of output data to fetch
        'yml'       => 'config.yml',                   # client credentials
 );
 
 my $url = $cipres->run;
 
 $cipres->clean_job( $url );

DESCRIPTION

The CyberInfrastructure for Phylogenetic RESearch (CIPRES) is a portal that provides access to phylogenetic analyses tools that can be run on the XSEDE HPC infrastructure. The portal has a web browser (point and click) interface, but also a web service interface that can be interacted with using RESTful commands. The basic workflow is as follows:

  • Launch a job This is done by issuing an HTTP POST request that includes: 1) HTTP authentication (i.e. a user name and password that is registered to the realm), 2) uploading input data, 3) configuration options for the job. The result value is an XML document that reports the status. If all goes well, this will report that the job was launched successfully, and it gives a URL to visit to check up on the status.

  • Check job status The job URL is visited periodically (at most once every 60 seconds, as per CIPRES policy). This is done using an authenticated GET request where the return value is an XML document that reports whether the job has finished. Once it is finished, the document will include a link to another document that lists the output data, which will be named (e.g. output.mafft) and which will be identifiable by a URL from whence the data can be retrieved.

  • Get results Upon completion the results are fetched from their respective URLs. Under simple cases this will be just a single file (e.g. an alignment), but there could be multiple file types, as well as output from STDERR and STDOUT and various job status files that the server generated internally.

This module hides the complexity of this interaction, so that entire analyses can be run using only the commands shown in the synopsis section. The general idea is that you can reuse this functionality in other modules and scripts. It is also does the heavy lifting for the cipresrun executable that allows you to run analyses from terminal interfaces.

METHODS

new()

The constructor takes the arguments as shown in the SYNOPSIS section. The arguments are a direct translation of the named arguments (not option flags) that are passed on the command line to the cipresrun program. The value of the outfile argument, and that of the param argument, is both a hash reference.

run()

Runs the entire analysis using the configuration as provided to the constructor. Returns key value pairs where each key is an outfile and each file is the data as text.

launch_job()

Is called by run(). Launches the analysis and returns the status URL at which progress can be inspected.

check_status()

Is called by run(). Consults the status URL. Returns a hash reference whose values specify whether the job is done, and if so, where the results can be fetched.

get_results()

Is called by run(). Returns the named result data as a hash.

yml()

Given the location the config.yml file, populates properties of the object with the right parameter values to authenticate the client with the CIPRES server.

ua()

Instantiates an authenticated LWP::UserAgent object.

payload()

Constructs the HTTP POST payload for launching jobs. Returns an array reference of key/value pairs.

headers()

Constructs the HTTP headers to identify the client app and, optionally, to tell the server that multipart/form-data is being attached as a payload.

clean_job()

Cleans up the job on the server that is identified by the provided input URL (i.e. the status URL, which is the return value of the run() method).

SEE ALSO

Also consult the documentation for cipresrun, which shows the usage of this module from the command line.

LICENSE

MIT License

Copyright (c) 2020 Naturalis Biodiversity Center

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.