The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Bio::MaxQuant::ProteinGroups::Response - Analyze MQ proteinGroups for differential responses

VERSION

Version 0.03

SYNOPSIS

This module is tailored for MaxQuant data, but could be applicable elsewhere. The target experiment is one where several celltypes have been assayed for responses to different conditions, e.g. cancer cell lines responding to hormones and drugs. The module help to analyse responses to the conditions within each cell line and differences in those responses between cell lines. Those differences in responses indicate that the proteins involved are markers of the mechanism by which the cells differ in their response, and are therefore not only good targets to exploit for biomarkers, but also for biological follow up.

    use Bio::MaxQuant::ProteinGroups::Response;

    my $resp = Bio::MaxQuant::ProteinGroups::Response->new(
        filepath=>'proteinGroups.txt'
    );

    $resp->replicate_comparison(output_directory=>'./replicate_comparisons');
        $resp->calculate_response_comparisons(output_directory=>'./responses');
        $resp->calculate_differential_response_comparisons(output_directory=>'./differential_responses');

The data are output as tables in the directories. They are the printable tables returned from Statistics::Reproducibility.

SUBROUTINES/METHODS

new

creates a new ProteinGroups object.

Options: filepath - path to the file! default is proteinGroups.txt separator - NOT table separator! This is the separator used in the experiment name to separate cellline from condition from replicate. Default is full stop (period) rseparator - used for separating the compared cells/conditions. the default is hyphen (-) replicate_indicator - used in differential response comparisons to indicate which cell the individual replicates were compared (with the median of the other cell)

resultsfile

returns a handle to the results file, ready for writing.

this is not callde until processing starts, but when it is it will clobber the old file.

experiments

Returns the list of experiments in the file as a hash. Keys are names, values are listrefs of cellline,condition,replicate. Caches! So once called, it will not re-read the file unless/until you delete $o->{experiments}

Also populates cellines, conditions and replicates lists, which are accessible by their own accessors.

quickNormalize

TO BE REMOVED

Does a quick normalization of ALL the input columns. They are each normalized by their own median, and not directly to each other.

Two options are available:

        select => [list of indices]
        exclude => [list of indices]

Select allows to choose a particular subset of rows on which to normalize, e.g. some proteins you know don't change. Exclude allows to choose a particular subset of rows to exclude from the normalization, e.g. contaminants.

sub quickNormalize { my ($o,%opts) = @_; my $d = $o->data; my $n = $o->{n}; my @I = (0..$n-1); if($opts{exclude}){ my %I; @I{@I} = @I; delete $I{$_} foreach @{$opts{exclude}}; @I = sort {$a <=> $b} keys %I; } if($opts{select}){ @I = @{$opts{select}}; } $o->{quicknorm} = { map { my $med = median ((@{$d->{$_}})[@I]); ($_ => [map {/\d/ ? $_ - med : ''} @{$d->{$_}}]) } keys %$d; } }

TO BE REMOVED

blankRows

Option: select (as for quick Normalize)

This allows blanking the data for a subset (e.g. contaminants) so that they do not contribute to the statistics.

blankItems

help function, accepts a listref and a list of indices to blank (set to '') returns the listref for your convenience.

celllines

Returns the list of cell lines. Ensures experiments() is called.

conditions

Returns the list of conditions. Ensures experiments() is called.

condition_replicates

Returns a hash of key=conditions, value=list of replicates. Ensures experiments() is called.

replicates

Returns the list of replicates. Ensures experiments() is called.

parse_experiment_name

Method to parse the experiment name. Uses $o->{separator} to separate into 3 parts. Uses index and substr, not regexes. Default separator is dot/fullstop/period "." .

parse_response_name

Method to parse the response name. Uses $o->{rseparator} to separate into 3 parts. Uses index and substr, not regexes. Default separator is hyphen "-", which should not be used in experiment name!

replicate_comparison

Uses Statistics::Reproducibility to get normalized values and metrics on each condition.

Caches!

response_comparisons

Returns the list of comparisons that can be made between conditions within each cell line, given the replicates available.

At least 2 replicates must be available for a comparison to be made.

Caches.

differential_response_comparisons

Returns the list of comparisons that can be made between cell line responses to a each condition.

Caches.

data

Reads in all the protein ratios from the proteinGroups file. Also reads other identifying information, such as id and Leading Proteins. Reads each non-normalized ratio column into a list and stores them in a hash by experiment name.

datum

Converts one datum into a logged ratio or an empty string, depending.

calculate_response_comparisons

calculates the differences between conditions in a cell type. outputs a bunch of files. You can specify the diretory with output_directory option.

sigfigs

Helper function Tries FormatSigFigs($_[0],$SigFigs), but only if $_[0] actually looks like a number! $SigFigs is a global in this module and is set to 3.

calculate_differential_response_comparisons

medians

calculates the medians for all replicate sets and stores them in $o->{medians}

put_resultsfile_hashtable

a method called by medians() if resultsfile was defined. Calls put_resultsfile with some medians and normalized data.

dumpHashtable

helper function that dumps a HoL as a tab delimited table.

median

helper function that does a simple median calculation

put_resultsfile

take a list of lists (ref) and outputs directly to $o->{resultsfile}. This is as an alternative or addition to the output_file options avaiable for some methods, and is called by dump_results_table and others throughout processing.

dump_results_table

Dumps a results table to a file ($o->{complete_results_file}) for laster use.

translate_results_table

helper function that separates out and better labels the different results from Statistics::Reproducbility

AUTHOR

Jimi, <j at 0na.me>

BUGS

Please report any bugs or feature requests to bug-bio-maxquant-proteingroups-response at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Bio-MaxQuant-ProteinGroups-Response. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Bio::MaxQuant::ProteinGroups::Response

You can also look for information at:

ACKNOWLEDGEMENTS

LICENSE AND COPYRIGHT

Copyright 2014 Jimi.

This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at:

http://www.perlfoundation.org/artistic_license_2_0

Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license.

If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license.

This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder.

This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed.

Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.