Statistics::Data::Rank - Utilities for ranking data
This is documentation for Version 0.02, released February 2015.
use Statistics::Data::Rank; my $rank = Statistics::Data::Rank->new(); my %vars = ('nodrug' => [174, 224, 260], 'placebo' => [261, 213, 231], 'morphine' => [199, 143, 113]); my $ranks_href = $rankd->ranks_between(data => \%vars); # pre-load data: $rankd->load(\%vars); $ranks_href = $rankd->ranks_within(); my $sor = $rankd->sum_of_ranks_within(); # or _between() # or specify which vars to rank/sum-rank: $sor = $rankd->sum_of_ranks_within(lab => [qw/placebo morphine/]);
Performs ranking of nammed data, either by an independent, between-variable method (as in Kruskall-Wallis test), or a dependent, cross-variable method (as in Friedman test). Methods return hash of ranks and sum-of-ranks. Data must be pre-loaded (as per Statistics::Data or sent to the methods with the argument data as a hash-ref of array-refs. Output is tested ahead of installation to ensure it matches published data (Siegal, 1956).
$rankd = Statistics::Data->new();
Constructor, expecting/accepting no args. Inherited from Statistics::Data.
$rankd->load('a' => [1, 4], 'b' => [3, 7]);
The given data can now be used by any of the following methods. This is inherited from Statistics::Data, and all its other methods are available here via the class object. Only passing of data as a hash of arrays (HOA) is supported for now. Alternatively, give each of the following methods the HOA for the optional named argument data.
$ranks_href = $rankd->ranks_between(data => $values_href); $ranks_href = $rankd->ranks_between(lab => [qw/fez bop/]); # two, say, of previously loaded data $ranks_href = $rankd->ranks_between(); # all of any previously loaded data ($ranks_href, $ties_aref, $nties) = $rankd->ranks_between(data => $values_href);
Given a hash of arefs where the keys are names (groups, treatments) of the sample data (each as an aref), return a hash of the ranks of each value under each name, after pooling all the data and ranking them with a link to their name. Ties are resolved by giving each tied score the mean of the ranks for which it is tied (see Siegal, 1956, p. 188ff). If called in list context, then a reference to an array of the number of variables having the same value per its rank, and a scalar for the number of ties, are also returned. Before ranking, data are checked for numeracy, and any non-numeric or empty values are culled.
Used, e.g., by Kruskal-Wallis ANOVA, Jonckheere-Terpstra ANOVA, Dwass-Steel comparison, and Worsley-cluster tests.
$ranks_href = $rankd->ranks_within(data => $values_href); # pass data now $ranks_href = $rankd->ranks_within(); # using all of any previously loaded data ($ranks_href, $ties_href) = $rankd->ranks_within();
Given a hash of arefs where the keys are variable names, and the values are their actual sample data (each as an aref), returns a hash of the ranks of each value under each name, calculated dependently (per the values across individual indices). So if 'a' => [1, 3, 7] and 'b' => [4, 5, 6], the ranks returned will be 'a' => [1, 2, 6] and 'b' => [3, 4, 5]. Ties are resolved by giving each tied score the mean of the ranks for which it is tied (see Siegal, 1956, p. 188ff). If called in list context, then a reference to hash of aref is also returned, giving the number of variables having the same value at each index for a rank. Before ranking, data are checked for numeracy, and any non-numeric or empty values are culled.
Used, e.g., by Friedman and Page tests.
$sor = $rankd->sum_of_ranks_between(); # all pre-loaded data $sor = $rankd->sum_of_ranks_between(data => HASHREF); # or using these data $sor = $rankd->sum_of_ranks_between(lab => STRING); # or for a particular load
Returns the sum of ranks for (1) the entire dataset, either as given in argument data, or all pre-loaded variables; or for a particular pre-loaded dataset (variable) as given in the named argument lab, where (assuming more than one variable), all values have been pooled and ordered by value per variable.
$sor = $rankd->sum_of_ranks_within(); # all pre-loaded data $sor = $rankd->sum_of_ranks_within(data => HASHREF); # or using these data $sor = $rankd->sum_of_ranks_within(lab => STRING); # or for a particular load
If called in array context, the sum-href is returned followed by the href of ties (useful for some statistic). Otherwise, it returns the href of summed ranks. The sum for a particular named variable can also be returned by the argument lab.
Returns the sum of the squared sums-of-ranks calculated dependently (per the values across individual indices). Used in Friedman ANOVA. Expects a hashref of the variables, keyed by name. Called in list context, also returns a hash of the tied ranks.
List::AllUtils : used for summing.
Statistics::Data : used as base.
Statistics::Lite : for basic decriptives.
String::Util : string content checking.
croaked ahead of calculating (sum of) ranks between or within and there is no hashref of data available.
croak
croaked by sum_of_ranks_between and sum_of_ranks_within if the value of the optional argument lab does not exist as pre-loaded data; either in a call to load or add, or as data in the present method.
Siegal, S. (1956). Nonparametric statistics for the behavioral sciences. New York, NY, US: McGraw-Hill
Roderick Garton, <rgarton at cpan.org>
<rgarton at cpan.org>
Please report any bugs or feature requests to bug-statistics-data-rank-0.02 at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Statistics-Data-Rank-0.02. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
bug-statistics-data-rank-0.02 at rt.cpan.org
You can find documentation for this module with the perldoc command.
perldoc Statistics::Data::Rank
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Statistics-Data-Rank-0.02
AnnoCPAN: Annotated CPAN documentation
http://annocpan.org/dist/Statistics-Data-Rank-0.02
CPAN Ratings
http://cpanratings.perl.org/d/Statistics-Data-Rank-0.02
Search CPAN
http://search.cpan.org/dist/Statistics-Data-Rank-0.02/
Statistics::RankCorrelation : loop for dealing with ties in calculating "ranks within" adapted from Boggs' "rank" function.
Copyright 2015 Roderick Garton.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.
To install Statistics::Data::Rank, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Statistics::Data::Rank
CPAN shell
perl -MCPAN -e shell install Statistics::Data::Rank
For more information on module installation, please visit the detailed CPAN module installation guide.