The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Statistics::Sequences - Common methods/interface for sub-module sequential tests (of Runs, Joins, Pot, etc.)

VERSION

This is documentation for Version 0.15 of Statistics::Sequences.

SYNOPSIS

 use Statistics::Sequences 0.15;
 $seq = Statistics::Sequences->new();
 my @data = (1, 'a', 'a', 1); # ordered list
 $seq->load(\@data); # or @data or 'name' => \@data
 print $seq->observed(stat => 'runs'); # assuming sub-module Runs.pm is installed
 print $seq->test(stat => 'vnomes', length => 2); # assuming sub-module Vnomes.pm is installed
 $seq->dump(stat => 'runs', values => [qw/observed z_value p_value/], exact => 1, tails => 1);
 # see also Statistics::Data for inherited methods

DESCRIPTION

This module provides methods for loading, updating and accessing data as ordered list of scalar values (numbers, strings) for statistical tests of their sequential properties via sub-modules including Statistics::Sequences::Joins, Statistics::Sequences::Pot, Statistics::Sequences::Runs, Statistics::Sequences::Turns and Statistics::Sequences::Vnomes. None of these sub-modules are installed by default.

It also provides a common interface to access the statistical values returned by these tests, so that several tests can be performed on the same data, with the same class object. Alternatively, use each sub-module directly.

SUBROUTINES/METHODS

new

 $seq = Statistics::Sequences->new();

Returns a new Statistics::Sequences object (inherited from Statistics::Data) by which all the methods for caching, reading and testing data can be accessed, including each of the methods for performing the Runs-, Joins-, Pot-, Turns- or Vnomes-tests.

Sub-packages also have their own new method - so, e.g., Statistics::Sequences::Runs, can be individually imported, and its own new method can be called, e.g.:

 use Statistics::Sequences::Runs;
 $runs = Statistics::Sequences::Runs->new();

In this case, data are not automatically shared across packages, and only one test (in this case, the Runs-test) can be accessed through the class-object.

load, add, access, unload

All these operations on the basic data are inherited from Statistics::Data - see this doc for details of these and other possible methods.

observed

 $v = $seq->observed(stat => 'joins|pot|runs|turns|vnomes', %args); # gets data from cache, with any args needed by the stat
 $v = $seq->observed(stat => 'joins|pot|runs|turns|vnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats
 $v = $seq->observed(stat => 'joins|pot|runs|turns|vnomes', label => 'myLabelledLoadedData'); # just needs args for partic.stats

If this method is defined by the sub-module named in the argument stat, returns the observed value of the statistic for the loaded data, or data sent with this call, eg., how many runs in the sequence (1, 1, 0, 1). See the particular statistic's manpage for any other arguments needed or optional.

expected

 $v = $seq->expected(stat => 'joins|pot|runs|turns|vnomes', %args); # gets data from cache, with any args needed by the stat
 $v = $seq->expected(stat => 'joins|pot|runs|turns|vnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats

If this method is defined by the sub-module named in the argument stat, returns the expected value of the statistic for the loaded data, or data sent with this call, eg., how many runs should occur in a 4-length sequence of two possible events. See the statistic's manpage for any other arguments needed or optional.

variance

 $seq->variance(stat => 'joins|pot|runs|turns|vnomes', %args); # gets data from cache, with any args needed by the stat
 $seq->variance(stat => 'joins|pot|runs|turns|vnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats

Returns the expected range of deviation in the statistic's observed value for the given number of trials, if this method is defined by the sub-module named in the argument stat.

obsdev

 $v = $seq->obsdev(stat => 'joins|pot|runs|turns|vnomes', %args); # gets data from cache, with any args needed by the stat
 $v = $seq->obsdev(stat => 'joins|pot|runs|turns|vnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats

Returns the deviation of (difference between) observed and expected values of the statistic for the loaded/given sequence (O - E); if this method is defined by the sub-module named in the argument stat.

stdev

 $v = $seq->stdev(stat => 'joins|pot|runs|turns|vnomes', %args); # gets data from cache, with any args needed by the stat
 $v = $seq->stdev(stat => 'joins|pot|runs|turns|vnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats

Returns square-root of the variance, if this method is defined by the sub-module named in the argument stat.

z_value

 $v = $seq->z_value(stat => 'joins|pot|runs|turns|vnomes', %args); # gets data from cache, with any args needed by the stat
 $v = $seq->z_value(stat => 'joins|pot|runs|turns|vnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats

Return the deviation ratio: observed deviation to standard deviation. Use argument ccorr for continuity correction.

p_value

 $p = $seq->p_value(stat => 'runs'); # same for 'joins', 'turns'
 $p = $seq->p_value(stat => 'pot', state => 'a value appearing in the data');
 $p = $seq->p_value(stat => 'vnomes', length => 'an integer greater than zero and less than sample-size');

Returns the probability of observing so many runs, joins, etc., according to whatever such method is defined by the sub-module named in the argument stat.

stats_hash

 $href = $seq->stats_hash(values => [qw/observed expected variance z_value p_value/]);
 $href = $seq->stats_hash(values => {observed => 1, expected => 1, variance => 1, z_value => 1, p_value => 1});

Returns a hashref with values for any of the methods for the specified statistic (e.g., observed() value for runs). The named argument values is for an array-ref of stats that correspond to the method names for the given stat (which is really a class name, e.g., runs, pot, for a Statistics::Sequences sub-module). The hash-reference of stat-values as keys (also shown in example above) is only in place for the purpose of setting optional args per value in a future version.

Include other required or optional arguments relevant to any of the values requested, as defined in the sub-module manpages, e.g., ccorr if getting a z_value, tails and exact if getting a p_value, state if testing pot, prob if testing joins. The args precision_s and precision_p apply to all values, although the latter specifically applies to any p_value.

dump

 $seq->dump(stat => 'runs|joins|pot ...', values => {}, format => 'string|labline|table', flag => '1|0', precision_s => 'integer', precision_p => 'integer');

Alias: print_summary

Print results of the last-conducted test to STDOUT. By default, if no parameters to dump are passed, a single line of test statistics is printed. Options are as follows.

values => hashref

Hashref of the statistical parameters to dump. Default is observed value and p-value for the given stat.

flag => boolean

If true, the p-value associated with the z-value is appended with a single asterisk if the value if below .05, and with two asterisks if it is below .01.

If false (default), nothing is appended to the p-value.

format => 'table|labline|csv'

Default is 'csv', to print the stats hash as a comma-separated string (no newline), e.g., '4.0000,0.8596800". If specifying 'labline', you get something like "observed = 4.0000, p_value = 0.8596800\n". If specifying "table", this is a dump from Text::SimpleTable with the stat methods as headers and column length set to the maximum required for the given headers, level of precision, flag, etc. For example, with precision_s => 4 and precision_p => 7, you get:

 .-----------+-----------.
 | observed  | p_value   |
 +-----------+-----------+
 | 4.0000    | 0.8596800 |
 '-----------+-----------'
verbose => 1|0

If true, includes a title giving the name of the statistic, details about the hypothesis tested (if p_value => 1 in the values hashref), et al. No effect if format is not defined or equals 'csv'.

precision_s => 'non-negative integer'

Precision of the statistic values (observed, expected, variance, z_value).

precision_p => 'non-negative integer'

Specify rounding of the probability associated with the z-value to so many digits. If zero or undefined, you get everything available.

dump_data

 $seq->dump_data(delim => "\n");

Prints to STDOUT a space-separated line of the tested data - as dichotomized and put to test. Optionally, give a value for delim to specify how the datapoints should be separated. Inherited from Statistics::Data.

DIAGNOSTICS

Requested sequences module '$class' is not available

Croaked when any method is called that is not defined for the sub-module named as stat.

Method '$method' is not defined or correctly called for $class

Method, like observed() called for a particular class (with the argument stat in this parent module) might not exist, e.g., like 'kurtosis' among the 'pot' statistics; or the other arguments for the method are invalid, like calling them without any data.

No values requested to return in hash

Croaked from stats_hash, including va dump(), if array or hash ref named values is not given in the call.

Cannot print data-string

Courtesy of the dump() method; when trying to print a string as a single line or a table (via Text::SimpleTable's draw).

BUNDLING

This module uses its sub-modules implicitly - so a bundled program using this module might need to explicitly use its sub-modules if these need to be included in the bundle itself.

AUTHOR

Roderick Garton, <rgarton at cpan.org>

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Statistics::Sequences

You can also look for information at:

LICENSE AND COPYRIGHT

This program is free software. It may be used, redistributed and/or modified under the same terms as Perl-5.6.1 (or later) (see http://www.perl.com/perl/misc/Artistic.html).

Disclaimer

To the maximum extent permitted by applicable law, the author of this module disclaims all warranties, either express or implied, including but not limited to implied warranties of merchantability and fitness for a particular purpose, with regard to the software and the accompanying documentation.