NAME
Statistics::Sequences  Manage sequences (ordered list of literals) for testing their runs, joins, turns, trinomes, potential energy, etc.
VERSION
This is documentation for Version 0.14 of Statistics::Sequences.
SYNOPSIS
use Statistics::Sequences 0.14;
$seq = Statistics::Sequences>new();
my @data = (1, 'a', 'a', 1); # ordered list of literal scalars (numbers, strings), as permitted by specific test
$seq>load(\@data); # or @data or dataname => \@data
print $seq>observed(stat => 'runs'); # expected, variance, z_value, p_value  assuming submodule Runs.pm is installed
print $seq>test(stat => 'vnomes', length => 2); #   assuming submodule Vnomes.pm is installed
$seq>dump(stat => 'runs', values => {observed => 1, z_value => 1, p_value => 1}, exact => 1, tails => 1);
# see also Statistics::Data for inherited methods
DESCRIPTION
Loading, updating and accessing data as ordered list of literal scalars (numbers, strings) for statistical tests of their sequential structure via Statistics::Sequences::Joins, Statistics::Sequences::Pot, Statistics::Sequences::Runs, Statistics::Sequences::Turns and Statistics::Sequences::Vnomes. Note that none of these submodules are installed by default; to use this module as intended, install one or more of these submodules.
To access the tests, use this base module to create a Statistics::Sequences object with new, then load data into it and access each test by calling the test method, specifying the stat attribute: either joins, pot, runs, turns or vnomes, where the relevant submodule is installed. This allows running several tests on the same data, as the data are immediately available to each test (of joins, pot, runs, turns or vnomes). See the SYNOPSIS for a simple example.
Alternatively, use each submodule directly, and restrict analyses to the submodule's test; this module is used implicitly as their base. That is, to perform a test of one type (e.g., runs), use the relevant subpackage, load data via its constructor; see the SYNOPSIS for the particular test, i.e., Joins, Pot, Runs, Turns or Vnomes. You won't be able to access other tests of the same data by this approach, unless you create another object for that test, and then specifically pass the data from the earlier object into the new one.
SUBROUTINES/METHODS
new
$seq = Statistics::Sequences>new();
Returns a new Statistics::Sequences object (inherited from Statistics::Data) by which all the methods for caching, reading and testing data can be accessed, including each of the methods for performing the Runs, Joins, Pot, Turns or Vnomestests.
Subpackages also have their own new method  so, e.g., Statistics::Sequences::Runs, can be individually imported, and its own new method can be called, e.g.:
use Statistics::Sequences::Runs;
$runs = Statistics::Sequences::Runs>new();
In this case, data are not automatically shared across packages, and only one test (in this case, the Runstest) can be accessed through the classobject returned by new.
load, add, access, unload
All these operations on the basic data are inherited from Statistics::Data  see this doc for details of these and other possible methods.
Dichotomous data: Both the runs and joinstests expect dichotomous data: a binary or binomial or Bernoulli sequence, but with whatever characters to symbolize the two possible events. They test their "loads" to make sure the data are dichotomous. To reduce numerical and categorical data to a dichotomous level, see the pool, match, split, swing, shrink (boolwin) and other methods in Statistics::Data::Dichotomize.
observed, observation
$v = $seq>observed(stat => 'joinspotrunsturnsvnomes', %args); # gets data from cache, with any args needed by the stat
$v = $seq>observed(stat => 'joinspotrunsturnsvnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats
$v = $seq>observed(stat => 'joinspotrunsturnsvnomes', label => 'myLabelledLoadedData'); # just needs args for partic.stats
Return the observed value of the statistic for the loaded data, or data sent with this call, eg., how many runs in the sequence (1, 1, 0, 1). See the particular statistic's manpage for any other arguments needed or optional.
expected, expectation
$v = $seq>expected(stat => 'joinspotrunsturnsvnomes', %args); # gets data from cache, with any args needed by the stat
$v = $seq>expected(stat => 'joinspotrunsturnsvnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats
Return the expected value of the statistic for the loaded data, or data sent with this call, eg., how many runs should occur in a 4length sequence of two possible events. See the statistic's manpage for any other arguments needed or optional.
variance
$seq>variance(stat => 'joinspotrunsturnsvnomes', %args); # gets data from cache, with any args needed by the stat
$seq>variance(stat => 'joinspotrunsturnsvnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats
Returns the expected range of deviation in the statistic's observed value for the given number of trials.
obsdev, observed_deviation
$v = $seq>obsdev(stat => 'joinspotrunsturnsvnomes', %args); # gets data from cache, with any args needed by the stat
$v = $seq>obsdev(stat => 'joinspotrunsturnsvnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats
Returns the deviation of (difference between) observed and expected values of the statistic for the loaded/given sequence (O  E).
stdev, standard_deviation
$v = $seq>stdev(stat => 'joinspotrunsturnsvnomes', %args); # gets data from cache, with any args needed by the stat
$v = $seq>stdev(stat => 'joinspotrunsturnsvnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats
Returns squareroot of the variance.
z_value, zscore
$v = $seq>zscore(stat => 'joinspotrunsturnsvnomes', %args); # gets data from cache, with any args needed by the stat
$v = $seq>zscore(stat => 'joinspotrunsturnsvnomes', data => [qw/blah bing blah blah blah/]); # just needs args for partic.stats
Return the deviation ratio: observed deviation to standard deviation. Use argument ccorr for continuity correction.
p_value, test
$p = $seq>test(stat => 'runs');
$p = $seq>test(stat => 'joins');
$p = $seq>test(stat => 'turns');
$p = $seq>test(stat => 'pot', state => 'a value appearing in the data');
$p = $seq>test(stat => 'vnomes', length => 'an integer greater than zero and less than samplesize');
Returns the probability of observing so many runs, joins, etc., versus those expected, relative to the expected variance.
When using a Statistics::Sequences classobject, this method requires naming which test to perform, i.e., runs, joins, pot or vnomes. This is not required when the classobject already refers to one of the submodules, as created by the new
method within Statistics::Sequences::Runs, Statistics::Sequences::Joins, Statistics::Sequences::Pot, Statistics::Sequences::Turns and Statistics::Sequences::Vnomes.
Common options
Options common to all the subpackage tests are as follows.
 data => 'string'

Optionally specify the name of the data to be tested. By default, this is not required: the data tested are those that were last loaded, either anonymously, or as returned by one of the Statistics::Data::Dichotomize methods. Otherwise, if the data are already ready for testing in a dichotomous format, data that were previously loaded by name can be individually tested. For example, here are two sets of data that are loaded by name, and then a single test of one of them is performed.
@chimps = (qw/banana banana cheese banana cheese banana banana banana/); @mice = (qw/banana cheese cheese cheese cheese cheese cheese cheese/); $seq>load(chimps => \@chimps, mice => \@mice); $p = $seq>test(stat => 'runs', data => 'chimps');
 ccorr => boolean

Specify whether or not to perform the continuitycorrection on the observed deviation. Default is false. Relevant only for those tests relying on a Ztest. See Statistics::Zed.
 tails => 12

Specify whether the zvalue is calculated for both sides of the normal (or chisquare) distribution (2, the default for most tested data) or only one side (the default for data prepared with the swing method.
Testspecific required settings and options
Some subpackage tests need to have parameters defined in the call to test, and/or have specific options, as follows.
Joins : The Joins test optionally allows the setting of a probability value; see testtest
in the Statistics::Sequences::Joins manpage.
Pot : The Pot test requires the setting of a state to be tested; see test
in the Statistics::Sequences::Pot manpage.
Vnomes : The Serial test for vnomes requires a length, i.e., the value of v; see test
in the Statistics::Sequences::Vnomes manpage..
Runs, Turns : There are presently no specific requirements nor options for the Runs and Turnstests.
stats_hash
$href = $seq>stats_hash(stat => 'runs', values => {observed => 1, expected => 1, variance => 1, z_value => 1, p_value => 1});
Returns a hashref with values for any of the descriptives and probability value relevant to the specified statistic. Include other required or optional arguments relevant to any of the values requested, e.g., ccorr if getting a z_value, tails and exact if getting a p_value, state if testing pot, prob if testing joins, ... precision_s, precision_p ...
dump
$seq>dump(stat => 'runsjoinspot ...', values => {}, format => 'stringtable', flag => '10', precision_s => 'integer', precision_p => 'integer');
Alias: print_summary
Print results of the lastconducted test to STDOUT. By default, if no parameters to dump
are passed, a single line of test statistics is printed. Options are as follows.
 values => hashref

Hashref of the statistical parameters to dump. Default is observed value and pvalue for the given stat.
 flag => boolean

If true, the pvalue associated with the zvalue is appended with a single asterisk if the value if below .05, and with two asterisks if it is below .01.
If false (default), nothing is appended to the pvalue.
 format => 'tablelablinecsv'

Default is 'csv', to print the stats hash as a commaseparated string (no newline), e.g., '4.0000,0.8596800". If specifying 'labline', you get something like "observed = 4.0000, p_value = 0.8596800\n". If specifying "table", this is a dump from Text::SimpleTable with the stat methods as headers and column length set to the maximum required for the given headers, level of precision, flag, etc. For example, with precision_s => 4 and precision_p => 7, you get:
.+.  observed  p_value  +++  4.0000  0.8596800  '+'
 verbose => 10

If true, includes a title giving the name of the statistic, details about the hypothesis tested (if p_value => 1 in the values hashref), et al. No effect if format is not defined or equals 'csv'.
 precision_s => 'nonnegative integer'

Precision of the statistic values (observed, expected, variance, z_value).
 precision_p => 'nonnegative integer'

Specify rounding of the probability associated with the zvalue to so many digits. If zero or undefined, you get everything available.
dump_data
$seq>dump_data(delim => "\n");
Prints to STDOUT a spaceseparated line of the tested data  as dichotomized and put to test. Optionally, give a value for delim to specify how the datapoints should be separated. Inherited from Statistics::Data.
BUNDLING
This module use
s its submodules implicitly  so a bundled program using this module might need to explicitly use
its submodules if these need to be included in the bundle itself.
AUTHOR
Roderick Garton, <rgarton at cpan.org>
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Statistics::Sequences
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
http://rt.cpan.org/NoAuth/Bugs.html?Dist=StatisticsSequences0.14
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
LICENSE AND COPYRIGHT
 Copyright (c) 20062016 Roderick Garton

This program is free software. It may be used, redistributed and/or modified under the same terms as Perl5.6.1 (or later) (see http://www.perl.com/perl/misc/Artistic.html).
 Disclaimer

To the maximum extent permitted by applicable law, the author of this module disclaims all warranties, either express or implied, including but not limited to implied warranties of merchantability and fitness for a particular purpose, with regard to the software and the accompanying documentation.