The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

dbrvstatdiff - evaluate statistical differences between two random variables

SYNOPSIS

    dbrvstatdiff [-f format] [-c ConfRating] 
        [-h HypothesizedDifference] m1c sd1c n1c m2c sd2c n2c

OR

    dbrvstatdiff [-f format] [-c ConfRating] m1c n1c m2c n2c

DESCRIPTION

Produce statistics on the difference of sets of random variables.

Random variables are specified by:

m1c, m2c

The column names of means of random variables.

sd1c, sd2c

The column names of standard deviations of random variables.

n1c, n2c

Counts of number of samples for each random variable

These values can be computed with dbcolstats.

Creates up to ten new columns:

diff

The difference of RV 2 - RV 1.

diff_pct

The percentage difference (RV2-RV1)/1

diff_conf_{half,low,high} and diff_conf_pct_{half,low,high}

The half half confidence intervals and low and high values for absolute and relative confidence.

t_test

The T-test value for the given hypothesized difference.

t_test_result

Given the confidence rating, does the test pass? Will be either "rejected" or "not-rejected".

t_test_break

The hypothesised value that is break-even point for the T-test.

t_test_break_pct

Break-even point as a percent of m1c.

Confidence intervals are not printed if standard deviations are not provided. Confidence intervals assume normal distributions with common variances.

T-tests are only computed if a hypothesized difference is provided. Hypothesized diferences should be proceeded by <=, >=, =. T-tests assume normal distributions with common variances.

OPTIONS

-c FRACTION or --confidence FRACTION

Specify FRACTION for the confidence interval. Defaults to 0.95 for a 95% confidence factor (alpha = 0.05).

-f FORMAT or --format FORMAT

Specify a printf(3)-style format for output statistics. Defaults to %.5g.

-h DIFF or --hypothesis DIFF

Specify the hypothoesized difference as DIFF, where DIFF is something like <=0 or >=0, etc.

This module also supports the standard fsdb options:

-d

Enable debugging output.

-i or --input InputSource

Read from InputSource, typically a file name, or - for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

-o or --output OutputDestination

Write to OutputDestination, typically a file name, or - for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

--autorun or --noautorun

By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The --(no)autorun option controls that behavior within Perl.

--help

Show help.

--man

Show full manual.

SAMPLE USAGE

Input:

    #fsdb title mean2 stddev2 n2 mean1 stddev1 n1
    example6.12 0.17 0.0020 5 0.22 0.0010 4

Command:

    cat data.fsdb | dbrvstatdiff mean2 stddev2 n2 mean1 stddev1 n1

Output:

    #fsdb title mean2 stddev2 n2 mean1 stddev1 n1 diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high
    example6.12 0.17    0.0020  5       0.22    0.0010  4       0.05    29.412  0.0026138       0.047386        0.052614        1.5375  27.874  30.949
    #  | dbrvstatdiff mean2 stddev2 n2 mean1 stddev1 n1

Input 2:

(example 7.10 from Scheaffer and McClave):

    #fsdb title x2 sd2 n2 x1 sd1 n1
    example7.10 9 35.22 24.44 9 31.56 20.03

Command 2:

    dbrvstatdiff -h '<=0' x2 sd2 n2 x1 sd1 n1

Output 2:

    #fsdb title n1 x1 sd1 n2 x2 sd2 diff diff_pct diff_conf_half diff_conf_low diff_conf_high diff_conf_pct_half diff_conf_pct_low diff_conf_pct_high t_test t_test_result
    example7.10 9 35.22 24.44 9 31.56 20.03 3.66 0.11597 4.7125 -1.0525 8.3725 0.14932 -0.033348 0.26529 1.6465 not-rejected
    #  | /global/us/edu/ucla/cs/ficus/users/johnh/BIN/DB/dbrvstatdiff -h <=0 x2 sd2 n2 x1 sd1 n1

Case 3:

A common use case is to have one file with a set of trials from two experiments, and to use dbrvstatdiff to see if they are different.

Input 3:

    #fsdb case trial value
    a  1  1
    a  2  1.1
    a  3  0.9
    a  4  1
    a  5  1.1
    b  1  2
    b  2  2.1
    b  3  1.9
    b  4  2
    b  5  1.9

Command 3:

    cat two_trial.fsdb | 
        dbmultistats -k case value |
        dbcolcopylast mean stddev n |
        dbrvstatdiff mean stddev n copylast_mean copylast_stddev copylast_n

Output 3:

SEE ALSO

Fsdb. dbcolstats. dbcolcopylast.

CLASS FUNCTIONS

new

    $filter = new Fsdb::Filter::dbrvstatdiff(@arguments);

Create a new dbrvstatdiff object, taking command-line arguments.

set_defaults

    $filter->set_defaults();

Internal: set up defaults.

parse_options

    $filter->parse_options(@ARGV);

Internal: parse command-line arguments.

setup

    $filter->setup();

Internal: setup, parse headers.

run

    $filter->run();

Internal: run over each rows.

AUTHOR and COPYRIGHT

Copyright (C) 1991-2008 by John Heidemann <johnh@isi.edu>

This program is distributed under terms of the GNU general public license, version 2. See the file COPYING with the distribution for details.