NAME

Statistics::Basic::LeastSquareFit - find the least square fit for two lists

SYNOPSIS

A machine to calculate the Least Square Fit of given vectors x and y.

The module returns the alpha and beta filling this formula:

\$y = \$beta * \$x + \$alpha

for a given set of x and y co-ordinate pairs.

Say you have the set of Cartesian coordinates:

my @points = ( [1,1], [2,2], [3,3], [4,4] );

The simplest way to find the LSF is as follows:

my \$lsf = lsf()->set_size(int @points);
\$lsf->insert(@\$_) for @points;

Or this way:

my \$xv  = vector( map {\$_->} @points );
my \$yv  = vector( map {\$_->} @points );
my \$lsf = lsf(\$xv, \$yv);

And then either query the values or print them like so:

print "The LSF for \$xv and \$yv: \$lsf\n";
my (\$yint, \$slope) =
my (\$alpha, \$beta) = \$lsf->query;

LSF is meant for finding a line of best fit. \$beta is the slope of the line and \$alpha is the y-offset. Suppose you want to draw the line. Use these to calculate the x for a given y or vice versa:

my \$y = \$lsf->y_given_x( 7 );
my \$x = \$lsf->x_given_y( 7 );

(Note that x_given_y() can sometimes produce a divide-by-zero error since it has to divide by the \$beta.)

Create a 20 point "moving" LSF like so:

use Statistics::Basic qw(:all nofill);

my \$sth = \$dbh->prepare("select x,y from points where something");
my \$len = 20;
my \$lsf = lsf()->set_size(\$len);

\$sth->execute or die \$dbh->errstr;
\$sth->bind_columns( my (\$x, \$y) ) or die \$dbh->errstr;

my \$count = \$len;
while( \$sth->fetch ) {
\$lsf->insert( \$x, \$y );
if( defined( my (\$yint, \$slope) = \$lsf->query ) {
print "LSF: y= \$slope*x + \$yint\n";
}

# This would also work:
# print "\$lsf\n" if \$lsf->query_filled;
}

METHODS

This list of methods skips the methods inherited from Statistics::Basic::_TwoVectorBase (things like insert(), and ginsert()).

new()

Create a new Statistics::Basic::LeastSquareFit object. This function takes two arguments -- which can either be arrayrefs or Statistics::Basic::Vector objects. This function is called when the leastsquarefirt() shortcut-function is called.

query()

LSF is meant for finding a line of best fit. \$beta is the slope of the line and \$alpha is the y-offset.

my (\$alpha, \$beta) = \$lsf->query;
y_given_x()

Automatically calculate the y-value on the line for a given x-value.

my \$y = \$lsf->y_given_x( 7 );
x_given_y()

Automatically calculate the x-value on the line for a given y-value.

my \$x = \$lsf->x_given_y( 7 );

x_given_y() can sometimes produce a divide-by-zero error since it has to divide by the \$beta. This might be helpful:

if( defined( my \$x = eval { \$lsf->x_given_y(7) } ) ) {
warn "there is no x value for 7";

} else {
print "x (given y=7): \$x\n";
}
query_vector1()

Return the Statistics::Basic::Vector for the first vector used in the computation of alpha and beta.

query_vector2()

Return the Statistics::Basic::Vector object for the second vector used in the computation of alpha and beta.

query_mean1()

Returns the Statistics::Basic::Mean object for the first vector used in the computation of alpha and beta.

query_variance1()

Returns the Statistics::Basic::Variance object for the first vector used in the computation of alpha and beta.

query_covariance()

Returns the Statistics::Basic::Covariance object used in the computation of alpha and beta.