# NAME

Statistics::Basic::LeastSquareFit - find the least square fit for two lists

# SYNOPSIS

A machine to calculate the Least Square Fit of given vectors x and y.

The module returns the alpha and beta filling this formula:

``    \$y = \$beta * \$x + \$alpha``

for a given set of x and y co-ordinate pairs.

Say you have the set of Cartesian coordinates:

``    my @points = ( [1,1], [2,2], [3,3], [4,4] );``

The simplest way to find the LSF is as follows:

``````    my \$lsf = lsf()->set_size(int @points);
\$lsf->insert(@\$_) for @points;``````

Or this way:

``````    my \$xv  = vector( map {\$_->} @points );
my \$yv  = vector( map {\$_->} @points );
my \$lsf = lsf(\$xv, \$yv);``````

And then either query the values or print them like so:

``````    print "The LSF for \$xv and \$yv: \$lsf\n";
my (\$yint, \$slope) =
my (\$alpha, \$beta) = \$lsf->query;``````

LSF is meant for finding a line of best fit. `\$beta` is the slope of the line and `\$alpha` is the y-offset. Suppose you want to draw the line. Use these to calculate the `x` for a given `y` or vice versa:

``````    my \$y = \$lsf->y_given_x( 7 );
my \$x = \$lsf->x_given_y( 7 );``````

(Note that `x_given_y()` can sometimes produce a divide-by-zero error since it has to divide by the `\$beta`.)

Create a 20 point "moving" LSF like so:

``````    use Statistics::Basic qw(:all nofill);

my \$sth = \$dbh->prepare("select x,y from points where something");
my \$len = 20;
my \$lsf = lsf()->set_size(\$len);

\$sth->execute or die \$dbh->errstr;
\$sth->bind_columns( my (\$x, \$y) ) or die \$dbh->errstr;

my \$count = \$len;
while( \$sth->fetch ) {
\$lsf->insert( \$x, \$y );
if( defined( my (\$yint, \$slope) = \$lsf->query ) {
print "LSF: y= \$slope*x + \$yint\n";
}

# This would also work:
# print "\$lsf\n" if \$lsf->query_filled;
}``````

# METHODS

This list of methods skips the methods inherited from Statistics::Basic::_TwoVectorBase (things like insert(), and ginsert()).

new()

Create a new Statistics::Basic::LeastSquareFit object. This function takes two arguments -- which can either be arrayrefs or Statistics::Basic::Vector objects. This function is called when the leastsquarefirt() shortcut-function is called.

query()

LSF is meant for finding a line of best fit. `\$beta` is the slope of the line and `\$alpha` is the y-offset.

``    my (\$alpha, \$beta) = \$lsf->query;``
y_given_x()

Automatically calculate the y-value on the line for a given x-value.

``    my \$y = \$lsf->y_given_x( 7 );``
x_given_y()

Automatically calculate the x-value on the line for a given y-value.

``    my \$x = \$lsf->x_given_y( 7 );``

`x_given_y()` can sometimes produce a divide-by-zero error since it has to divide by the `\$beta`. This might be helpful:

``````    if( defined( my \$x = eval { \$lsf->x_given_y(7) } ) ) {
warn "there is no x value for 7";

} else {
print "x (given y=7): \$x\n";
}``````
query_vector1()

Return the Statistics::Basic::Vector for the first vector used in the computation of alpha and beta.

query_vector2()

Return the Statistics::Basic::Vector object for the second vector used in the computation of alpha and beta.

query_mean1()

Returns the Statistics::Basic::Mean object for the first vector used in the computation of alpha and beta.

query_variance1()

Returns the Statistics::Basic::Variance object for the first vector used in the computation of alpha and beta.

query_covariance()

Returns the Statistics::Basic::Covariance object used in the computation of alpha and beta.

Paul Miller `<jettero@cpan.org>`