NAME

Statistics::GaussHelmert - General weighted least squares estimation

VERSION

This document refers to version 0.03

SYNOPSIS

use Statistics::GaussHelmert;

# create an empty model
my $estimation = new Statistics::GaussHelmert;

# setup the model given observations $y, covariance matrices
# $Sigma_yy, an initial guess $b0 for the unknown parameters.
$estimation->observations($y);
$estimation->covariance_observations($Sigma_yy);
$estimation->initial_guess($b0);

# specify the implicit model function and its Jacobians by using
# closures. 
$estimation->observation_equations(sub { ... });
$estimation->Jacobian_unknowns(sub { ... });
$estimation->Jacobian_observations(sub { ... });

# Maybe we want to impose some constraints on the unknown
# parameters, this is not mandatory
$estimation->constraints(sub { ... });
$estimation->Jacobian_constraints(sub { ... });

# start estimation
$estimation->start(verbose => 1);

# print result
print $estimation->estimated_unknown(),
   $estimation->covariance_unknown();

DESCRIPTION

This module is a flexible tool for estimating model parameters given a set of observations. The module provides function for a linear estimation model, the underlying model is called Gauss-Helmert model.

Statistics::GaussHelmert is different to modules such as Statistics::OLS in the sense that it may fit arbitrary functions to data of any dimensions. You have to specify an implicit minimization function (in contrast to explicit functions as in traditional regression methods) and its derivatives with respects to the unknown and the observations. You may also specify constraint function on the unknowns (with its derivative). Furthermore you already need an approximate solution. For some problems it is easy finding approximate solutions by directly solving for the unknown parameters with some well chosen observations.

Setting up the model

Assume you have a some measurements (or observations) which are related to some unknown parameters that you want to know. For example, you measure points in 2D which should lie on an unknown line.

The points that you measure in 2D will be put in a PDL::Matrix object $y (a so-called multipiddle i.e. multi-PDL object, a vector of vectors). The unknown parameter vector $b is again a PDL::Matrix object, just a vector. Additionally, you should have an idea on how precise your measurements are and specify this in a covariance matrix - actually again a multipiddle: a vector of matrices.

Before you use this module to solve your problem, you first have to sit down and calculate the equations which relate your observations with the unknown parameters. The Gauss-Helmert model requires that you specify this equation in the form w($y,$b) = 0, where $y are the observations (lined up in a PDL::Matrix vector of vectors, see below) and $b is the unknown parameter vector. This equation w($y,$b) = 0 is called "observation equation". Note that in general this is a "vector equation", i.e. a set of scalar equations.

Then you have to think about, what constraints you want to impose on the unkown parameter vector $b. It could be that you want it to have the (euclidean) length 1, or you want to set one parameter to a certain value. So you specify a function of the form h($b) = 0, again in general being a "vector equation".

To do the minimization according to the model w($y,$b)=0 and h($b) = 0, you also need to specify the Jacobians of the observation function w($y,$b) and the constraint function h($b). We need the Jacobians with respect to the observation vector $y and the unknown vector $b. Right now it is not possible to do numerical differentiation - in that case the Jacobians wouldn't be needed.

Coding the model

Let us say you have an array @y of n different observation vectors, each of the vectors being a PDL::Matrix vector, for example generated by the special constructor vpdl from PDL::Matrix. To put then in the estimation, you have line them up in a multipiddle $y using $y = PDL::cat(@y).

The same applies for the set of covariance matrices @Sigma_yy for the elements of the array @y.

The observation function and its Jacobians (also the constraints on the unknown and its Jacobians) may be defined as closures. The Jacobian closures return PDL::Matrix matrices, the dimensions depend on the dimensions of the vectors in @y and $b.

You may want to look at some examples in the t/ directory or the example/ directory for more detailed information.

Furthermore you may pass some options to the estimation:

verbose() ; verbose(2) ; verbose("/path/to/my/logfile")

If this variable is set to 1, a short protocol is printed on STDOUT, If it is set to 2, a more elaborated output is written to the file "./GaussHelmert.log". You may also pass a filename to verbose so the elaborated output will go to this file.

The value of verbose() defaults to 0, i.e. no output at all.

max_iteration() ; max_iteration(10)

returns or sets the maximum number of iteration which may be reached if the abort criteria is not met. Default is 15.

eps() ; eps($small_number)

Returns or sets a small number which is used for the abort criteria of the iteration. This refers to the largest difference of change that is allowed in the unknown vector divided by its standard deviation. It defaults to 10^(-2), this means we want to be within a precision of 1% of the standard deviation.

Estimation results

You may now start and retrieve the results:

start() ; start(verbose => 2)

Starts the estimation and returns the object itself.

estimated_unknown()

returns the estimated unknown parameter vector as a PDL::Matrix object.

sigma0_squared()

returns the estimated covariance factor.

covariance_unknown()

returns the estimated covariance matrix, without the covariance factor.

estimated_observations()

returns the fitted observations as a multipiddle.

estimated_unknowns_iterations()

returns a list of estimated unknown parameter vectors for each iteration.

number_of_iterations()

returns the used number of iterations.

EXAMPLE USAGE

See t/ and example/ folder (to be expanded).

NOT YET DOCUMENTED

There is a subclass Statistics::GaussHelmertBlocks, which can deal with blocks of observation vectors. This is not documented yet, but used in Math::UncertainGeometry.

LITERATURE

Press et al. (1992) Numerical Recipes in C, 2nd Ed., Cambridge University Press, Chapter 15.

Chapter 15, "Modeling of Data", deals with general weighted least squares estimation, though it describes the Levenber-Marquardt method in more detail. Additionally, it is assumed that only one maesurement is observed.

Mikhail, E.M. and Ackermann, F. (1976): Observations and Least Squares University Press of America

This book covers the classical Gauss-Helmert model.

Koch, K. (1999) Parameter Estimation and Hypothesis Testing in Linear Models, Springer Verlag, 2nd edition

This book covers the Gauss-Markoff model, a cousin of the Gauss-Helmert model, but modelling explicit functions.

DIAGNOSTICS

Not really done yet, meanwhile check the output with the verbose flag.

BUGS

Probably there are some, but this function has been extensively and succesfully tested in conjunction with the Math::UncertainGeometry library.

TODO

numerical differentiation instead of explicit Jacobians
Better diagnostics, more examples
Speed Optimization not done at all.
More examples
A more complete TODO list

SEE ALSO

This module uses the Perl Data Language, PDL, especially PDL::Matrix, to perform matrix operations.

For a complex example on how this module is used, see Math::UncertainGeometry::Estimation for estimating points, lines and planes in 2D and 3D.

AUTHOR

Stephan Heuel (perl@heuel.org)

COPYRIGHT

Copyright (C) 2000/2001 Stephan Heuel and Institute for Photogrammetry, University of Bonn. All rights reserved. There is no warranty. You are allowed to redistribute this software / documentation under certain conditions, see the GNU Public License, GPL for details, http://www.gnu.org/copyleft/gpl.html.