The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Math::Polynomial::Solve - Find the roots of polynomial equations.

SYNOPSIS

  use Math::Complex;  # The roots may be complex numbers.
  use Math::Polynomial::Solve qw(poly_roots);

  my @x = poly_roots(@coefficients);

or

  use Math::Complex;  # The roots may be complex numbers.
  use Math::Polynomial::Solve qw(:numeric);  # See the EXPORT section

  #
  # Find roots using the matrix method.
  #
  my @x = poly_roots(@coefficients);

  #
  # Alternatively, use the classical methods instead of the matrix
  # method if the polynomial degree is less than five.
  #
  poly_option(hessenberg => 0);
  @x = poly_roots(@coefficients);

or

  use Math::Complex;  # The roots may be complex numbers.
  use Math::Polynomial::Solve qw(:classical);  # See the EXPORT section

  #
  # Find the polynomial roots using the classical methods.
  #

  # Find the roots of ax + b
  my @x1 = linear_roots($a, $b);

  # Find the roots of ax**2 + bx +c
  my @x2 = quadratic_roots($a, $b, $c);

  # Find the roots of ax**3 + bx**2 +cx + d
  my @x3 = cubic_roots($a, $b, $c, $d);

  # Find the roots of ax**4 + bx**3 +cx**2 + dx + e
  my @x4 = quartic_roots($a, $b, $c, $d, $e);

or use Math::Complex; # The roots may be complex numbers. use Math::Polynomial; use Math::Polynomial::Solve qw(:classical ascending_order);

  ascending_order(1); # Change default coefficient order for M::P::S.

  #
  # Form 8*x**3 - 6*x - 1
  #
  my $p1 = Math::Polynomial->new(-1, -6, 0, 8);

  #
  # Use Math::Polynomial's coefficient order.
  # If ascending_order() had not been called,
  # the statement would be:
  #
  # my @roots = poly_roots(reverse $p1->coefficients);
  #
  my @roots = poly_roots($p1->coefficients);

or

  use Math::Polynomial::Solve qw(:utility);

  my @coefficients = (89, 23, 23, 432, 27);

  # Return a version of the polynomial with no leading zeroes
  # and the leading coefficient equal to 1.
  my @monic_form = simplified_form(@coefficients);

  # Find the y-values of the polynomial at selected x-values.
  my @xvals = (0, 1, 2, 3, 5, 7);
  my @yvals = poly_evaluate(\@coefficients, \@xvals);

or

  use Math::Polynomial::Solve qw(:sturm);

  # Find the number of unique real roots of the polynomial.
  my $no_of_unique_roots = poly_real_root_count(@coefficients);

DESCRIPTION

This package supplies a set of functions that find the roots of polynomials, along with some utility functions.

Roots will be either real or of type Math::Complex.

Functions making use of the Sturm sequence are also available, letting you find the number of real roots present in a range of X values.

In addition to the root-finding functions, the internal functions have also been exported for your use.

EXPORTED BY DEFAULT

ascending_order()

Changes the default order of the coefficents to the functions.

When Math::Polynomial::Solve was originally written, it followed the calling convention of Math::Polynomial, using the highest degree coefficient, followed by the next highest degree coefficient, and so on in descending order.

Later Math::Polynomial was re-written, and the order of the coefficients were put in ascending order, e.g.:

  use Math::Polynomial;

  #
  # Create the polynomial 8*x**3 - 6*x - 1.
  #
  $fpx = Math::Polynomial->new(-1, -6, 0, 8);

If you use Math::Polynomial with this module, it will probably be more convenient to change the default parameter list of Math::Polynomial::Solve's functions, using the ascending_order() function:

  use Math::Polynomial;
  use Math::Polynomial::Solve qw(:classical :numeric);

  ascending_order(1);

  my $fp4 = Math::Polynomial->interpolate([1 .. 4], [14, 19, 25, 32]);

  #
  # Find roots of $fp4.
  #
  my @fp4_roots = quartic_roots($fp4->coefficients);

or

  my @fp4_roots = poly_roots($fp4->coefficients);

If ascending_order(1) had not been called, the previous line of code would have been written instead as

  my @fp4_roots = poly_roots(reverse $fp4->coefficients);

The function is a temporary measure to help with the change in the API when version 3.00 of this module is released. At that point coefficients will be in ascending order by default, and you will need to use ascending_order(0) to use the old (current) style, although you will get a deprecation warning.

Numeric Functions

These are the functions that calculate the polynomial's roots through numeric algorithms. They are all exported under the tag "numeric".

poly_roots()

Returns the roots of a polynomial equation, regardless of degree. Unlike the other root-finding functions, it will check for coefficients of zero for the highest power, and 'step down' the degree of the polynomial to the appropriate case. Additionally, it will check for coefficients of zero for the lowest power terms, and add zeros to its root list before calling one of the root-finding functions.

By default, poly_roots() will use the Hessenberg matrix method for solving polynomials. This can be changed by calling "poly_options()".

The method of poly_roots() is almost equivalent to

  @x = hqr_eigen_hessenberg(
        balance_matrix(build_companion(@coefficients))
        );

except this wouldn't check for leading and trailing zero coefficients, and it ignores the settings of poly_options().

poly_option()

Set options that affect the behavior of the poly_roots() function. All options are set to either 1 ("on") or 0 ("off"). See also "poly_iteration()" and "poly_tolerance()".

Options may be set and saved:

  #
  # Set a few options...
  #
  poly_option(hessenberg => 0, root_function => 1);

  #
  # Get all of the current options and their values.
  #
  my %all_options = poly_option();

  #
  # Set some options but save the former option values
  # for later.
  #
  my %changed_options = poly_option(hessenberg => 1, varsubst => 1);

The current options available are:

hessenberg

Use the QR Hessenberg matrix method to solve the polynomial. By default, this is set to 1. If set to 0, poly_roots() uses one of the classical root-finding functions listed below, if the degree of the polynomial is four or less.

root_function

Use the root() function from Math::Complex if the polynomial is of the form ax**n + c. This will take precedence over the other solving methods.

varsubst

Reduce polynomials to a lower degree through variable substitution, if possible.

For example, with varsubst set to one and the polynomial to solve being 9x**6 + 128x**3 + 21, poly_roots() will reduce the polynomial to 9y**2 + 128y + 21 (where y = x**3), and solve the quadratic (either classically or numerically, depending on the hessenberg option). Taking the cube root of each quadratic root completes the operation.

This has the benefit of having a simpler matrix to solve, or if the hessenberg option is set to zero, has the effect of being able to use one of the classical methods on a polynomial of high degree. In the above example, the order-six polynomial gets solved with the quadratic_roots() function if the hessenberg option is zero.

Currently the variable substitution is fairly simple and will only look for gaps of zeros in the coefficients that are multiples of the prime numbers less than or equal to 31 (2, 3, 5, et cetera).

build_companion

Creates the initial companion matrix. Returns an array of arrays (the internal representation of a matrix). This may be used as an argument to the Math::Matrix contructor:

  my @cm = build_companion(@coef);

  my $m = Math::Matrix->new(@cm);
  $m->print();

The Wikipedia article at http://en.wikipedia.org/wiki/Companion_matrix/ has more information on the subject.

balance_matrix

Balances the matrix (makes the rows and columns have similar norms) by applying a matrix transformation with a diagonal matrix of powers of two. This is used to help prevent any rounding errors that occur if the elements of the matrix differ greatly in magnitude.

hqr_eigen_hessenberg

Returns the roots of the polynomial equation by solving the matrix created by build_companion() and balance_matrix(). See "poly_roots()".

Classical Functions

These are the functions that solve polynomials via the classical methods. Quartic, cubic, quadratic, and even linear equations may be solved with these functions. They are all exported under the tag "classical".

"poly_roots()" will use these functions if the hessenberg option is set to 0, and if the degree of the polynomial is four or less.

The leading coefficient $a must always be non-zero for the classical functions.

linear_roots()

Here for completeness's sake more than anything else. Returns the solution for

  ax + b = 0

by returning -b/a. This may be in either a scalar or an array context.

quadratic_roots()

Gives the roots of the quadratic equation

  ax**2 + bx + c = 0

using the well-known quadratic formula. Returns a two-element list.

cubic_roots()

Gives the roots of the cubic equation

  ax**3 + bx**2 + cx + d = 0

by the method described by R. W. D. Nickalls (see the "ACKNOWLEDGMENTS" section below). Returns a three-element list. The first element will always be real. The next two values will either be both real or both complex numbers.

quartic_roots()

Gives the roots of the quartic equation

  ax**4 + bx**3 + cx**2 + dx + e = 0

using Ferrari's method (see the "ACKNOWLEDGMENTS" section below). Returns a four-element list. The first two elements will be either both real or both complex. The next two elements will also be alike in type.

Sturm Functions

These are the functions that create and make use of the Sturm sequence. They are all exported under the tag "sturm".

poly_real_root_count()

Return the number of unique, real roots of the polynomial.

  $unique_roots = poly_real_root_count(@coefficients);

For example, the equation (x + 3)**3 forms the polynomial x**3 + 9x**2 + 27x + 27, but since all three of its roots are identical, poly_real_root_count(1, 9, 27, 27) will return 1.

Likewise, poly_real_root_count(1, -8, 25) will return 0 because the two roots of x**2 -8x + 25 are both complex.

This function is the all-in-one function to use instead of

  my @chain = poly_sturm_chain(@coefficients);

  return sturm_sign_count(sturm_sign_minus_inf(\@chain)) -
          sturm_sign_count(sturm_sign_plus_inf(\@chain));

if you don't intend to use the Sturm chain for anything else.

sturm_real_root_range_count()

Return the number of unique, real roots of the polynomial between two X values.

  my($x0, $x1) = (0, 1000);

  my @chain = poly_sturm_chain(@coefficients);
  $unique_roots = sturm_real_root_range_count(\@chain, $x0, $x1);

This is equivalent to:

  my($x0, $x1) = (0, 1000);

  my @chain = poly_sturm_chain(@coefficients);
  my @signs = sturm_sign_chain(\@chain, [$x0, $x1]);
  $no_unique_roots = sturm_sign_count(@{$signs[0]}) - sturm_sign_count(@{$signs[1]});

sturm_bisection_roots()

Return the real roots counted by "sturm_real_root_range_count()". Uses the bisection method combined with sturm_real_root_range_count() to narrow the range to a single root, then uses "laguerre()" to find its value.

  my($from, $to) = (-1000, 0);
  my @chain = poly_sturm_chain(@coefficients);
  my @roots = sturm_bisection_roots(\@chain, $from, $to);

As it is using the Sturm functions, it will find only the real roots.

poly_sturm_chain()

Returns the chain of Sturm functions used to evaluate the number of roots of a polynomial in a range of X values.

If you feed in a sequence of X values to the Sturm functions, you can tell where the (real, not complex) roots of the polynomial are by counting the number of times the Y values change sign.

See "poly_real_root_count()" above for an example of its use.

Sturm Sign Sequence Functions

sturm_sign_chain()

sturm_sign_minus_inf()

sturm_sign_plus_inf()

These functions return the array of signs that are used by the functions "poly_real_root_count()" and "sturm_real_root_range_count()" to find the number of real roots in a polynomial.

In normal use you will probably never need to use them, unless you want to examine the internals of the Sturm functions:

  #
  # Examine the sign changes that occur at each endpoint of
  # the x range.
  #
  my(@coefficients) = (1, 4, 7, 23);
  my(@xvals) = (-12, 12);

  my @chain = poly_sturm_chain( @coefficients);
  my @signs = sturm_sign_chain(\@chain, \@xvals);  # An array of arrays.

  print "\nPolynomial: [", join(", ", @coefficients), "]\n";

  foreach my $j (0..$#signs)
  {
    my @s = @{$signs[$j]};
    print $xval[$j], "\n",
          "\t", join(", ", @s), "], sign count = ",
          sturm_sign_count(@s), "\n\n";
  }

Similar examinations can be made at plus and minus infinity:

  #
  # Examine the sign changes that occur between plus and minus
  # infinity.
  #
  my @coefficients = (1, 4, 7, 23);

  my @chain = poly_sturm_chain( @coefficients);
  my @smi = sturm_sign_minus_inf(\@chain);
  my @spi = sturm_sign_plus_inf(\@chain);

  print "\nPolynomial: [", join(", ", @coefficients), "]\n";

  print "Minus Inf:\n",
        "\t", join(", ", @smi), "], sign count = ",
        sturm_sign_count(@smi), "\n\n";

  print "Plus Inf:\n",
        "\t", join(", ", @spi), "], sign count = ",
        sturm_sign_count(@spi), "\n\n";

sturm_sign_count()

Counts and returns the number of sign changes in a sequence of signs, such as those returned by the "Sturm Sign Sequence Functions"

See "poly_real_root_count()" and "sturm_real_root_range_count()" for examples of its use.

Utility Functions

These are internal functions used by the other functions listed above that may also be useful to the user, or which affect the behavior of other functions. They are all exported under the tag "utility".

epsilon()

Returns the machine epsilon value that was calculated when this module was loaded.

The value may be changed, although this in general is not recommended.

  my $old_epsilon = epsilon($new_epsilon);

The previous value of epsilon may be saved to be restored later.

The Wikipedia article at http://en.wikipedia.org/wiki/Machine_epsilon/ has more information on the subject.

fltcmp()

Compare two floating point numbers within a degree of accuracy.

Like most functions ending in "cmp", this one returns -1 if the first argument tests as less than the second argument, 1 if the first tests greater than the second, and 0 otherwise. Comparisons are made within a tolerance range that may be set with "poly_tolerance()".

  #
  # Set a very forgiving comparison tolerance.
  #
  poly_tolerance(fltcmp => 1e-5);
  my @x = poly_roots(@cubic);
  my @y = poly_evaluate(\@cubic, \@x);

  if (fltcmp($y[0], 0.0) == 0 and
      fltcmp($y[1], 0.0) == 0 and
      fltcmp($y[2], 0.0) == 0)
  {
    print "Roots found: (", join(", ", @x), ")\n";
  }
  else
  {
    print "Problem root-finding for [", join(", ", @cubic), "]\n";
  }

laguerre()

A numerical method for finding a root of an equation, especially made for polynomials.

  @roots = laguerre(\@coefficients, \@xvalues);
  push @roots, laguerre(\@coefficients, $another_xvalue);

For each x value the function will attempt to find a root closest to it. The function will return real roots only.

This is the function used by "sturm_bisection_roots()" after narrowing its search to a range containing a single root.

newtonraphson()

Like "laguerre()", a numerical method for finding a root of an equation.

  @roots = newtonraphson(\@coefficients, \@xvalues);
  push @roots, newtonraphson(\@coefficients, $another_xvalue);

For each x value the function will attempt to find a root closest to it. The function will return real roots only.

This function is provided as an alternative to laguerre(). It is not used internally by any other functions.

poly_iteration()

Sets the limit to the number of iterations that a solving method may go through before giving up trying to find a root. Each method of root-finding used by "poly_roots()", "sturm_bisection_roots()", and "laguerre()" has its own iteration limit, which may be found, like "poly_option()", simply by looking at the return value of poly_iteration().

  #
  # Get all of the current iteration limits.
  #
  my %its_limits = poly_iteration();

  #
  # Double the limit for the hessenberg method, but set the limit
  # for Laguerre's method to 20.
  #
  my %old_limits = poly_iteration(hessenberg => $its_limits{hessenberg} * 2,
                      laguerre => 20);

  #
  # Reset the limits with the former values, but save the values we had
  # for later.
  #
  my %hl_limits = poly_iteration(%old_limits);

There are iteration limit values for:

hessenberg

The numeric method used by poly_roots(), if the hessenberg option is set. Its default value is 60.

laguerre

The numeric method used by "laguerre()". Laguerre's method is used within sturm_bisection_roots() once it has narrowed its search in on an individual root, and of course laguerre() may be called independently. Its default value is 60.

newtonraphson

The numeric method used by newtonraphson(). The Newton-Raphson method is offered as an alternative to Laguerre's method. Its default value is 60.

sturm_bisection

The bisection method used to find roots within a range. Its default value is 100.

poly_tolerance()

Set the degree of accuracy needed for comparisons to be equal or roots to be found. Amongst the root finding functions this currently only affects laguerre() and newtonraphson(), as the Hessenberg matrix method determines how close it needs to get using a complicated formula based on "epsilon()".

  #
  # Print the tolerances.
  #
  my %tolerances = poly_tolerance();
  print "Default tolerances:\n";
  foreach my $k (keys %tolerances)
  {
    print "$k => ", $tolerances{$k}, "\n";
  }

  #
  # Quadruple the tolerance for Laguerre's method.
  #
  poly_tolerance(laguerre => 4 * $tolerances{laguerre});

Tolerances may be set for:

laguerre

The numeric method used by laguerre(). Laguerre's method is used within sturm_bisection_roots() once an individual root has been found within a range, and of course it may be called independently.

newtonraphson

The numeric method used by newtonraphson(). Newton-Raphson is, like Laguerre's method, a method for finding a root near the starting X value.

fltcmp

A comparison function that determines if one argument is less than, equal to, or greater than, the other. Comparisons are made within a range determined by the tolerance.

poly_derivative()

  @derivative = poly_derivative(@coefficients);

Returns the coefficients of the first derivative of the polynomial. Leading zeros are removed before returning the derivative, so the length of the returned polynomial may be even shorter than expected from the length of the original polynomial. Returns an empty list if the polynomial is a simple constant.

poly_antiderivative()

Returns the coefficients of the antiderivative of the polynomial. The constant term is set to zero; to override this use

  @integral = poly_antiderivative(@coefficients);
  $integral[$#integral] = $const_term;

simplified_form()

Return the polynomial adjusted by removing any leading zero coefficients and placing it in a monic polynomial form (all coefficients divided by the coefficient of the highest power).

poly_evaluate()

Returns the values of the polynomial given a list of arguments. Unlike most of the above functions, this takes the reference of the coefficient list, which lets the function take a single x-value or multiple x-values passed in as a reference.

The function may return a list...

  my @coefficients = (1, -12, 0, 8, 13);
  my @xvals = (0, 1, 2, 3, 5, 7);
  my @yvals = poly_evaluate(\@coefficients, \@xvals);

  print "Polynomial: [", join(", ", @coefficients), "]\n";

  for my $j (0..$#yvals)
  {
    print "Evaluates at ", $xvals[$j], " to ", $yvals[$j], "\n";
  }

or return a scalar.

  my $x_median = ($xvals[0] + $xvals[$#xvals])/2.0;
  my $y_median = poly_evaluate(\@coefficients, $x_median);

poly_derivaluate();

Given an X value, returns the y-values of the polynomial, its first derivative, and its second derivative.

  my($y, $dy, $ddy) = poly_derivaluate(\@coefficients, $x);

Note that unlike "poly_evaluate()", this takes a single x-value.

If the polynomial is a linear equation, the second derivative value will be zero. Similarly, if the "equation" is a constant, the first derivative value will be zero.

poly_nonzero_term_count()

Returns a simple count of the number of coefficients that aren't zero.

poly_constmult()

Simple function to multiply all of the coefficients by a constant. Like poly_evaluate(), uses the reference of the coefficient list.

  my @coefficients = (1, 7, 0, 12, 19);
  my @coef3 = poly_constmult(\@coefficients, 3);

poly_divide()

Divide one polynomial by another. Like poly_evaluate(), the function takes a reference to the coefficient list. It returns a reference to both a quotient and a remainder.

  my @coefficients = (1, -13, 59, -87);
  my @polydiv = (3, -26, 59);
  my($q, $r) = poly_divide(\@coefficients, \@polydiv);
  my @quotient = @$q;
  my @remainder = @$r;

EXPORT

Currently there is one default export, ascending_order, although this function will be deprecated after version 3.00 of this module is released.

The remaining functions may be individually named in an export list, but there are also four export tags: classical, numeric, sturm, and utility.

ACKNOWLEDGMENTS

The cubic

The cubic is solved by the method described by R. W. D. Nickalls, "A New Approach to solving the cubic: Cardan's solution revealed," The Mathematical Gazette, 77, 354-359, 1993.

Dr. Nickalls was kind enough to send me his article, with notes and revisions, and directed me to a Matlab script that was based on that article, written by Herman Bruyninckx, of the Dept. Mechanical Eng., Div. PMA, Katholieke Universiteit Leuven, Belgium. This function is an almost direct translation of that script, and I owe Herman Bruyninckx for creating it in the first place.

Beginning with version 2.51, Dr. Nikalls's paper is included in the references directory of this package. Dr. Nickalls has also made his paper available at http://www.nickalls.org/dick/papers/maths/cubic1993.pdf.

This article is also available on http://www.2dcurves.com/cubic/cubic.html, and there is a nice discussion of his method at http://www.sosmath.com/algebra/factor/fac111/fac111.html.

Dick Nickalls, dick@nickalls.org

Herman Bruyninckx, Herman.Bruyninckx@mech.kuleuven.ac.be, has web page at http://www.mech.kuleuven.ac.be/~bruyninc. His matlab cubic solver is at http://people.mech.kuleuven.ac.be/~bruyninc/matlab/cubic.m.

Andy Stein has written a version of cubic.m that will work with vectors. It is included with this package in the eg directory.

The quartic

The method for quartic solution is Ferrari's, as described in the web page Karl's Calculus Tutor at http://www.karlscalculus.org/quartic.html. I also made use of some short cuts mentioned in web page Ask Dr. Math FAQ, at http://forum.swarthmore.edu/dr.math/faq/faq.cubic.equations.html.

Quintic and higher.

Back when this module could only solve polynomials of degrees 1 through 4, Matz Kindahl, the original author of Math::Polynomial, suggested the poly_roots() function. Later on, Nick Ing-Simmons, who was working on a perl binding to the GNU Scientific Library, sent a perl translation of Hiroshi Murakami's Fortran implementation of the QR Hessenberg algorithm, and it fit very well into the poly_roots() function. Quintics and higher degree polynomials can now be solved, albeit through numeric analysis methods.

Hiroshi Murakami's Fortran routines were at http://netlib.bell-labs.com/netlib/, but do not seem to be available from that source anymore. However, his files have been located and are now included in the references/qralg directory.

He referenced the following articles:

  • R. S. Martin, G. Peters and J. H. Wilkinson, "The QR Algorithm for Real Hessenberg Matrices", Numer. Math. 14, 219-231(1970).

  • B. N. Parlett and C. Reinsch, "Balancing a Matrix for Calculation of Eigenvalues and Eigenvectors", Numer. Math. 13, 293-304(1969).

    Fortran code for this routine is at http://netlib.sandia.gov/eispack/balanc.f, and is the basis for "balance_matrix()".

  • Alan Edelman and H. Murakami, "Polynomial Roots from Companion Matrix Eigenvalues", Math. Comp., v64,#210, pp.763-776(1995).

For an overview (and useful algorithms), this is probably the book to start with.

Sturm's Sequence and Laguerre's Method

  • D�rrie, Heinrich. 100 Great Problems of Elementary Mathematics; Their History and Solution. New York: Dover Publications, 1965. Translated by David Antin.

Discusses Charles Sturm's 1829 paper with an eye towards mathematical proof rather than an algorithm, but is still very useful.

  • Glassner, Andrew S. Graphics Gems. Boston: Academic Press, 1990.

The chapter "Using Sturm Sequences to Bracket Real Roots of Polynomial Equations" (by D. G. Hook and P. R. McAree) has a clearer description of the actual steps needed to implement Sturm's method.

  • Acton, Forman S. Numerical Methods That Work. New York: Harper & Row, Publishers, 1970.

Lively, opinionated book on numerical equation solving. I looked it up when it became obvious that everyone was quoting Acton when discussing Laguerre's method.

Newton-Raphson

Commonly known as Newton's method. Almost every introduction to calculus text book will have a section on it; a Wikipedia article is at http://en.wikipedia.org/wiki/Newton%27s_method.

SEE ALSO

  • Forsythe, George E., Michael A. Malcolm, and Cleve B. Moler Computer Methods for Mathematical Computations. Prentice-Hall, 1977.

  • William Press, Brian P. Flannery, Saul A. Teukolsky, and William T. Vetterling Numerical Recipes in C. Cambridge University Press, 1988. http://www.nr.com/.

AUTHOR

John M. Gamble may be found at jgamble@cpan.org