Newsgroups: sci.math.num-analysis
From: hbaker@netcom.com (Henry G. Baker)
Subject: FAQ: # real poly roots in an interval, perhaps (oo,oo)
Date: Wed, 7 Feb 1996 17:03:43 GMT
Summary: Sturm chains/sequences, gcd-with-multipliers(p(x),p'(x))

From  "100 Great Problems of Elementary Mathematics: Their History and
Solution", by Heinrich Doerrie, translated by David Antin, Dover,
1965.  ISBN 0-486-61348-8.

24.  Sturm's Problem of the Number of Roots

"Find the number of real roots of an algebraic equation with real
coefficients over a given interval".

This very important algebraic problem was solved in a surprisingly
simple way in 1829 by the French mathematician Charles Sturm
(1803-1855).  The paper containing the famous Sturm theorem appeared
in the eleventh volume of the Bulletin des sciences de Ferussac and
bears the title, "Memoire sur la resolution des equations numeriques."

"With this discovery," says Liouville, "Sturm at once simplified and
perfected the elements of algebra, enriching them with new results."

SOLUTION.  We distinguish two cases:

I.  The real roots of the equation in question are all simple over the
given interval.

II.  The equation also possesses multiple real roots over the interval.

We will first show that the second case leads us back to the first.

Let the prescribed equation F(x)=0 have the distinct roots alpha,
beta, gamma, ..., and let the root alpha be a-fold, beta b-fold, gamma
c-fold, ..., so that

F(x) = (x-alpha)^a (x-beta)^b (x-gamma)^c ...

For the derivative F'(x) of F(x) we obtain

F'(x)/F(x) = a/(x-alpha) + b/(x-beta) + c/(x-gamma) + ...

             a(x-beta)(x-gamma)(x-delta)...+b(x-alpha)(x-gamma)(x-delta)...+...
           = ------------------------------------------------------------------
                              (x-alpha)(x-beta)(x-gamma)...

If we then call the numerator of this fraction p(x) and the
denominator q(x) and set the whole rational function F(x)/q(x) equal
to G(x), then

F(x) = G(x) q(x)     and     F'(x) = G(x) p(x).

Now the functions p(x) and q(x) have no common divisor.  (The factor
x-beta of q(x) may, for example, go into all the terms of p(x) except
the second with no remainder.)  It follows from this that G(x) is the
greatest common divisor of F(x) and F'(x).  This can be determined
easily from the divisional algorithm and can therefore be considered
known, as a result of which q(x) is known also.

The equation F(x)=0 then falls into the two equations

q(x)=0     and     G(x)=0,

the first of which possesses only simple roots, while the second can
be further reduced in the same way that F(x)=0 was.

An equation with multiple roots can therefore always be transformed
into equations (with known coefficients) possessing only simple roots.

Consequently, it is sufficient to solve the problem for the first
case.

Let f(x)=0 be an algebraic equation all of whose roots are simple.
The derivative f'(x) of f(x) then vanishes for none of these roots and
the highest common divisor of the functions f(x) and f'(x) is a
constant K that differs from zero.  We use the divisional algorithm to
determine the highest common divisor of f(x) and f'(x), writing, for
the sake of convenience in representation, f0(x) and f1(x) instead of
f(x) and f'(x), and calling the quotients resulting from the
successive divisions q0(x), q1(x), q2(x), ... and the remainders
-f2(x), -f3(x), ....

If we also drop the argument sign for the sake of brevity, we obtain the
following scheme:

(0)     f0 = q0 f1 - f2

(1)     f1 = q1 f2 - f3

(2)     f2 = q2 f3 - f4, etc.

In this scheme there must at last appear -- at the very latest with
the remainder K -- a remainder -fs(x) that does not vanish at any
point of the interval and consequently possesses the same sign over
the whole interval.  Here we break off the algorithm.  The functions
involved

f0, f1, f2, ..., fs

form a "Sturm chain" and in this connections are called Sturm functions.

The Sturm functions possess the following three properties:

1.  Two neighboring functions do not vanish simultaneously at any
point of the interval.

2.  At a null point of a Sturm function its two neighboring functions
are of different sign.

3.  Within a sufficiently small area surrounding a zero point of
f0(x), f1(x) is everywhere greater than zero or everywhere smaller
than zero.

PROOF OF 1.  If, for example, f2 and f3 vanish at any point of an
interval, f4 [according to (2)] also vanishes at this point, and
consequently f5 also [according to (3)], and so forth, so that finally
[according to the last line of the algorithm] fs also vanishes, which,
however, contradicts our assumption.

PROOF OF 2.  If the function f3 vanishes at the point sigma, for
example, of the interval, then it follows from (2) that

f2(sigma) = -f4(sigma).

PROOF OF 3.  This proof follows from the known theorem: A function
[f0(x)] rises or falls at a point depending on whether its derivative
[f1(x)] at that point is greater or smaller than zero.

We now select any point x of the interval, note the sign of the values
f0(x), f1(x), ..., fs(x), and obtain a "Sturm sign chain" (to obtain
an unequivocal sign, however, it must be assumed that none of the
designated s+1 function values is zero).  The sign chain will contain
sign sequences (++ and --) and sign changes (+- and -+).

We will consider the number Z(x) of sign changes in the sign chain and
the changes undergone by Z(x) when x passes through the interval.  A
change can occur only if one or more of the Sturm functions changes
sign, i.e., passes over from negative (positive) values through zero
to positive (negative) values.  We will accordingly study the effect
produced on Z(x) by the passage of a function fv(x) through zero.

Let k be a point at which fv disappears, h a point situated to the
left, and l a point to the right of k and so close to k that over the
interval h to l the following holds true:

(1) fv(x) does not vanish except when x=k;

(2) every neighbor (fv+1,fv-1) of fv does not change sign.

We must distinguish between the cases v>0 and v=0; in the first case
we are concerned with the triplet fv-1, fv, fv+1, in the second, with
the pair f0, f1.

In the triplet, fv-1 and fv+1 possess either the + and - sign or the -
and + sign at all three points h, k, l.  Thus, whatever the sign of fv
may be at these points, the triplet possesses one change of sign for
each of the three arguments h, k, l.  The passage through zero of the
function fv does not change the number of sign changes in the chain!

In the pair, f1 has either the + or - sign at all three points h, k,
l.  In the first case, f0 is increasing and is thus negative at h and
positive at l.  In the second case, f0 is decreasing and is positive
at point h, and negative at l.  In both cases a sign change is lost.

From  our investigation we learn that: The Sturm sign chain undergoes a
change in the number of sign changes Z(x) only when x passes through a
null point of f(x); and specifically, the chain then loses (with an
increasing x) exactly one sign change.  Thus, if x passes through the
interval (the ends of which do not represent roots of f(x)=0) from
left to right, the sign chain loses exactly as many sign changes as
there are null points of f(x) within the interval.

Result:

STURM'S THEOREM: The number of real roots of an algebraic equation
with real coefficients whose real roots are simple over an interval
the end points of which are not roots is equal to the difference
between the numbers of sign changes of the Sturm sign chains formed
for the interval ends.

NOTE.  The same considerations can also be applied unchanged to the
series formed when we multiply f0, f1, f2, ..., fs by any positive
constants; this series is then likewise designated as a Sturm chain.
In the formation of the Sturm function chain all fractional
coefficients are accordingly avoided.

EXAMPLE 1.  Determine the number and situation of the real roots of
the equation x^5 - 3 x - 1 = 0.

The Sturm chain is

f0 = x^5 - 3 x - 1,

f1 = 5 x^4 - 3,

f2 = 12 x + 5,

f3 = 1.

The signs of f for x = -2, -1, 0, +1, +2 are

-----------------------
 x | f0 | f1 | f2 | f3
-----------------------
-2 |  - |  + |  - |  +
-1 |  + |  + |  - |  +
 0 |  - |  - |  + |  +
+1 |  - |  + |  + |  +
+2 |  + |  + |  + |  +
-----------------------

The equation thus has three real roots: one between -2 and -1, one
between -1 and 0, one between +1 and +2.  The other two roots are
complex.

EXAMPLE 2.  Determine the number of real roots of the equation
x^5-ax-b=0 when a and b are positive magnitudes and
4^4 a^5 > 5^5 b^4.

The Sturm chain reads

x^5-ax-b,

5x^4-a,

4ax+5b,

4^4 a^5 - 5^5 b^4.

For the values x = -oo [- infinity] and +oo it has the signs -+-+ and
++++, respectively.  The equation has three real roots and two complex
roots.

EOF

==============================================================================
From: rjohnson@apple.com (Robert Johnson)
Newsgroups: sci.math.num-analysis,sci.math
Subject: Re: [Q] Polynomial root interval estimation ...
Date: 8 Feb 1996 19:39:42 -0800

In article <SHREINER.96Jan26143025@timon.cam-ani.co.uk>,
Dave Shreiner <shreiner@sgi.com> wrote:
>   In a book on ray tracing, the author made reference to something I
>believe called "Sturm sequences", which could be used to generate intervals
>in which roots of a polynomial reside in ... or at least that's the
>impression that I got.  Could someone please point me to some additional
>resources or give the cocktail-napkin explanation?

Sturm sequences are used to find how many real roots of a polynomial P
lie between two real numbers.  By dividing P by gcd(P,P'), we can reduce
all multiple roots to simple roots.  Therefore, assume that P has only
simple roots.

Generate the sequence of polynomials { P_k } as follows:

    P_0 = P
    P_1 = P'
    P_k = -(P_{k-2} mod P_{k-1})        for k>1                 [1]

Except for the sign change, [1] is the polynomial Euclidean Algorithm.
Therefore, the last polynomial in the sequence is a constant polynomial
because P and P' are relatively prime.  Furthermore, for k>0,

    if P_k(x) = 0, then P_{k+1}(x) = -P_{k-1}(x)                [2]

This means that if P_k(x) = P_{k-1}(x) = 0, then P is identically 0.
Assume this is not the case.

Now, define Sturm(P,x) to be the number of sign changes in the sequence
{ P_k(x) }.  The only place where Sturm(P,x) can change is where one of
the P_k vanishes.  However, for k>0, [2] says that if P_k(x) = 0, then
P_{k+1}(x) and P_{k-1}(x) have opposite signs.  Therefore, no matter how
P_k behaves near x, it provides no change in Sturm(P,x).

Thus, only where P (P_0) vanishes can Sturm(P,x) change.  If P increases
through 0 at x, then P' (P_1) is positive; thus, Sturm(P,x) decreases by
1.  If P decreases through 0 at x, then P' is negative; thus, Sturm(P,x)
decreases by 1 again.  Thus, Sturm(P,x) decreases by 1 at every root of
P.  Thus, assuming neither P(x) nor P(y) is 0, the number of roots of P
in [x,y] is given by

    Sturm(P,x) - Sturm(P,y)                                     [3]

Rob Johnson
Apple Computer, Inc.
rjohnson@apple.com

==============================================================================

From: rjohnson@apple.com (Robert Johnson)
Newsgroups: sci.math.num-analysis,sci.math
Subject: Re: Determining if the Roots of a Quartic are Real
Date: 9 Feb 1996 03:13:00 -0800

In article <4fa4j3$4ca@news.fsu.edu>,
Matthew Brenneman <brennman@magnet.fsu.edu> wrote:
>I am working with a quartic eqn having the general form:
>
>  x^4 + a3 x^3 + a2 x^2 + a1 x + a0 = 0
>
>What I need to know is:   IF all of the roots are real
>                          THEN is there some condition that the
>                          coefficients have to satisfy?

Use Sturm sequences to count the number of real roots.  Strictly speaking,
Sturm sequences only work for polynomials with simple roots, but dividing
a polynomial P by gcd(P,P') reduces all multiple roots to simple ones.
If all roots of P/gcd(P,P') are real then all roots of P are real.

Here is a copy of an article on Sturm sequences that I just posted to
sci.math and sci.math.num-analysis.

[previous article quoted; now deleted -- djr]

By looking at the degree and sign of the highest order term in each P_k,
we can compute Sturm(P,-oo) - Sturm(P,+oo) which is the number of real
roots of P.

Rob Johnson
Apple Computer, Inc.
rjohnson@apple.com

==============================================================================

Newsgroups: sci.math.num-analysis,sci.math
From: bpc@netcom.com (Benjamin P. Carter)
Subject: Re: Determining if the Roots of a Quartic are Real
Date: Sat, 10 Feb 1996 08:07:40 GMT

rjohnson@apple.com (Robert Johnson) writes:

>In article <4fa4j3$4ca@news.fsu.edu>,
>Matthew Brenneman <brennman@magnet.fsu.edu> wrote:
>>I am working with a quartic eqn having the general form:
>>
>>  x^4 + a3 x^3 + a2 x^2 + a1 x + a0 = 0
>>
>>What I need to know is:   IF all of the roots are real
>>                          THEN is there some condition that the
>>                          coefficients have to satisfy?

I assume the coefficients are all real; otherwise the roots will not
all be real.

>Use Sturm sequences to count the number of real roots. ... [snip]

That is one approach.  Another is to look at discriminants.  If the
discriminant of the given polynomial does not vanish, then there are
no repeated roots.  If the discriminant is negative, then there are
two real roots and two non-real (complex) roots.  If the discriminant
is positive, then either all roots are real, or none of them are.
These two sub-cases can be distinguished by examining the derivative
of the given polynomial.  The sign of the discriminant of the derivative
determines how many real roots it has.  If the derivative has only one
real root, then the given polynomial has < 4 real roots.  If the 
derivative has three real roots, then the outer two are the x values
where the given polynomial has local minima.  If the given polynomial is 
ever negative, then it has some real roots.
-- 
    Ben Carter                  internet address: bpc@netcom.com


==============================================================================
[Actually the discriminant IS part of the chain of polynomials to be
computed when calculating sturm sequences -- djr]
==============================================================================