Newsgroups: sci.math.num-analysis From: hbaker@netcom.com (Henry G. Baker) Subject: FAQ: # real poly roots in an interval, perhaps (oo,oo) Date: Wed, 7 Feb 1996 17:03:43 GMT Summary: Sturm chains/sequences, gcd-with-multipliers(p(x),p'(x)) From "100 Great Problems of Elementary Mathematics: Their History and Solution", by Heinrich Doerrie, translated by David Antin, Dover, 1965. ISBN 0-486-61348-8. 24. Sturm's Problem of the Number of Roots "Find the number of real roots of an algebraic equation with real coefficients over a given interval". This very important algebraic problem was solved in a surprisingly simple way in 1829 by the French mathematician Charles Sturm (1803-1855). The paper containing the famous Sturm theorem appeared in the eleventh volume of the Bulletin des sciences de Ferussac and bears the title, "Memoire sur la resolution des equations numeriques." "With this discovery," says Liouville, "Sturm at once simplified and perfected the elements of algebra, enriching them with new results." SOLUTION. We distinguish two cases: I. The real roots of the equation in question are all simple over the given interval. II. The equation also possesses multiple real roots over the interval. We will first show that the second case leads us back to the first. Let the prescribed equation F(x)=0 have the distinct roots alpha, beta, gamma, ..., and let the root alpha be a-fold, beta b-fold, gamma c-fold, ..., so that F(x) = (x-alpha)^a (x-beta)^b (x-gamma)^c ... For the derivative F'(x) of F(x) we obtain F'(x)/F(x) = a/(x-alpha) + b/(x-beta) + c/(x-gamma) + ... a(x-beta)(x-gamma)(x-delta)...+b(x-alpha)(x-gamma)(x-delta)...+... = ------------------------------------------------------------------ (x-alpha)(x-beta)(x-gamma)... If we then call the numerator of this fraction p(x) and the denominator q(x) and set the whole rational function F(x)/q(x) equal to G(x), then F(x) = G(x) q(x) and F'(x) = G(x) p(x). Now the functions p(x) and q(x) have no common divisor. (The factor x-beta of q(x) may, for example, go into all the terms of p(x) except the second with no remainder.) It follows from this that G(x) is the greatest common divisor of F(x) and F'(x). This can be determined easily from the divisional algorithm and can therefore be considered known, as a result of which q(x) is known also. The equation F(x)=0 then falls into the two equations q(x)=0 and G(x)=0, the first of which possesses only simple roots, while the second can be further reduced in the same way that F(x)=0 was. An equation with multiple roots can therefore always be transformed into equations (with known coefficients) possessing only simple roots. Consequently, it is sufficient to solve the problem for the first case. Let f(x)=0 be an algebraic equation all of whose roots are simple. The derivative f'(x) of f(x) then vanishes for none of these roots and the highest common divisor of the functions f(x) and f'(x) is a constant K that differs from zero. We use the divisional algorithm to determine the highest common divisor of f(x) and f'(x), writing, for the sake of convenience in representation, f0(x) and f1(x) instead of f(x) and f'(x), and calling the quotients resulting from the successive divisions q0(x), q1(x), q2(x), ... and the remainders -f2(x), -f3(x), .... If we also drop the argument sign for the sake of brevity, we obtain the following scheme: (0) f0 = q0 f1 - f2 (1) f1 = q1 f2 - f3 (2) f2 = q2 f3 - f4, etc. In this scheme there must at last appear -- at the very latest with the remainder K -- a remainder -fs(x) that does not vanish at any point of the interval and consequently possesses the same sign over the whole interval. Here we break off the algorithm. The functions involved f0, f1, f2, ..., fs form a "Sturm chain" and in this connections are called Sturm functions. The Sturm functions possess the following three properties: 1. Two neighboring functions do not vanish simultaneously at any point of the interval. 2. At a null point of a Sturm function its two neighboring functions are of different sign. 3. Within a sufficiently small area surrounding a zero point of f0(x), f1(x) is everywhere greater than zero or everywhere smaller than zero. PROOF OF 1. If, for example, f2 and f3 vanish at any point of an interval, f4 [according to (2)] also vanishes at this point, and consequently f5 also [according to (3)], and so forth, so that finally [according to the last line of the algorithm] fs also vanishes, which, however, contradicts our assumption. PROOF OF 2. If the function f3 vanishes at the point sigma, for example, of the interval, then it follows from (2) that f2(sigma) = -f4(sigma). PROOF OF 3. This proof follows from the known theorem: A function [f0(x)] rises or falls at a point depending on whether its derivative [f1(x)] at that point is greater or smaller than zero. We now select any point x of the interval, note the sign of the values f0(x), f1(x), ..., fs(x), and obtain a "Sturm sign chain" (to obtain an unequivocal sign, however, it must be assumed that none of the designated s+1 function values is zero). The sign chain will contain sign sequences (++ and --) and sign changes (+- and -+). We will consider the number Z(x) of sign changes in the sign chain and the changes undergone by Z(x) when x passes through the interval. A change can occur only if one or more of the Sturm functions changes sign, i.e., passes over from negative (positive) values through zero to positive (negative) values. We will accordingly study the effect produced on Z(x) by the passage of a function fv(x) through zero. Let k be a point at which fv disappears, h a point situated to the left, and l a point to the right of k and so close to k that over the interval h to l the following holds true: (1) fv(x) does not vanish except when x=k; (2) every neighbor (fv+1,fv-1) of fv does not change sign. We must distinguish between the cases v>0 and v=0; in the first case we are concerned with the triplet fv-1, fv, fv+1, in the second, with the pair f0, f1. In the triplet, fv-1 and fv+1 possess either the + and - sign or the - and + sign at all three points h, k, l. Thus, whatever the sign of fv may be at these points, the triplet possesses one change of sign for each of the three arguments h, k, l. The passage through zero of the function fv does not change the number of sign changes in the chain! In the pair, f1 has either the + or - sign at all three points h, k, l. In the first case, f0 is increasing and is thus negative at h and positive at l. In the second case, f0 is decreasing and is positive at point h, and negative at l. In both cases a sign change is lost. From our investigation we learn that: The Sturm sign chain undergoes a change in the number of sign changes Z(x) only when x passes through a null point of f(x); and specifically, the chain then loses (with an increasing x) exactly one sign change. Thus, if x passes through the interval (the ends of which do not represent roots of f(x)=0) from left to right, the sign chain loses exactly as many sign changes as there are null points of f(x) within the interval. Result: STURM'S THEOREM: The number of real roots of an algebraic equation with real coefficients whose real roots are simple over an interval the end points of which are not roots is equal to the difference between the numbers of sign changes of the Sturm sign chains formed for the interval ends. NOTE. The same considerations can also be applied unchanged to the series formed when we multiply f0, f1, f2, ..., fs by any positive constants; this series is then likewise designated as a Sturm chain. In the formation of the Sturm function chain all fractional coefficients are accordingly avoided. EXAMPLE 1. Determine the number and situation of the real roots of the equation x^5 - 3 x - 1 = 0. The Sturm chain is f0 = x^5 - 3 x - 1, f1 = 5 x^4 - 3, f2 = 12 x + 5, f3 = 1. The signs of f for x = -2, -1, 0, +1, +2 are ----------------------- x | f0 | f1 | f2 | f3 ----------------------- -2 | - | + | - | + -1 | + | + | - | + 0 | - | - | + | + +1 | - | + | + | + +2 | + | + | + | + ----------------------- The equation thus has three real roots: one between -2 and -1, one between -1 and 0, one between +1 and +2. The other two roots are complex. EXAMPLE 2. Determine the number of real roots of the equation x^5-ax-b=0 when a and b are positive magnitudes and 4^4 a^5 > 5^5 b^4. The Sturm chain reads x^5-ax-b, 5x^4-a, 4ax+5b, 4^4 a^5 - 5^5 b^4. For the values x = -oo [- infinity] and +oo it has the signs -+-+ and ++++, respectively. The equation has three real roots and two complex roots. EOF ============================================================================== From: rjohnson@apple.com (Robert Johnson) Newsgroups: sci.math.num-analysis,sci.math Subject: Re: [Q] Polynomial root interval estimation ... Date: 8 Feb 1996 19:39:42 -0800 In article , Dave Shreiner wrote: > In a book on ray tracing, the author made reference to something I >believe called "Sturm sequences", which could be used to generate intervals >in which roots of a polynomial reside in ... or at least that's the >impression that I got. Could someone please point me to some additional >resources or give the cocktail-napkin explanation? Sturm sequences are used to find how many real roots of a polynomial P lie between two real numbers. By dividing P by gcd(P,P'), we can reduce all multiple roots to simple roots. Therefore, assume that P has only simple roots. Generate the sequence of polynomials { P_k } as follows: P_0 = P P_1 = P' P_k = -(P_{k-2} mod P_{k-1}) for k>1 [1] Except for the sign change, [1] is the polynomial Euclidean Algorithm. Therefore, the last polynomial in the sequence is a constant polynomial because P and P' are relatively prime. Furthermore, for k>0, if P_k(x) = 0, then P_{k+1}(x) = -P_{k-1}(x) [2] This means that if P_k(x) = P_{k-1}(x) = 0, then P is identically 0. Assume this is not the case. Now, define Sturm(P,x) to be the number of sign changes in the sequence { P_k(x) }. The only place where Sturm(P,x) can change is where one of the P_k vanishes. However, for k>0, [2] says that if P_k(x) = 0, then P_{k+1}(x) and P_{k-1}(x) have opposite signs. Therefore, no matter how P_k behaves near x, it provides no change in Sturm(P,x). Thus, only where P (P_0) vanishes can Sturm(P,x) change. If P increases through 0 at x, then P' (P_1) is positive; thus, Sturm(P,x) decreases by 1. If P decreases through 0 at x, then P' is negative; thus, Sturm(P,x) decreases by 1 again. Thus, Sturm(P,x) decreases by 1 at every root of P. Thus, assuming neither P(x) nor P(y) is 0, the number of roots of P in [x,y] is given by Sturm(P,x) - Sturm(P,y) [3] Rob Johnson Apple Computer, Inc. rjohnson@apple.com ============================================================================== From: rjohnson@apple.com (Robert Johnson) Newsgroups: sci.math.num-analysis,sci.math Subject: Re: Determining if the Roots of a Quartic are Real Date: 9 Feb 1996 03:13:00 -0800 In article <4fa4j3$4ca@news.fsu.edu>, Matthew Brenneman wrote: >I am working with a quartic eqn having the general form: > > x^4 + a3 x^3 + a2 x^2 + a1 x + a0 = 0 > >What I need to know is: IF all of the roots are real > THEN is there some condition that the > coefficients have to satisfy? Use Sturm sequences to count the number of real roots. Strictly speaking, Sturm sequences only work for polynomials with simple roots, but dividing a polynomial P by gcd(P,P') reduces all multiple roots to simple ones. If all roots of P/gcd(P,P') are real then all roots of P are real. Here is a copy of an article on Sturm sequences that I just posted to sci.math and sci.math.num-analysis. [previous article quoted; now deleted -- djr] By looking at the degree and sign of the highest order term in each P_k, we can compute Sturm(P,-oo) - Sturm(P,+oo) which is the number of real roots of P. Rob Johnson Apple Computer, Inc. rjohnson@apple.com ============================================================================== Newsgroups: sci.math.num-analysis,sci.math From: bpc@netcom.com (Benjamin P. Carter) Subject: Re: Determining if the Roots of a Quartic are Real Date: Sat, 10 Feb 1996 08:07:40 GMT rjohnson@apple.com (Robert Johnson) writes: >In article <4fa4j3$4ca@news.fsu.edu>, >Matthew Brenneman wrote: >>I am working with a quartic eqn having the general form: >> >> x^4 + a3 x^3 + a2 x^2 + a1 x + a0 = 0 >> >>What I need to know is: IF all of the roots are real >> THEN is there some condition that the >> coefficients have to satisfy? I assume the coefficients are all real; otherwise the roots will not all be real. >Use Sturm sequences to count the number of real roots. ... [snip] That is one approach. Another is to look at discriminants. If the discriminant of the given polynomial does not vanish, then there are no repeated roots. If the discriminant is negative, then there are two real roots and two non-real (complex) roots. If the discriminant is positive, then either all roots are real, or none of them are. These two sub-cases can be distinguished by examining the derivative of the given polynomial. The sign of the discriminant of the derivative determines how many real roots it has. If the derivative has only one real root, then the given polynomial has < 4 real roots. If the derivative has three real roots, then the outer two are the x values where the given polynomial has local minima. If the given polynomial is ever negative, then it has some real roots. -- Ben Carter internet address: bpc@netcom.com ============================================================================== [Actually the discriminant IS part of the chain of polynomials to be computed when calculating sturm sequences -- djr] ==============================================================================