From: Ray Chason Subject: Re: Fast multiplication and division Date: 22 Oct 1999 05:56:15 GMT Newsgroups: sci.math Keywords: Karatsuba multiplication of large integers Quanta wrote: >Hello, Do you where can I find any full description of the Karatsiba or >Knuth algorithm to multiplie and divide two big integer as fast as >possible to translate it in C programming?? Most fast integer multiplication algorithms are really polynomial multiplication algorithms. You break the integers into pieces, consider each of the pieces a coefficient of a polynomial, multiply the polynomials, and then evaluate the product polynomial. For instance: 1234 * 5678 (x^3 + 2*x^2 + 3*x + 4)(5x^3 + 6x^2 + 7*x +8) (x = 10) 5x^6 +16x^5 + 34*x^4 + 60*x^3 + 61*x^2 + 52*x + 8 which is 7006628 at x=10. So, to some algorithms. 1) Karatsuba multiplication is simple to explain: * Start with your factors A and B. * Consider each to be a binomial (A1*x+A0) and (B1*x+B0), choosing x so as to break the two numbers in half. * The product of these two binomials is: A1*B1*x^2 + (A1*B0+A0*B1)*x + A0*B0. * The trick is to evaluate the center term as: (A1+A0)*(B1+B0) - A1*B1 - A0*B0 Since you need to evaluate A1*B1 and A0*B0 only once, this formula requires only one additional multiplication. So you multiply numbers twice as long with three times as many operations, where the grammar school algorithm would require four times as many. * Use recursion to evaluate the three products. Fall back to the grammar school algorithm once the factors are small enough (it's much like optimizing a Quicksort with an insertion sort). 2) I suppose anything called a "Knuth algorithm" would be in one of his books. By this, do you perhaps mean an FFT based algorithm? A Fast Fourier Transform evaluates a polynomial at a set of complex roots of unity. Once you've done this with your two factors, you multiply each of the elements of FFT(A) with the corresponding element of FFT(B). Follow with the inverse transform and you have your product polynomial. There is a trap in this algorithm: floating point arithmetic is only so accurate. There will be some roundoff noise in the product, and you want to be sure it's to the right of the decimal point, or your product will be wrong. Use long double if your compiler supports it, and put only a few bits in each coefficient. 3) A variant of the above uses the number theoretic transform. An NTT is an FFT in a finite integer field. The convolution with NTTs yields the residues, modulo the modulus (argh!) of the finite field, of the coefficients of the product polynomial; but you can do multiple NTT-based convolutions (typically three) and apply the Chinese Remainder Theorem to recover the coefficients. NTT-based multiplication can be faster than FFT, despite having to do three sets of transforms, because it uses only integer arithmetic; it also doesn't suffer from roundoff noise, and so it can put more bits in each coefficient. -- --------------===============<[ Ray Chason ]>===============-------------- PGP public key at http://www.smart.net/~rchason/pubkey.asc Delenda est Windoze! ============================================================================== From: Randy Poe Subject: Re: Fast multiplication and division Date: Sun, 24 Oct 1999 10:22:39 -0400 Newsgroups: sci.math Ray Chason wrote: > > Quanta wrote: > > >Hello, Do you where can I find any full description of the Karatsiba or > >Knuth algorithm to multiplie and divide two big integer as fast as > >possible to translate it in C programming?? > > Most fast integer multiplication algorithms are really polynomial > multiplication algorithms. You break the integers into pieces, consider > each of the pieces a coefficient of a polynomial, multiply the polynomials, > and then evaluate the product polynomial. > For instance: 1234 * 5678 > (x^3 + 2*x^2 + 3*x + 4)(5x^3 + 6x^2 + 7*x +8) (x = 10) > 5x^6 +16x^5 + 34*x^4 + 60*x^3 + 61*x^2 + 52*x + 8 > which is 7006628 at x=10. [snip a nice post] If your polynomials use the decimal digits, what you are doing is equivalent to long multiplication as we were taught to do in school. I have implemented efficient long-multiplication in assembly-language algorithms. For efficiency, you can use a natural base of the computer. For instance, break your numbers down into 16-bit words and use 16-bit multiply operations (and 32-bit adds). You are in effect doing long multiplication in base 2^16. The same thing can easily be done in C using shift and logical AND operations to extract the bits you need. Be VERY careful about use of signed and unsigned integers when you do this. A shift left of a number with a 1 in the high bit will fill the result with 1's from the left. This is fine if you are interpreting the result as a negative number, but disastrous if not. I like assembly because it gives me explicit control of such fine points. When I read a little farther in your post, I realized that you were not talking about long multiplication at all, but actually the FFT algorithms I have heard of. > > * The trick is to evaluate the center term as: > (A1+A0)*(B1+B0) - A1*B1 - A0*B0 > Since you need to evaluate A1*B1 and A0*B0 only once, this formula > requires only one additional multiplication. So you multiply numbers > twice as long with three times as many operations, where the grammar > school algorithm would require four times as many. This is, I believe, the "butterfly" at the heart of the FFT. Thanks for a very nice review of FFT-based multiplication algorithms. It goes in my save file. Randy Poe VisionPlace.com info1@visionplace.com ============================================================================== From: Ray Chason Subject: Re: Fast multiplication and division Date: 25 Oct 1999 06:20:30 GMT Newsgroups: sci.math Randy Poe wrote: >Ray Chason wrote: >> >> Quanta wrote: >> >> >Hello, Do you where can I find any full description of the Karatsiba or >> >Knuth algorithm to multiplie and divide two big integer as fast as >> >possible to translate it in C programming?? >> >> Most fast integer multiplication algorithms are really polynomial >> multiplication algorithms. You break the integers into pieces, consider >> each of the pieces a coefficient of a polynomial, multiply the polynomials, >> and then evaluate the product polynomial. >> For instance: 1234 * 5678 >> (x^3 + 2*x^2 + 3*x + 4)(5x^3 + 6x^2 + 7*x +8) (x = 10) >> 5x^6 +16x^5 + 34*x^4 + 60*x^3 + 61*x^2 + 52*x + 8 >> which is 7006628 at x=10. > >[snip a nice post] >If your polynomials use the decimal digits, what you are doing >is equivalent to long multiplication as we were taught to do >in school. Well, yes. But viewing an integer as a polynomial points the way to the faster algorithms, such as Karatsuba, or the FFT. >I have implemented efficient long-multiplication in assembly-language >algorithms. For efficiency, you can use a natural base of the >computer. For instance, break your numbers down into 16-bit >words and use 16-bit multiply operations (and 32-bit adds). >You are in effect doing long multiplication in base 2^16. >The same thing can easily be done in C using shift and >logical AND operations to extract the bits you need. >Be VERY careful about use of signed and unsigned integers >when you do this. It's also much faster in assembly language, as I'm sure you've noticed. I'd do it in C only for something I was going to distribute, so that it could be made to work on any hardware. I see from your headers that you're using Windows 98. That means at least a Pentium. Have you tried using 32 bit multiplies? The MUL instruction can multiply two 32 bit unsigned integers into a 64 bit product; it makes a good fast long-multiplication function that much easier to write. Maybe you're using a DOS-based compiler. Check out http://www.delorie.com/djgpp/ if you need some development tools that support the 32 bit instructions. >When I read a little farther in your post, I realized that >you were not talking about long multiplication at all, >but actually the FFT algorithms I have heard of. Mainly the Karatsuba algorithm, since you mentioned it by name. >> * The trick is to evaluate the center term as: >> (A1+A0)*(B1+B0) - A1*B1 - A0*B0 >> Since you need to evaluate A1*B1 and A0*B0 only once, this formula >> requires only one additional multiplication. So you multiply numbers >> twice as long with three times as many operations, where the grammar >> school algorithm would require four times as many. > >This is, I believe, the "butterfly" at the heart of the FFT. Actually this is the key to the Karatsuba method, and has nothing to do with FFTs. The "butterfly" is really a two-element DFT and looks like this: Decimation in frequency: (A[i], A[i+step]) <-- (A[i]+A[i+step], w*(A[i]-A[i+step])) Decimation in time: (A[i], A[i+step]) <-- (A[i]+w*A[i+step], A[i]-w*A[i+step]) Here are some FFT links for you: http://www.fftw.org/ http://www.jjj.de/fxt/ http://www.eptools.com/tn/T0001/INDEX.HTM -- --------------===============<[ Ray Chason ]>===============-------------- PGP public key at http://www.smart.net/~rchason/pubkey.asc Delenda est Windoze!