From: "Alan Miller" Subject: Re: Problems with multiplying large matrices Date: Wed, 09 Feb 2000 01:43:25 GMT Newsgroups: sci.math.num-analysis Summary: [missing] E. Robert Tisdale wrote in message <38A0BE33.42F2AC6D@netwood.net>... >Alan Miller wrote: > >> I recommend that you use a QR factorization >> >> X = QR >> >> where Q is an orthonormal transformation (Q'Q = I) >> and R is the upper-triangular Cholesky matrix, >> which in your case will be the upper triangle of a 15x15 matrix. > >This needs a little explanation. >Q is a 100 million by 100 million element matrix. > Q is 100 million x 100 million, but it is never formed. It needs a lot of explanation. Anyone who is not familiar with modern methods of least squares calculation (post about 1970), should read a book on it such as: `Solving Least Squares Problems' by C.L.Lawson & R.J.Hanson which has been reprinted as a paperback by SIAM (1995). (First edition - Prentice-Hall, 1974) The original contained the code for each algorithm. This can be downloaded from the lawson-hanson directory at netlib. For a more advanced (far more mathematical) text, see: `Numerical Methods for Least Squares Problems' by Ake Bjorck, also published by SIAM as a paperback (1996). -- Alan Miller, Retired Scientist (Statistician) CSIRO Mathematical & Information Sciences Alan.Miller -at- vic.cmis.csiro.au http://www.ozemail.com.au/~milleraj ============================================================================== From: hrubin@odds.stat.purdue.edu (Herman Rubin) Subject: Re: Problems with multiplying large matrices Date: 9 Feb 2000 12:35:05 -0500 Newsgroups: sci.math.num-analysis In article , Alan Miller wrote: >LazyLizard wrote in message <87q5bc$75o$1@nnrp1.deja.com>... >>In article <389F3291.97EF9B3C@netwood.net>, >>Thanks for the suggestion but I need to take care of the number of >>values (that would be cases of the variable set) of the variables. >>Those cases are of the order of 10-100 million, its part of a data- >>minig problem. So, basically, I would have limited number of columns >>(say 20) and almost unlimited number of rows (say 10-100 million) for >>the input set. How do I transpose such a matrix? Or better still, how >>can I find (X^T)times(X)? I would still need X^T in order to solve the >>matrix equation, because I need to pre-multiply D with X^T. >>Let me list out the steps that I am using to solve the linear >>regression equation: >>1. Find X^T >>2. Premultiply both sides of the equation with X^T >> ((X^T)X)W = (X^T)D >>3. Find ((X^T)X) inverse >>4. Premultiply both sides of the equation with result of step 3 >> W = (inverse of (X^T)X)((X^T)D) >>Here, X is a matrix dimension:(100million X n), n<40 >> W is a matrix dimension:(n X 1), n<40 >> D is a matrix dimension:(100million X 1) >>By the way I am using Java to develop the application. >>So, now can you help me? >>Thanks in anticipation ... >This is not a good way to carry out linear least-squares >calculations. I recommend that you use a QR factorization >X = QR >where Q is an orthonormal transformation (Q'Q = I) and R >is the upper-triangular Cholesky matrix, which in your case >will be the upper triangle of a 15x15 matrix. The number of operations in the QR factorization is roughly twice that in computing by the usual method, and the memory requirement is larger. >Using planar (Jacobi/Givens) rotations, this can be done >one row at a time, so that you never have to store your >1million x 15 matrix X in memory. No, but there would have to be large movements of data in and out. To carry out the operations on the right side, the 15 vectors, of essentially full size, will still have to be used repeatedly. On the other hand, if the X matrix is available by rows, one can multiply X^T by X, or by D, without doing anything involving much transient storage. One can do this for multidimensional D easily, and this, using the symmetry, gives as good a way to do it to obtain X^T*X as any. This was even done for regressions using large data sets BC (before computers). Just write out the matrix multiplication as a loop, adding the product of the (transpose of one row of X) with (one row of D) to the partial sum at a time. Only the partial sums need to be stored as intermediates. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 hrubin@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558 ============================================================================== From: "Alan Miller" Subject: Least squares, 10 million cases Date: Thu, 10 Feb 2000 00:25:50 GMT Newsgroups: sci.math.num-analysis Some further comments: Herman Rubin is only partially correct. The `straight' planar rotation algorithm does take about twice as many operations as forming the sum of squares and products matrix (X'X). My version (either at my web site or Applied Statistics algorithm AS 274) uses Sven Hammarling's algorithm which requires about 50% more operations than forming X'X. N.B. My version stores the Cholesky factor, R, as a vector, which saves time with subscript calculations compared with 2D storage. There is a fast planar rotation algorithm by Anda & Park, but I don't know of any public domain code for this. It takes about the same number of operations as forming X'X. I don't understand his comments about the Y variable. It is treated in the same way as one of the X variables. With 10 million cases, and 15 X-values and a Y-value for each case, access to the data is obviously an important consideration. The planar rotation updating method accesses the data by rows. Presumably, when he talks about punched cards, he was also accessing the data by rows (cases), rather than by columns (variables). My understanding is that Java stores data by rows, so that the planar rotation algorithm would be a very appropriate one for this language, whereas Householder reduction, which operates on columns, can be very efficient in Fortran. I did a web search for `QR Least squares Java' and found several sites with least-squares code in Java, but all of those which I looked at used translations of Householder reduction code which had originally been coded in Fortran. I have seen two translations of my code into C (using f2c), but I don't know of a translation into Java. -- Alan Miller, Retired Scientist (Statistician) CSIRO Mathematical & Information Sciences Alan.Miller -at- vic.cmis.csiro.au http://www.ozemail.com.au/~milleraj