From: Paige Miller Subject: Re: PCA on covariance or correlation matriceS? Date: Tue, 09 May 2000 08:11:52 -0400 Newsgroups: sci.math.num-analysis,sci.stat.math To: A T Summary: [missing] A T wrote: > > Sorry to post this again. I had the answer awhile ago from the newsgroup but > my machien crashed and lost my files. > > Could someone explain to me the difference on using covaraince matrix and > correlation matrix for working out the pricipal components again? There is no difference in the calculations, the only difference is in interpretation. If you use covariance matrix, then you are implicitly assuming that all of the original variables are in the same units. Thus, if they are not all in the same units ... for example, if one variable is pH (range 0 to 14) and another variable is RPMs on a motor (range is 1000 to 10000 rpm) then the variable with the larger variability (in this case RPMs) will dominate the loadings that you find. Variables with larger variances will dominate the results (which may or may not be what you want). If you use the correlation matrix, then you force all of the original variables to be in the same units. Thus, they will be given a priori the same weight in the analysis (which may or may not be what you want). In the case I cited where one variable was pH and another variable was RPMs, I believe most people would select the correlation matrix as the approach. -- Paige Miller Eastman Kodak Company paige.miller@kodak.com "It's nothing until I call it!" -- Bill Klem, NL Umpire