From: kovarik@mcmail.cis.McMaster.CA (Zdislav V. Kovarik) Subject: Re: C Code for Robust Linear Regression Date: 4 Aug 2000 18:01:05 -0400 Newsgroups: sci.math.num-analysis Summary: [missing] In article <398B2727.916D9405@flashcom.net>, Vince Castelli wrote: :Hi! : :I am looking for some C code that performs robust linear regression :(i.e. VERY resistant to outliers). The standard sum-of-squares :algorithm is quite useless in these situations. Does anyone have a :pointer to some code (other than the one in Numerical Recipes) or has a :routine that they would be willing to share? : :Regards, :Vince The l_1 (ell-one) regression is oblivious to outliers. It consists of minimizing the sum of absolute values of the residuals (in contrast with summing the squares of these residuals), but there is a price: the problem belongs to linear programming rather than linear equation systems. To illustrate the insensitivity to outliers, let me remind you that for just a sample (not paired with any other variable), the intermediate value which is immune to outliers is the median, and it minimizes the sum of absolute values of individual deviations. Vasek Chvatal has in his book Linear Programming a section about l_1 problems. (I wish I had the book near me to give details). If you have just one independent and one dependent variable, and the sample has length N, there is a slow, brute-force method (O(N^2) time, unless you can parallelize): from the theory, we know that the optimal solution passes through two data points (to be found), so you scan all N*(N-1)/2 pairs, each time add up the absolute values of residuals, and pick the winner. By this time, some efficient l_1 (first-power) approximation solvers should be in existence. Hope it helps, ZVK(Slavek). ============================================================================== From: "Alan Miller" Subject: Re: C Code for Robust Linear Regression Date: Fri, 04 Aug 2000 23:21:21 GMT Newsgroups: sci.math.num-analysis Applied Statistics algorithm AS 282 is for robust linear (multiple) regression. You can download it from statlib (http://lib.stat.cmu.edu) or from my ozemail web site (I think I put it in with my least squares code). You will need to translate it from fortran. -- Alan Miller, Retired Scientist (Statistician) CSIRO Mathematical & Information Sciences Alan.Miller -at- vic.cmis.csiro.au http://www.ozemail.com.au/~milleraj http://users.bigpond.net.au/amiller/ [quote of original message deleted --djr] ============================================================================== From: Thomas Ruedas Subject: Re: C Code for Robust Linear Regression Date: Sat, 05 Aug 2000 01:44:02 +0200 Newsgroups: sci.math.num-analysis I had this problem as well recently, and among others got the recommendation in this group to use a conventional (possibly weighted) least-squares (i.e. L2) method recursively, which has the advantage that you can use an existing LSQ routine. For implementation the following references might be useful for you: @BOOK{PJHuber81, AUTHOR = {Huber, P. J.}, TITLE = {Robust Statistics}, PUBLISHER = {J.Wiley}, ADDRESS = {New York}, YEAR = {1981}} @ARTICLE{OLeary90, AUTHOR = {O'Leary, D. P.}, TITLE = {Robust regression computation using iteratively reweighted least squares}, JOURNAL = {SIAM J.Matr.An.App.}, VOLUME = {11}, NUMBER = {3}, YEAR = {1990}, PAGES = {466--480}} Page 18 of Huber outlines a simple algorithm. HTH, -- ------------------------------------------------------------------------ Thomas Ruedas Institute of Meteorology and Geophysics, J.W.Goethe University Frankfurt e-mail: ruedas@geophysik.uni-frankfurt.de http://www.geophysik.uni-frankfurt.de/~ruedas/ ------------------------------------------------------------------------