From: borchers@newshost.nmt.edu (Brian Borchers) Subject: Re: regularization Date: 5 Oct 1999 10:24:39 -0600 Newsgroups: sci.math.num-analysis Keywords: error propagation, regularization smarsly@mpikg-golm.mpg.de wrote: >who can help me with a problem concerning the error propagation of fitting >data WITH additional regularization (or some other constraints)? > >The problem: in least square fits of data with a fitting function f(c), c >being the vector containing the fitting parameters, one has to minimize > >|f(c)- data|^2 + r*Reg(c). >Reg(c) is a regularization term. Let F be a >matrix containing the derivatives d f(c)/dc AND the regularization terms, >then the solution c of the fitting is c = inv(F) * data (neglecting weighting, >etc.) with the covariance matrix Cc = inv(F) * C_data * inv(F)' (Cc provides >the intervals of "confidence" for the fitting parameters c, >C_data represents the data errors). > >My question: is it correct to include the regularization for the >calculation of the covariance matrix (i.e. the intervals of >confidence) in this way? If not, how is the regularization taken into >account correctly in calculating the error propagation for the >parameters c? There are several important things to consider here: - First, you should always be aware that the estimates produced with regularization are biased in the sense that the expected value of the regularized solution is not the value of the "true" solution. Depending on how large the regularization parameter r is, the results may be extremely biased. - If you don't include the regularization in computing the confidence intervals, then you'll get confidence intervals that are just as large as those that you would get from simple least squares. Since you wouldn't be using regularization unless the least squares problem was badly conditioned, these intervals are likely to be so large that they're useless. - If you do include the regularization in computing the confidence intervals, you have to be aware that the confidence intervals are also biased. As r gets larger, the confidence intervals will get tighter... Two extreme cases illustrate the problem. Suppose that you're using simple 0th order Tikhonov regularization on a linear inverse problem Ax=b. That is, you're minimizing ||Ax-b||^2 + r^2 ||x||^2 Suppose further that the least squares problem is very badly conditioned. If we use r=0, we get the least squares solution with very large confidence intervals for the parameters. On the other hand, in the limit as r goes to infinity, we get the solution x=0 with confidence intervals of +- 0. Which do you prefer, an unbiased solution that tells you that the data don't tell you anything, or an incredibly biased solution (a solution to the wrong problem!) with very tight confidence intervals for the parameters? This trade off should be considered in constructing confidence intervals for the parameters. Note that papers in the geophysics literature often do include such confidence intervals. The typical approach is to find the generalized inverse matrix (including the regularization at the level which was used in constructing the inverse solution) and use it to compute a confidence interval. An alternative approach is to use a Monte Carlo method to generate lots of inverse solutions for simulated data sets (with the same statistics as the actual data set) and then generate confidence intervals from these solutions... There's a brief discussion of constructing confidence intervals for discrete ill-posed problems (along with some references) in Per Hansen's book, "Rank Deficient and Discrete Ill-Posed Problems", on page 123. -- Brian Borchers borchers@nmt.edu Department of Mathematics http://www.nmt.edu/~borchers/ New Mexico Tech Phone: 505-835-5813 Socorro, NM 87801 FAX: 505-835-5366