Date: Fri, 28 Apr 1995 17:17:42 +0200 From: khenry@dip1.ee.uct.ac.za (Karen Henry) To: rusin@math.niu.edu Subject: Re: smallest enclosing ellipse hello Dave I must admit, I didnt specify the problem very well.Sorry. What I have is a tightly clustered set of points in a 2-dim planeThe distribution of the points will be in an 'elliptical shaped group' - hence the desire to fit an ellipse, in particular, to the data .I want to fit an ellipse around these points.The ellipse must contain all data points within(some upon) its boundaries.The cluster will always be best represented by an ellipse,NOT a circle. > What does it mean to 'fit an ellipse to points'? If the goal is to find > an ellipse enclosing them all, simply compute the center of mass (average the > coordinates), compute all the distances to this center (retaining the > largest value found along the way) and output the circle with center at > the center-of-mass with radius being that maximal distance. This requires > two passes over the data points. > The major axis may/may not be parallel to the coordinate axes - it,s orientation will be determined by the data points.This means that the above method of finding the centre of mass and using farthest dist to define radius(actually need foci length, or a,b dist from centre along major,minor axes) will not work in some cases. It may be that your above method will have to be the first step.I will then have to test each point to see if it lies within this boundary , and adjust the ellipse accordingly.I was hoping that someone knew of a more efficient method than this, though. when one is better than another (smaller area? smaller eccentricity? > smaller perimeter?) I havent thought much about this; ideally the total distance from the ellipse boundary to the data points farthest from the ellipse centre(or rather closest to the ellipse boundary ) should have the minimum value for the 'best' fit ellipse ,much like a least-squares fit to a straight line. If you think the points actually lie on the curve of an ellipse > (or if they are scattered a little around such a curve) then you can > try to select the curve which is optimal in some way (perhaps in the > least-squares sense) passing near the data. This is a linear regression > problem and is easily handled with linear algebra or with standard > statistical software. > None of the points must lie beyond the boundary of the ellipse. I hope the above detail makes things a little clearer thanks very much. karen ============================================================================== Date: Fri, 28 Apr 95 11:58:21 CDT From: rusin (Dave Rusin) To: khenry@dip1.ee.uct.ac.za, rusin@math.niu.edu Subject: Re: smallest enclosing ellipse > The major axis may/may not be parallel to the coordinate axes - it,s >orientation will be determined by the data points.This means that the >above method of finding the centre of mass and using farthest dist to >define radius(actually need foci length, or a,b dist from centre along >major,minor axes) will not work in some cases. It may be that your >above method will have to be the first step.I will then have to test >each point to see if it lies within this boundary , and adjust the >ellipse accordingly.I was hoping that someone knew of a more efficient >method than this, though. OK, try this: first average the points to get a good guess of the center. Translate everything so the center is at the origin. Now imagine the ellipse going around the points. It could be parameterized as (x,y) = (a cos(theta+phi), b sin(theta+phi) ) for fixed constants a, b, and phi, where theta ranges from 0 to 2pi. Using the angle-addition formulas, this can also be written (a*c cos(theta) - a*s sin(theta), b*s cos(theta) + b*c cos(theta) ) where s=sin(phi) and c=cos(phi) are really just two constants satisfying c^2+s^2=1. All of your points will be inside this ellipse iff each point is closer to the origin than the corresponding point on the ellipse: in other words, if each point is expressed in the form P_i = r_i (cos(theta_i), sin(theta_i)), then you need r_i^2 <= [a*c cos(theta_i) - a*s sin(theta_i)]^2 + [b*s cos(theta_i) + b*c cos(theta_i)]^2 for all i. It's probably easier to multiply thru by r_i^2 since then you may write x_i for r_i cos(theta_i) and likewise for y_i. Thus the condition that the ellipse enclose all the points is the condition that (x_i^2+y_i^2)^2 <= [a*c x_i - a*s y_i]^2 + [b*s x_i + b*c y_i]^2 for all points P_i=(x_i, y_i). So now you've written this as a quadratic optimization problem: you want to choose four variable a, b, c, s to minimize something quadratic (probably a*b, which is essentially the area of the ellipse) subject to a quadratic equation (c^2+s^2=1) and N quartic inequalities (each of the inequalities above). If you can guess or otherwise fix the orientation phi, then c and s are also known, and each of the inequalities is simply K_i <= L_i A + M_i B where A=a^2, B=b^2, and the coefficients can now be computed from the point P_i. If you don't mind changing your optimization function to something like sqrt(a^2+b^2) (which gets small if the ellipse is of small area and not too far from a circle), then you have a linear optimization problem: minimize A+B subject to a bunch of linear inequalities (including A>=0, B>=0). The simplex algorithm works well in practice, and ellipsoidal methods are known to produce an answer in polynomial time. So you have to guess the center and the orientation and accept a screwy notion of "best" ellipse; then you have a procedure to find that best one. Probably in your shoes I would just punt and choose the point P_0 which is furthest from the origin, and then assume that's where the major axis points. This would fix c = x_0/r_0, s = -y_0/r_0. Moreover, I would choose b (the length of the semi-major axis) to be r_0. Then each of the previous inequalities simply becomes a^2 A_i >= B_i, so I would select a as the minimum of all the expressions sqrt(B_i/A_i). If you have more computation power, you can argue as follows: in general, an ellipse at the origin is determined by 3 constants (a, b, phi) or if you allow the center to move as well, it's determined by 5 constants. Thus in general, any choice of 3 (or 5) points will determine a unique ellipse passing thru them all. Also you can argue that except in special conditions, an ellipse will have to pass thru 3 (or 5) or else it is not smallest. Thus one possibility is to compute all the ellipses passing thru some points and check to see which ones contain all the other points. For N points, this will require N^4 resp N^6 operations, so this is not the sort of thing you'll want to do very often, but in critical situations it might be worth it. Possibly there are other efficient methods for polynomial optimization (linear, quadratic, ...) but I don't keep up with that area of research.