Date: Fri, 28 Apr 1995 17:17:42 +0200
From: khenry@dip1.ee.uct.ac.za (Karen Henry)
To: rusin@math.niu.edu
Subject: Re: smallest enclosing ellipse

hello Dave

I must admit, I didnt specify the problem very well.Sorry.

What I have is a tightly clustered set of points in a 2-dim planeThe distribution of the points will be in an 'elliptical shaped group'  - hence the desire to fit an ellipse, in particular, to the data .I want to fit an ellipse around these points.The ellipse must contain all data points within(some upon) its boundaries.The cluster will always be best represented by an ellipse,NOT a circle.
 
> What does it mean to 'fit an ellipse to points'? If the goal is to find
> an ellipse enclosing them all, simply compute the center of mass (average the
> coordinates), compute all the distances to this center (retaining the
> largest value found along the way) and output the circle with center at
> the center-of-mass with radius being that maximal distance. This requires
> two passes over the data points.
>
 The major axis may/may not be parallel to the coordinate axes - it,s orientation will be determined by the data points.This means that the above method of finding the centre of mass and using farthest dist to define radius(actually need foci length, or a,b dist from centre along major,minor axes) will not work in some cases.
It may be that your above method will have to be the first step.I will then have to test each point to see if it lies within this boundary , and adjust the ellipse accordingly.I was hoping that someone knew of a more efficient method than this, though.

 
 when one is better than another (smaller area? smaller eccentricity? 
> smaller perimeter?) 

I havent thought much about this; ideally the total distance from the ellipse boundary to the data points farthest from the ellipse centre(or rather closest to the ellipse boundary ) should have the minimum value for the 'best' fit ellipse ,much like a least-squares fit to a straight line.


 If you think the points actually lie on the curve of an ellipse
> (or if they are scattered a little around such a curve) then you can
> try to select the curve which is optimal in some way (perhaps in the
> least-squares sense) passing near the data. This is a linear regression
> problem and is easily handled with linear algebra or with standard
> statistical software.
> 
 None of the points must lie beyond the boundary of the ellipse.
 
I hope the above detail makes things a little clearer

thanks very much.

karen

==============================================================================

Date: Fri, 28 Apr 95 11:58:21 CDT
From: rusin (Dave Rusin)
To: khenry@dip1.ee.uct.ac.za, rusin@math.niu.edu
Subject: Re: smallest enclosing ellipse

> The major axis may/may not be parallel to the coordinate axes - it,s
>orientation will be determined by the data points.This means that the
>above method of finding the centre of mass and using farthest dist to
>define radius(actually need foci length, or a,b dist from centre along
>major,minor axes) will not work in some cases.  It may be that your
>above method will have to be the first step.I will then have to test
>each point to see if it lies within this boundary , and adjust the
>ellipse accordingly.I was hoping that someone knew of a more efficient
>method than this, though.

OK, try this: first average the points to get a good guess of the center.
Translate everything so the center is at the origin.

Now imagine the ellipse going around the points. It could be parameterized
as (x,y) = (a cos(theta+phi), b sin(theta+phi) ) for fixed constants  
a, b, and  phi, where  theta ranges from  0  to 2pi. Using the angle-addition
formulas, this can also be written
	(a*c cos(theta) - a*s sin(theta), b*s cos(theta) + b*c cos(theta) )
where  s=sin(phi) and c=cos(phi) are really just two constants satisfying
c^2+s^2=1.

All of your points will be inside this ellipse iff each point is closer to
the origin than the corresponding point on the ellipse: in other words, if
each point is expressed in the form  P_i = r_i (cos(theta_i), sin(theta_i)),
then you need  
	r_i^2 <= [a*c cos(theta_i) - a*s sin(theta_i)]^2 
		+ [b*s cos(theta_i) + b*c cos(theta_i)]^2
for all  i. It's probably easier to multiply thru by r_i^2 since then
you may write x_i for r_i cos(theta_i) and likewise for y_i. Thus the
condition that the ellipse enclose all the points is the condition that
	(x_i^2+y_i^2)^2 <= [a*c x_i - a*s y_i]^2 + [b*s x_i + b*c y_i]^2
for all points  P_i=(x_i, y_i). 

So now you've written this as a quadratic optimization problem: you
want to choose four variable a, b, c, s to minimize something
quadratic (probably a*b, which is essentially the area of the ellipse)
subject to a quadratic equation (c^2+s^2=1) and  N quartic inequalities
(each of the inequalities above). 

If you can guess or otherwise fix the orientation  phi, then  c  and  s
are also known, and each of the inequalities is simply
	K_i <= L_i A + M_i B
where  A=a^2, B=b^2, and  the coefficients can now be computed from the
point  P_i. If you don't mind changing your optimization function to 
something like  sqrt(a^2+b^2)  (which gets small if the ellipse is of
small area and not too far from a circle), then you have a linear
optimization problem: minimize  A+B subject to a bunch of linear
inequalities (including  A>=0, B>=0). The simplex algorithm works well in
practice, and ellipsoidal methods are known to produce an answer in 
polynomial time.

So you have to guess the center and the orientation and accept a screwy
notion of "best" ellipse; then you have a procedure to find that best one.

Probably in your shoes I would just punt and choose the point  P_0  which
is furthest from the origin, and then assume that's where the major axis
points. This would fix  c =  x_0/r_0,  s = -y_0/r_0. Moreover, I would
choose  b (the length of the semi-major axis) to be  r_0. Then each of the
previous inequalities simply becomes  a^2 A_i >= B_i, so I would select
a  as the minimum of all the expressions  sqrt(B_i/A_i).

If you have more computation power, you can argue as follows: in general,
an ellipse at the origin is determined by 3 constants (a, b, phi) or if
you allow the center to move as well, it's determined by  5  constants. Thus
in general, any choice of 3 (or 5) points will determine a unique ellipse
passing thru them all. Also you can argue that except in special conditions,
an ellipse will have to pass thru  3  (or  5)  or else it is not smallest.
Thus one possibility is to compute all the ellipses passing thru some points
and check to see which ones contain all the other points. For  N  points,
this will require N^4 resp N^6 operations, so this is not the sort of thing
you'll want to do very often, but in critical situations it might be worth it.
Possibly there are other efficient methods for polynomial optimization
(linear, quadratic, ...) but I don't keep up with that area of research.