From: spellucci@mathematik.tu-darmstadt.de (Peter Spellucci)
Subject: Re: Numerical integration: a question
Date: 13 Apr 2000 09:25:38 GMT
Newsgroups: sci.math.num-analysis
Summary: [missing]

In article <38F556B7.4AD3080E@NoSPAMeecs.umich.edu>,
 Thomas Kragh <tkraghNOSPAM@NoSPAMeecs.umich.edu> writes: 
|> If your data is given to you on a uniformly-sampled set of points, and
|> you do not have any other information about the function, I would say
|> that an iterated Simpson's Rule is probably your best best.
|> 
|> Note that the "standard" simpson's rule is >exact< for polynomials up to
|> 3rd-power, so fitting a cubic spline is a waste of time - the cubic
|> polynomial fit is "built into" the numerical integration algorithm
|> already.
this is not completely correct. think what simpson does: it interpolates three
consecutive points by a parabola, integrates this exactly and sums up. 
by hazard, if these three points are from a cubic, it integrates this cubic
exact (because of symmetry of weights and nodes with respect to the midpoint
of the interval).
Now, what does the questioners code? it interpolates the data globally and
obtains a cubic  b e t w e e n  a n y  t w o grid points, evaluates this 
piecewise cubic on a refined grid with the half stepsize and integrates this one.
If the data are indeed smooth, the order of the error is O(h^4) in both cases.
But... assume that his data are subject to some (hopefully small) errors.
then he can use a smoothing spline , do exactly the same thing and will get 
a much more meaningful result than simply applying Simpsons rule to the raw
data. His question, whether there exists some "better" method is hard to answer.
In principle one can use either piecewise integration by higher order
Newton-Cotes formulae or interpolation of an interpolating spline of higher
order (no problem to compute such) or smoothing splines of higher order
(also no problem in principle, but are there ready to use codes out there say
for a fifth or seventh degree smoothing spline?).
But all this makes sense only if the errors in the data are very small, best
zero,  a n d   t h e   h i g h e r  d e r i v a t i v e s  of the function
underlying all this growth slower in magnitude for order k than (1/h)^k, 
his grid size.
For smooth data and high precision arithmetic, one could decide that on the
basis of the higher order divided differences of the data, but for data
subject to some noise this makes no sense.
hope that helps
peter
==============================================================================

From: "r.e.s." <rs.1@mindspring.com>
Subject: Simpson's paradox, continuous-case
Date: Sun, 16 Apr 2000 10:57:38 -0700
Newsgroups: sci.math,sci.stat.math

A general form of "Simpson's paradox" is as follows:

Random variables X,Y,Z, are distributed such that,
for y,z in the support of Y,Z,

     E(X|Y=y) is strictly increasing in y,
     but, for every z,
     E(X|Y=y,Z=z) is strictly decreasing in y.

(The distribution of (X,Y,Z) could be discrete,
continuous, or mixed.)

Are there especially nice & simple examples of this
in which E(X|Y=y) and E(X|Y=y,Z=z) are *continuous*
functions?

--r.e.s.
==============================================================================

From: Rich Ulrich <wpilib@pitt.edu>
Subject: Re: Simpson's paradox, continuous-case
Date: Mon, 17 Apr 2000 16:16:53 -0400
Newsgroups: sci.math,sci.stat.math

On Sun, 16 Apr 2000 10:57:38 -0700, "r.e.s." <rs.1@mindspring.com>
wrote:

> A general form of "Simpson's paradox" is as follows:
> 
> Random variables X,Y,Z, are distributed such that,
> for y,z in the support of Y,Z,
> 
 < snip;  x is correlated with y;  controlling for z, x is negatively
correlated with y .> 

> Are there especially nice & simple examples of this
> in which E(X|Y=y) and E(X|Y=y,Z=z) are *continuous*
> functions?


The number of firemen attending to a fire and the cost of damages are
pretty-much continuous variables.  Before you control for seriousness,
it does appear that firemen are responsible for damages.

Simpson's paradox  is also know as "Ecological Fallacy,"  and I think
that the examples under that name may be more apt to be continuous.

-- 
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html
==============================================================================

From: "r.e.s." <rs.1@mindspring.com>
Subject: Re: Simpson's paradox, continuous-case
Date: Tue, 18 Apr 2000 00:28:55 -0700
Newsgroups: sci.math,sci.stat.math

"Rich Ulrich" <wpilib@pitt.edu> wrote ...
| "r.e.s." <rs.1@mindspring.com> wrote:
|
| > A general form of "Simpson's paradox" is as follows:
| >
| > Random variables X,Y,Z, are distributed such that,
| > for y,z in the support of Y,Z,
| >
|  < snip; x is correlated with y;
|    controlling for z, x is negatively correlated with y.>

Of course the part that you snipped, viz.,

     E(X|Y=y) is strictly increasing in y,
     but, for every z,
     E(X|Y=y,Z=z) is strictly decreasing in y.

is more general than correlation effects.

(It reads as though you were saying they're equivalent.
I agree that specializing to correlations is a first
step toward geting a "nice & simple" (linear) example.)

| > Are there especially nice & simple examples of this
| > in which E(X|Y=y) and E(X|Y=y,Z=z) are *continuous*
| > functions?
|
| The number of firemen attending to a fire and the cost of
| damages are pretty-much continuous variables.  Before you
| control for seriousness, it does appear that firemen are
| responsible for damages.

The example seems to be this:

X=#firemen,
Y=cost of damages,
Z=some other measure of "seriousness".

It does seem reasonable that E(X|Y=y) might be increasing
in y;  however, does it seem likely to you that,
for fixed z, E(X|Y=y,Z=z) is strictly decreasing in y?

(Because of the likely strong associations among X, Y, Z,
it seems to me that conditioning on Z=z might very much
weaken, but not "reverse", the dependency on y.  So this
seems to be a case of "confounded" variates, and more an
example of the Ecological Fallacy than of a generalized
Simpson's paradox, which would involve the "reversal"
behavior.)

| Simpson's paradox  is also know as "Ecological Fallacy,"
| and I think that the examples under that name may be more
| apt to be continuous.

Thanks, I hadn't seen that terminology.  But from what
I've read so far, the Ecological Fallacy doesn't require
the pseudo-paradoxical "reversal" behavior -- i.e. the
reversal from strictly increasing to strictly decreasing
y-dependency between E(X|Y=y) and E(X|Y=y,Z=z) -- that's
characteristic of Simpson's "paradox".

http://www2.chass.ncsu.edu/garson/pa765/datalevl.htm
has this to say about the term:

"Coined by Robinson (1950), the ecological fallacy is
assuming that individual-level correlations are the
same as aggregate-level correlations.  Robinson showed
that individual level correlations may be larger,
smaller, or even reverse in sign compared to aggregate
level correlations."

Elsewhere, the definition of "ecological fallacy" is
extended to refer to the use of aggregate data to draw
inferences about individuals -- a very broad definition,
indeed!

--r.e.s.
==============================================================================

From: Rich Ulrich <wpilib@pitt.edu>
Subject: Re: Simpson's paradox, continuous-case
Date: Tue, 18 Apr 2000 11:53:14 -0400
Newsgroups: sci.math,sci.stat.math

Concerning my response to his post,
on Tue, 18 Apr 2000 00:28:55 -0700, "r.e.s." <rs.1@mindspring.com>
wrote:
 < ... >
> The example seems to be this:
> 
> X=#firemen,
> Y=cost of damages,
> Z=some other measure of "seriousness".
> 
> It does seem reasonable that E(X|Y=y) might be increasing
> in y;  however, does it seem likely to you that,
> for fixed z, E(X|Y=y,Z=z) is strictly decreasing in y?
 < ... >
Yes, there is that reversal, so they go the opposite directions.  

Pay attention to "reversal."  It is not hard to write a description of
"strictly increasing... decreasing"  from a concrete example, but it
is trickier to take the abstract and construct the concrete in terms
that still sound natural.   

Reversal:
 - For *fixed*  seriousness, expect less damage with more firemen.  
 - Whereas, overall, expect more damage with more firemen.

I snipped the E() lines the first time because I find them confusing;
now, I guess that r.e.s. did, too.  


< snip; concerning Ecological Fallacy, which I described as another
name for Simpson's Paradox -- > 

> http://www2.chass.ncsu.edu/garson/pa765/datalevl.htm
> has this to say about the term:
> 
> "Coined by Robinson (1950), the ecological fallacy is
> assuming that individual-level correlations are the
> same as aggregate-level correlations.  Robinson showed
> that individual level correlations may be larger,
> smaller, or even reverse in sign compared to aggregate
> level correlations."
> 
> Elsewhere, the definition of "ecological fallacy" is
> extended to refer to the use of aggregate data to draw
> inferences about individuals -- a very broad definition,
> indeed!

Very good, and very well;  I concede that the Ecological Fallacy is a
broader term than Simpson's Paradox.  I guess, the latter can provide
the harsh examples of the former.  I had not paid attention to that,
but in the future, I will try to be more specific with my description.

-- 
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html
==============================================================================

From: israel@math.ubc.ca (Robert Israel)
Subject: Re: Simpson's paradox, continuous-case
Date: 18 Apr 2000 19:21:37 GMT
Newsgroups: sci.math,sci.stat.math

In article <8dcut8$pis$1@slb6.atl.mindspring.net>,
 "r.e.s." <rs.1@mindspring.com> writes:
> A general form of "Simpson's paradox" is as follows:
> 
> Random variables X,Y,Z, are distributed such that,
> for y,z in the support of Y,Z,
> 
>      E(X|Y=y) is strictly increasing in y,
>      but, for every z,
>      E(X|Y=y,Z=z) is strictly decreasing in y.
> 
> (The distribution of (X,Y,Z) could be discrete,
> continuous, or mixed.)
> 
> Are there especially nice & simple examples of this
> in which E(X|Y=y) and E(X|Y=y,Z=z) are *continuous*
> functions?

Suppose (Y,Z) have the joint distribution 
f(y,z) = 2 for 0 <= z <= y <= 1
         0 otherwise

and X = Z - Y/4.  Then E[Z|Y=y] = y/2 so E[X|Y=y] = y/4
is strictly increasing in y, but E[X|Y=y,Z=z] = z-y/4 is
strictly decreasing in y.

Robert Israel                                israel@math.ubc.ca
Department of Mathematics        http://www.math.ubc.ca/~israel 
University of British Columbia            
Vancouver, BC, Canada V6T 1Z2