From: "Thane M. Larson" <thane_larson@hp.com>
Subject: Re: Ratio of trial pass/fail to large populations
Date: Tue, 14 Dec 1999 11:58:14 -0800
Newsgroups: sci.math
Keywords: Incomplete Beta Function, pass/fail tests

Hi Kirk,

Being the friend at work, I thought I would respond with answer to your question.

The limits are give by use of the inverse beta function, qbeta(x,a,b) if using mathcad.
For this problem, x = (confidence + 1)/2.   The upper bound will be  Pmax(x,a1,b1)=
qbeta(x,a1,b1) and the lower bound will be Pmin(x,a2,b2) = 1-qbeta(x,a2,b2)
where a1=c+1, b1=n-c,  a2=n-c+1, b2=c; where c=number of failures in your
experiment of n trials.

The incomplete beta function is the integral of the beta distribution and namely:

                       Integral { t^(a-1) * (1-t)^(b-1)  dt } evaluated from 0 to x
qbeta(x,a,b) =  -----------------------------------------------------------
                        Integral { t^(a-1) * (1-t)^(b-1)  dt } evaluated from 0 to 1


For more information regarding the Incomplete Beta Function see:

   Section 25.5
   Handbook of Mathematical Functions
   Edited by M. Abramowitz and L.E. Stegun
   US Dept. of Commerce
   National Bureau of Standards
   Applied Mathematics Series 55

Thane Larson

Kirk Bresniker wrote:

> I'm working through some design of experiment problems with a friend at work and
> we're trying to find a reference or solution to the following problem:
>
> I have a population of units, P. Of the P units, a certain percentage will fail
> in testing.  Unfortunately, I can't test every unit. I want to sample a much smaller
> number of units, s, which I will choose at random from P. What I am looking for
> is the relationship that for a given ratio of pass/fails from my test population
> what is the range of pass/fail ratios in the full population that could
> generate the experimental results, given a confidence interval on the results.
>
> Right now we have a graph, circa 1955, included as an appendix to an
> stats text, with no real information on how it was generated.  The graph
> has the pass/fail ratio of the experiment on the x-axis and the
> pass/fail ratio of the full population on the y-axis.  Above and below
> the x=y line are bands for different experiment populations.  To use the
> graph I draw a vertical line at the pass/fail ratio for my experiment.
> Where this line intersects the bands which match my experimental
> population, I draw two hortizontal line.  Where these lines intersect
> the y-axis represents the range of pass/fail in the general population
> that, with a confidence of 95%, could generate these results.
>
> Here is an example for a sample size of 15:
>
> 100%                                         +   +
>                                          +     .
>  90%                                 +       .
>                                  +         .
>  80%    ---------------------+           .       +
>                              |         .
>  70%                     +   |       .
>                              |     .
>  60%                 +       |   .           +
>                              | .
>  50%             +           .            +
>                            . |
>  40%         +           .   |
>                        .     |       +
>  30%                 .       |
>          +         .         |   +
>  20%             .           |
>          ------.-------------+
>  10%         .           +   |
>            .         +       |
>   0%     .   +   +           |
>
>          0   1   2   3   4   5   6   7   8   9   1
>          %   0   0   0   0   0   0   0   0   0   0
>              %   %   %   %   %   %   %   %   %   0
>                                                  %
> So, if I get a 50% fail in my experiment on 15 units, I know with 95% confidence
> that the large population has a fail range between 15% and 80%. Looks like I
> should have chosen a bigger sample!
>
> We're interested in reproducing the graphs with different confidence assumptions,
> as well as see the effects of populations which are not 'very large', but
> rather on the order of three or four orders of magnitude larger than the
> test population.
>
> The one reference included on the graph leads to a muddled reference to
> the incomplete beta function, and we're still trying to follow up those
> leads.
>
> Kirk Bresniker
> kirkb@rose.hp.com