From: "Thane M. Larson"
Subject: Re: Ratio of trial pass/fail to large populations
Date: Tue, 14 Dec 1999 11:58:14 -0800
Newsgroups: sci.math
Keywords: Incomplete Beta Function, pass/fail tests
Hi Kirk,
Being the friend at work, I thought I would respond with answer to your question.
The limits are give by use of the inverse beta function, qbeta(x,a,b) if using mathcad.
For this problem, x = (confidence + 1)/2. The upper bound will be Pmax(x,a1,b1)=
qbeta(x,a1,b1) and the lower bound will be Pmin(x,a2,b2) = 1-qbeta(x,a2,b2)
where a1=c+1, b1=n-c, a2=n-c+1, b2=c; where c=number of failures in your
experiment of n trials.
The incomplete beta function is the integral of the beta distribution and namely:
Integral { t^(a-1) * (1-t)^(b-1) dt } evaluated from 0 to x
qbeta(x,a,b) = -----------------------------------------------------------
Integral { t^(a-1) * (1-t)^(b-1) dt } evaluated from 0 to 1
For more information regarding the Incomplete Beta Function see:
Section 25.5
Handbook of Mathematical Functions
Edited by M. Abramowitz and L.E. Stegun
US Dept. of Commerce
National Bureau of Standards
Applied Mathematics Series 55
Thane Larson
Kirk Bresniker wrote:
> I'm working through some design of experiment problems with a friend at work and
> we're trying to find a reference or solution to the following problem:
>
> I have a population of units, P. Of the P units, a certain percentage will fail
> in testing. Unfortunately, I can't test every unit. I want to sample a much smaller
> number of units, s, which I will choose at random from P. What I am looking for
> is the relationship that for a given ratio of pass/fails from my test population
> what is the range of pass/fail ratios in the full population that could
> generate the experimental results, given a confidence interval on the results.
>
> Right now we have a graph, circa 1955, included as an appendix to an
> stats text, with no real information on how it was generated. The graph
> has the pass/fail ratio of the experiment on the x-axis and the
> pass/fail ratio of the full population on the y-axis. Above and below
> the x=y line are bands for different experiment populations. To use the
> graph I draw a vertical line at the pass/fail ratio for my experiment.
> Where this line intersects the bands which match my experimental
> population, I draw two hortizontal line. Where these lines intersect
> the y-axis represents the range of pass/fail in the general population
> that, with a confidence of 95%, could generate these results.
>
> Here is an example for a sample size of 15:
>
> 100% + +
> + .
> 90% + .
> + .
> 80% ---------------------+ . +
> | .
> 70% + | .
> | .
> 60% + | . +
> | .
> 50% + . +
> . |
> 40% + . |
> . | +
> 30% . |
> + . | +
> 20% . |
> ------.-------------+
> 10% . + |
> . + |
> 0% . + + |
>
> 0 1 2 3 4 5 6 7 8 9 1
> % 0 0 0 0 0 0 0 0 0 0
> % % % % % % % % % 0
> %
> So, if I get a 50% fail in my experiment on 15 units, I know with 95% confidence
> that the large population has a fail range between 15% and 80%. Looks like I
> should have chosen a bigger sample!
>
> We're interested in reproducing the graphs with different confidence assumptions,
> as well as see the effects of populations which are not 'very large', but
> rather on the order of three or four orders of magnitude larger than the
> test population.
>
> The one reference included on the graph leads to a muddled reference to
> the incomplete beta function, and we're still trying to follow up those
> leads.
>
> Kirk Bresniker
> kirkb@rose.hp.com