From: "Thane M. Larson" Subject: Re: Ratio of trial pass/fail to large populations Date: Tue, 14 Dec 1999 11:58:14 -0800 Newsgroups: sci.math Keywords: Incomplete Beta Function, pass/fail tests Hi Kirk, Being the friend at work, I thought I would respond with answer to your question. The limits are give by use of the inverse beta function, qbeta(x,a,b) if using mathcad. For this problem, x = (confidence + 1)/2. The upper bound will be Pmax(x,a1,b1)= qbeta(x,a1,b1) and the lower bound will be Pmin(x,a2,b2) = 1-qbeta(x,a2,b2) where a1=c+1, b1=n-c, a2=n-c+1, b2=c; where c=number of failures in your experiment of n trials. The incomplete beta function is the integral of the beta distribution and namely: Integral { t^(a-1) * (1-t)^(b-1) dt } evaluated from 0 to x qbeta(x,a,b) = ----------------------------------------------------------- Integral { t^(a-1) * (1-t)^(b-1) dt } evaluated from 0 to 1 For more information regarding the Incomplete Beta Function see: Section 25.5 Handbook of Mathematical Functions Edited by M. Abramowitz and L.E. Stegun US Dept. of Commerce National Bureau of Standards Applied Mathematics Series 55 Thane Larson Kirk Bresniker wrote: > I'm working through some design of experiment problems with a friend at work and > we're trying to find a reference or solution to the following problem: > > I have a population of units, P. Of the P units, a certain percentage will fail > in testing. Unfortunately, I can't test every unit. I want to sample a much smaller > number of units, s, which I will choose at random from P. What I am looking for > is the relationship that for a given ratio of pass/fails from my test population > what is the range of pass/fail ratios in the full population that could > generate the experimental results, given a confidence interval on the results. > > Right now we have a graph, circa 1955, included as an appendix to an > stats text, with no real information on how it was generated. The graph > has the pass/fail ratio of the experiment on the x-axis and the > pass/fail ratio of the full population on the y-axis. Above and below > the x=y line are bands for different experiment populations. To use the > graph I draw a vertical line at the pass/fail ratio for my experiment. > Where this line intersects the bands which match my experimental > population, I draw two hortizontal line. Where these lines intersect > the y-axis represents the range of pass/fail in the general population > that, with a confidence of 95%, could generate these results. > > Here is an example for a sample size of 15: > > 100% + + > + . > 90% + . > + . > 80% ---------------------+ . + > | . > 70% + | . > | . > 60% + | . + > | . > 50% + . + > . | > 40% + . | > . | + > 30% . | > + . | + > 20% . | > ------.-------------+ > 10% . + | > . + | > 0% . + + | > > 0 1 2 3 4 5 6 7 8 9 1 > % 0 0 0 0 0 0 0 0 0 0 > % % % % % % % % % 0 > % > So, if I get a 50% fail in my experiment on 15 units, I know with 95% confidence > that the large population has a fail range between 15% and 80%. Looks like I > should have chosen a bigger sample! > > We're interested in reproducing the graphs with different confidence assumptions, > as well as see the effects of populations which are not 'very large', but > rather on the order of three or four orders of magnitude larger than the > test population. > > The one reference included on the graph leads to a muddled reference to > the incomplete beta function, and we're still trying to follow up those > leads. > > Kirk Bresniker > kirkb@rose.hp.com