Date: Mon, 18 Mar 96 13:49:47 CST
From: rusin (Dave Rusin)
To: lasmith@athena.mit.edu
Subject: Re: A fun question on words formed from an alphabet without replacement
Newsgroups: sci.math
In article <4ic5lq$9ok@senator-bedfellow.MIT.EDU> you write:
>
>Given are n letters a1,a2,...,an.
>There is a chance p_i of choosing letter i.
>What is the chance that a k letter word formed from
>the letters *without replacement* (no letter used twice)
>misses letter an?
So if a_1 is selected first, with what probability will the various
letters be selected next? I assume you mean for the probability that
a_2 is selected next, for example, to be p_2/(p_2+...+p_n), i.e.,
the remaining letters will be selected with probabilities proportional
to their original probabilities, but of course scaled up to sum to 1.
I don't know if there is a _compact_ expression for the resulting
answer, but it's easy to describe. For example, when k=2, the
event you want to find the probability of is the union of these:
(a_i is selected first and then a_n is not selected second)
(for i=1, 2, ..., n-1). These n-1 events are mutually exclusive,
so we get a probability of
Sum p_i*(1-p_n/(1-p_i))
=Sum p_i - p_n Sum p_i/(1-p_i)
=1-p_n - p_n Sum [-1 + 1/(1-p_i) ]
= 1 + (n-1)p_n - p_n Sum[ 1/(1-p_i) ]
but that's as simple as I can make it. There is, for example, no way to
express this in terms of p_n only.
So you have to accept cumbersome answers. Once you've granted that, it's
possible to write down correct (and cumbersome) expressions for k>2, too.
Here's the same as above but for k=3:
Sum p_i*(p_j/(1-p_i))*(1-p_n/(1-p_i-p_j))
the sum taken over all pairs i<>j of integers in [1, ..., n-1] .
I won't even bother trying to manipulate it to something simpler.
I will concede that the resulting formulas will have to be rational
functions in p_1, ..., p_(n-1), and p_n, which are symmetric
with respect to the first n-1 parameters; thus in principle you can
express the answers as rational functions of p_n and a generating
set of symmetric functions of the p_i; for example, you should be
able to express the answer in terms of p_n and the "moments"
m_1=p_1+...+p_(n-1) (= 1 - p_n),
m_2=p_1^2+...+p_(n-1)^2,
...
m_(n-1) = p_1^(n-1)+...+p_(n-1)^(n-1)
or instead in terms of p_n and the "elementary symmetric functions"
s_1=m_1,
s_2= Sum( p_i p_j : i < j ),
...
s_(n-1) = Prod(p_i)
But it's not clear that these are any better unless in your application
some set of symmetric functions of the p_i is already in use.
dave