I have a statistics question.

I understand that there are different methods for obtaining the confidence
interval for a proportion under simple random sampling.  The usual method
(normal approximation) is inappropriate for small samples and for cases
with the population proportion close to 0 or 1 (in such cases the
hypergeometric distribution is to be used).

What methods are appropriate for sampling with unequal selection
probabilities (such as proportional to size sampling) and for stratified
random sampling?

From my reading of the documentation for PROC SURVEYFREQ, it looks like SAS
uses the normal approximation method.  Is this appropriate if the
population proportion is close to 0 or 1?  What about for small samples?

Background - The hypergeometric calculation produces a confidence interval
which is potentially asymmetric.  Let H(x, n-x, A, N-A) be the
hypergeometric probability for finding in a sample of size n from a
population of size N; with x occurences in the sample and A occurences in
the population.

H(x, n-x, A, N-A) = [A choose x] * [(N-A) choose (n-x)] / [N choose n]

The upper 95% CI for the population number of occurences, A is given by
finding the smallest integer Au such that
     sum from j=0 to x {H(j, n-j, Au, N-Au)}<= 0.025.
The lower limit Al for the 95% CI on population occurences is given by the
largest integer Al such that
     sum from j=x to n {H(j, n-j, Al, N-Al)}<= 0.025.
The upper limit for the proportion is then Pu=Au/N, and the lower limit is

--Alex Cavallo
Navigant Consulting

