AP Statistics Curriculum 2007 Estim Proportion

From Socr

Revision as of 05:47, 4 February 2008 by IvoDinov (Talk | contribs)
Jump to: navigation, search

Contents

General Advance-Placement (AP) Statistics Curriculum - Estimating a Population Proportion

Estimating a Population Proportion

When the sample size is large, the sampling distribution of the sample proportion \hat{p} is approximately Normal, by CLT, as the sample proportion may be presented as a sample average or Bernoulli random variables. When the sample size is small, the normal approximation may be inadequate. To accommodate this we will modify the sample-proportion \hat{p} slightly and obtain the corrected-sample-proportion \tilde{p}:

\hat{p}={y\over n} \longrightarrow \tilde{y}={y+0.5z_{\alpha \over 2}^2 \over n+z_{\alpha \over 2}^2},

where z_{\alpha \over 2} is the normal critical value we saw earlier.

The standard error of \hat{p} also needs a slight modification

SE_{\hat{p}} =  \sqrt{\hat{p}(1-\hat{p})\over n} \longrightarrow SE_{\tilde{p}} =  \sqrt{\tilde{p}(1-\tilde{p})\over n+z_{\alpha \over 2}^2}.

Confidence intervals for proportions

The confidence intervals for the sample proportion \hat{p} and the corrected-sample-proportion \tilde{p} are given by

\hat{p}\pm z_{\alpha\over 2} SE_{\hat{p}}
\tilde{p}\pm z_{\alpha\over 2} SE_{\tilde{p}}

Example

Suppose a researcher is interested in studying the effect of aspirin in reducing heart attacks. He randomly recruits 500 subjects with evidence of early heart disease and has them take one aspirin daily for two years. At the end of the two years he finds that during the study only 17 subjects had a heart attack. Calculate a 95% (α = 0.05) confidence interval for the true (unknown) proportion of subjects with early heart disease that have a heart attack while taking aspirin daily. Note that z_{\alpha \over 2} = z_{0.025}=1.96:

\hat{p} = {17\over 500}=0.034 ; \tilde{p} = {17+0.5z_{0.025}^2\over 500+z_{0.025}^2}== {17+1.92\over 500+3.84}=0.038
SE_{\hat{p}}= \sqrt{0.034(1-0.034)\over 500}=0.0036; SE_{\tilde{p}}= \sqrt{0.038(1-0.038)\over 500+3.84}=0.0085

And the corresponding confidence intervals are given by

\hat{p}\pm 1.96 SE_{\hat{p}}=[0.026944, 0.041056]
\tilde{p}\pm 1.96 SE_{\tilde{p}}=[0.0213, 0.0547]

Sample-size estimation

For a given margin of error we can derive the minimum sample-size that guarantees an interval estimate within the given margin of error. The margin of error is the standard-error of the sample-proportion:

SE_{\tilde{p}} = \sqrt{\tilde{p}(1-\tilde{p})\over n+z_{\alpha \over 2}^2}.</math>

This equation has one unknown parameter (n), which we can solve for if we are given an upper limit for the margin of error.

SE_{\tilde{p}} \geq \sqrt{\tilde{p}(1-\tilde{p})\over n+z_{\alpha \over 2}^2} \longrightarrow n \geq {\tilde{p}(1-\tilde{p})\over SE_{\tilde{p}^2} } -z_{\alpha \over 2}^2}.</math>

Example

How many subjects are needed if the heart-researchers want SE < 0.005 for a 95% CI, and have a guess based on previous research that \tilde{p}= 0.04?

n \geq {0.04(1-0.04)\over 0.005^2} } - 1.96^2}=1533.16 \approx 1534.</math>



References

  • TBD



Translate this page:

(default)

Deutsch

Español

Français

Italiano

Português

日本語

България

الامارات العربية المتحدة

Suomi

इस भाषा में

Norge

한국어

中文

繁体中文

Русский

Nederlands

Ελληνικά

Hrvatska

Česká republika

Danmark

Polska

România

Sverige

Personal tools