# AP Statistics Curriculum 2007 Hypothesis Var

(Difference between revisions)
 Revision as of 19:01, 14 June 2007 (view source)IvoDinov (Talk | contribs)← Older edit Revision as of 23:10, 6 February 2008 (view source)IvoDinov (Talk | contribs) Newer edit → Line 1: Line 1: ==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Testing a Claim about a Standard Deviation or Variance== ==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Testing a Claim about a Standard Deviation or Variance== - === Testing a Claim about a Standard Deviation or Variance=== + Assessing the amount of variation in a process, natural phenomenon or an experiment is of paramount importance in many fields. For instance, a computer manufacturer may dismiss a batch of computer chips if they vary more than certain tollerance levels in their clock-speed, heat emmissions or energy consumptions. - Example on how to attach images to Wiki documents in included below (this needs to be replaced by an appropriate figure for this section)! + -
[[Image:AP_Statistics_Curriculum_2007_IntroVar_Dinov_061407_Fig1.png|500px]]
+ - ===Approach=== + === [[AP_Statistics_Curriculum_2007_Estim_Var | Background]]=== - Models & strategies for solving the problem, data understanding & inference. + Recall that the [[AP_Statistics_Curriculum_2007_EDA_Var | sample-variance (s2)]] is an unbiased point estimate for the population variance $\sigma^2$, and similarly, the [[AP_Statistics_Curriculum_2007_EDA_Var | sample-standard-deviation (''s'')]] is a good point estimate for the population-standard-deviation $\sigma$. - * TBD + The sample-variance is roughly [http://en.wikipedia.org/wiki/Chi_square_distribution Chi-square distributed]: + : $\chi_o^2 = {(n-1)s^2 \over \sigma^2} \sim \Chi_{(df=n-1)}^2$ - ===Model Validation=== + === Testing a Claim about the Variance ($\sigma^2$)=== - Checking/affirming underlying assumptions. + For Normally distributed random variables, given $H_o: \sigma^2 = \sigma_o^2$ vs. $H_1: \sigma^2 \not= \sigma_o^2$ , then ${(n-1) s^2 \over \sigma_o^2}$ has a $\Chi^2_{(df=n - 1)}$ distribution, where [[AP_Statistics_Curriculum_2007_EDA_Var |$s^2 is the sample variance]]. - * TBD + Notice that the Chi-square distribution is not symmetric (it is positively skewed) and therefore, there are two critical values for each level of confidence [itex](1-\alpha)$.  The value $\chi_L^2$ represents the left-tail critical value and $\chi_R^2$ represents the right-tail critical value.  For various degrees of freedom and areas, you can compute all critical values either using the [http://socr.ucla.edu/htmls/SOCR_Distributions.html SOCR Chi-Square Distribution] or using the [http://socr.ucla.edu/Applets.dir/Normal_T_Chi2_F_Tables.htm SOCR Chi-square distribution calculator]. - ===Computational Resources: Internet-based SOCR Tools=== + * Example: A random sample of size 30 drawn from a Normal distribution has sample-variance $s^2 = 5$. Test at the $\alpha=0.05$ level of significance if this is consistent with $H_o: \sigma^2 = 2$. - * TBD + : Test statistic: $\chi_o^2 = {29\times 5 \over 2} = 72.5$. + : Left and Right [http://en.wikipedia.org/wiki/Chi_square_distribution Chi-Square] critical values (for $\alpha = 0.05$) are $\chi_L^2=16.047$ and $\chi_R^2=45.722$. Since, $\chi_o^2 = 72.5 > \chi_R^2=45.722$, $H_o$ is rejected at the $\alpha=0.05$ level of significance. - ===Examples=== +
[[Image:SOCR_EBook_Dinov_Hypothesis_020508_Fig7.jpg|500px]]
- Computer simulations and real observed data. + - + - * TBD + + === Testing a Claim about the Standard Deviation ($\sigma$)=== + As the standard deviation is just th square root of the variance ($\sigma = |\sqrt{\sigma^2}|$), we do significance testing for the standard deviation anologously. + + For Normally distributed random variables, given $H_o: \sigma = \sigma_o$ vs. $H_1: \sigma \not= \sigma_o$ , then ${(n-1) s^2 \over \sigma_o^2}$ has a $\Chi^2_{(df=n - 1)}$ distribution, where [[AP_Statistics_Curriculum_2007_EDA_Var |$s^2$ is the square of the sample standard deviation]]. + ===Hands-on activities=== ===Hands-on activities=== - Step-by-step practice problems. + * Formulate appropriate hypothesis and assess the significance of the evidence to reject the null hypothesis for the population standard deviation ($\sigma$) assuming the observations below represent a random sample from the liquid content (in fluid ounces) of 16 beverage cans and can be considered as Normally distributed. Use a 90% level of confidence ($\alpha=0.1$). +
+ {| class="wikitable" style="text-align:center; width:75%" border="1" + |- + | 14.816 || 14.863 || 14.814 || 14.998 || 14.965 || 14.824 || 14.884 || 14.838 || 14.916 || 15.021 || 14.874 || 14.856 || 14.860 || 14.772 || 14.980 || 14.919 + |} +
+ + * Hypotheses: $H_o: \sigma = 0.04$ vs. $H_1: \sigma \not= 0.04$ . + + * Get the sample statistics from [http://socr.ucla.edu/htmls/SOCR_Charts.html SOCR Charts] (e.g., Index Plot); Sample-Mean=14.8875; Sample-SD=0.072700298, Sample-Var=0.005285333. +
[[Image:SOCR_EBook_Dinov_Estim_Var_020408_Fig3.jpg|500px]]
+ + * Identify the degrees of freedom ($df=n-1=15$) and the level of confidence (${\alpha \over 2}=0.05$, as we are looking for a $(1-\alpha)100% CI(\sigma)$). + + * Find the left and right critical values, $\chi_L^2=7.261$ and $\chi_R^2=24.9958$, as in the image below. +
[[Image:SOCR_EBook_Dinov_Estim_Var_020408_Fig4.jpg|500px]]
+ + * CI($\sigma^2$) + : ${15\times 0.0053 \over 24.9958} \leq \sigma^2 \leq {15\times 0.0052 \over 7.261}$ + + * CI($\sigma$) + : $\sqrt{15\times 0.0053 \over 24.9958} \leq \sigma \leq \sqrt{15\times 0.0052 \over 7.261}$ + + ===More examples=== + + * You randomly select and measure the contents of 15 bottles of cough syrup.  The results (in fluid ounces) are shown.  Use a 95% level of confidence to construct a confidence interval for the standard deviation ($\sigma$) assuming the contents of these cough syrup bottles is Normally distributed. Does this CI($\sigma$) suggest that the variation in the bottles is at an acceptable level if the '''population standard deviation''' of the bottle’s contents should be less than 0.025 fluid ounce? +
+ {| class="wikitable" style="text-align:center; width:75%" border="1" + |- + | 4.211 || 4.246 || 4.269 || 4.241 || 4.260 || 4.293 || 4.189 || 4.248 || 4.220 || 4.239 || 4.253 || 4.209 || 4.300 || 4.256 || 4.290 + |} +
+ + * The gray whale has the longest annual migration distance of any mammal.  Gray whales leave Baja, California, and western Mexico in the spring, migrating to the Bering and Chukchi seas for the summer months.  Tracking a sample of 50 whales for a year provided a mean migration distance of 11,064 miles with a standard deviation of 860 miles.  Construct a 90% confidence interval for the variance for the migrating whales.  Assume that the population of migration distances is Normally distributed. - * TBD + * For the [[SOCR_012708_ID_Data_HotDogs | hot-dogs dataset]] construct 97% CI for the population standard deviation of the calorie and sodium contents, separately.

===References=== ===References=== - * TBD

## General Advance-Placement (AP) Statistics Curriculum - Testing a Claim about a Standard Deviation or Variance

Assessing the amount of variation in a process, natural phenomenon or an experiment is of paramount importance in many fields. For instance, a computer manufacturer may dismiss a batch of computer chips if they vary more than certain tollerance levels in their clock-speed, heat emmissions or energy consumptions.

### Background

Recall that the sample-variance (s2) is an unbiased point estimate for the population variance σ2, and similarly, the sample-standard-deviation (s) is a good point estimate for the population-standard-deviation σ.

The sample-variance is roughly Chi-square distributed:

$\chi_o^2 = {(n-1)s^2 \over \sigma^2} \sim \Chi_{(df=n-1)}^2$

### Testing a Claim about the Variance (σ2)

For Normally distributed random variables, given $H_o: \sigma^2 = \sigma_o^2$ vs. $H_1: \sigma^2 \not= \sigma_o^2$ , then ${(n-1) s^2 \over \sigma_o^2}$ has a $\Chi^2_{(df=n - 1)}$ distribution, where s2 is the sample variance.

Notice that the Chi-square distribution is not symmetric (it is positively skewed) and therefore, there are two critical values for each level of confidence (1 − α). The value $\chi_L^2$ represents the left-tail critical value and $\chi_R^2$ represents the right-tail critical value. For various degrees of freedom and areas, you can compute all critical values either using the SOCR Chi-Square Distribution or using the SOCR Chi-square distribution calculator.

• Example: A random sample of size 30 drawn from a Normal distribution has sample-variance s2 = 5. Test at the α = 0.05 level of significance if this is consistent with Ho2 = 2.
Test statistic: $\chi_o^2 = {29\times 5 \over 2} = 72.5$.
Left and Right Chi-Square critical values (for α = 0.05) are $\chi_L^2=16.047$ and $\chi_R^2=45.722$. Since, $\chi_o^2 = 72.5 > \chi_R^2=45.722$, Ho is rejected at the α = 0.05 level of significance.

### Testing a Claim about the Standard Deviation (σ)

As the standard deviation is just th square root of the variance ($\sigma = |\sqrt{\sigma^2}|$), we do significance testing for the standard deviation anologously.

For Normally distributed random variables, given Ho:σ = σo vs. $H_1: \sigma \not= \sigma_o$ , then ${(n-1) s^2 \over \sigma_o^2}$ has a $\Chi^2_{(df=n - 1)}$ distribution, where s2 is the square of the sample standard deviation.

### Hands-on activities

• Formulate appropriate hypothesis and assess the significance of the evidence to reject the null hypothesis for the population standard deviation (σ) assuming the observations below represent a random sample from the liquid content (in fluid ounces) of 16 beverage cans and can be considered as Normally distributed. Use a 90% level of confidence (α = 0.1).
 14.816 14.863 14.814 14.998 14.965 14.824 14.884 14.838 14.916 15.021 14.874 14.856 14.86 14.772 14.98 14.919
• Hypotheses: Ho:σ = 0.04 vs. $H_1: \sigma \not= 0.04$ .
• Get the sample statistics from SOCR Charts (e.g., Index Plot); Sample-Mean=14.8875; Sample-SD=0.072700298, Sample-Var=0.005285333.
• Identify the degrees of freedom (df = n − 1 = 15) and the level of confidence (${\alpha \over 2}=0.05$, as we are looking for a (1 − α)100%CI(σ)).
• Find the left and right critical values, $\chi_L^2=7.261$ and $\chi_R^2=24.9958$, as in the image below.
• CI(σ2)
${15\times 0.0053 \over 24.9958} \leq \sigma^2 \leq {15\times 0.0052 \over 7.261}$
• CI(σ)
$\sqrt{15\times 0.0053 \over 24.9958} \leq \sigma \leq \sqrt{15\times 0.0052 \over 7.261}$

### More examples

• You randomly select and measure the contents of 15 bottles of cough syrup. The results (in fluid ounces) are shown. Use a 95% level of confidence to construct a confidence interval for the standard deviation (σ) assuming the contents of these cough syrup bottles is Normally distributed. Does this CI(σ) suggest that the variation in the bottles is at an acceptable level if the population standard deviation of the bottle’s contents should be less than 0.025 fluid ounce?
 4.211 4.246 4.269 4.241 4.26 4.293 4.189 4.248 4.22 4.239 4.253 4.209 4.3 4.256 4.29
• The gray whale has the longest annual migration distance of any mammal. Gray whales leave Baja, California, and western Mexico in the spring, migrating to the Bering and Chukchi seas for the summer months. Tracking a sample of 50 whales for a year provided a mean migration distance of 11,064 miles with a standard deviation of 860 miles. Construct a 90% confidence interval for the variance for the migrating whales. Assume that the population of migration distances is Normally distributed.
• For the hot-dogs dataset construct 97% CI for the population standard deviation of the calorie and sodium contents, separately.