# AP Statistics Curriculum 2007 Infer 2Means Indep

## General Advance-Placement (AP) Statistics Curriculum - Inferences about Two Means: Independent Samples

In the previous section we discussed the inference on two paired random samples. Now, we show how to do inference on two independent samples.

### Indepenent Samples Designs

Independent samples designs refer to design of experiments or observations where all measurements are individually independent from each other within their groups and the groups are independent. The groups may be drawn from different populations with different distribution characteristics.

### Background

• Recall that for a random sample {$X_1, X_2, X_3, \cdots , X_n$} of the process, the population mean may be estimated by the sample average, $\overline{X_n}={1\over n}\sum_{i=1}^n{X_i}$.
• The standard error of $\overline{x}$ is given by ${{1\over \sqrt{n}} \sqrt{\sum_{i=1}^n{(x_i-\overline{x})^2\over n-1}}}$.

### Analysis Protocol for Independent Designs

To study independent samples we would like to examine the differences between two group means. Suppose {$X_1^1, X_2^1, X_3^1, \cdots , X_n^1$} and {$Y_1, Y_2, Y_3, \cdots , Y_n$} represent the two independent samples. Then we want to study the differences of the two group means relative to the internal sample variations. If the two samples were drawn from populations that had different centers, then we would expect that the two sample averages will be distinct.

#### Large Samples

• Significance Testing: We have a standard null-hypothesis HoX − μY = μo (e.g., μo = 0). Then the test statistics is:
$Z_o = {\overline{x}-\overline{y}-\mu_o \over SE(\overline{x}-\overline{y})} \sim N(0,1)$.
$z_o= {\overline{x}-\overline{y} \over \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}}$
• Confidence Intervals: (1 − α)100% confidence interval for μ1 − μ2 will be
$CI(\alpha): \overline{x}-\overline{y} \pm z_{\alpha\over 2} SE(\overline{x}-\overline{y})= \overline{x}-\overline{y} \pm z_{\alpha\over 2} \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}$. Note that the $SE(\overline{x} -\overline{x})=\sqrt{SE(\overline{x})+SE(\overline{y})}$, as the samples are independent. Also, $z_{\alpha\over 2}$ is the critical value for a Standard Normal distribution at ${\alpha\over 2}$.

#### Small Samples

• Significance Testing: Again, we have a standard null-hypothesis HoX − μY = μo (e.g., μo = 0). Then the test statistics is:
$T_o = {\overline{x}-\overline{y}-\mu_o \over SE(\overline{x}-\overline{y})} \sim T(df)$.
The degrees of freedom is: Failed to parse (lexing error): df={\left $$SE^2(\overline{x})+SE^2(\overline{x}) \right$$^2 \over {SE^4(\overline{x}) \over n_1-1} + {SE^4(\overline{y}) \over n_2-1} } \approx n_1+n_2-2.
Always round up the degrees of freedom to the next larger integer.

$t_o= {\overline{x}-\overline{y} \over \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}}$
• Confidence Intervals: (1 − α)100% confidence interval for μ1 − μ2 will be
$CI(\alpha): \overline{x}-\overline{y} \pm t_{df, {\alpha\over 2}} SE(\overline{x}-\overline{y})= \overline{x}-\overline{y} \pm t_{df, {\alpha\over 2}} \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}$. Note that the $SE(\overline{x} -\overline{x})=\sqrt{SE(\overline{x})+SE(\overline{y})}$, as the samples are independent.
The degrees of freedom is: Failed to parse (lexing error): df={\left $$SE^2(\overline{x})+SE^2(\overline{x}) \right$$^2 \over {SE^4(\overline{x}) \over n_1-1} + {SE^4(\overline{y}) \over n_2-1} } \approx n_1+n_2-2.
Always round up the degrees of freedom to the next larger integer.


Also, $t_{df, {\alpha\over 2}}$ is the critical value for a Student's T distribution at ${\alpha\over 2}$.