# AP Statistics Curriculum 2007 Contingency Fit

(Difference between revisions)
Jump to: navigation, search
 Revision as of 23:15, 10 March 2008 (view source) (→Motivational example)← Older edit Revision as of 23:16, 10 March 2008 (view source) (→Calculations)Newer edit → Line 23: Line 23: Suppose there were ''N = 1064'' data measurements with ''Observed(Tall) = 787'' and ''Observed(Dwarf) = 277''. These are the O’s (observed values). To calculate the E’s (expected values), we will take the hypothesized proportions under $H_o$ and multiply them by the total sample size ''N''. Expected(Tall) = (0.75)(1064) = 798 and Expected(Dwarf) = (0.25)(1064) = 266. Quickly check to see if the expected total = N = 1064. Suppose there were ''N = 1064'' data measurements with ''Observed(Tall) = 787'' and ''Observed(Dwarf) = 277''. These are the O’s (observed values). To calculate the E’s (expected values), we will take the hypothesized proportions under $H_o$ and multiply them by the total sample size ''N''. Expected(Tall) = (0.75)(1064) = 798 and Expected(Dwarf) = (0.25)(1064) = 266. Quickly check to see if the expected total = N = 1064. - * The hypotheses: + * The Hypotheses: : $H_o$:P(tall) = 0.75 (No effect, follows a 3:1phenotypic ratio) : $H_o$:P(tall) = 0.75 (No effect, follows a 3:1phenotypic ratio) :: P(dwarf) = 0.25 :: P(dwarf) = 0.25 Line 29: Line 29: ::P(dwarf) ≠ 0.25 ::P(dwarf) ≠ 0.25 - * Test statistics: + * Test Statistics: :$\chi_o^2 = \sum_{all-categories}{(O-E)^2 \over E} \sim \chi_{(df=number\_of\_categories - 1)}^2$ :$\chi_o^2 = \sum_{all-categories}{(O-E)^2 \over E} \sim \chi_{(df=number\_of\_categories - 1)}^2$ - * P-values and critical values for the [http://socr.stat.ucla.edu/htmls/SOCR_Distributions.html Chi-Square distribution may be easily computed using SOCR Distributions]. + * P-values and Critical alues for the [http://socr.stat.ucla.edu/htmls/SOCR_Distributions.html Chi-Square Distribution may be easily computed using SOCR Distributions]. * Results: * Results: - For the Mendel's pea experiment, we can compute the Chi-square test statistics to be: + For the Mendel's Pea Experiment, we can compute the Chi-Square Test Statistics to be: : $\chi_o^2 = {(787-798)^2 \over 798} + {(277-266)^2 \over 266} = 0.152+0.455=0.607$. : $\chi_o^2 = {(787-798)^2 \over 798} + {(277-266)^2 \over 266} = 0.152+0.455=0.607$. : p-value=$P(\chi_{(df=1)}^2 > \chi_o^2)=0.436$ : p-value=$P(\chi_{(df=1)}^2 > \chi_o^2)=0.436$ - * [[SOCR_EduMaterials_AnalysisActivities_Chi_Goodness |SOCR Chi-square Calculations]]: + * [[SOCR_EduMaterials_AnalysisActivities_Chi_Goodness |SOCR Chi-Square Calculations]]:
[[Image:SOCR_EBook_Dinov_ChiSquare_030108_Fig1.jpg|500px]]
[[Image:SOCR_EBook_Dinov_ChiSquare_030108_Fig1.jpg|500px]]

## General Advance-Placement (AP) Statistics Curriculum - Multinomial Experiments: Chi-Square Goodness-of-Fit

The Chi-Square Test is used to test if a data sample comes from a population with specific characteristics. The Chi-Square Goodness-of-Fit Test is applied to binned data (data put into classes or categories). In most situations, the data histogram or frequency histogram may be obtained and the Chi-Square Test may be applied to these (frequency) values. This test requires a sufficient sample size in order for the Chi-Square approximation to be valid.

The Kolmogorov-Smirnov is an alternative to the Chi-Square Goodness-of-Fit Test. The Chi-Square Goodness-of-Fit Test may also be applied to discrete distributions such as the Binomial and the Poisson. The Kolmogorov-Smirnov Test is restricted to continuous distributions.

## Motivational Example

Mendel's Pea Experiment relates to the transmission of hereditary characteristics from parent organisms to their offspring; it underlies much of genetics. Suppose a tall offspring is the event of interest and that the true proportion of tall peas (based on a 3:1 phenotypic ratio) is 3/4 or p = 0.75. He would like to show that Mendel's data follow this 3:1 phenotypic ratio.

 Observed (O) Expected (E) Tall 787 798 Dwarf 277 266

## Calculations

Suppose there were N = 1064 data measurements with Observed(Tall) = 787 and Observed(Dwarf) = 277. These are the O’s (observed values). To calculate the E’s (expected values), we will take the hypothesized proportions under Ho and multiply them by the total sample size N. Expected(Tall) = (0.75)(1064) = 798 and Expected(Dwarf) = (0.25)(1064) = 266. Quickly check to see if the expected total = N = 1064.

• The Hypotheses:
Ho:P(tall) = 0.75 (No effect, follows a 3:1phenotypic ratio)
P(dwarf) = 0.25
Ha: P(tall) ≠ 0.75
P(dwarf) ≠ 0.25
• Test Statistics:
$\chi_o^2 = \sum_{all-categories}{(O-E)^2 \over E} \sim \chi_{(df=number\_of\_categories - 1)}^2$
• Results:

For the Mendel's Pea Experiment, we can compute the Chi-Square Test Statistics to be:

$\chi_o^2 = {(787-798)^2 \over 798} + {(277-266)^2 \over 266} = 0.152+0.455=0.607$.
p-value=$P(\chi_{(df=1)}^2 > \chi_o^2)=0.436$

## Examples

### Butterfly Hotspots

A hotspot is defined as a 10km2 area that is species rich (heavily populated by the species of interest). Suppose in a study of butterfly hotspots in a particular region, the number of butterfly hotspots in a sample of 2,588, 10km2 areas is 165. In theory, 5% of the areas should be butterfly hotspots. Do the data provide evidence to suggest that the number of butterfly hotspots is increasing from the theoretical standards? Test using α = 0.01.

### Cell-Phone Usage

Of 250 randomly selected cell phone users, is there evidence to show that there is a difference in area of home residence, defined as: Northern California (North); Southern California (South); or Out of State (Out)? Without further information suppose we have P(North) = 0.24, P(South) = 0.45, and P(Out) = 0.31. Is there any evidence suggesting different use of cell phones in these three groups of users?

### Brain Cancer

Suppose 200 randomly selected cancer patients were asked if their primary diagnosis was Brain cancer and if they owned a cell phone before their diagnosis. The results are presented in the table below:

 Brain cancer Yes No Total Cell Phone Use Yes 18 80 98 No 7 95 102 Total 25 175 200

Does it seem like there is an association between brain cancer and cell phone use? Of the brain cancer patients 18/25 = 0.72, owned a cell phone before their diagnosis. P(CP|BC) = 0.72, estimated probability of owning a cell phone given that the patient has brain cancer.

Of the other cancer patients, 80/175 = 0.46, owned a cell phone before their diagnosis. P(CP|NBC) = 0.46, estimated probability of owning a cell phone given that the patient has another cancer.

## References

• TBD

Translate this page: