# AP Statistics Curriculum 2007 Chi-Square

(Difference between revisions)
 Revision as of 21:32, 2 July 2011 (view source)JayZzz (Talk | contribs)← Older edit Revision as of 03:53, 3 July 2011 (view source)JayZzz (Talk | contribs) Newer edit → Line 23: Line 23: '''Support''':
'''Support''':
- ''x'' ∈ [0, +∞) + ''x'' ∈ [0, +∞)http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_Chi-Square&action=edit - ====Raw Moments==== + '''Moments''':
- The ''k''th '''Raw Moment''' for a discrete random variable ''X'' is defined by E[X^k]=\sum_x{x^kP(X=x)}. The ''k''th '''Raw Moment''' for a continuously-values random variable ''Y'' is analogously defined by $E[Y^k]=\int{y^kP(y)dy},$ where the integral is over the domain of ''Y'' and ''P(y)'' is the probability density function of ''Y''. + The nth raw moment for a distribution with r degrees of freedom is: + $2^n \tfrac{\Gamma(n+\tfrac{1}{2}r)}{\Gamma\tfrac{1}{2}r}$ - ====Centralized Moments==== + The nth central moment is: - The ''k''th '''Centralized Moment''' for a discrete random variable ''X'' is defined by $E_c[X^k]=\sum_x{(x-\mu)^kP(X=x)},$ where $\mu$ is the expected value of ''X''. The ''k''th '''Centralized Moment''' for a continuously-values random variable ''Y'' is analogously defined by $E_c[Y^k]=\int{(y-\mu)^kP(y)dy},$ where $\mu$ is the expected value of ''Y'', the integral is over the domain of ''Y'' and ''P(y)'' is the probability density function of ''Y''. + $2^nU(-n,1-n-\tfrac{1}{2}r,-\tfrac{1}{2}r)$, - ====Standardized Moments==== + where U(a,b,x) is a [http://en.wikipedia.org/wiki/Confluent_hypergeometric_function confluent hypergeometric function] of the second kind. - The ''k''th '''Standardized Moment''' for a discrete random variable ''X'' is defined by + - : $E_s[X^k]={\sum_x{(x-\mu)^kP(X=x)} \over {(\sum_{x} (x-\mu)^2P(X=x))^{k/2}}}.$ - - The ''k''th '''Standardized Moment''' for a continuously-values random variable ''Y'' is analogously defined by - - :$E_s[Y^k]={\int{(y-\mu)^kP(y)dy} \over \sigma^k},$ where the integral is over the domain of ''Y'' and ''P(y)'' is the probability density function of ''Y'' ===Applications=== ===Applications=== $\cdot$ [http://en.wikipedia.org/wiki/Goodness_of_fit Chi-Square goodness of fit] $\cdot$ [http://en.wikipedia.org/wiki/Goodness_of_fit Chi-Square goodness of fit]

## General Advance-Placement (AP) Statistics Curriculum - Chi-Square Distribution

### Chi-Square Distribution

The Chi-Square distribution is used in the chi-square tests for goodness of fit of an observed distribution to a theoretical one and the independence of two criteria of classification of qualitative data. It is also used in confidence interval estimation for a population standard deviation of a normal distribution from a sample standard deviation. The Chi-Square distribution is a special case of the Gamma distribution [link to gamma].

PDF:
$\frac{1}{2^{k/2}\Gamma(k/2)}\; x^{k/2-1} e^{-x/2}\,$

CDF:
$\frac{1}{\pi} \arctan\left(\frac{x-x_0}{\gamma}\right)+\frac{1}{2}\!$

Mean:
$\approx k\bigg(1-\frac{2}{9k}\bigg)^3$

Median:
$\approx k\bigg(1-\frac{2}{9k}\bigg)^3$

Mode:
max{ k − 2, 0 }

Variance:
2k

Moments:
The nth raw moment for a distribution with r degrees of freedom is: $2^n \tfrac{\Gamma(n+\tfrac{1}{2}r)}{\Gamma\tfrac{1}{2}r}$

The nth central moment is: $2^nU(-n,1-n-\tfrac{1}{2}r,-\tfrac{1}{2}r)$,

where U(a,b,x) is a confluent hypergeometric function of the second kind.

### Applications

$\cdot$ Chi-Square goodness of fit

$\cdot$ Independence of two criteria of classification of qualitative data

$\cdot$ Confidence Interval estimation for a population standard deviation of a normal distribution from a sample standard deviation

$\cdot$ ANOVA: The F distribution is distribution of two independent chi-square random variables, divided by their respective degrees of freedom [link to Fisher’s F, ANOVA]

### Example

Chi Square Test for Goodness of Fit: There are 60 people in a statistics class, and we have data on the month of their birth. Our null hypothesis is that the number of students with a particular birth month should be divided equally among the total 60. We can use a chi square test with 12-1=11 degrees of freedom to compare the observed data against our null hypothesis.

Birthday Month Observed Expected Residual (Obs-Exp) (ObsExp)2 (ObsExp)2 / Exp
Jan 3 5 -2 4 0.8
Feb 4 5 -1 1 0.2
Mar 8 5 3 9 1.8
April 4 5 -1 1 0.2
May 2 5 -3 9 1.8
June 3 5 -2 4 0.8
July 6 5 1 1 0.2
Aug 6 5 1 1 0.2
Sept 4 5 -1 1 0.2
Oct 3 5 -2 4 0.8
Nov 2 5 -3 9 1.8
Dec 5 5 0 0 0
Total = 8.8

Our Chi Square value is 8.8. Using the SOCR Chi-Square Distribution Calculator, at 11 degrees of freedom, a chi square value of 8.8 gives us a p-value of 0.36. We do not reject our null hypothesis. The observed data do not show evidence of a non-uniform distribution of birth months.

http://www.distributome.org/ -> SOCR -> Distributions -> Distributome

http://www.distributome.org/ -> SOCR -> Distributions -> Chi-Square Distribution

http://www.distributome.org/ -> SOCR -> Functors -> Chi-Square Distribution

http://www.distributome.org/ -> SOCR -> Analyses -> Chi-Square Test Contingency Table

http://www.distributome.org/ -> SOCR -> Analyses -> Chi-Square Model Goodness-of-Fit Test

http://www.distributome.org/ -> SOCR -> Modeler -> ChiSquareFit_Modeler

SOCR Chi-Square Distribution Calculator (http://socr.ucla.edu/htmls/dist/ChiSquare_Distribution.html)