AP Statistics Curriculum 2007 Normal Prob
From Socr
(→General Normal Distribution) |
|||
Line 18: | Line 18: | ||
: <math>X = \mu +Z\sigma</math> converts standard scores to general normal values. | : <math>X = \mu +Z\sigma</math> converts standard scores to general normal values. | ||
- | === | + | ===Systolic Arterial Pressure Example=== |
This [[Help_pages_for_SOCR_Distributions | Distributions help-page may be useful in understanding SOCR Distribution Applet]]. | This [[Help_pages_for_SOCR_Distributions | Distributions help-page may be useful in understanding SOCR Distribution Applet]]. | ||
- | |||
Suppose that the average systolic blood pressure (SBP) for a Los Angeles freeway commuter follows a Normal distribution with mean 130 mmHg and standard deviation 20 mmHg. Denote ''X'' to be the random variable representing the SBP measure for a randomly chosen commuter. Then <math>X\sim N(\mu=130, \sigma^2 =20^2)</math>. | Suppose that the average systolic blood pressure (SBP) for a Los Angeles freeway commuter follows a Normal distribution with mean 130 mmHg and standard deviation 20 mmHg. Denote ''X'' to be the random variable representing the SBP measure for a randomly chosen commuter. Then <math>X\sim N(\mu=130, \sigma^2 =20^2)</math>. | ||
Line 35: | Line 34: | ||
* What is the range of SBP values that contain the central 80% of the SBPs for all commuters? That is what are <math>x_o, x_1</math>, so that <math>P(x_0<X<x_1)=0.8</math> and <math>{x_o+x_1\over2}=\mu=130</math> (i.e., they are symmetric around the mean)? (<math>x_o=104, x_1=156</math>) | * What is the range of SBP values that contain the central 80% of the SBPs for all commuters? That is what are <math>x_o, x_1</math>, so that <math>P(x_0<X<x_1)=0.8</math> and <math>{x_o+x_1\over2}=\mu=130</math> (i.e., they are symmetric around the mean)? (<math>x_o=104, x_1=156</math>) | ||
<center>[[Image:SOCR_EBook_Dinov_RV_Normal_013108_Fig10.jpg|500px]]</center> | <center>[[Image:SOCR_EBook_Dinov_RV_Normal_013108_Fig10.jpg|500px]]</center> | ||
+ | |||
+ | ===Assessing Normality=== | ||
+ | How can we tell if data collected from a process or experiment we observe is normally distributed? There are several methods for ''checking normality'': | ||
+ | * Symmetry: Are the [[AP_Statistics_Curriculum_2007_EDA_Center |mean and median]] of the dataset equal (Mean = Median)? | ||
+ | * Do [[SOCR_EduMaterials_Activities_Histogram_Graphs | histogram]], [[SOCR_EduMaterials_Activities_BoxPlot | box-and-whisker]] or [[SOCR_EduMaterials_Activities_DotChart |dotplot]] and look for bias (skewness), asymmetry, outliers, etc. | ||
+ | * Empirical Rule - check the percent of data that falls within 1, 2 and 3 [[AP_Statistics_Curriculum_2007_EDA_Var | SD]]s from the mean (should be approximately 68%, 95% and 99.7%). | ||
+ | * Or we can do a [[SOCR_EduMaterials_Activities_QQChart |Quantile-Quantile Probability plot]] comparing the quantiles of the data against their Normal distribution counterparts. | ||
+ | <center>[[Image:SOCR_EBook_Dinov_RV_Normal_013108_Fig11.jpg|500px]]</center> | ||
+ | |||
+ | * ''Why do we care if the data is normally distributed? Having evidence that the data we are analyzing is normally distributed allows us to use the (General) Normal distribution as a model to calculate the probabilities of various events and assess significant observations.'' | ||
+ | |||
+ | * Example: Suppose we are given the heights for 11 women. If we want to use the normal distribution to make inference on women heights, we first need to show that there is no evidence suggesting that the Normal and Data distributions are significantly distinct. | ||
+ | <center> | ||
+ | {| class="wikitable" style="text-align:center; width:75%" border="1" | ||
+ | |- | ||
+ | | Height (in.) || 61.0 || 62.5 || 63.0 || 64.0 || 64.5 || 65.0 || 66.5 || 67.0 || 68.0 || 68.5 || 70.5 | ||
+ | |} | ||
+ | |||
+ | ===Normal Probability Plot=== | ||
<hr> | <hr> |
Revision as of 22:44, 31 January 2008
Contents |
General Advance-Placement (AP) Statistics Curriculum - Nonstandard Normal Distribution & Experiments: Finding Probabilities
Due to the Central Limit Theorem, the normal distribution is perhaps the most important model for studying various quantitative phenomena. Many numerical measurements (e.g., weight, time, etc.) can be well approximated by the normal distribution. While the mechanisms underlying natural processes may often be unknown, the use of the normal model can be theoretically justified by assuming that many small, independent effects are additively contributing to each observation.
General Normal Distribution
The (general) normal distribution is a continuous distribution that has similar exact areas, bound in terms of its mean, like the Standard Normal distribution and the x-axis on the symmetric intervals around the origin:
- The area: μ − σ < x < μ + σ = 0.8413 − 0.1587 = 0.6826
- The area: μ − 2σ < x < μ + 2σ = 0.9772 − 0.0228 = 0.9544
- The area: μ − 3σ < x < μ + 3σ = 0.9987 − 0.0013 = 0.9974
- General Normal density function
- See the special case of Standard Normal distribution where the mean is set to zero and a variance to one.
- The relation between the Standard and the General Normal distribution is provided by these simple linear transformations (Supposed X denotes General and Z denotes Standard Normal random variables):
- converts general normal scores to standard (Z) values.
- X = μ + Zσ converts standard scores to general normal values.
Systolic Arterial Pressure Example
This Distributions help-page may be useful in understanding SOCR Distribution Applet.
Suppose that the average systolic blood pressure (SBP) for a Los Angeles freeway commuter follows a Normal distribution with mean 130 mmHg and standard deviation 20 mmHg. Denote X to be the random variable representing the SBP measure for a randomly chosen commuter. Then .
- Find the percentage of LA freeway commuters that have a SBP less than 100. That is compute the following probability: p=P(X<100)=? (p=0.066776)
- If normal SBP is defined by the range [110 ; 140], and we take a random sample of 1,000 commuters and measure their SBP, how many would be expected to have normal SBP? (Number = 1,000P(110<X<140)= 1,000*0.532807=532.807).
- What is the 90^{th} percentile for the SBP? That is what is x_{o}, so that P(X < x_{o}) = 0.9?
- What is the range of SBP values that contain the central 80% of the SBPs for all commuters? That is what are x_{o},x_{1}, so that P(x_{0} < X < x_{1}) = 0.8 and (i.e., they are symmetric around the mean)? (x_{o} = 104,x_{1} = 156)
Assessing Normality
How can we tell if data collected from a process or experiment we observe is normally distributed? There are several methods for checking normality:
- Symmetry: Are the mean and median of the dataset equal (Mean = Median)?
- Do histogram, box-and-whisker or dotplot and look for bias (skewness), asymmetry, outliers, etc.
- Empirical Rule - check the percent of data that falls within 1, 2 and 3 SDs from the mean (should be approximately 68%, 95% and 99.7%).
- Or we can do a Quantile-Quantile Probability plot comparing the quantiles of the data against their Normal distribution counterparts.
- Why do we care if the data is normally distributed? Having evidence that the data we are analyzing is normally distributed allows us to use the (General) Normal distribution as a model to calculate the probabilities of various events and assess significant observations.
- Example: Suppose we are given the heights for 11 women. If we want to use the normal distribution to make inference on women heights, we first need to show that there is no evidence suggesting that the Normal and Data distributions are significantly distinct.
Height (in.) | 61.0 | 62.5 | 63.0 | 64.0 | 64.5 | 65.0 | 66.5 | 67.0 | 68.0 | 68.5 | 70.5 |
Normal Probability Plot
References
- SOCR Home page: http://www.socr.ucla.edu
Translate this page:
<center>