AP Statistics Curriculum 2007 Normal Std

(Difference between revisions)
 Revision as of 20:16, 31 January 2008 (view source)IvoDinov (Talk | contribs)← Older edit Current revision as of 16:28, 8 February 2012 (view source)IvoDinov (Talk | contribs) m (→Standard Normal Distribution) (26 intermediate revisions not shown) Line 1: Line 1: - ====A Game of Chance==== + ==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Standard Normal Variables and Experiments== - * Suppose we are offered to play a game of chance under these conditions: it costs us to play $1.50 and the awarded prices are {$1, $2,$3}. Assume the probabilities of winning each price are {0.6, 0.3, 0.1}, respectively. Should we play the game? What are our chances of winning/loosing? Let's let X=awarded price. Then X={1, 2, 3}. + - + -
+ - {| class="wikitable" style="text-align:center; width:75%" border="1" + - |- + - | x  || 1 || 2 || 3 + - |- + - | P(X=x) || 0.6 || 0.3 || 0.1 + - |- + - | x*P(X=x) || 0.6 || 0.6 || 0.3 + - |} + -
+ - Then the mean of this game (i.e., expected return or expectation) is computed as the weighted (by the outcome probabilities) average of all the outcome prices: $E[X] = x_1P(X=x_1) + x_2P(X=x_2)+x_3P(X=x_3) = 1\times 0.6 + 2\times 0.3 + 3\times 0.1 = 1.5$. In other words, the expected return of this came is 1.5, which equals the entry fee, and hence the game is fair - neither the player nor the house has an advantage in this game (on the long run!) Of course, each streak of n games will produce different outcomes and may give small advantage to one side, however, on the long run, no one will make money. + - + - The variance for this game is computed by $VAR[X] = (x_1-1.5)^2P(X=x_1) + (x_2-1.5)^2P(X=x_2)+(x_3-1.5)^2P(X=x_3) =$ + - $=0.25\times 0.6 + 0.25\times 0.3 + 2.25\times 0.1 = 0.45$. Thus, the standard deviation is $SD[X] = \sqrt{VAR[X]}=0.67$. + - + - * Suppose now we ''alter the rules for the game of chance'' and the new pay-off is as follows: + - + - {| class="wikitable" style="text-align:center; width:75%" border="1" + - |- + - | x || 0 || 1.5 || 3 + - |- + - | P(X=x) || 0.6 || 0.3 || 0.1 + - |- + - | x*P(X=x) || 0 || 0.45 || 0.3 + - |} + - + - + - ** What is the ''new expected return'' of the game? Remember, the old expectation was equal to the entrance fee of1.50, and the game was fair! + - ** The change in the pay-off of the game may be represented by this linear transformation $Y = {3(X-1)\over 2}$. Therefore, by our rules for computing expectations of linear functions, $E[Y] = 3/2 E(X) –3/2 = 3/4 = 0.75$, and the game became clearly biased. Note how easy it is to compute ''E[Y]'', using this formula. At the same time, we could have computed the expectation of ''Y'' using first-principles (adding the values of the last row in the revised table above)! + - + - * You can play similar games under different conditions for the probability distribution of the prices using the SOCR [[SOCR_EduMaterials_Activities_BinomialCoinExperiment | Binomial Coin]] or [[SOCR_EduMaterials_Activities_DiceExperiment | Die]] experiments. ==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Standard Normal Variables and Experiments== + === Standard Normal Distribution=== === Standard Normal Distribution=== - The standard normal distribution is a continuous distribution where the following exact ''areas'' are bound between the Standard Normal Density function and the x-axis on the symmetric intervals around the origin: + The Standard Normal Distribution is a continuous distribution with the following density: - * The area: -1 < z < 1 = 0.8413 - 0.1587 = 0.6826 + * Standard Normal ''density'' function $f(x)= {e^{-x^2 \over 2} \over \sqrt{2 \pi}}.$ - * The area: -2.0 < z < 2.0 = 0.9772 - 0.0228 = 0.9544 + * Standard Normal ''cumulative distribution'' function \Phi(y)= \int_{-\infty}^{y}{{e^{-x^2 \over 2} \over \sqrt{2 \pi}} dx}.[/itex] - * The area: -3.0 < z < 3.0 = 0.9987 - 0.0013 = 0.9974 + * Why are these two functions, f(x), \Phi(y) well-defined density and distribution functions, i.e., $\int_{-\infty}^{\infty} {f(x)dx}=1? [[AP_Statistics_Curriculum_2007_Normal_Std#Appendix|See the appendix below]]. - [[Image:SOCR_EBook_Dinov_RV_Normal_013108_Fig0.jpg|500px]] + - * Standard Normal density function [itex]f(x)= {e^{-x^2} \over \sqrt{2 \pi}}.$ + Note that the following exact ''areas'' are bound between the Standard Normal Density Function and the x-axis on these symmetric intervals around the origin: + * The area: -1.0 < x < 1.0 = 0.8413 - 0.1587 = 0.6826 + * The area: -2.0 < x < 2.0 = 0.9772 - 0.0228 = 0.9544 + * The area: -3.0 < x < 3.0 = 0.9987 - 0.0013 = 0.9974 + * Note that the [http://en.wikipedia.org/wiki/Inflection_point inflection points] ($f ''(x)=0$)of the Standard Normal density function are $\pm 1. + [[Image:SOCR_EBook_Dinov_RV_Normal_013109_Fig0.jpg|600px]] - * The Standard Normal distribution is also a special case of the [[AP_Statistics_Curriculum_2007_Normal_Prob | more general normal distribution]] where the mean is set to zero and a variance to one. The Standard Normal distribution is often called the ''bell curve'' because the graph of its probability density resembles a bell. + * The Standard Normal distribution is also a special case of the [[AP_Statistics_Curriculum_2007_Normal_Prob | more general normal distribution]] where the mean is set to zero and the variance is set to one. The Standard Normal distribution is often called the ''bell curve'' because the graph of its probability density resembles a bell. ===Experiments=== ===Experiments=== - Suppose we decide to test the state of 100 used batteries. To do that, we connect each battery to a volt-meter by randomly attaching the positive (+) and negative (-) battery terminals to the corresponding volt-meter's connections. Electrical current always flows from + to -, i.e., the current goes in the direction of the voltage drop. Depending upon which way the battery is connected to the volt-meter we can observe positive or negative voltage recordings (voltage is just a difference, which forces current to flow from higher to the lower voltage.) Denote [itex]X_i$={measured voltage for battery i} - this is random variable 0 and assume the distribution of all $X_i$ is Standard Normal, $X_i \sim N(0,1)$. Use the Normal Distribution (with mean=0 and variance=1) in the [http://socr.ucla.edu/htmls/SOCR_Distributions.html SOCR Distribution applet] to address the following questions. This [[Help_pages_for_SOCR_Distributions | Distributions help-page may be useful in understanding SOCR Distribution Applet]]. How many batteries, from the sample of 100, can we expect to have? + Suppose we decide to test the state of 100 used batteries. To do that, we connect each battery to a volt-meter by randomly attaching the positive (+) and negative (-) battery terminals to the corresponding volt-meter's connections. Electrical current always flows from + to -, i.e., the current goes in the direction of the voltage drop. Depending upon which way the battery is connected to the volt-meter we can observe positive or negative voltage recordings (voltage is just a difference, which forces current to flow from higher to the lower voltage.) Denote $X_i$={measured voltage for battery i} - this is random variable with mean of 0 and unitary variance. Assume the distribution of all $X_i$ is Standard Normal, $X_i \sim N(0,1)$. Use the Normal Distribution (with mean=0 and variance=1) in the [http://socr.ucla.edu/htmls/SOCR_Distributions.html SOCR Distribution applet] to address the following questions. This [[Help_pages_for_SOCR_Distributions | Distributions help-page may be useful in understanding SOCR Distribution Applet]]. How many batteries, from the sample of 100, can we expect to have? * Absolute Voltage > 1? P(X>1) = 0.1586, thus we expect 15-16 batteries to have voltage exceeding 1. * Absolute Voltage > 1? P(X>1) = 0.1586, thus we expect 15-16 batteries to have voltage exceeding 1.
[[Image:SOCR_EBook_Dinov_RV_Normal_013108_Fig1.jpg|500px]]
[[Image:SOCR_EBook_Dinov_RV_Normal_013108_Fig1.jpg|500px]]
Line 58: Line 29:
[[Image:SOCR_EBook_Dinov_RV_Normal_013108_Fig4.jpg|500px]]
[[Image:SOCR_EBook_Dinov_RV_Normal_013108_Fig4.jpg|500px]]
-
+ ===[[EBook_Problems_Normal_Std|Problems]]=== - ===References=== + + ===Appendix=== + The derivation below illustrates why the standard normal density function, f(x)= {e^{-x^2 \over 2} \over \sqrt{2 \pi}}[/itex], represents a well-defined density function, i.e., $f(x)\ge 0$ and $\int_{-\infty}^{\infty} {f(x)dx}=1$. + * Clearly the exponential function $e^{-x^2 \over 2}$ is always non-negative (in fact it's strictly positive for each real value argument). + * To show that $\int_{-\infty}^{\infty} {f(x)dx}=1$, let $A=\int_{-\infty}^{\infty} {f(x)dx}$. Then $A^2=\int_{-\infty}^{\infty} {f(x)dx}\times \int_{-\infty}^{\infty} {f(w)dw}$. Thus, + : $A^2=\int_{-\infty}^{\infty} {\int_{-\infty}^{\infty} { {e^{-x^2 \over 2} \over \sqrt{2 \pi}} \times {e^{-w^2 \over 2} \over \sqrt{2 \pi}} dw}dx}$, + : Change variables from Cartesian to polar coordinates: + :: $x=r\cos(\theta)$ + :: $x=r\cos(\theta)$, $0\le \theta\le 2\pi$ + :: Hence, + ::: $x^2+w^2=r^2$, + ::: $e^{-x^2 \over 2} \times e^{-w^2 \over 2} =e^{-r^2 \over 2}$, + ::: $dx=\cos(\theta)dr$, and + ::: $dy=r\cos(\theta)dr$. + : Therefore, $A^2=\int_{0}^{\infty} {\int_{0}^{2\pi} {{e^{-r^2 \over 2} \over 2 \pi}\cos^2(\theta)rdrd\theta}}$, and + : $A^2=\int_{0}^{\infty} {\int_{0}^{2\pi} {{e^{-r^2 \over 2} \over 2 \pi}d{\frac{r^2}{2}}d\theta}}=\int_{0}^{\infty} {{e^{-r^2 \over 2} \over \pi}d{\frac{r^2}{2}}} \times \int_{0}^{2\pi} {\cos^2(\theta)d\theta}=1$, since + :: $\int_{0}^{2\pi} {\cos^2(\theta)d\theta}=\int_{0}^{2\pi} {\frac{1+\cos(2\theta)}{2}d\theta}=\pi$, and + :: $\int_{0}^{\infty} {e^{-w}dw}=1$.

General Advance-Placement (AP) Statistics Curriculum - Standard Normal Variables and Experiments

Standard Normal Distribution

The Standard Normal Distribution is a continuous distribution with the following density:

• Standard Normal density function $f(x)= {e^{-x^2 \over 2} \over \sqrt{2 \pi}}.$
• Standard Normal cumulative distribution function $\Phi(y)= \int_{-\infty}^{y}{{e^{-x^2 \over 2} \over \sqrt{2 \pi}} dx}.$
• Why are these two functions, f(x),Φ(y) well-defined density and distribution functions, i.e., $\int_{-\infty}^{\infty} {f(x)dx}=1$? See the appendix below.

Note that the following exact areas are bound between the Standard Normal Density Function and the x-axis on these symmetric intervals around the origin:

• The area: -1.0 < x < 1.0 = 0.8413 - 0.1587 = 0.6826
• The area: -2.0 < x < 2.0 = 0.9772 - 0.0228 = 0.9544
• The area: -3.0 < x < 3.0 = 0.9987 - 0.0013 = 0.9974
• Note that the inflection points (f''(x) = 0)of the Standard Normal density function are $\pm$ 1.
• The Standard Normal distribution is also a special case of the more general normal distribution where the mean is set to zero and the variance is set to one. The Standard Normal distribution is often called the bell curve because the graph of its probability density resembles a bell.

Experiments

Suppose we decide to test the state of 100 used batteries. To do that, we connect each battery to a volt-meter by randomly attaching the positive (+) and negative (-) battery terminals to the corresponding volt-meter's connections. Electrical current always flows from + to -, i.e., the current goes in the direction of the voltage drop. Depending upon which way the battery is connected to the volt-meter we can observe positive or negative voltage recordings (voltage is just a difference, which forces current to flow from higher to the lower voltage.) Denote Xi={measured voltage for battery i} - this is random variable with mean of 0 and unitary variance. Assume the distribution of all Xi is Standard Normal, $X_i \sim N(0,1)$. Use the Normal Distribution (with mean=0 and variance=1) in the SOCR Distribution applet to address the following questions. This Distributions help-page may be useful in understanding SOCR Distribution Applet. How many batteries, from the sample of 100, can we expect to have?

• Absolute Voltage > 1? P(X>1) = 0.1586, thus we expect 15-16 batteries to have voltage exceeding 1.
• |Absolute Voltage| > 1? P(|X|>1) = 1- 0.682689=0.3173, thus we expect 31-32 batteries to have absolute voltage exceeding 1.
• Voltage < -2? P(X<-2) = 0.0227, thus we expect 2-3 batteries to have voltage less than -2.
• Voltage <= -2? P(X<=-2) = 0.0227, thus we expect 2-3 batteries to have voltage less than or equal to -2.
• -1.7537 < Voltage < 0.8465? P(-1.7537 < X < 0.8465) = 0.761622, thus we expect 76 batteries to have voltage in this range.

Appendix

The derivation below illustrates why the standard normal density function, $f(x)= {e^{-x^2 \over 2} \over \sqrt{2 \pi}}$, represents a well-defined density function, i.e., $f(x)\ge 0$ and $\int_{-\infty}^{\infty} {f(x)dx}=1$.

• Clearly the exponential function $e^{-x^2 \over 2}$ is always non-negative (in fact it's strictly positive for each real value argument).
• To show that $\int_{-\infty}^{\infty} {f(x)dx}=1$, let $A=\int_{-\infty}^{\infty} {f(x)dx}$. Then $A^2=\int_{-\infty}^{\infty} {f(x)dx}\times \int_{-\infty}^{\infty} {f(w)dw}$. Thus,
$A^2=\int_{-\infty}^{\infty} {\int_{-\infty}^{\infty} { {e^{-x^2 \over 2} \over \sqrt{2 \pi}} \times {e^{-w^2 \over 2} \over \sqrt{2 \pi}} dw}dx}$,
Change variables from Cartesian to polar coordinates:
x = rcos(θ)
x = rcos(θ), $0\le \theta\le 2\pi$
Hence,
x2 + w2 = r2,
$e^{-x^2 \over 2} \times e^{-w^2 \over 2} =e^{-r^2 \over 2}$,
dx = cos(θ)dr, and
dy = rcos(θ)dr.
Therefore, $A^2=\int_{0}^{\infty} {\int_{0}^{2\pi} {{e^{-r^2 \over 2} \over 2 \pi}\cos^2(\theta)rdrd\theta}}$, and
$A^2=\int_{0}^{\infty} {\int_{0}^{2\pi} {{e^{-r^2 \over 2} \over 2 \pi}d{\frac{r^2}{2}}d\theta}}=\int_{0}^{\infty} {{e^{-r^2 \over 2} \over \pi}d{\frac{r^2}{2}}} \times \int_{0}^{2\pi} {\cos^2(\theta)d\theta}=1$, since
$\int_{0}^{2\pi} {\cos^2(\theta)d\theta}=\int_{0}^{2\pi} {\frac{1+\cos(2\theta)}{2}d\theta}=\pi$, and
$\int_{0}^{\infty} {e^{-w}dw}=1$.