# AP Statistics Curriculum 2007 Limits LLN

(Difference between revisions)
 Revision as of 19:10, 15 February 2008 (view source) (→ General Advance-Placement (AP) Statistics Curriculum - The Law of Large Numbers)← Older edit Revision as of 22:12, 1 March 2008 (view source) (→ General Advance-Placement (AP) Statistics Curriculum - The Law of Large Numbers)Newer edit → Line 2: Line 2: === Motivation=== === Motivation=== - Suppose we independently conduct one experiment repeatedly. Assume that we are interested in the relative frequency of occurrence of one event whose probability to be observed at each experiment is ''p''. The ratio of the observed frequency of that event to the total number of repetitions converges towards ''p'' as the number of (identical and independent) experiments increases. This is an informal statement of the Law of Large Numbers (LLN). + Suppose we ''independently'' conduct one experiment repeatedly. Assume that we are interested in the relative frequency of occurrence of one event whose probability to be observed at each experiment is ''p''. The ratio of the observed frequency of that event to the total number of repetitions converges towards ''p'' as the number of (identical and independent) experiments increases. This is an informal statement of the Law of Large Numbers (LLN). For a more concrete example, suppose we study the average height of a class of 100 students. Compared to the average height of 3 randomly chosen students from this class, the average height of 10 randomly chosen students is most likely closer to the real average height of all 100 students. Because the sample of 10 is a larger than the sample of 3, it is a better representation of the entire class. At one extreme, a sample of 99 of the 100 students will produce a sample average height almost exactly the same as the average height for all 100 students. On the other extreme, sampling a single student will be an extremely variant estimate of the overall class average weight. For a more concrete example, suppose we study the average height of a class of 100 students. Compared to the average height of 3 randomly chosen students from this class, the average height of 10 randomly chosen students is most likely closer to the real average height of all 100 students. Because the sample of 10 is a larger than the sample of 3, it is a better representation of the entire class. At one extreme, a sample of 99 of the 100 students will produce a sample average height almost exactly the same as the average height for all 100 students. On the other extreme, sampling a single student will be an extremely variant estimate of the overall class average weight. Line 9: Line 9: The [http://en.wikipedia.org/wiki/Law_of_large_numbers complete formal statements of the LLN are discussed here]. The [http://en.wikipedia.org/wiki/Law_of_large_numbers complete formal statements of the LLN are discussed here]. - It is generally necessary to draw the parallels between the formal LLN statements (in terms of sample averages) and the frequent interpretations of the LLN (in terms of probabilities of various events). Suppose we observe the same process independently multiple times. Assume a binarized (dichotomous) function of the outcome of each trial is of interest (e.g., failure may denote the event that the continuous voltage measure < 0.5V, and the complement, success, that voltage ≥ 0.5V – this is the situation in electronic chips which binarize electric currents to 0 or 1). Researchers are often interested in the event of observing a success at a given trial or the number of successes in an experiment consisting of multiple trials. Let’s denote ''p=P(success)'' at each trial. Then, the ratio of the total number of successes to the number of trials (''n'') is the average $\overline{X_n}={1\over n}\sum_{i=1}^n{X_i}$, where  $X_i = \begin{cases}0,& \texttt{failure},\\ + It is generally necessary to draw the parallels between the formal LLN statements (in terms of sample averages) and the frequent interpretations of the LLN (in terms of probabilities of various events). + + Suppose we observe the same process independently multiple times. Assume a binarized (dichotomous) function of the outcome of each trial is of interest (e.g., failure may denote the event that the continuous voltage measure < 0.5V, and the complement, success, that voltage ≥ 0.5V – this is the situation in electronic chips which binarize electric currents to 0 or 1). Researchers are often interested in the event of observing a success at a given trial or the number of successes in an experiment consisting of multiple trials. Let’s denote ''p=P(success)'' at each trial. Then, the ratio of the total number of successes to the number of trials (''n'') is the average [itex]\overline{X_n}={1\over n}\sum_{i=1}^n{X_i}$, where  $X_i = \begin{cases}0,& \texttt{failure},\\ 1,& \texttt{success}.\end{cases}$ represents the outcome of the ith trial. Thus, $\overline{X_n}=\hat{p}$, the ratio of the observed frequency of that event to the total number of repetitions, estimates the true ''p=P(success)''. Therefore, $\hat{p}$ converges towards ''p'' as the number of (identical and independent) trials increases. 1,& \texttt{success}.\end{cases}[/itex] represents the outcome of the ith trial. Thus, $\overline{X_n}=\hat{p}$, the ratio of the observed frequency of that event to the total number of repetitions, estimates the true ''p=P(success)''. Therefore, $\hat{p}$ converges towards ''p'' as the number of (identical and independent) trials increases.

## General Advance-Placement (AP) Statistics Curriculum - The Law of Large Numbers

### Motivation

Suppose we independently conduct one experiment repeatedly. Assume that we are interested in the relative frequency of occurrence of one event whose probability to be observed at each experiment is p. The ratio of the observed frequency of that event to the total number of repetitions converges towards p as the number of (identical and independent) experiments increases. This is an informal statement of the Law of Large Numbers (LLN).

For a more concrete example, suppose we study the average height of a class of 100 students. Compared to the average height of 3 randomly chosen students from this class, the average height of 10 randomly chosen students is most likely closer to the real average height of all 100 students. Because the sample of 10 is a larger than the sample of 3, it is a better representation of the entire class. At one extreme, a sample of 99 of the 100 students will produce a sample average height almost exactly the same as the average height for all 100 students. On the other extreme, sampling a single student will be an extremely variant estimate of the overall class average weight.

### The Law of Large Numbers (LLN)

It is generally necessary to draw the parallels between the formal LLN statements (in terms of sample averages) and the frequent interpretations of the LLN (in terms of probabilities of various events).

Suppose we observe the same process independently multiple times. Assume a binarized (dichotomous) function of the outcome of each trial is of interest (e.g., failure may denote the event that the continuous voltage measure < 0.5V, and the complement, success, that voltage ≥ 0.5V – this is the situation in electronic chips which binarize electric currents to 0 or 1). Researchers are often interested in the event of observing a success at a given trial or the number of successes in an experiment consisting of multiple trials. Let’s denote p=P(success) at each trial. Then, the ratio of the total number of successes to the number of trials (n) is the average $\overline{X_n}={1\over n}\sum_{i=1}^n{X_i}$, where $X_i = \begin{cases}0,& \texttt{failure},\\ 1,& \texttt{success}.\end{cases}$ represents the outcome of the ith trial. Thus, $\overline{X_n}=\hat{p}$, the ratio of the observed frequency of that event to the total number of repetitions, estimates the true p=P(success). Therefore, $\hat{p}$ converges towards p as the number of (identical and independent) trials increases.

### SOCR LLN Activity

Go to SOCR Experiments and select the Coin Toss LLN Experiment from the drop-down list of experiments in the top-left panel. This applet consists of a control toolbar on the top followed by a graph panel in the middle and a results table at the bottom. Use the toolbar to flip coins one at a time, 10, 100, 1,000 at a time or continuously! The toolbar also allows you to stop or reset an experiment and select the probability of Heads (p) using the slider. The graph panel in the middle will dynamically plot the values of the two variables of interest (proportion of heads and difference of Heads and Tails). The outcome table at the bottom presents the summaries of all trials of this experiment.

### LLN Application

One demonstration of the law of large numbers provides practical algorithms for estimation of transcendental numbers. The two most popular transcendental numbers are π and e.

The SOCR E-Estimate Experiment provides the complete details of this simulation. In a nutshell, we can estimate the value of the natural number e using random sampling from Uniform distribution. Suppose $X_1, X_2, \cdots, X_n$ are drawn from uniform distribution on (0, 1) and define $U= {\operatorname{argmin}}_n { \left (X_1+X_2+...+X_n > 1 \right )}$, note that all $X_i \ge 0$.

Now, the expected value $E(U) = e \approx 2.7182$. Therefore, by LLN, taking averages of $\left \{ U_1, U_2, U_3, ..., U_k \right \}$ values, each computed from random samples $X_1, X_2, ..., X_n \sim U(0,1)$ as described above, will provide a more accurate estimate (as $k \rightarrow \infty$) of the natural number e.

The Uniform E-Estimate Experiment, part of SOCR Experiments, provides a hands-on demonstration of how the LLN facilitates stochastic simulation-based estimation of e.

• TBD