SOCR EduMaterials Activities LawOfLargeNumbers

From Socr

(Difference between revisions)
Jump to: navigation, search
Line 12: Line 12:
== '''Exercise 1'''==
== '''Exercise 1'''==
-
This exercise illustrates the statement and validity of the LLN in the situation of tossing (biased or fair) coins repeatedly. Suppose we let H and T denote Heads and Tails, the probabilities of observing a Head or a Tail at each trial are <math>0<p<1</math> and <math>0<1-p<1</math>, respectfully. The sample space of this experiment consist of sequences of H's and Ts. For example, an outcome may be <math>\{H, H, T, H, H, T, T, T, ....\}</math>. If we toss a coin n times, the size of the sample-space is <math>2^n</math>, as the coin tosses are independent. [[About_pages_for_SOCR_Distributions | Binomial Distribution]] governs the probability of observing <math>0\le k\le n</math> Heads in n experiments, which is evaluated by the binomial density at <math>k</math>.  
+
This exercise illustrates the statement and validity of the LLN in the situation of tossing (biased or fair) coins repeatedly. Suppose we let H and T denote Heads and Tails, the probabilities of observing a Head or a Tail at each trial are <math>0<p<1</math> and <math>0<1-p<1</math>, respectfully. The sample space of this experiment consist of sequences of H's and Ts. For example, an outcome may be <math>\{H, H, T, H, H, T, T, T, ....\}</math>. If we toss a coin n times, the size of the sample-space is <math>2^n</math>, as the coin tosses are independent. [[About_pages_for_SOCR_Distributions | Binomial Distribution]] governs the probability of observing <math>0\le k\le n</math> Heads in <math>n</math> experiments, which is evaluated by the binomial density at <math>k</math>.  
In this case we will be interested in two random variables associated with this process. The first variable will be the ''proportion of Heads'' and the second will be the ''differences of the number of Heads and Tails''. This will empirically demonstrate the LLN and it's most common misconceptions (presented below). Point your browser to the [http://socr.ucla.edu/htmls/SOCR_Experiments.html  SOCR Experiments] and select the '''Coin Toss LLN Experiment''' from the drop-down list of experiments in the top-left panel. This applet consists of a control toolbar on the top followed by a graph panel in the middle and a results table at the bottom. Use the toolbar to flip coins one at a time, 10, 100, 1,000 at a time or continuously! The toolbar also allows you to stop or reset an experiment and select the probability of Heads ('''p''') using the slider. The graph panel in the middle will dynamically plot the values of the two variables of interest (''proportion of heads'' and ''difference of Heads and Tails''). The outcome table atthe bottom present the summaries of all trials of this experiment. From this table, you can copy and paste the summary for further processing using other computational resources (e.g., [http://socr.ucla.edu/htmls/SOCR_Modeler.html SOCR Modeler] or [http://office.microsoft.com/excel MS Excel]).
In this case we will be interested in two random variables associated with this process. The first variable will be the ''proportion of Heads'' and the second will be the ''differences of the number of Heads and Tails''. This will empirically demonstrate the LLN and it's most common misconceptions (presented below). Point your browser to the [http://socr.ucla.edu/htmls/SOCR_Experiments.html  SOCR Experiments] and select the '''Coin Toss LLN Experiment''' from the drop-down list of experiments in the top-left panel. This applet consists of a control toolbar on the top followed by a graph panel in the middle and a results table at the bottom. Use the toolbar to flip coins one at a time, 10, 100, 1,000 at a time or continuously! The toolbar also allows you to stop or reset an experiment and select the probability of Heads ('''p''') using the slider. The graph panel in the middle will dynamically plot the values of the two variables of interest (''proportion of heads'' and ''difference of Heads and Tails''). The outcome table atthe bottom present the summaries of all trials of this experiment. From this table, you can copy and paste the summary for further processing using other computational resources (e.g., [http://socr.ucla.edu/htmls/SOCR_Modeler.html SOCR Modeler] or [http://office.microsoft.com/excel MS Excel]).
 +
 +
<center>[[Image:SOCR_Activities_LLN_Dinov_022007_Fig1.jpg|400px]]</center>
Now, select '''n=100''' and '''p=0.5'''. The figure below shows a snapshot of the applet. Remember that each time you run the applet the random samples will be different and the figures and results will generally vary. Click on the '''Run''' or '''Step''' buttons to perform the experiment and observe the ''proportion of heads'' and ''differences'' evolve over time. Choosing '''Continuous''' from the number of experiments drop-down list in the tool bar will run the experiment in a continuous mode (use the '''Stop''' button to terminate the experiment in this case). The statement of the LLN in this experiment is simply that '''as the number of experiments increases the sample proportion of Heads (red curve) will approach the theoretical (user preset) value of p (in this case ''p=0.5'')'''. Try to change the value of '''p''' and run the experiment interactively several times. Notice the behavior of the graphs of the two variables we study. Try to pose and answer questions like these:
Now, select '''n=100''' and '''p=0.5'''. The figure below shows a snapshot of the applet. Remember that each time you run the applet the random samples will be different and the figures and results will generally vary. Click on the '''Run''' or '''Step''' buttons to perform the experiment and observe the ''proportion of heads'' and ''differences'' evolve over time. Choosing '''Continuous''' from the number of experiments drop-down list in the tool bar will run the experiment in a continuous mode (use the '''Stop''' button to terminate the experiment in this case). The statement of the LLN in this experiment is simply that '''as the number of experiments increases the sample proportion of Heads (red curve) will approach the theoretical (user preset) value of p (in this case ''p=0.5'')'''. Try to change the value of '''p''' and run the experiment interactively several times. Notice the behavior of the graphs of the two variables we study. Try to pose and answer questions like these:
* If we set '''p=0.4''', how large of a sample-size is needed to ensure that the sample-proportion stays within [0.4; 0.6]?
* If we set '''p=0.4''', how large of a sample-size is needed to ensure that the sample-proportion stays within [0.4; 0.6]?
-
* What is the behavious of the curve representing the differences of Heads and Tails (red curve)?
+
* What is the behaviour of the curve representing the differences of Heads and Tails (red curve)?
-
* What proportion of experiments (each of fixed sample-size, say '''n=40''') is expected to have their sample proportions within 0.1 from the value of '''p'''?
+
* Is the convergence of the sample-proportion to the theoretical proportion (that we preset) dependent on p?
-
<center>[[Image:SOCR_Activities_LLN_Dinov_022007_Fig1.jpg|400px]]</center>
+
* Remember that the more experiments you run the closer the theoretical and sample proportions will be (by LLN). Go in '''Continous run mode''' and watch the convergence of the sample proportion to <math>p</math>. Can you explain in words, why can't we expect the second variable of interest (the differences of Heads and Tails) to converge? [[Image:SOCR_Activities_LLN_Dinov_022007_Fig2.jpg|100px]]
== '''Common Misconceptions regarding the LLN'''==
== '''Common Misconceptions regarding the LLN'''==

Revision as of 18:47, 20 February 2007

Contents

SOCR Educational Materials - Activities - SOCR Law of Large Numbers Activity

This is part I of a heterogeneous activity that demonstrates the Law of Large Numbers (LLN). Part II of this activity contains more examples and diverse experiments.

Example

The average weight of 10 students from a class of 100 students is most likely closer to the real average weight of all 100 students, compared to the average weight of 3 randomly chosen students from that same class. This is because the sample of 10 is a larger number than the sample of only 3 and better represents the entire class. At the extreme, a sample of 99 of the 100 students will produce a sample average almost exactly the same as the average for all 100 students. On the other extreme, sampling a single student will be an extremely variant estimate of the overall class average weight.

Statement of the Law of Large Numbers

If an event of probability p is observed repeatedly during independent repetitions, the ratio of the observed frequency of that event to the total number of repetitions converges towards p as the number of repetitions becomes arbitrarily large.

Complete details about the LLN can be found here

Exercise 1

This exercise illustrates the statement and validity of the LLN in the situation of tossing (biased or fair) coins repeatedly. Suppose we let H and T denote Heads and Tails, the probabilities of observing a Head or a Tail at each trial are 0 < p < 1 and 0 < 1 − p < 1, respectfully. The sample space of this experiment consist of sequences of H's and Ts. For example, an outcome may be {H,H,T,H,H,T,T,T,....}. If we toss a coin n times, the size of the sample-space is 2n, as the coin tosses are independent. Binomial Distribution governs the probability of observing 0\le k\le n Heads in n experiments, which is evaluated by the binomial density at k.

In this case we will be interested in two random variables associated with this process. The first variable will be the proportion of Heads and the second will be the differences of the number of Heads and Tails. This will empirically demonstrate the LLN and it's most common misconceptions (presented below). Point your browser to the SOCR Experiments and select the Coin Toss LLN Experiment from the drop-down list of experiments in the top-left panel. This applet consists of a control toolbar on the top followed by a graph panel in the middle and a results table at the bottom. Use the toolbar to flip coins one at a time, 10, 100, 1,000 at a time or continuously! The toolbar also allows you to stop or reset an experiment and select the probability of Heads (p) using the slider. The graph panel in the middle will dynamically plot the values of the two variables of interest (proportion of heads and difference of Heads and Tails). The outcome table atthe bottom present the summaries of all trials of this experiment. From this table, you can copy and paste the summary for further processing using other computational resources (e.g., SOCR Modeler or MS Excel).

Now, select n=100 and p=0.5. The figure below shows a snapshot of the applet. Remember that each time you run the applet the random samples will be different and the figures and results will generally vary. Click on the Run or Step buttons to perform the experiment and observe the proportion of heads and differences evolve over time. Choosing Continuous from the number of experiments drop-down list in the tool bar will run the experiment in a continuous mode (use the Stop button to terminate the experiment in this case). The statement of the LLN in this experiment is simply that as the number of experiments increases the sample proportion of Heads (red curve) will approach the theoretical (user preset) value of p (in this case p=0.5). Try to change the value of p and run the experiment interactively several times. Notice the behavior of the graphs of the two variables we study. Try to pose and answer questions like these:

  • If we set p=0.4, how large of a sample-size is needed to ensure that the sample-proportion stays within [0.4; 0.6]?
  • What is the behaviour of the curve representing the differences of Heads and Tails (red curve)?
  • Is the convergence of the sample-proportion to the theoretical proportion (that we preset) dependent on p?
  • Remember that the more experiments you run the closer the theoretical and sample proportions will be (by LLN). Go in Continous run mode and watch the convergence of the sample proportion to p. Can you explain in words, why can't we expect the second variable of interest (the differences of Heads and Tails) to converge?

Common Misconceptions regarding the LLN

  • Misconception 1: If we observe a streak of 10 consecutive heads (when p=0.5, say) the odds of the 11th trial being a Head is > p! This is of course, incorrect, as the coin tosses are independent trials (an example of a memoryless process).
  • Misconception 2: If run large number of coin tosses, the number of heads and number of tails become more and more equal. This is incorrect, as the LLN only guarantees that the sample proportion of heads will converge to the true population proportion (the p parameter that we selected). In fact, the difference |Heads - Tails| diverges!

Part II of this activity




Translate this page:

(default)

Deutsch

Español

Français

Italiano

Português

日本語

България

الامارات العربية المتحدة

Suomi

इस भाषा में

Norge

한국어

中文

繁体中文

Русский

Nederlands

Ελληνικά

Hrvatska

Česká republika

Danmark

Polska

România

Sverige

Personal tools