SOCR EduMaterials ModelerActivities MixtureModel 1

From Socr

(Difference between revisions)
Jump to: navigation, search
m
Line 12: Line 12:
* '''Model Fitting''': Now go back to the SOCR [http://socr.stat.ucla.edu/htmls/SOCR_Modeler.html Modeler] browser (where you did the data sampling). Choose Mixed-Model-Fit from the drop-down list in the left panel. <center>[[Image:SOCR_ModelerActivities_MixtureModelFit_Dinov_011707_Fig4.jpg|400px]]</center>
* '''Model Fitting''': Now go back to the SOCR [http://socr.stat.ucla.edu/htmls/SOCR_Modeler.html Modeler] browser (where you did the data sampling). Choose Mixed-Model-Fit from the drop-down list in the left panel. <center>[[Image:SOCR_ModelerActivities_MixtureModelFit_Dinov_011707_Fig4.jpg|400px]]</center>
-
We will now try to fit a 2-component mixture of Gaussian (Normal) distributions to this Bimodal Laplace distribution (of the generated sample).  
+
* We will now try to fit a 2-component mixture of Gaussian (Normal) distributions to this Bimodal Laplace distribution (of the generated sample). You may need to click the Re-Initialize button a few times. The Expectation-Maximization algorithm used to estimate the mixture distribution parameters is unstable and will produce somewhat different results for different initial consitions. <center>[[Image:SOCR_ModelerActivities_MixtureModelFit_Dinov_011707_Fig5.jpg|400px]]</center>
 +
* Notice the quantitative results of this mixture model fitting protocol (in the Results panel). Rwcall that we sampled 100 observations from Laplace distribution with mean of zero (not Normal Gaussian, which we could also have done and the fit would have been much better, of course) and then another 100 observations from Laplace distribution with mean = 20.0. The reported estimates of the means of the two Gaussian mixtures are 0 and 22 (pretty close to the original/theoretical means). We could have also fit in a mixture of 3 (or more) Gaussian mixture components, if we had a reason to believe that the mixture distribution is tri- (or higher-)modal.
 +
<center>[[Image:SOCR_ModelerActivities_MixtureModelFit_Dinov_011707_Fig6.jpg|400px]]</center>
 +
 
 +
 
<hr>
<hr>

Revision as of 19:57, 17 January 2007

SOCR Modeler Activities - SOCR Mixture Model Fitting Activity

This is a SOCR Activity that demonstrates random sampling and fitting of mixture models to data

  • Data Generation: You typically have investigator-acquired data that you need to fit a model to. In this case we will generate the data by randomly sampling using the SOCR resource. Go to the SOCR Modeler and select the Data Generation tab from the right panel.
    • Now, click the Raw Data check-box in the left panel, select Laplace Distribution (or any other distribution you want to sample data from), choose the sample-size to be 100 (keep the center, mu, at zero) and click Sample. Then go to the Data tab, in the right panel. There you should see the 100 random Laplace observations stored as a column vector.
    • Next, go back to the Data Generation tab from the right panel and change the center of the Laplace distribution (set Mu=20, say). Click Sample again and you will see the list of randomly generated data in the Data tab expand to 200 (as you sampled another set of 100 random Laplace observations).
  • Exploratory Data Analysis (EDA): Go to the Data tab and select all observations in the data column (use CTR-A, or mouse-copy). Then open another web browser and go to SOCR Charts. Choose HistogramChartDemo2, say, clear the default data (Data tab) and paste in (CTR-V or mouse paste-in) the first column the 200 observations that you sampled in the SOCR Modeler Data Generator (above). Then you need to map the values - go to the Mapping tab, select the first column, where you pasted the data (C1), and click XValue. This will move the C1 column label from the right bin to the bottom-right bin. Finally, click Update Chart and go to the Graph tab to see your histogram of the 200 (bimodal) Laplace observations. Notice, that you can change the width of the histogram bin to clearly see the bi-modality of the distribution of these 200 measurements. Of course, this is due to the fact that we sampled from two distinct Laplace distributions, one with mean of zero and the second with mean of 20.0.
  • Model Fitting: Now go back to the SOCR Modeler browser (where you did the data sampling). Choose Mixed-Model-Fit from the drop-down list in the left panel.
  • We will now try to fit a 2-component mixture of Gaussian (Normal) distributions to this Bimodal Laplace distribution (of the generated sample). You may need to click the Re-Initialize button a few times. The Expectation-Maximization algorithm used to estimate the mixture distribution parameters is unstable and will produce somewhat different results for different initial consitions.
  • Notice the quantitative results of this mixture model fitting protocol (in the Results panel). Rwcall that we sampled 100 observations from Laplace distribution with mean of zero (not Normal Gaussian, which we could also have done and the fit would have been much better, of course) and then another 100 observations from Laplace distribution with mean = 20.0. The reported estimates of the means of the two Gaussian mixtures are 0 and 22 (pretty close to the original/theoretical means). We could have also fit in a mixture of 3 (or more) Gaussian mixture components, if we had a reason to believe that the mixture distribution is tri- (or higher-)modal.





Translate this page:

(default)

Deutsch

Español

Français

Italiano

Português

日本語

България

الامارات العربية المتحدة

Suomi

इस भाषा में

Norge

한국어

中文

繁体中文

Русский

Nederlands

Ελληνικά

Hrvatska

Česká republika

Danmark

Polska

România

Sverige

Personal tools