# SOCR EduMaterials Activities BMI Modeling Activity

## SOCR Educational Materials - Activities - SOCR Body Mass Index (BMI) Activity and Applications of the Chi-Squared Test

Often times when solving a problem from intro-level textbooks, we are told to assume that a population follows a normal distribution. Other times, a graph of the data will allow us to assume some degree of normality. This allows the use of a number of statistical analyses later on.

## Motivation and Goals

The following activity will demonstrate one of the ways to test for normality, using the Chi-Squared test for Goodness-of-Fit. The model to fit will be the normal model. We will run this test on a human characteristic often assumed to fit at least some kind of normal model: BMI.

## Summary

This activity uses a simplified version of the BMI data sets found here. Four cases of data were excluded due to extremely high BMIs that hinted at a mistake in the entry process. 10 variables from the original dataset were left out in the dataset presented here, though the same process presented here may be used on them for additional practice.

## Data

### Data Description

• Number of cases: 248
• Variables
• Underwater Density – Density determined via a graduated-cylinder type test
• Body fat—Calculated body density and tissue-type proportions using Siri’s equation (see the full dataset page)
• Height
• Weight
• BMI—Body Mass Index, calculated as $$\frac{weight}{height^2}$$.

### Data Summary

Statistic Underwater_Density_($$\frac{g}{cm^3}$$) Body_Fat Height(m) Weight_(kg) BMI
Mean 1.0562 18.854 1.787 80.547 25.18643319
SD 0.0184 8.0663 0.0659 12.0076 3.146481308

### Raw Dataset

Underwater_Density(g/cm3)Body_FatHeight(m)Weight(kg)BMI
1.070812.31.7208569.9666223.6268
1.08536.11.8351578.5848823.33436
1.041425.31.6827569.8532224.66876
1.075110.41.8351583.8011924.88325
1.03428.71.8097583.5743925.51738
1.050220.91.8986595.367826.45525
1.054919.21.7716582.1002226.15703
1.070412.41.841579.8322623.54155
1.094.11.879686.6361424.5227
1.072211.71.866989.9246925.80102
1.0837.11.892384.4815823.59294
1.08127.81.930497.9759526.29208
1.051320.81.765381.8734226.27277
1.050521.21.8097593.0998328.42574
1.048422.11.765385.1619727.32805
1.051220.91.676473.8221626.26827
1.0333291.803488.7907127.3013
1.046822.91.803494.914229.18415
1.0622161.7208583.347628.14538
1.06116.51.866996.0481827.55796
1.055119.11.727281.1930327.21658
1.06415.21.7716590.9452728.97505
1.063115.61.7335563.6163321.16878
1.058417.71.77867.4718721.34319
1.0668141.7208568.6058523.16728
1.09113.71.816172.2345821.90109
1.08117.91.714559.647420.29161
1.046822.91.714567.1316722.83771
1.0913.71.6446560.4411822.34529
1.0798.81.752672.9149723.73838
1.071611.91.8732582.5538123.52587
1.08625.71.8097572.6881822.19354
1.071911.81.8097576.2035223.26686
1.050221.31.803499.1099330.47425
1.026332.31.8669112.150732.17806
1.010140.11.65186.9763431.90854
1.043824.21.77891.7390629.01956
1.034628.41.7335589.244329.69667
1.025832.61.701892.0792531.79397
1.027931.61.77898.4295431.13594
1.0269321.816196.1615829.15561
1.08147.71.727256.8124419.044
1.06713.91.8605574.5025521.52229
1.074210.81.714560.5545820.60023
1.06655.61.8097567.3584720.56625
1.067813.61.739961.5751620.34028
1.090341.6954557.8330320.11898
1.075610.21.8351571.7809921.31407
1.0846.61.752663.1627420.56342
1.080781.7208562.2555521.02287
1.08486.31.866969.2862319.87947
1.09063.91.714561.8019621.02458
1.047322.61.828889.8112926.85335
1.052420.41.727282.3270227.5967
1.0356281.765391.2854629.29305
1.02831.51.7970591.8524528.44267
1.04324.61.6700581.5332329.23316
1.039626.11.8605597.9759528.30328
1.031729.81.739981.0796426.78325
1.029830.71.7843587.6567327.5312
1.040325.81.701880.7394427.87845
1.026432.31.77893.2132329.48588
1.0313301.714583.234228.31567
1.049921.51.7970568.7192421.27933
1.067313.81.816170.1934221.28222
1.08476.31.7589570.4202222.76095
1.069312.91.816171.100621.55727
1.043924.31.816175.9767223.03568
1.07888.81.7462566.5646821.82886
1.07968.51.8732572.9149720.77903
1.06813.51.625656.6990521.45598
1.07211.81.6700564.8637123.25642
1.066618.51.714567.2450722.87628
1.0798.81.765373.7087623.65277
1.048322.21.739980.6260426.63341
1.049821.51.7843573.1417722.97235
1.05618.81.7589577.6776925.10668
1.028331.41.7208574.2757525.08193
1.038226.81.7081568.1522523.3576
1.056818.41.8478586.2959525.27301
1.0377271.77877.450924.49982
1.0378271.7589576.2035224.63021
1.038626.61.714575.7499325.76958
1.064814.91.7081571.554224.52354
1.046223.11.6700572.5747826.02117
1.088.31.841580.1724523.64186
1.066614.11.854279.8322623.22016
1.05220.51.77880.2858525.3966
1.057318.21.765381.5332326.16361
1.07958.51.790774.9561423.37553
1.042424.91.8224587.3165326.28968
1.078591.892383.5743923.33959
1.099117.41.97485101.831526.11042
1.0779.61.8605585.6155624.73261
1.07311.31.689173.7087625.83499
1.058217.81.7335570.9872123.62149
1.048422.21.828889.357726.71773
1.050621.21.866990.0380925.83355
1.052420.41.828878.8116723.56449
1.05320.11.8097578.3580823.92471
1.04822.31.8732589.244325.4325
1.041225.41.7589580.2858525.94968
1.0578181.739975.0695424.79792
1.054719.31.866990.8318726.0613
1.056918.31.8859592.1926525.92006
1.059317.31.917787.9969223.92799
1.0521.41.7589576.4303124.70351
1.053819.71.739977.450925.58456
1.0355281.77883.120826.29337
1.048622.11.77880.8528425.57595
1.050321.31.7843573.9355623.22166
1.038426.71.8224579.4920623.93385
1.060716.71.7589571.6675923.16412
1.052920.11.8478580.3992523.54608
1.067113.91.828881.1930324.27651
1.040425.81.879686.6361424.5227
1.057518.11.8351585.0485725.25363
1.035827.91.892393.6668226.15808
1.041425.31.816184.0279925.47678
1.065214.71.7462572.6881823.83696
1.0623161.6954568.7192423.90608
1.067413.81.689173.0283725.59652
1.058717.51.701875.7499326.15563
1.037327.21.7462580.5126526.40288
1.05917.41.7208569.0594423.32046
1.051520.81.8605587.2031325.19123
1.064814.91.7716574.9561423.88094
1.057518.11.816177.9044923.62017
1.047222.71.790777.6776924.22427
1.045223.61.8605589.357725.81364
1.039826.11.6954571.21424.77396
1.043524.41.765376.3169224.48972
1.037427.11.7716584.3681826.8796
1.049121.81.7970575.6365323.42131
1.032529.41.879685.1619724.10543
1.048122.41.8097576.3169223.30149
1.052220.41.90596.5017826.59165
1.042224.91.803480.1724524.65137
1.057118.31.765378.5848825.2175
1.045923.31.7208575.7499325.57974
1.07759.41.8351572.4613821.5161
1.075410.31.968585.343422.02415
1.066414.21.7970570.7604121.91139
1.05519.21.8478594.5740127.69736
1.032229.61.7716593.6668229.84214
1.08735.31.841565.203919.22782
1.041625.21.78435101.151131.76951
1.07769.41.752669.0594422.48316
1.054219.61.8923109.65630.62333
1.075810.11.8351566.2244919.66416
1.06116.51.7081571.100624.36808
1.051211.866990.8318726.0613
1.059417.31.9113577.7910921.29362
1.028731.21.752693.3266330.38365
1.0761101.8351582.7806124.5802
1.070412.51.7462561.9153620.30419
1.047722.51.816180.3992524.37656
1.07759.41.8351568.6058520.37127
1.065314.61.854288.904125.85882
1.069131.7462583.5743927.40693
1.064415.11.790763.5029319.80378
1.03727.31.828899.2233329.66753
1.054919.21.8732598.4295428.05007
1.049221.81.727275.4097325.27797
1.052520.31.83515101.944930.27069
1.01834.31.7653103.532533.22306
1.06116.51.765378.3580825.14472
1.092631.7208569.0594423.32046
1.09830.71.663757.0392420.60742
1.052120.51.803480.3992524.7211
1.060316.91.816179.9456624.23904
1.041425.31.82245102.852130.9672
1.07639.91.7589565.8842921.29486
1.068913.11.701868.4924523.6497
1.031629.91.8161109.429233.17827
1.047722.51.7589584.9351727.45242
1.060316.91.8923106.480829.7366
1.038726.61.8859599.4501327.9605
1.108901.727253.750718.01768
1.072511.51.7081566.1110922.65804
1.071312.11.7716572.2345823.01385
1.058717.51.8859577.337521.74352
1.07948.61.816175.9767223.03568
1.045323.61.88595105.573629.68212
1.052420.41.828895.4811928.54864
1.05220.51.841591.7390627.05271
1.043424.41.7335583.9145927.92317
1.072811.41.7589569.3996322.43108
1.01438.11.9304110.789929.73073
1.062415.91.790787.7701227.37165
1.042924.71.89865101.944928.27976
1.04722.81.8478573.8221621.61988
1.041125.51.7335581.6466327.16849
1.0488221.752670.8738123.07386
1.058317.71.816176.2035223.10444
1.08416.61.8478575.8633222.21767
1.046223.61.714577.450926.34823
1.070912.21.7843580.8528425.39424
1.048422.11.7589568.0388621.99126
1.03428.71.816190.9452727.57405
1.085461.879683.46123.62396
1.020934.81.77165101.151132.22662
1.06116.61.854294.6874127.54096
1.02532.91.663775.2963327.20344
1.025432.81.841588.4505126.08296
1.07719.61.7843572.8015822.8655
1.074210.81.7970572.4613822.43811
1.08297.11.727263.7297321.36273
1.037327.21.892398.0893527.39314
1.054319.51.8224576.3169222.97786
1.056118.71.7970588.3371127.35413
1.054319.51.854278.3580822.79138
1.067813.61.7716567.6986621.56871
1.08197.51.77870.0800222.16821
1.043324.51.8224590.3782827.21152
1.0646151.7589570.0800222.65099
1.070612.41.790769.5130321.67807
1.0399261.83515104.326230.97778
1.072611.51.714573.3685724.95945
1.08745.21.7081564.5235122.11393
1.07410.91.7462581.5332326.73756
1.070312.51.6954557.3794319.96118
1.06514.81.7335576.8839125.58366
1.041825.21.8859590.0380925.3143
1.064714.91.765379.1518725.39944
1.0601171.739976.0901225.13505
1.074510.61.6700567.0182724.02892
1.06216.11.8224582.6672124.88984
1.063615.41.816179.6054624.13589
1.038426.71.7081573.3685725.14537
1.040325.81.714571.554224.34222
1.056318.61.714576.5437126.03961
1.042424.81.8351586.8629425.79238
1.037227.31.765399.4047731.89849
1.070512.41.765370.4202222.5975
1.031629.91.6700586.0691530.85948
1.0599171.6700557.8330320.73562
1.0207351.73355101.831533.88515
1.030430.41.8288106.25431.76968
1.025632.61.84785103.305730.25456
1.0334291.739990.4916829.89235
1.064115.21.7589570.5336122.7976
1.030830.21.790797.7491630.48368
1.0736111.701860.8947821.02631
1.023633.61.7716591.1720729.04731
1.032829.31.676484.7083830.14193
1.0399261.790786.5227426.98265
1.027131.91.77894.1204229.77285

## Exploratory data analyses (EDA)

Before we run any quantitative tests, let’s examine what these variables look like in graphical form. Keep an eye out for which variables appear to follow a normal distribution.

## Quantitative Data Analyses (QDA)

In this section, we will be testing the BMI variable for normality, although the same analysis can be carried for the other variables. As the name goodness-of-fit implies, we first need to create a normal model to compare to. We will use the sample mean (25.18643319) and standard deviation (3.146481308) as the parameters of the normal distribution.

The next few steps will use the SOCR Distributions Applet (see Distribution Activities). Open the applet in a java-enabled browser.

### Data Modeling

Enter in the values for the mean and standard deviation, then drag the graph down to see your full distribution.

To run a goodness of fit test, we will need to create a set of bins to compare between the real distribution and the expected normal one. For simplicity’s sake, we will use a 16 bins of bin-size 1 beginning with BMI=18 and ending with BMI=34. To find the frequency of results in each bin from the normal distribution, click on the edges of the bin size (try to be as accurate as possible) on the normal distribution applet. The example shown alternatively, use the normal CDF function.

After calculating the probability of each bin, multiply each of these probabilities by the total number of cases (in this case, 248). Now we can place these calculated frequencies next to the frequencies from the observed distribution (the observed frequencies were found by plain counting):

Bin(Simplified)Bin(Actual)Normal_ProbabilityEstimatedNormalFrequencyObservedFrequency
18-1918.000-18.9990.0134540353.390416821
19-2019.000-19.9990.0250017576.3004427646
20-2120.000-20.9990.04203215510.5921030611
21-2221.000-21.9990.0639276916.1097778823
22-2322.000-22.9990.08796224522.1664857420
23-2423.000-23.9990.10949759827.593394738
24-2524.000-24.9990.12331384531.0750889426
25-2625.000-25.9990.12563822231.6608319431
26-2726.000-26.9990.11580648529.1832342226
27-2827.000-27.9990.09657045724.3357551621
28-2928.000-28.9990.07285441918.359313599
29-3029.000-29.9990.04972426312.5305142815
30-3130.000-30.9990.0307028367.73711467210
31-3231.000-31.9990.0171507344.3219849686
32-3332.000-32.9990.0086672072.1841361642
33-3433.000-33.9990.003962490.998547483

It might be useful to see how this hypothetical expected data matches up with our actual results graphically. Note the differences between the two data sets in the Quantile-Quantile Charts (ignore the stacked shape of the normal estimation, which is due to binning the data). Note that the line is a better fit in the latter case).

### Chi-Square Goodness-of-Fit Test

With these values now settled, we can begin the Chi-square analysis. Open up the SOCR Analyses Applet in a Java-enabled browser, and then select the Chi-square Goodness of Fit in the pull-down menu on the left:

Next, enter the data into two columns using the Paste button.

Name the two columns Observed (for the actual results) and Expected (for the normal model estimates).

Click on the Mapping tab and add observed and expected into the correct bins.

Click the Calculate Button. A window should pop up asking about the number of parameters. Recall that the normal distribution is defined by two parameters—mean and standard deviation. Enter “2” and press “OK”.

The results page should come up with the following text:

Observed Data = Observed
Expected Data = Expected

Chi-Square Goodness of Fit Results:

Total Counts = 16
Number of Parameters = 2
Chi-Square Goodness of Fit Results:
********** Chi-Square Statistic is: 21.044 *********
********** Chi-Square Degrees of Freedom is: 16 - 2 - 1 = 13 *********
********** Chi-Square p-value is: .072 *********

Based on α = 0.05, there is not enough evidence to conclude that the BMI data distribution does not fit a normal distribution. However, it is worth noting that it does come very close. This understanding of a distribution is very important to health officials; for example, it helps creates the charts that doctors national-wide use to understand. In addition, a deviation from that distribution can be used to chart changes in the overall health of the nation (Penman, 2006).

### Linear Regression

Finally, we can explore the correlation between the observed frequencies and the prediction-model values (predicted frequencies) of the BMI data within each of the 16 bins. If there is a good agreement (e.g., high correlation) this would indicate that the normal distribution model fits well the observed data (BMI frequencies).

Copy and paste the 2 column data (observed and predicted frequencies) into the Simple Linear Regression applet of SOCR Analysis.

Map the predicted and observed frequencies to the Dependent and Independent variables (Mapping Tab).

Click Calculate button to view the results. Regression Model:

PredictedFreq = 1.71070 + 0.8918062570205698 * ObservedFreq
Correlation(ObservedFreq, PredictedFreq) = .91394
R-Square = .83529
Intercept:
Parameter Estimate: 1.71070
Standard Error: 2.00468
T-Statistics: .85335
P-Value: .40783
Slope:
Parameter Estimate: .89181
Standard Error: .10584
T-Statistics: 8.42598
P-Value: .00000

### Regression Graphs

The Graphs tab includes a regression model plot, scatter plot with confidence/prediction limits, and various plots of the residuals.

## Practice problems

• Try this method out on one of the other variables. See if it breaks from the normal distribution.
• Many biological measures are said to follow a normal distribution. Look under the data header “Biomedical Data” in the SOCR Free Datasets and check this claim out with one of the variables you are interested in.