SOCR EduMaterials Activities BoxPlot

From Socr

(Difference between revisions)
Jump to: navigation, search
Line 11: Line 11:
===Background & Motivation===
===Background & Motivation===
-
'''The boxplot (or box-and-whisker-plot), invented by John Tukey in 1977, is an efficient way for presenting data, especially for comparing multiple groups of data.  In the box plot we can mark-off the five-number summay of a data set (minimum, 25th percentile, median, 75th percentile, maximum).  The box contains the <math> 50 % </math> of the data.  The upper edge of the box represents the 75th percentile, while the lower edge the 35th percentile.  The median is represented by a line drawn in the middle of the box.  If the median is not in the middle of the box then the data are skewed.  The ends of the lines (called whiskers) represent the minimum and maximum values of the data set, unless there are outliers.  The advantage of a box plot is that it provides grahically the location and the spread of the data set, it provides an idea about the skewness of the data set, and can provide a comparison between variables by constructiing a side-by-side box plots.
+
'''The boxplot (or box-and-whisker-plot), invented by John Tukey in 1977, is an efficient way for presenting data, especially for comparing multiple groups of data.  In the box plot we can mark-off the five-number summay of a data set (minimum, 25th percentile, median, 75th percentile, maximum).  The box contains the <math> 50 % </math> of the data.  The upper edge of the box represents the 75th percentile, while the lower edge the 35th percentile.  The median is represented by a line drawn in the middle of the box.  If the median is not in the middle of the box then the data are skewed.  The ends of the lines (called whiskers) represent the minimum and maximum values of the data set, unless there are outliers.  Outliers are observations below <math> Q_1 -1.5 (IQR) </math> or above <math> Q_3 + 1.5(IQR) </math>, where <math> Q_1</math> is the 25th percentile, <math> Q_3</math> is the 75th percentile, and <math> IQR=Q_3-Q_1 </math> (called the interquartile range).  The advantage of a box plot is that it provides grahically the location and the spread of the data set, it provides an idea about the skewness of the data set, and can provide a comparison between variables by constructing a side-by-side box plots.
===Exercises===
===Exercises===

Revision as of 02:57, 22 July 2007

Contents

SOCR Educational Materials - Activities - SOCR Box-and-Whisker Plot Activity

Summary

This activity describes the construction of the box-and-whisker plot in SOCR. The applets can be accessed at SOCR Charts by clicking on Miscellaneous.

Goals

The aims of this activity are to:

  • show the importance of the box plot in explonatory data analysis (EDA)
  • illustrate how to use the SOCR to construct a box plot
  • present some peculliarities of a box plot

Background & Motivation

The boxplot (or box-and-whisker-plot), invented by John Tukey in 1977, is an efficient way for presenting data, especially for comparing multiple groups of data. In the box plot we can mark-off the five-number summay of a data set (minimum, 25th percentile, median, 75th percentile, maximum). The box contains the 50% of the data. The upper edge of the box represents the 75th percentile, while the lower edge the 35th percentile. The median is represented by a line drawn in the middle of the box. If the median is not in the middle of the box then the data are skewed. The ends of the lines (called whiskers) represent the minimum and maximum values of the data set, unless there are outliers. Outliers are observations below Q1 − 1.5(IQR) or above Q3 + 1.5(IQR), where Q1 is the 25th percentile, Q3 is the 75th percentile, and IQR = Q3Q1 (called the interquartile range). The advantage of a box plot is that it provides grahically the location and the spread of the data set, it provides an idea about the skewness of the data set, and can provide a comparison between variables by constructing a side-by-side box plots.

Exercises

  • Exercise 1: Go to the SOCR Charts and click first on the Miscellaneous tab and then on BoxAndWhiskerChartDemo1. In the Demo1 boxplot we can see side-by-side box plots of two categories for each of three series. These demonstration data can be viewed by clicking on DATA. Clicking on MAPPING you can choose the variables. Clicking on SHOW ALL the applet will present the graph, the data, and the mapping environment. Let’s clear this data set (click on CLEAR) so that we can enter our own data. After you click on CLEAR click on DATA to enter into the spreadsheet. The following data will be entered (don’t forget to separate the data by commas!):
C1 C2 C3
Series 1 1,2,3,4,5,6 2,4,6,8,10,12
Series 2 3,4,5,6,7,8 6,8,10,12,14,16,18
Series 3 5,6,7,8,9 10,16,18,20,22

The following snapshot shows how the above data entered into SOCR:

The following snapshot shows the mapping of the data:


The following snapshot shows the side-by-side box plots:

The following snapshot shows the data, the mapping, and the box plots in one screen:


  • Exercise 2:


Applications




Translate this page:

(default)

Deutsch

Español

Français

Italiano

Português

日本語

България

الامارات العربية المتحدة

Suomi

इस भाषा में

Norge

한국어

中文

繁体中文

Русский

Nederlands

Ελληνικά

Hrvatska

Česká republika

Danmark

Polska

România

Sverige

</math>

Personal tools