AP Statistics Curriculum 2007 Hypothesis Basics

From Socr

(Difference between revisions)

Revision as of 01:46, 6 February 2008

General Advance-Placement (AP) Statistics Curriculum - Fundamentals of Hypothesis Testing

Fundamentals of Hypothesis Testing

A (statistical) hypothesis test is a method of making statistical decisions about populations or processes based on experimental data. Hypothesis testing just answers the question of how well the findings fit the possibility that chance alone might be responsible for the observed discrepancy between the theoretical model and the empirical observations. This is accomplished by asking and answering a hypothetical question - what is the likelihood of the observed summary statistics of interest, if the data did come from the distribution specified by the null-hypothesis. One use of hypotheiss-testing is deciding whether experimental results contain enough information to cast doubt on conventional wisdom.

Example: Consider determining whether a suitcase contains some radioactive material. Placed under a Geiger counter, the suitecase produces 10 clicks (counts) per minute. The null hypothesis is that there is no radioactive material in the suitcase and that all measured counts are due to ambient radioactivity typical of the surrounding air and harmless objects in a suitcase. We can then calculate how likely it is that the null hypothesis produces 10 counts per minute. If it is likely, for example if the null hypothesis predicts on average 9 counts per minute, we say that the suitcase is compatible with the null hypothesis (which does not imply that there is no radioactive material, we just can't determine from the 1-minute sample we took using this specific method!); On the other hand, if the null hypothesis predicts for example 1 count per minute, then the suitcase is not compatible with the null hypothesis and there must be other factors responsible to produce the increased radioactive counts.

The hypothesis testing is also known as statistical significance testing. The null hypothesis is a conjecture that exists solely to be disproved, rejected or falsified by the sample-statistics used to estimate the unknown population papameters. Statistical significance is a possible finding of the test, that the sample is unlikely to have occurred in this proces by chance given the truth of the null hypothesis. The name of the test describes its formulation and its possible outcome. One characteristic of hypothesis testing is its crisp decision about the null-hypothesis: reject or do not reject (which is not the same as accept).

Null and Alternative (Research) Hypotheses

A Null hypothesis is a theses set up to be nullified or refuted in order to support an alternate (research) hypothesis. The null hypothesis is presumed true until statistical evidence, in the form of a hypothesis test, indicates otherwise. In science, the null hypothesis is used to test differences between treatment and control groups, and the assumption at the outset of the experiment is that no difference exists between the two groups for the variable of interest (e.g., population means). The null hypothesis proposes something initially presumed true, and it is rejected only when it becomes evidently false. That is, when a researcher has a certain degree of confidence, usually 95% to 99%, that the data do not support the null hypothesis.

An Example

If we want to compare the test scores of two random samples of men and women, a null hypothesis would be that the mean score of the male population was the same as the mean score of the female population:

H₀ : μ_men = μ_women

where:

H₀ = the null hypothesis

μ_men = the mean of the males (population 1), and

μ_women = the mean of the females (population 2).

Alternatively, the null hypothesis can postulate that the two samples are drawn from the same population, so that the center, variance and shape of the distributions are equal.

Formulation of the null hypothesis is a vital step in testing statistical significance. Having formulated such a hypothesis, one can establish the probability of observing the obtained data from the prediction of the null hypothesis, if the null hypothesis is true. That probability is what is commonly called the significance level of the results.

In many scientific experimental designs we predict that a particular factor will produce an effect on our dependent variable — this is our alternative hypothesis. We then consider how often we would expect to observe our experimental results, or results even more extreme, if we were to take many samples from a population where there was no effect (i.e. we test against our null hypothesis). If we find that this happens rarely (up to, say, 5% of the time), we can conclude that our results support our experimental prediction — we reject our null hypothesis.

Type I Error, Type II Error and Power

Directly related to hypotheis testing are 3 concepts:

Type I Error: The false positive (Type I) error of rejecting the null hypothesis given that it is actually true; e.g., A court finding a person guilty of a crime that they did not actually commit.

Type II Error: The Type II error (false negative) is the error of failing to reject the null hypothesis given that the alternative hypothesis is actually true; e.g., A court finding a person not guilty of a crime that they did actually commit.

Statistical Power: The power of a statistical test is the probability that the test will reject a false null hypothesis (that it will not make a Type II error). As power increases, the chances of a Type II error decrease. The probability of a Type II error is referred to as the false negative rate (β). Therefore power is equal to 1 − β.

		Actual condition
		Present	Absent
Test result	Positive	Condition Present + Positive result = True Positive	Condition absent + Positive result = False Positive Type I error
Test result	Negative	Condition present + Negative result = False (invalid) Negative Type II error	Condition absent + Negative result = True (accurate) Negative

Hands-on activities

Step-by-step practice problems.

TBD

References

TBD

SOCR Home page: http://www.socr.ucla.edu

Translate this page:

(default)	Deutsch	Español	Français	Italiano	Português	日本語	България	الامارات العربية المتحدة	Suomi	इस भाषा में	Norge
한국어	中文	繁体中文	Русский	Nederlands	Ελληνικά	Hrvatska	Česká republika	Danmark	Polska	România	Sverige

@@ Line 2: / Line 2: @@
 === Fundamentals of Hypothesis Testing===
-A (statistical) '''hypothesis test''' is a method of making statistical decisions about populations or processes based on experimental data.  Null-hypothesis testing just answers the question of "how well the findings fit the possibility that chance alone might be responsible for the observed discrepancy between the theoretical model and the empirical observations". This is accomplished by asking and answering a hypothetical question. One use is deciding whether experimental results contain enough information to cast doubt on conventional wisdom.
+A (statistical) '''hypothesis test''' is a method of making statistical decisions about populations or processes based on experimental data.  Hypothesis testing just answers the question of ''how well the findings fit the possibility that chance alone might be responsible for the observed discrepancy between the theoretical model and the empirical observations''. This is accomplished by asking and answering a hypothetical question - what is the likelihood of the observed summary statistics of interest, if the data did come from the distribution specified by the null-hypothesis. One use of hypotheiss-testing is deciding whether experimental results contain enough information to cast doubt on conventional wisdom.
-* Example: Consider determining whether a suitcase contains some radioactive material. Placed under a [http://en.wikipedia.org/wiki/Geiger_counter Geiger counter], the suitecase produces 10 clicks (counts) per minute. The '''null hypothesis''' is that there is no radioactive material in the suitcase and that all measured counts are due to ambient radioactivity typical of the surrounding air and harmless objects in a suitcase. We can then calculate how likely it is that the null hypothesis produces 10 counts per minute. If it is likely, for example if the null hypothesis predicts on average 9 counts per minute, we say that the suitcase is compatible with the null hypothesis (which does not imply that there is no radioactive material, we just can't determine!); on the other hand, if the null hypothesis predicts for example 1 count per minute, then the suitcase is not compatible with the null hypothesis and there must be other factors responsible to produce the measurements.
+* Example: Consider determining whether a suitcase contains some radioactive material. Placed under a [http://en.wikipedia.org/wiki/Geiger_counter Geiger counter], the suitecase produces 10 clicks (counts) per minute. The '''null hypothesis''' is that there is no radioactive material in the suitcase and that all measured counts are due to ambient radioactivity typical of the surrounding air and harmless objects in a suitcase. We can then calculate how likely it is that the null hypothesis produces 10 counts per minute. If it is likely, for example if the null hypothesis predicts on average 9 counts per minute, we say that the suitcase is compatible with the null hypothesis (which does not imply that there is no radioactive material, we just can't determine from the 1-minute sample we took using this specific method!); On the other hand, if the null hypothesis predicts for example 1 count per minute, then the suitcase is not compatible with the null hypothesis and there must be other factors responsible to produce the increased radioactive counts.
-The ''hypothesis testing'' is also known as ''null-hypothesis statistical significance testing''.  The null hypothesis is a conjecture that exists solely to be disproved, rejected or falsified by the [[AP_Statistics_Curriculum_2007_Estim_L_Mean | sample-statistics used to estimate the unknown population papameters]]. Statistical significance is a possible finding of the test, that the sample is unlikely to have occurred in this proces by chance given the truth of the null hypothesis.  The name of the test describes its formulation and its possible outcome.  One characteristic of hypothesis testing
+The ''hypothesis testing'' is also known as ''statistical significance testing''.  The null hypothesis is a conjecture that exists solely to be disproved, rejected or falsified by the [[AP_Statistics_Curriculum_2007_Estim_L_Mean | sample-statistics used to estimate the unknown population papameters]]. Statistical significance is a possible finding of the test, that the sample is unlikely to have occurred in this proces by chance given the truth of the null hypothesis.  The name of the test describes its formulation and its possible outcome.  One characteristic of hypothesis testing is its crisp decision about the null-hypothesis: '''reject''' or '''do not reject''' (which is not the same as '''accept''').
-is its crisp decision: '''reject''' or '''do not reject''' (which is not the same as '''accept''').
-<center>[[Image:AP_Statistics_Curriculum_2007_IntroVar_Dinov_061407_Fig1.png|500px]]</center>
+===Null and Alternative (Research) Hypotheses===
+A '''Null hypothesis''' is a theses set up to be nullified or refuted in order to support an ''alternate (research) hypothesis''. The null hypothesis is presumed true until statistical evidence, in the form of a hypothesis test, indicates otherwise.  In science, the null hypothesis is used to test differences between treatment and control groups, and the assumption at the outset of the experiment is that no difference exists between the two groups for the variable of interest (e.g., population means). The null hypothesis proposes something initially presumed true, and it is rejected only when it becomes evidently false. That is, when a researcher has a certain degree of confidence, usually 95% to 99%, that the data do not support the null hypothesis.
-===Null hypothesis===
-A '''Null hypothesis''' is a hypothesis set up to be nullified or refuted in order to support an ''alternate (research) hypothesis''. When used, the null hypothesis is presumed true until statistical evidence in the form of a hypothesis test indicates otherwise.  In science, the null hypothesis is used to test differences in treatment and control groups, and the assumption at the outset of the experiment is that no difference exists between the two groups for the variable being compared. The null hypothesis proposes something initially presumed true. It is rejected only when it becomes evidently false, that is, when the researcher has a certain degree of confidence, usually 95% to 99%, that the data do not support the null hypothesis.
 === An Example===
 If we want to compare the test scores of two random samples of men and women, a null hypothesis would be that the mean score of the male population was the same as the mean score of the female population:
-: ''H''<SUB>0</SUB> : &mu;<SUB>1</SUB> = &mu;<SUB>2</SUB>
+: ''H''<SUB>0</SUB> : &mu;<SUB>men</SUB> = &mu;<SUB>women</SUB>
 where:
 : ''H''<SUB>0</SUB> = the null hypothesis
-: &mu;<SUB>1</SUB> = the mean of population 1, and
+: &mu;<SUB>men</SUB> = the mean of the males (population 1), and
-: &mu;<SUB>2</SUB> = the mean of population 2.
+: &mu;<SUB>women</SUB> = the mean of the females (population 2).
+Alternatively, the null hypothesis can postulate that the two samples are drawn from the same population, so that the [[AP_Statistics_Curriculum_2007#Chapter_II:_Describing.2C_Exploring.2C_and_Comparing_Data | center, variance and shape of the distributions]] are equal.
+Formulation of the null hypothesis is a vital step in testing statistical significance.  Having formulated such a hypothesis, one can establish the probability of observing the obtained data from the prediction of the null hypothesis, if the null hypothesis is true. That probability is what is commonly called the ''significance level'' of the results.
+In many scientific experimental designs we predict that a particular factor will produce an effect on our dependent variable — this is our alternative hypothesis. We then consider how often we would expect to observe our experimental results, or results even more extreme, if we were to take many samples from a population where there was no effect (i.e. we test against our null hypothesis). If we find that this happens rarely (up to, say, 5% of the time), we can conclude that our results support our experimental prediction — we reject our null hypothesis.
-Alternatively, the null hypothesis can postulate that the two samples are drawn from the same population, so that the variance and shape of the distributions are equal, as well as the means............
-Formulation of the null hypothesis is a vital step in testing statistical significance.  Having formulated such a hypothesis, one can establish the probability of observing the obtained data or data more different from the prediction of the null hypothesis, if the null hypothesis is true. That probability is what is commonly called the "significance level" of the results.
+===Type I Error, Type II Error and Power===
+Directly related to hypotheis testing are 3 concepts:
-That is, in scientific experimental design, we may predict that a particular factor will produce an effect on our dependent variable — this is our alternative hypothesis. We then consider how often we would expect to observe our experimental results, or results even more extreme, if we were to take many samples from a population where there was no effect (i.e. we test against our null hypothesis). If we find that this happens rarely (up to, say, 5% of the time), we can conclude that our results support our experimental prediction — we reject our null hypothesis.
+* [http://en.wikipedia.org/wiki/Type_I_error Type I Error]: The '''false positive''' (Type I) error of rejecting the null hypothesis given that it is actually true; e.g., A court finding a person guilty of a crime that they did not actually commit.
+* [http://en.wikipedia.org/wiki/Type_II_error Type II Error]: The Type II error ('''false negative''') is the error of failing to reject the null hypothesis given that the alternative hypothesis is actually true; e.g., A court finding a person not guilty of a crime that they did actually commit.
+* [http://en.wikipedia.org/wiki/Type_I_error Statistical Power]: The '''power of a statistical test''' is the probability that the test will reject a false null hypothesis (that it will not make a Type II error). As power increases, the chances of a Type II error decrease. The probability of a Type II error is referred to as the false negative rate (β). Therefore power is equal to 1 − β.
-===Examples===
+{| class="wikitable" style="text-align: center;
-Computer simulations and real observed data.
+|-
+! colspan=2 rowspan=2|&nbsp;
+! colspan=2 |Actual condition
+|-
+! Present
+! Absent
+|-
+! rowspan=2 |Test<br/>&nbsp;result&nbsp;
+! Positive
+| Condition Present + Positive result = True Positive
+| Condition absent + Positive result = False Positive<br/>'''Type I error'''
+|-
+! &nbsp;Negative&nbsp;
+| Condition present + Negative result = False (invalid) Negative<br/>'''Type II error'''
+| Condition absent + Negative result = True (accurate) Negative
+|}
-* TBD
 ===Hands-on activities===

AP Statistics Curriculum 2007 Hypothesis Basics

From Socr

Revision as of 01:46, 6 February 2008

Contents

General Advance-Placement (AP) Statistics Curriculum - Fundamentals of Hypothesis Testing

Fundamentals of Hypothesis Testing

Null and Alternative (Research) Hypotheses

An Example

Type I Error, Type II Error and Power

Hands-on activities

References

Views

Personal tools

Navigation

Search

Toolbox