AP Statistics Curriculum 2007 MultivariateNormal

From Socr

(Difference between revisions)
Jump to: navigation, search
(Created page with '== EBook - Multivariate Normal Distribution== The multivariate normal distribution, or multivariate Gaussian distribution, is a generalization…')
(Bivariate (2D) case)
 
(12 intermediate revisions not shown)
Line 4: Line 4:
=== Definition===
=== Definition===
-
In k-dimensions, a random vector <math>X = (X<sub>1</sub>, \cdots, X<sub>k</sub>)</math> is multivariate normally distributed if it satisfies any one of the following ''equivalent'' conditions <ref>Gut, Allan: An Intermediate Course in Probability, Springer 2009, chapter 5, http://books.google.com/books?id=ufxMwdtrmOAC, ISBN 9781441901613</ref>:
+
In k-dimensions, a random vector <math>X = (X_1, \cdots, X_k)</math> is multivariate normally distributed if it satisfies any one of the following ''equivalent'' conditions (Gut, 2009):
-
* Every linear combination of its components ''Y''&nbsp;=&nbsp;''a''<sub>1</sub>''X''<sub>1</sub> + … + ''a<sub>k</sub>X<sub>k</sub>'' is [[AP_Statistics_Curriculum_2007_Normal_Prob|normally distributed]]. In other words, for any constant vector {{nowrap|''a'' ∈ '''R'''<sup>''k''</sup>}}, the linear combination (which is univariate random variable) <math>Y = a′X = \sum_{i=1,\cdots,k}{a_iX_i}</math> has a univariate normal distribution.
+
* Every linear combination of its components ''Y''&nbsp;=&nbsp;''a''<sub>1</sub>''X''<sub>1</sub> + … + ''a<sub>k</sub>X<sub>k</sub>'' is [[AP_Statistics_Curriculum_2007_Normal_Prob|normally distributed]]. In other words, for any constant vector <math>a\in R^k</math>, the linear combination (which is univariate random variable) <math>Y = a^TX = \sum_{i=1}^{k}{a_iX_i}</math> has a univariate normal distribution.
-
* There exists a random ''ℓ''-vector ''Z'', whose components are independent normal random variables, a ''k''-vector ''μ'', and a ''k×ℓ'' [[matrix (math)|matrix]] ''A'', such that {{nowrap|1=''X'' = ''AZ'' + ''μ''}}. Here ''ℓ'' is the ''rank'' of the covariance matri
+
* There exists a random ''ℓ''-vector ''Z'', whose components are independent normal random variables, a ''k''-vector ''μ'', and a ''k×ℓ'' matrix ''A'', such that <math>X = AZ + \mu</math>. Here ''ℓ'' is the ''rank'' of the variance-covariance matrix.
* There is a ''k''-vector ''μ'' and a symmetric, nonnegative-definite ''k×k'' matrix Σ, such that the characteristic function of ''X'' is  
* There is a ''k''-vector ''μ'' and a symmetric, nonnegative-definite ''k×k'' matrix Σ, such that the characteristic function of ''X'' is  
: <math>
: <math>
-
     \varphi_X(u) = \exp\Big( iu'\mu - \tfrac{1}{2} u'\Sigma u \Big).
+
     \varphi_X(u) = \exp\Big( iu^T\mu - \tfrac{1}{2} u^T\Sigma u \Big).
   </math>
   </math>
Line 18: Line 18:
: <math>
: <math>
     f_X(x) = \frac{1}{ (2\pi)^{k/2}|\Sigma|^{1/2} }
     f_X(x) = \frac{1}{ (2\pi)^{k/2}|\Sigma|^{1/2} }
-
             \exp\!\Big( {-\tfrac{1}{2}}(x-\mu)'\Sigma^{-1}(x-\mu) \Big),
+
             \exp\!\Big( {-\tfrac{1}{2}}(x-\mu)'\Sigma^{-1}(x-\mu) \Big)
-
   </math>
+
   </math>, where |Σ| is the determinant of Σ, and where (2π)<sup>''k''/2</sup>|Σ|<sup>1/2</sup> = |2πΣ|<sup>1/2</sup>.  This formulation reduces to the density of the univariate normal distribution if Σ is a scalar (i.e., a 1×1&nbsp;matrix).
-
where |Σ| is the determinant of Σ, and where (2π)<sup>''k''/2</sup>|Σ|<sup>1/2</sup> = |2πΣ|<sup>1/2</sup>.  This formulation reduces to the density of the univariate normal distribution if Σ is a scalar (i.e., a 1×1&nbsp;matrix).
+
If the variance-covariance matrix is singular, the corresponding distribution has no density.  An example of this case is the distribution of the vector of residual-errors in the ordinary least squares regression.  Note also that the ''X''<sub>''i''</sub> are in general ''not'' independent; they can be seen as the result of applying the matrix ''A'' to a collection of independent Gaussian variables ''Z''.
If the variance-covariance matrix is singular, the corresponding distribution has no density.  An example of this case is the distribution of the vector of residual-errors in the ordinary least squares regression.  Note also that the ''X''<sub>''i''</sub> are in general ''not'' independent; they can be seen as the result of applying the matrix ''A'' to a collection of independent Gaussian variables ''Z''.
-
 
-
<center>[[Image:SOCR_EBook_Dinov_RV_Normal_013108_Fig14.jpg|500px]]</center>
 
===Bivariate (2D) case===
===Bivariate (2D) case===
-
In 2-dimensions, the nonsingular bi-variate Normal distribution with ({{nowrap|1=''k'' = rank(Σ) = 2}}), the probability density function of a (bivariate) vector {{nowrap|[''X'' ''Y'']′}} is
+
: See the SOCR Bivariate Normal Distribution [[SOCR_BivariateNormal_JS_Activity| Activity]] and corresponding [http://socr.ucla.edu/htmls/HTML5/BivariateNormal/ Webapp].
 +
 
 +
In 2-dimensions, the nonsingular bi-variate Normal distribution with (<math>k=rank(\Sigma) = 2</math>), the probability density function of a (bivariate) vector (X,Y) is
: <math>
: <math>
     f(x,y) =
     f(x,y) =
Line 46: Line 45:
   </math>
   </math>
-
In the bivariate case, the first equivalent condition for multivariate normality is less restrictive: it is sufficient to verify that countably many distinct linear combinations of X and Y are normal in order to conclude that the vector {{nowrap|[X Y]′}} is bivariate normal.
+
In the bivariate case, the first equivalent condition for multivariate normality is less restrictive: it is sufficient to verify that countably many distinct linear combinations of X and Y are normal in order to conclude that the vector <math> [ X, Y ] ^T</math> is bivariate normal.
===Properties===
===Properties===
Line 53: Line 52:
====Two normally distributed random variables need not be jointly bivariate normal====
====Two normally distributed random variables need not be jointly bivariate normal====
-
The fact that two random variables ''X'' and ''Y'' both have a normal distribution does not imply that the pair (''X'',&nbsp;''Y'') has a joint normal distribution.  A simple example is one in which X has a normal distribution with expected value 0 and variance 1, and ''Y''&nbsp;=&nbsp;''X'' if |''X''|&nbsp;>&nbsp;''c'' and ''Y''&nbsp;=&nbsp;−''X'' if |''X''|&nbsp;<&nbsp;''c'', where ''c'' is about 1.54.
+
The fact that two random variables ''X'' and ''Y'' both have a normal distribution does not imply that the pair (''X'',&nbsp;''Y'') has a joint normal distribution.  A simple example is provided below:
 +
: Let X ~ N(0,1).
 +
: Let <math>Y = \begin{cases} X,& |X| > 1.33,\\
 +
-X,& |X| \leq 1.33.\end{cases}</math>
 +
Then, both X and Y are individually Normally distributed; however, the pair (X,Y) is '''not''' jointly bivariate Normal distributed (of course, the constant c=1.33 is not special, any other non-trivial constant also works).
 +
 +
Furthermore, as X and Y are not independent, the sum Z = X+Y is not guaranteed to be a (univariate) Normal variable. In this case, it's clear that Z is not Normal:
 +
: <math>Z = \begin{cases} 0,& |X| \leq 1.33,\\
 +
2X,& |X| > 1.33.\end{cases}</math>
 +
 +
===Applications===
 +
[[SOCR_EduMaterials_Activities_2D_PointSegmentation_EM_Mixture| This SOCR activity demonstrates the use of 2D Gaussian distribution, expectation maximization and mixture modeling for classification of points (objects) in 2D]].
===[[EBook_Problems_MultivariateNormal|Problems]]===
===[[EBook_Problems_MultivariateNormal|Problems]]===
Line 61: Line 71:
===References===
===References===
-
 
+
* Gut, A. (2009): [http://books.google.com/books?id=ufxMwdtrmOAC An Intermediate Course in Probability, Springer 2009, chapter 5, ISBN 9781441901613].
<hr>
<hr>

Current revision as of 00:00, 22 July 2012

Contents

EBook - Multivariate Normal Distribution

The multivariate normal distribution, or multivariate Gaussian distribution, is a generalization of the univariate (one-dimensional) normal distribution to higher dimensions. A random vector is said to be multivariate normally distributed if every linear combination of its components has a univariate normal distribution. The multivariate normal distribution may be used to study different associations (e.g., correlations) between real-valued random variables.

Definition

In k-dimensions, a random vector X = (X_1, \cdots, X_k) is multivariate normally distributed if it satisfies any one of the following equivalent conditions (Gut, 2009):

  • Every linear combination of its components Y = a1X1 + … + akXk is normally distributed. In other words, for any constant vector a\in R^k, the linear combination (which is univariate random variable) Y = a^TX = \sum_{i=1}^{k}{a_iX_i} has a univariate normal distribution.
  • There exists a random -vector Z, whose components are independent normal random variables, a k-vector μ, and a k×ℓ matrix A, such that X = AZ + μ. Here is the rank of the variance-covariance matrix.
  • There is a k-vector μ and a symmetric, nonnegative-definite k×k matrix Σ, such that the characteristic function of X is

    \varphi_X(u) = \exp\Big( iu^T\mu - \tfrac{1}{2} u^T\Sigma u \Big).
  • When the support of X is the entire space Rk, there exists a k-vector μ and a symmetric positive-definite k×k variance-covariance matrix Σ, such that the probability density function of X can be expressed as

    f_X(x) = \frac{1}{ (2\pi)^{k/2}|\Sigma|^{1/2} }
             \exp\!\Big( {-\tfrac{1}{2}}(x-\mu)'\Sigma^{-1}(x-\mu) \Big)
  , where |Σ| is the determinant of Σ, and where (2π)k/2|Σ|1/2 = |2πΣ|1/2. This formulation reduces to the density of the univariate normal distribution if Σ is a scalar (i.e., a 1×1 matrix).

If the variance-covariance matrix is singular, the corresponding distribution has no density. An example of this case is the distribution of the vector of residual-errors in the ordinary least squares regression. Note also that the Xi are in general not independent; they can be seen as the result of applying the matrix A to a collection of independent Gaussian variables Z.

Bivariate (2D) case

See the SOCR Bivariate Normal Distribution Activity and corresponding Webapp.

In 2-dimensions, the nonsingular bi-variate Normal distribution with (k = rank(Σ) = 2), the probability density function of a (bivariate) vector (X,Y) is


    f(x,y) =
      \frac{1}{2 \pi  \sigma_x \sigma_y \sqrt{1-\rho^2}}
      \exp\left(
        -\frac{1}{2(1-\rho^2)}\left[
          \frac{(x-\mu_x)^2}{\sigma_x^2} +
          \frac{(y-\mu_y)^2}{\sigma_y^2} -
          \frac{2\rho(x-\mu_x)(y-\mu_y)}{\sigma_x \sigma_y}
        \right]
      \right),

where ρ is the correlation between X and Y. In this case,


    \mu = \begin{pmatrix} \mu_x \\ \mu_y \end{pmatrix}, \quad
    \Sigma = \begin{pmatrix} \sigma_x^2 & \rho \sigma_x \sigma_y \\
                             \rho \sigma_x \sigma_y  & \sigma_y^2 \end{pmatrix}.

In the bivariate case, the first equivalent condition for multivariate normality is less restrictive: it is sufficient to verify that countably many distinct linear combinations of X and Y are normal in order to conclude that the vector [X,Y]T is bivariate normal.

Properties

Normally distributed and independent

If X and Y are normally distributed and independent, this implies they are "jointly normally distributed", hence, the pair (XY) must have bivariate normal distribution. However, a pair of jointly normally distributed variables need not be independent - they could be correlated.

Two normally distributed random variables need not be jointly bivariate normal

The fact that two random variables X and Y both have a normal distribution does not imply that the pair (XY) has a joint normal distribution. A simple example is provided below:

Let X ~ N(0,1).
Let Y = \begin{cases} X,& |X| > 1.33,\\
-X,& |X| \leq 1.33.\end{cases}

Then, both X and Y are individually Normally distributed; however, the pair (X,Y) is not jointly bivariate Normal distributed (of course, the constant c=1.33 is not special, any other non-trivial constant also works).

Furthermore, as X and Y are not independent, the sum Z = X+Y is not guaranteed to be a (univariate) Normal variable. In this case, it's clear that Z is not Normal:

Z = \begin{cases} 0,& |X| \leq 1.33,\\
2X,& |X| > 1.33.\end{cases}

Applications

This SOCR activity demonstrates the use of 2D Gaussian distribution, expectation maximization and mixture modeling for classification of points (objects) in 2D.

Problems


References




Translate this page:

(default)

Deutsch

Español

Français

Italiano

Português

日本語

България

الامارات العربية المتحدة

Suomi

इस भाषा में

Norge

한국어

中文

繁体中文

Русский

Nederlands

Ελληνικά

Hrvatska

Česká republika

Danmark

Polska

România

Sverige

Personal tools