The Bivariate Normal Distribution - Athena Sc

The Bivariate Normal Distribution

This is Section 4.7 of the 1st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included in the 2nd edition (2008).

Let U and V be two independent normal random variables, and consider two new random variables X and Y of the form

X = aU + bV, Y = cU + dV, where a, b, c, d, are some scalars. Each one of the random variables X and Y is normal, since it is a linear function of independent normal random variables. Furthermore, because X and Y are linear functions of the same two independent normal random variables, their joint PDF takes a special form, known as the bivariate normal PDF. The bivariate normal PDF has several useful and elegant properties and, for this reason, it is a commonly employed model. In this section, we derive many such properties, both qualitative and analytical, culminating in a closed-form expression for the joint PDF. To keep the discussion simple, we restrict ourselves to the case where X and Y have zero mean.

Jointly Normal Random Variables Two random variables X and Y are said to be jointly normal if they can be expressed in the form

X = aU + bV, Y = cU + dV, where U and V are independent normal random variables.

Note that if X and Y are jointly normal, then any linear combination Z = s1X + s2Y

For the purposes of this section, we adopt the following convention. A random variable which is always equal to a constant will also be called normal, with zero variance, even though it does not have a PDF. With this convention, the family of normal random variables is closed under linear operations. That is, if X is normal, then aX + b is also normal, even if a = 0.

1

2

The Bivariate Normal Distribution

has a normal distribution. The reason is that if we have X = aU + bV and Y = cU + dV for some independent normal random variables U and V , then

Z = s1(aU + bV ) + s2(cU + dV ) = (as1 + cs2)U + (bs1 + ds2)V.

Thus, Z is the sum of the independent normal random variables (as1 + cs2)U and (bs1 + ds2)V , and is therefore normal.

A very important property of jointly normal random variables, and which will be the starting point for our development, is that zero correlation implies independence.

Zero Correlation Implies Independence

If two random variables X and Y are jointly normal and are uncorrelated, then they are independent.

This property can be verified using multivariate transforms, as follows. Suppose that U and V are independent zero-mean normal random variables, and that X = aU + bV and Y = cU + dV , so that X and Y are jointly normal. We assume that X and Y are uncorrelated, and we wish to show that they are independent. Our first step is to derive a formula for the multivariate transform MX,Y (s1, s2) associated with X and Y . Recall that if Z is a zero-mean normal random variable with variance Z2 , the associated transform is

E[esZ ] = MZ (s) = eZ2 s2/2,

which implies that

E[eZ ] = MZ (1) = eZ2 /2.

Let us fix some scalars s1, s2, and let Z = s1X + s2Y . The random variable Z is normal, by our earlier discussion, with variance

Z2 = s21X2 + s22Y2 .

This leads to the following formula for the multivariate transform associated with the uncorrelated pair X and Y :

MX,Y (s1, s2) = E [es1X+s2Y ] = E[eZ] = e(s21X2 +s22Y2 )/2.

Let now X and Y be independent zero-mean normal random variables with the same variances X2 and Y2 as X and Y , respectively. Since X and Y are independent, they are also uncorrelated, and the preceding argument yields

MX,Y (s1, s2) = e(s21X2 +s22Y2 )/2..

The Bivariate Normal Distribution

3

Thus, the two pairs of random variables (X, Y ) and (X, Y ) are associated with the same multivariate transform. Since the multivariate transform completely determines the joint PDF, it follows that the pair (X, Y ) has the same joint PDF as the pair (X, Y ). Since X and Y are independent, X and Y must also be independent, which establishes our claim.

The Conditional Distribution of X Given Y

We now turn to the problem of estimating X given the value of Y . To avoid uninteresting degenerate cases, we assume that both X and Y have positive variance. Let us define

X^ = X Y, Y

X~ = X - X^ ,

where

E[XY ] =

X Y

is the correlation coefficient of X and Y . Since X and Y are linear combinations

of independent normal random variables U and V , it follows that Y and X~ are also linear combinations of U and V . In particular, Y and X~ are jointly normal.

Furthermore,

E[Y X~ ] = E[Y X] - E[Y X^ ] = X Y

-

X Y

Y2

= 0.

Thus, Y and X~ are uncorrelated and, therefore, independent. Since X^ is a scalar multiple of Y , it follows that X^ and X~ are independent.

We have so far decomposed X into a sum of two independent normal ran-

dom variables, namely,

X = X^ + X~ = X Y + X~ . Y

We take conditional expectations of both sides, given Y , to obtain

E[X | Y ] = X E[Y | Y ] + E[X~ | Y ] = X Y = X^ ,

Y

Y

where we have made use of the independence of Y and X~ to set E[X~ | Y ] = 0. We

have therefore reached the important conclusion that the conditional expectation E[X | Y ] is a linear function of the random variable Y .

Using the above decomposition, it is now easy to determine the conditional PDF of X. Given a value of Y , the random variable X^ = X Y /Y becomes

Comparing with the formulas in the preceding section, it is seen that X^ is defined to be the linear least squares estimator of X, and X~ is the corresponding

estimation error, although these facts are not needed for the argument that follows.

4

The Bivariate Normal Distribution

a known constant, but the normal distribution of the random variable X~ is unaffected, since X~ is independent of Y . Therefore, the conditional distribution of X given Y is the same as the unconditional distribution of X~ , shifted by X^ . Since X~ is normal with mean zero and some variance X2~ , we conclude that the conditional distribution of X is also normal with mean X^ and the same variance X2~ . The variance of X~ can be found with the following calculation:

X2~ = E

X - X Y 2 Y

= X2

-

2

X Y

X

Y

+

2

X2 Y2

Y2

= (1 - 2)X2 ,

where we have made use of the property E[XY ] = X Y . We summarize our conclusions below. Although our discussion used the

zero-mean assumption, these conclusions also hold for the non-zero mean case and we state them with this added generality; see the end-of-chapter problems.

Properties of Jointly Normal Random Variables Let X and Y be jointly normal random variables.

? X and Y are independent if and only if they are uncorrelated. ? The conditional expectation of X given Y satisfies

E[X | Y ] = E[X] + X Y - E[Y ] . Y

It is a linear function of Y and has a normal PDF. ? The estimation error X~ = X - E[X | Y ] is zero-mean, normal, and

independent of Y , with variance

X2~ = (1 - 2)X2 .

? The conditional distribution of X given Y is normal with mean E[X | Y ] and variance X2~ .

The Form of the Bivariate Normal PDF

Having determined the parameters of the PDF of X~ and of the conditional PDF of X, we can give explicit formulas for these PDFs. We keep assuming that

The Bivariate Normal Distribution

5

X and Y have zero means and positive variances. Furthermore, to avoid the degenerate where X~ is identically zero, we assume that || < 1. We have

fX~ (x~)

=

fX~ |Y

(x~ | y)

=

1 2 1 - 2 X

e-x~2/2X2~ ,

and

fX|Y

(x | y)

=

1 2 1 -

2

X

- e

x - X y Y

2

/2X2~ ,

where

X2~ = (1 - 2)X2 .

Using also the formula for the PDF of Y,

fY (y)

=

1 2 Y

e-y2/2Y2

,

and the multiplication rule fX,Y (x, y) = fY (y)fX|Y (x | y), we can obtain the joint PDF of X and Y . This PDF is of the form

fX,Y (x, y) = ce-q(x,y),

where the normalizing constant is

c

=

2 1

1 - 2 X Y

.

The exponent term q(x, y) is a quadratic function of x and y,

y2 q(x, y) = 2Y2 +

x - X y 2

Y 2(1 - 2)X2

,

which after some straightforward algebra simplifies to

q(x, y) =

x2 X2

- 2 xy X Y

y2 + Y2

2(1 - 2)

.

An important observation here is that the joint PDF is completely determined by X , Y , and .

In the special case where X and Y are uncorrelated ( = 0), the joint PDF takes the simple form

fX,Y (x, y)

=

1 2X Y

- e

x2 2X2

-

y2 2Y2

,

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download