Chapter 10 Joint densities - Yale University

[Pages:7]Page 1

Chapter 10

Joint densities

Consider the general problem of describing probabilities involving two random variables, X and Y . If both have discrete distributions, with X taking values x1, x2, . . . and Y taking values y1, y2, . . ., then everything about the joint behavior of X and Y can be deduced from the set of probabilities

P{X = xi , Y = yj } for i = 1, 2, . . . and j = 1, 2, . . .

We have been working for some time with problems involving such pairs of random variables, but we have not needed to formalize the concept of a joint distribution. When both X and Y have continuous distributions, it becomes more important to have a systematic way to describe how one might calculate probabilities of the form P{(X, Y ) B} for various subsets B of the plane. For example, how could one calculate P{X < Y } or P{X 2 + Y 2 9} or P{X + Y 7}?

?joint density

Definition. Say that random variables X and Y have a jointly continuous distribution with joint density function f (?, ?) if

P{(X, Y ) B} = {(x, y) B} f (x, y) d x d y

for each subset B of R2. In particular, for a small region around a point (x0, y0), P{(x, y) } (area of ) f (x0, y0),

at least if f is continuous at (x0, y0).

To ensure that P{(X, Y ) B} is nonnegative and that it equals one when B is the whole of R2, we must require

f 0 and

{(x, y) R2} f (x, y) d x d y = 1.

Apart from the replacement of single integrals by double integrals, and the replacement of intervals of small length by regions of small area, the definition of a joint density is the same as the definition for densities on the real line in Chapter 6.

The small region can be chosen in many ways--small rectangles, small disks, small blobs, small shapes that don't have any particular name--whatever suits the needs of a particular calculation.

Example. When X has density g(x) and Y has density h(y), and X is independent of Y , the joint density is particularly easy to calculate. Let be a small rectangle with one corner at (x0, y0) and small sides of length x > 0 and y > 0:

= {(x, y) R2 : x0 x x0 + x , y0 y y0 + y}

By independence,

P{(X, Y ) } = P{x0 X x0 + x }P{y0 Y y0 + y}

Statistics 241: 28 October 1997

c David Pollard

Chapter 10

Joint densities

?marginal densities

Invoke the defining property of the densities g and h to approximate the last product by

(g(x0)x + smaller order terms) h(y0)y + smaller order terms x y g(x0)h(y0)

Thus f (x0, y0) = g(x0)h(y0). That is, the joint density f is the product of the marginal densities g and h. The word marginal is used here to distinguish the joint density for (X, Y ) from the individual densities g and h.

Page 2

When pairs of random variables are not independent it takes more work to find a joint density. The prototypical case, where new random variables are constructed as linear functions of random variables with a known joint density, illustrates a general method for deriving joint densities.

Exercise. Suppose X and Y have a jointly continuous distribution with joint density f (x, y). For constants a, b, c, d with ad - bc = 0 define

U = a X + bY and V = cX + dY

Find the joint density function (u, v) for (U, V ).

Solution: Think of the pair (U, V ) are defining a new random point in R2. That is (U, V ) = T (X, Y ), where T maps the point (x, y) R2 to the point (u, v) R2 with

u = ax + by and v = cx + dy,

or in matrix notation,

(u, v) = (x, y)A

where A =

a b

c d

Notice that det A = ad -bc. The assumption that ad -bc = 0 ensures that the transformation is invertible:

(u, v) A-1 = (x, y)

where A-1 = 1 ad - bc

d -b

-c a

That is,

du - bv

-cu + av

= x and

=y

ad - bc

ad - bc

Notice that det A-1 = 1/(ad - bc) = 1/(det A) It helps to distinguish between the two roles for R2, referring to the domain of T as the

(X, Y )-plane and the range as the (U, V )-plane.

The joint density function (u, v) is characterized by the property that

P{u0 U u0 + u, v0 V v + v} (u0, v0)uv

for each (u0, v0) in the (U, V )-plane, and small (u, v). To calculate the probability on the left-hand side we need to find the region R in the (X, Y )-plane corresponding to the small rectangle , with corners at (u0, v0) and (u0 + u, v0 + v), in the (U, V )-plane.

The linear transformation A-1 maps parallel straight lines in the (U, V )-plane into parallel straight lines in the (X, Y )-plane. The region R must be a parallelogram, with vertices

(x0, y0 + y) = (u0, v0 + v) A-1 and (x0 + x , y0 + y) = (u0 + u, v0 + v) A-1 (x0, y0) = (u0, v0) A-1 and (x0 + x , y0) = (u0 + u, v0) A-1

More succinctly, (x , y) = (u, v) A-1 = uu + vv

where A-1 has rows u and v.

Statistics 241: 28 October 1997

c David Pollard

Chapter 10

(X,Y)-plane

Joint densities

(U,V)-plane

Page 3

(x0,y0)+vv

R

(x0,y0)

(x0,y0)+uu+vv (x0,y0)+uu

(u0+u,v0+v)

(u0,v0)

From the formula in the Appendix, the parallelogram R has area

det uu , vv

= uv| det

A-1 | = u v | det A|

For small u > 0 and v > 0,

(u0, v0)uv P{(U, V ) }

= P{(X, Y ) R}

(area of R) f (x0, y0)

uv f (x0, y0)/| det( A)|

It follows that (U, V ) have joint density

1

(u, v) =

f (x, y)

| det A|

where (x, y) = (u, v) A-1

In effect, we have calculated a Jacobian by first principles.

Example. Suppose X and Y are independent random variables, each distributed N (0, 1). By Example , the joint density for (X, Y ) equals

1

x2 + y2

f (x, y) = exp -

2

2

By Exercise , the joint distribution of the random variables

U = a X + bY and V = cX + dY

has the joint density

1

1 du - bv 2 1 -cu + av 2

(u, v) =

exp -

-

2(ad - bc)

2 ad - bc

2 ad - bc

1

(c2 + d2)u2 - 2(db + ac)uv + (a2 + b2)v2

=

exp -

2(ad - bc)

2(ad - bc)2

You'll learn more about joint normal distributions in Chapter 12.

The calculations in Exercise for linear transformations gives a good approximation for more general smooth transformations when applied to small regions. Densities describe the behaviour of distributions in small regions; in small regions smooth transformations are approximately linear; the density formula for linear transformations gives the density formula for smooth transformations in small regions.

Exercise. Suppose X and Y are independent random variables, with X having a gamma() distribution and Y having a gamma() distribution. Show that X/(X + Y ) has a beta(, ) distribution, independent of X + Y , which has a gamma( + ) distribution.

Solution: Write U for X/(X +Y ) and V for X +Y . The pair (X, Y ) takes values ranging over the positive quadrant (0, )2, with joint density function

x -1e-x y-1e-y

f (x, y) =

?

()

()

for x > 0, y > 0.

Statistics 241: 28 October 1997

c David Pollard

Chapter 10

Joint densities

?Jacobian matrix

The pair (U, V ) takes values in the strip (0, 1) (0, ). That is, 0 < U < 1 and 0 < V < . The joint density function, (u, v), for (U, V ) remains to be determined.

Consider (?, ?) near a point (u0, v0) in the strip. If U = u0 and V = v0 then

X = U V = u0v0 and Y = V - U V = (1 - u0)v0

Moreover, (U, V ) lies near (u0, v0) when (X, Y ) lies near (x0, y0), where

x0 = u0v0 and y0 = (1 - u0)v0

Notice how each (u0, v0) with 0 < u0 < 1 and 0 < v0 < corresponds to a unique (x0, y0) with 0 < x0 < and 0 < y0 < .

For small positive u and v, determine the region R in the (X, Y ) quadrant corresponding to the small rectangle

= {(u, v) : u0 u u0 + u, v0 v v0 + v}

in the (U, V ) strip. First locate the points corresponding to the corners of .

(u0 + u, v0) (x0, y0) + (uv0, -uv0) (u0, v0 + v) (x0, y0) + (vu0, v(1 - u0)) (u0 + u, v0 + v) (x0, y0) + (uv0 + vu0, -uv0 + v(1 - u0)) + (uv, -uv)

In matrix notation,

(u0, v0) + (u, 0) (x0, y0) + (u, 0) J (u0, v0) + (0, v) (x0, y0) + (0, v) J

where J =

v0 u0

-v0 1 - u0

(u0, v0) + (u, v) (x0, y0) + (u, v)J + smaller order terms

You might recognize J as the Jacobian matrix of partial derivatives

x y u u

x y v v evaluated at (u0, v0). For small perturbations, the transformation from (u, v) to (x, y) is ap-

proximately linear.

(X,Y)-quadrant

(U,V)-strip

Page 4

(x0,y0) R

v0+v

v0

u0 u0+u

The region R is approximately a rectangle, with the edges oblique to the coordinate axes. To a good approximation, the area of R is equal to uv times the area of the rectangle with corners at

(0, 0) and a = (v0, -v0) and b = (u0, 1 - u0) and a + b

From the Appendix, the area of this rectangle equals | det(J )| = v0. The rest of the calculation of the joint density (?, ?) for (U, V ) is easy:

uv(u0, v0) P{(U, V ) } = P{(X, Y ) R}

Statistics 241: 28 October 1997

c David Pollard

Chapter 10

Joint densities

beta vs. gamma

f (x0, y0)(area of R)

x0-1e-x0 ()

y0-1e-y0 ()

u v v0

Substitute x0 = u0v0 and y0 = (1 - u0)v0, then rearrange factors, to get the joint density

(u0, v0)

=

u

-1 0

v0-1

e-u0

v0

()

(1

-

u 0 ) -1 v0 -1 e-v0 +u 0 v0 ()

v0

If we write

u-1(1 - u)-1 g(u) =

B(, )

the beta(, ) density

v+-1e-v h(v) =

( + )

the gamma( + ) density

then

B(, )( + )

(u, v) = g(u)h(v)

for 0 < u < 1 and 0 < v <

()()

I have dropped the subscripting zeros because I no longer need to keep your attention fixed on a particular (u0, v0) in the (U, V ) strip. The jumble of constants involving beta and gamma functions must reduce to the constant 1, because

1 = P{0 < U < 1, 0 < V < }

= {0 < u < 1, 0 < v < }(u, v) du dv

1

B(, )( + )

= g(u) du h(v) dv

0

0

()()

Notice how the double integral has split into a product of two single integrals because the

joint density factorized into a product of a function of u and a function of v. Both the single

integrals equal 1 because both g and h are density functions. We have earned a bonus,

()() B(, ) =

( + )

for > 0 and > 0

which is a useful expression relating beta and gamma functions. The factorization of the joint density implies that the random variables U and V are in-

dependent. To see why, consider any pair of subsets A and B of the real line. The defining property of the joint density gives

P{U A} = P{U A, 0 < V < }

= {u A, 0 < v < }(u, v) du dv

Page 5

= {u A}U (u) du

where U (u) =

0

(u,

v)

d

v.

That

is, we get

the marginal

density

U (u) for U

by

integrating the joint density with respect to V over its whole range. Specifically,

U (u) = g(u)h(v)dv = g(u)

0

That is,

U has a beta(, ) distribution.

Similarly, V has a continuous distribution with density

1

1

V (v) = (u, v) du = g(u)h(v) du = h(v)

0

0

That is, V has a gamma( + ) distribution.

Statistics 241: 28 October 1997

c David Pollard

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download