Chapter 3. Multivariate Distributions. - University of Chicago
3-1
Chapter 3. Multivariate Distributions.
All of the most interesting problems in statistics involve looking at more than a single measurement
at a time, at relationships among measurements and comparisons between them. In order to permit us to
address such problems, indeed to even formulate them properly, we will need to enlarge our mathematical
structure to include multivariate distributions, the probability distributions of pairs of random variables,
triplets of random variables, and so forth. We will begin with the simplest such situation, that of pairs of
random variables or bivariate distributions, where we will already encounter most of the key ideas.
3.1 Discrete Bivariate Distributions.
If X and Y are two random variables defined on the same sample space S; that is, defined in reference
to the same experiment, so that it is both meaningful and potentially interesting to consider how they may
interact or affect one another, we will define their bivariate probability function by
p(x, y) = P (X = x and
Y = y).
(3.1)
In a direct analogy to the case of a single random variable (the univariate case), p(x, y) may be thought of
as describing the distribution of a unit mass in the (x, y) plane, with p(x, y) representing the mass assigned
to the point (x, y), considered as a spike at (x, y) of height p(x, y). The total for all possible points must be
one:
X X
p(x, y) = 1.
(3.2)
all x all y
[Figure 3.1]
Example 3.A. Consider the experiment of tossing a fair coin three times, and then, independently of the
first coin, tossing a second fair coin three times. Let
X = #Heads for the first coin
Y = #Tails for the second coin
Z = #Tails for the first coin.
The two coins are tossed independently, so for any pair of possible values (x, y) of X and Y we have, if
{X = x} stands for the event ¡°X = x¡±,
p(x, y) = P (X = x and Y = y)
= P ({X = x} ¡É {Y = y})
= P ({X = x}) ¡¤ P ({Y = y})
= PX (x) ¡¤ PY (y).
On the other hand, X and Z refer to the same coin, and so
p(x, z) = P (X = x
and Z = z)
= P ({X = x} ¡É {Z = z})
= P ({X = x}) = pX (x) if
=0
z =3?x
otherwise.
This is because we must necessarily have x + z = 3, which means {X = x} and {Z = x ? 3} describe the
same event. If z 6= 3 ? x, then {X = x} and {Z = z} are mutually exclusive and the probability both occur
3-2
is zero. These bivariate distributions can be summarized in the form of tables, whose entries are p(x, y) and
p(x, z) respectively:
y
0
x
1
2
3
z
0
1
2
3
1
64
3
64
3
64
1
64
3
64
9
64
9
64
3
64
3
64
9
64
9
64
3
64
1
64
3
64
3
64
1
64
x
0
1
2
3
0
0
0
0
1
8
1
0
0
3
8
0
2
0
3
8
0
0
3
1
8
0
0
0
p(x, y)
p(x, z)
Now, if we have specified a bivariate probability function such as p(x, y), we can always deduce the
respective univariate distributions from it, by addition:
pX (x) =
X
p(x, y),
(3.3)
p(x, y),
(3.4)
all y
pY (y) =
X
all x
The rationale for these formulae is that we can decompose the event {X = x} into a collection of smaller
sets of outcomes. For example,
{X = x} = {X = x and
Y = 0} ¡È {X = x and
¡¤ ¡¤ ¡¤ ¡È {X = x
Y = 1} ¡È ¡¤ ¡¤ ¡¤
and Y = 23} ¡È ¡¤ ¡¤ ¡¤
where the values of y on the righthand side run through all possible values of Y . But then the events of the
righthand side are mutually exclusive (Y cannot have
P two values at once), so the probability of the righthand
side is the sum of the events¡¯ probabilities, or
p(x, y), while the lefthand side has probability pX (x).
all y
When we refer to these univariate distributions in a multivariate context, we shall call them the marginal
probability functions of X and Y . This name comes from the fact that when the addition in (3.3) or (3.4)
is performed upon a bivariate distribution p(x, y) written in tabular form, the results are most naturally
written in the margins of the table.
Example 3.A (continued). For our coin example, we have the marginal distributions of X, Y , and Z:
y
0
x
1
2
3
Py (y)
z
0
1
2
3
pX (x)
1
64
3
64
3
64
1
64
1
8
3
64
9
64
9
64
3
64
3
8
3
64
9
64
9
64
3
64
3
8
1
64
3
64
3
64
1
64
1
8
1
8
3
8
3
8
1
8
0
x
1
2
3
pX (x)
1
8
3
8
3
8
1
8
0
0
0
0
1
8
1
0
0
3
8
0
2
0
3
8
0
0
3
1
8
1
8
0
0
0
3
8
3
8
1
8
pz (z)
This example highlights an important fact: you can always find the marginal distributions from the
bivariate distribution, but in general you cannot go the other way: you cannot reconstruct the interior of
a table (the bivariate distribution) knowing only the marginal totals. In this example,
? both tables have
?
exactly the same marginal totals, in fact X, Y , and Z all have the same Binomial 3, 21 distribution, but
3-3
the bivariate distributions are quite different. The marginal distributions pX (x) and pY (y) may describe our
uncertainty about the possible values, respectively, of X considered separately, without regard to whether
or not Y is even observed, and of Y considered separately, without regard to whether or not X is even
observed. But they cannot tell us about the relationship between X and Y , they alone cannot tell us
whether X and Y refer to the same coin or to different coins. However, the example also gives a hint as to
just what sort of information is needed to build up a bivariate distribution from component parts. In one
case the knowledge that the two coins were independent gave us p(x, y) = pX (x) ¡¤ pY (y); in the other case
the complete dependence of Z on X gave us p(x, z) = pX (x) or 0 as z = 3 ? x or not. What was needed was
information about how the knowledge of one random variable¡¯s outcome may affect the other: conditional
information. We formalize this as a conditional probability function, defined by
p(y|x) = P (Y = y|X = x),
(3.5)
which we read as ¡°the probability that Y = y given that X = x.¡± Since ¡°Y = y¡± and ¡°X = x¡± are events,
this is just our earlier notion of conditional probability re-expressed for discrete random variables, and from
(1.7) we have that
p(y|x) = P (Y = y|X = x)
P (X = x and Y = y)
=
P (X = x)
p(x, y)
=
,
pX (x)
(3.6)
as long as pX (x) > 0, with p(y|x) undefined for any x with pX (x) = 0.
If p(y|x) = pY (y) for all possible pairs of values (x, y) for which p(y|x) is defined, we say X and Y
are independent variables. From (3.6), we would equivalently have that X and Y are independent random
variables if
p(x, y) = pX (x) ¡¤ pY (y), for all x, y.
(3.7)
Thus X and Y are independent only if all pairs of events ¡°X = x¡± and ¡°Y = y¡± are independent; if (3.7)
should fail to hold for even a single pair (xo , yo ), X and Y would be dependent. In Example 3.A, X and Y
are independent, but X and Z are dependent. For example, for x = 2, p(z|x) is given by
p(z|2) =
p(2, z)
pX (2)
=1
if z = 1
=0
otherwise,
so p(z|x) 6= pZ (z) for x = 2, z = 1 in particular (and for all other values as well).
By using (3.6) in the form
p(x, y) = pX (x)p(y|x) for all
x, y,
(3.8)
it is possible to construct a bivariate distribution from two components: either marginal distribution and the
conditional distribution of the other variable given the one whose marginal distribution is specified. Thus
while marginal distributions are themselves insufficient to build a bivariate distribution, the conditional
probability function captures exactly what additional information is needed.
3-4
3.2 Continuous Bivariate Distributions.
The distribution of a pair of continuous random variables X and Y defined on the same sample space
(that is, in reference to the same experiment) is given formally by an extension of the device used in the
univariate case, a density function. If we think of the pair (X, Y ) as a random point in the plane, the
bivariate probability density function f (x, y) describes a surface in 3-dimensional space, and the probability
that (X, Y ) falls in a region in the plane is given by the volume over that region and under the surface
f (x, y). Since volumes are given as double integrals, the rectangular region with a < X < b and c < Y < d
has probability
Z dZ b
P (a < X < b and c < Y < d) =
f (x, y)dxdy.
(3.9)
c
a
[Figure 3.3]
It will necessarily be true of any bivariate density that
f (x, y) ¡Ý 0
and
Z
Z
¡Þ
for all
x, y
(3.10)
¡Þ
f (x, y)dxdy = 1,
?¡Þ
(3.11)
?¡Þ
that is, the total volume between the surface f (x, y) and the x ? y plane is 1. Also, any function f (x, y)
satisfying (3.10) and (3.11) describes a continuous bivariate probability distribution.
It can help the intuition to think of a continuous bivariate distribution as a unit mass resting squarely
on the plane, not concentrated as spikes at a few separated points, as in the discrete case. It is as if the mass
is made of a homogeneous substance, and the function f (x, y) describes the upper surface of the mass.
If we are given a bivariate probability density f (x, y), then we can, as in the discrete case, calculate the
marginal probability densities of X and of Y ; they are given by
Z
¡Þ
fX (x) =
f (x, y)dy
for all
x,
(3.12)
f (x, y)dx
for all
y.
(3.13)
?¡Þ
Z
¡Þ
fY (y) =
?¡Þ
Just as in the discrete case, these give the probability densities of X and Y considered separately, as
continuous univariate random variables.
The relationships (3.12) and (3.13) are rather close analogues to the formulae for the discrete case, (3.3)
and (3.4). They may be justified as follows: for any a < b, the events ¡°a < X ¡Ü b¡± and ¡°a < X ¡Ü b and
?¡Þ < Y < ¡Þ¡± are in fact two ways of describing the same event. The second of these has probability
Z
¡Þ
Z
Z
b
f (x, y)dxdy =
?¡Þ
a
b
Z
¡Þ
f (x, y)dydx
a
?¡Þ
b ¡¤Z ¡Þ
a
?¡Þ
Z
=
?
f (x, y)dy dx.
We must therefore have
Z
b
¡¤Z
¡Þ
P (a < X ¡Ü b) =
a
?
f (x, y)dy dx for all
a < b,
?¡Þ
R¡Þ
and thus ?¡Þ f (x, y)dy fulfills the definition of fX (x) (given in Section 1.7): it is a function of x that gives
the probabilities of intervals as areas, by integration.
3-5
In terms of the mass interpretation of bivariate densities, (3.12) amounts to looking at the mass ¡°from
the side,¡± in a direction parallel to the y axis. The integral
Z
x+dx
¡¤Z
?
¡Þ
Z
¡Þ
f (u, y)dy du ¡Ö
x
?¡Þ
f (x, y)dy ¡¤ dx
?¡Þ
gives the totalRmass (for the entire range of y) between x and x + dx, and so, just as in the univariate case,
¡Þ
the integrand ?¡Þ f (x, y)dy gives the density of the mass at x.
Example 3.B. Consider the bivariate density function
?
f (x, y) = y
=0
?
1
? x + x for 0 < x < 1,
2
0 ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- the gaussian distribution washington university in st louis
- joint and marginal distributions university of arizona
- 7 joint marginal and conditional distributions
- chapter 3 multivariate distributions university of chicago
- the marginal distribution an example using the normal distributions
- bayesian inference chapter 9 linear models and regression
- chapters 5 multivariate probability distributions brown university
- chapter 5 multivariate probability distributions umass
- formal modeling in cognitive science school of informatics
- joint distributions discrete case university of illinois urbana
Related searches
- university of chicago ranking
- university of chicago admissions staff
- why university of chicago essay
- university of chicago ranking 2019
- university of chicago essay examples
- university of chicago essay prompt
- university of chicago sample essays
- university of chicago essay prompts 2019
- university of chicago supplemental essay
- university of chicago essays
- university of chicago past prompts
- university of chicago urbana champaign