Decomposing Variance - University of Michigan

Decomposing Variance

Kerby Shedden

Department of Statistics, University of Michigan

October 10, 2021

1 / 40

Law of total variation

For any regression model involving a response y R and a covariate vector x Rp, we can decompose the marginal variance of y as follows:

var(y ) = varx E [y |x = x] + Ex var[y |x = x].

If the population is homoscedastic, var[y |x] does not depend on x, so we can simply write var[y |x] = 2, and we get var(y ) = varx E [y |x] + 2. If the population is heteroscedastic, var[y |x = x] is a function 2(x) with expected value 2 = Ex 2(x), and again we get var(y ) = varx E [y |x] + 2.

If we write y = f (x) + with E [ |x] = 0, then E [y |x] = f (x), and varx E [y |x] summarizes the variation of f (x) over the marginal distribution of x.

2 / 40

Law of total variation

4

3

E(Y|X)

2

1

0

10

1

X2

3

4

Orange curves: conditional distributions of y given x Purple curve: marginal distribution of y Black dots: conditional means of y given x

3 / 40

Pearson correlation

The population Pearson correlation coefficient of two jointly distributed random variables x R and y R is

cov(x, y ) xy x y . Given data y = (y1, . . . , yn) and x = (x1, . . . , xn) , the Pearson correlation coefficient is estimated by

cov(x, y ) ^xy = ^x ^y =

i (xi - x?)(yi - y?)

(x - x?) (y - y?)

=

.

i (xi - x?)2 ? i (yi - y?)2 x - x? ? y - y?

When we write y - y? here, this means y - y? ? 1, where 1 is a vector of 1's, and y? is a scalar.

4 / 40

Pearson correlation

By the Cauchy-Schwartz inequality, -1 xy 1 -1 ^xy 1.

The sample correlation coefficient is slightly biased, but the bias is so small that it is usually ignored.

5 / 40

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download