Linear Regression with One Regressor

[Pages:13]Linear Regression with One Regressor

Michael Ash Lecture 12

Goodness of Fit

What fraction of the variation in Y is explained by X ? Reminder (by definition)

Yi = Y^i + u^i

Total Sum of Squares (TSS) expresses the total variation in Yi (ignoring X ) around the mean of Y :

n

TSS = (Yi - Y )2

i =1

Explained Sum of Squares (ESS) expresses the variation in Y^i , the prediction of Yi using X , around the mean of Y :

n

ESS = (Y^i - Y )2

i =1

The R2 (R-squared) I

If the variation of the prediction of Yi using X captures a lot of the overall variation in variation in Yi , then the regression has high explanatory value. In a perfect regression, because Y^i = Yi , the variation of the prediction of Yi using X would capture all of the overall variation in variation in Yi .

R2 = ESS TSS

0 R2 1

The R2 (R-squared) II

The Sum of Squared Residuals (SSR) expresses the variation in Yi around the mean of Y not predicted by Y^i .

n

SSR =

u^i2

i =1

TSS = ESS + SSR

All of the variation can be decomposed into the explained and unexplained variation. (This is not self-evident and depends on the absence of correlation between the explained and unexplained portions). In the worst possible regression, Y^i = Y the variation of the prediction of Yi using X would capture none of the overall variation in variation in Yi .

R2

=

SSR 1-

TSS

0 R2 1

The R2 (R-squared) III

In bivariate regression, R 2 = r 2, R-squared is the square of the correlation between X and Y , a direct measure of how well a line fits the data.

The R2 (R-squared) IV

Three competing regressions

1. No-information regression: ignore X ; always predict same Y .

Yi = ?Y + vi Y^i = Y v^i = Yi - Y

2. OLS regression: does X add any explanatory value?

Yi = 0 + 1Xi + ui Y^i = ^0 + ^1Xi u^i = Yi - Y^i

3. Magical regression: know Yi perfectly

Y^i = Yi w^i = 0

Method for ranking regressions

n

SSR1 =

v^i2, R2 = 0

i =1

n

SSR2 =

u^i2

i =1

n

SSR3 =

w^i2 = 0, R2 = 1

i =1

R2 expresses how OLS (method 2) fares between method 1 (guessing the mean every time) and method 3 (predicting all of the Yi perfectly).

What's a "good" R2?

Completely context dependent Time-series macroeconomics: typical R 2 0.9 Models of individual wages: typical R 2 0.3

large and significant but R 2 low Lots of individual randomness (ui ) in the data Regression results useful for average (budgeting, etc.) but not individual prediction

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download