Inferential Methods in Regression and Correlation

[Pages:48]Inferential Methods in Regression and Correlation

Chapter 11

Back to Ch.3 (Linear Regression):

? Recall Simple Linear Regression:

? Fit a line in the data when you see a linear trend ? Minimizing the errors using LS method ? Get estimates of slope and intercept accordingly ? Random residuals

? In this chapter, we introduce regression as a sample that we want to draw inference on

Concept

? Remember when we used X to estimate ?? ? What did we do?

? Used confidence intervals to give a guess where ? falls

? Used hypothesis testing to check specific hypotheses for ?

? Treat the regression line similarly

? Need to understand the sample distribution again!

Regression line

? We learned the regression as = a + bx

? That is the sample regression line

? The true regression line we write as a model:

yi = + xi + ei ? In this model:

?

eeri rios rt,hiet

"error" term for the ith observation, without is called the population regression line

this

? this means that, without the error term, every point would fall exactly on the line

? ei is assumed to follow a normal distribution with mean 0 and standard deviation

? Additionally, all ei 's are assumed independent of each other

Let's visualize using an example

? Suppose we use Age to predict Blood Pressure

? Which is X? Y?

? Draw a picture...

? For any fixed x, the dependent y has a normal distribution

? The mean of y falls on the "population regression line"

? Another way to say the same thing is just:

ei ~ N(0, )

Estimating the slope and intercept

? Still apply the same formulas from Chapter 3 for the Least Squares estimates

? The Least Squares method gives:

Only estimates

? Of course, these are only sample estimates

? If we took a different sample, we would get different estimates

? Need to use these estimates a and b to draw inference about the "real" slope and intercept, and

? Need to know the sampling distribution...

? No problem. ? We already know it's normal, the important

statement is this again: ei ~ N(0, )

Estimating the error variance

? From the model, we know that ei ~ N(0, ) ? To estimate we use the residuals

? After the slope and intercept is estimated, the residuals are calculated as: e^i = yi - y^i

? SSE is calculated as: e2 = (yi - y^i )2

? It is used to estimate by:

se =

SSE =

n-2

MSE

? Why n ? 2? We calculating two estimates, a and b, hence we lose 2 df

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download