Lecture 5: Multiple Linear Regression
[Pages:50]Lecture 5: Multiple Linear Regression
CS109A Introduction to Data Science
Pavlos Protopapas and Kevin Rader
Lecture Outline
Simple Regression:
? Predictor variables Standard Errors
? Evaluating Significance of Predictors ? Hypothesis Testing ? How well do we know "? ? How well do we know $?
Multiple Linear Regression:
? Categorical Predictors ? Collinearity ? Hypothesis Testing ? Interaction Terms
Polynomial Regression
CS109A, PROTOPAPAS, RADER
1
Standard Errors
The variances of & and ' are also called their standard errors, "& , "' .
If our data is drawn from a larger set of observations then we can empirically estimate the standard errors, "& , "' of & and ' through bootstrapping.
If we know the variance . of the noise , we can compute "& , "' analytically, using the formulae below:
SE b0 =
s
1
x2
n
+
P
i
(xi
x)2
SE b1 = qP i (xi
x)2
CS109A, PROTOPAPAS, RADER
2
Standard Errors
MLBaeortrgteeersdtdaactaota:v:era.gSSaenEE:dbb015((==)5 -oqrs)P.5n1(i+( 5x-iPi)x(.x)x2i2x)2
In practice, we do not know the theoretical value of since we do not know the exact distribution of the noise .
Remember:
5 = 5 + 5 5 = 5 - (5)
CS109A, PROTOPAPAS, RADER
3
Standard Errors
In practice, we do not know the theoretical value of since we do not know the exact distribution of the noise . However, if we make the following assumptions,
? the errors 5 = 5 - $5 and B = B - $B are uncorrelated, for ,
? each 5 is normally distributed with mean 0 and variance .,
then, we can empirically estimate ., from the data and our regression line:
r
sP
n ? MSE =
i (yi
ybi)2
n2
n2
s
X (f^(x) yi)2
n 2 CS109A, PROTOPAPAS, RADER
4
Standard Errors
More data: and 5(5 - ). Largest coverage: () or 5(5 - ). Better data: .
SE b0 =
s
1
x2
n
+
P
i
(xi
SE b1 = qP i (xi
x)2
x)2
Better model: (" - 5)
s X (f^(x) yi)2
n2
Question: What happens to the F&, F' under these scenarios?
CS109A, PROTOPAPAS, RADER
5
Standard Errors
The following results are for the coefficients for TV advertising:
Method
"
Analytic Formula
0.0061
Bootstrap
0.0061
The coefficients for TV advertising but restricting the coverage of x are:
Method Analytic Formula Bootstrap
" 0.0068 0.0068
The coefficients for TV advertising but with added extra noise:
This makes no sense?
Method Analytic Formula Bootstrap
" 0.0028 0.0023
CS109A, PROTOPAPAS, RADER
6
Importance of predictors
We have discussed finding the importance of predictors, by determining the cumulative distribution from to 0.
.
CS109A, PROTOPAPAS, RADER
7
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- multiple hypothesis testing a review
- multiple hypothesis testing the f test
- 4 hypothesis testing in the multiple regression model
- lecture 5 multiple linear regression
- chapter 8 the multiple regression model hypothesis tests
- hypothesis testing in the multiple regression model
- hypothesis tests in multiple regression analysis
- lecture 5 hypothesis testing in multiple linear regression
- solutions manual for fundamental statistics for the
- regression analysis benedictine
Related searches
- simple linear regression test statistic
- linear regression coefficients significance
- linear regression test statistic calculator
- linear regression without a calculator
- linear regression significance
- linear regression coefficient formula
- multiple linear regression null hypothesis
- multiple linear regression hypothesis test
- multiple linear regression excel mac
- multiple linear regression spss
- multiple linear regression in excel
- multiple linear regression analysis spss