Modern regression 2: The lasso - Carnegie Mellon University
Modern regression 2: The lasso
Ryan Tibshirani Data Mining: 36-462/36-662
March 21 2013
Optional reading: ISL 6.2.2, ESL 3.4.2, 3.4.3
1
Reminder: ridge regression and variable selection
Recall our setup: given a response vector y Rn, and a matrix X Rn?p of predictor variables (predictors on the columns)
Last time we saw that ridge regression,
^ridge = argmin
y - X
2 2
+
2 2
Rp
can have better prediction error than linear regression in a variety of scenarios, depending on the choice of . It worked best when there was a subset of the true coefficients that are small or zero
But it will never sets coefficients to zero exactly, and therefore cannot perform variable selection in the linear model. While this didn't seem to hurt its prediction ability, it is not desirable for the purposes of interpretation (especially if the number of variables p is large)
2
Recall our example: n = 50, p = 30; true coefficients: 10 are nonzero and pretty big, 20 are zero
0.8
0.6
0.4
Linear MSE Ridge MSE Ridge Bias^2 Ridge Var
0
5
10
15
20
25
Coefficients
-0.5
0.0
0.5
1.0
q
True nonzero
True zero
qq q
qq q q
qq q q
q
q
q q q qqq qq q q qqq
q
q
0
5
10
15
20
25
0.2
0.0
3
Example: prostate data
Recall the prostate data example: we are interested in the level of prostate-specific antigen (PSA), elevated in men who have prostate cancer. We have measurements of PSA on n = 97 men with prostate cancer, and p = 8 clinical predictors. Ridge coefficients:
lcavol q
q lcavol
0.6
0.6
0.4
0.4
Coefficients
Coefficients
0.2
svi q lweightq
plgbgp4h5
q q
gleasonq
0.2
q svi qlweight
q q
plgbgp4h5
qgleason
0.0
0.0
alcgpe
q q
q q
alcgpe
0
200
400
600
800
1000
0
2
4
6
8
df()
What if the people who gave this data want us to derive a linear model using only a few of the 8 predictor variables to predict the level of PSA?
4
Now the lasso coefficient paths:
lcavol q
q lcavol
0.6
0.6
0.4
0.4
Coefficients
Coefficients
0.2
svi q lweightq
plgbgp4h5
q q
gleasonq
0.2
q svi qlweight
q q
plgbgp4h5
qgleason
0.0
0.0
alcgpe
q q
q q
alcgpe
0
20
40
60
80
0
2
4
6
8
df()
We might report the first 3 coefficients to enter the model: lcavol (the log cancer volume), svi (seminal vesicle invasion), and lweight (the log prostate weight)
How would we choose 3 (i.e., how would we choose ?) We'll talk about this later
5
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- ∫∫f νx ν y exp −j2π νx x dνy
- physics 215 final examsolutions winter2018
- separation of variables
- chapter 1 solutions to review problems
- the laplacian university of plymouth
- chapter 10 differential equations pnw
- 1 overview 2 the gradient descent algorithm
- homogeneous functions united states naval academy
- modern regression 2 the lasso carnegie mellon university
- chapter 2 introduction to electrodynamics
Related searches
- 3 wing 2 the charmer
- 1 2 the nature of science answer key
- 9 2 the process of cellular respiration key
- modern warships of the world
- modern slavery in the us
- in regression analysis the residuals represent the
- 3 2 the international system of units
- 3 2 the international system of units key
- 3 2 the international system of units answers
- in regression analysis the independent variable is
- modern paganism in the church
- 12 2 the structure of dna workbook answers