SOME PERSPECTIVES OF SPARSE STATISTICAL ... - …

SOME PERSPECTIVES OF SPARSE STATISTICAL MODELING

A DISSERTATION SUBMITTED TO THE DEPARTMENT OF STATISTICS

AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

Hui Zou May 2005

c Copyright by Hui Zou 2005 All Rights Reserved

ii

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

Trevor Hastie Principal Adviser

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

Robert Tibshirani

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

Bradley Efron

Approved for the University Committee on Graduate Studies.

iii

Abstract

In this thesis we develop some new sparse modeling techniques and related theory. We first point out the fundamental drawbacks of the lasso in some scenarios: (1) the number of predictors (greatly) exceeds the number of observations; (2) the predictors are highly correlated and form "groups". A typical example where these scenarios naturally occur is the gene selection problem in microarray analysis. We then propose the elastic net, a new regularization and variable selection method, to further improve upon the lasso. We show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together. The elastic net is particularly useful when the number of predictors is much bigger that the number of samples. We also propose an algorithm called LARS-EN for efficiently computing the entire elastic net regularization path, much like the LARS algorithm does for the lasso.

The second part of the thesis shows a nice application of the elastic net in deriving principal components with sparse loadings. We propose a principled approach called SPCA to modify PCA based on a novel sparse PCA criterion in which an elastic net constraint is used to produce sparse loadings. To solve the optimization problem in SPCA, we consider an alternating algorithm which iterates between the elastic net and the reduced rank Procrustes rotation. SPCA allows flexible control of the sparse structure of the resulting loadings and has the ability of identifying important variables.

In the third part of the thesis, we study the degrees of freedom of the lasso in the framework of SURE theory. We prove that the number of non-zero coefficients is an unbiased

iv

estimate for the degrees of freedom of the lasso--a conclusion requires no special assumption on the predictors. Our analysis also provides mathematical support for a related conjecture by Efron et al. (2004). As an application, various model selection criteria--Cp, AIC and BIC--are defined, which, along with the LARS algorithm, provide a principled and efficient approach to obtaining the optimal Lasso fit with the computational efforts of a single ordinary least-squares fit. The degrees of freedom of the elastic net can be obtained by similar arguments.

v

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download