Model Selection: General Techniques - Stanford University

[Pages:16]Statistics 203: Introduction to Regression and Analysis of Variance

Model Selection: General Techniques

Jonathan Taylor

- p. 1/16

Today

q Today q Crude outlier detection test q Bonferroni correction

q Simultaneous inference for

q Model selection: goals q Model selection: general q Model selection: strategies q Possible criteria

q Mallow's Cp

q AIC & BIC q Maximum likelihood

estimation q AIC for a linear model q Search strategies q Implementations in R q Caveats

s Outlier detection / simultaneous inference. s Goals of model selection. s Criteria to compare models. s (Some) model selection.

- p. 2/16

Crude outlier detection test

q Today q Crude outlier detection test q Bonferroni correction

q Simultaneous inference for

q Model selection: goals q Model selection: general q Model selection: strategies q Possible criteria

q Mallow's Cp

q AIC & BIC q Maximum likelihood

estimation q AIC for a linear model q Search strategies q Implementations in R q Caveats

s If the studentized residuals are large: observation may be an outlier.

s Problem: if n is large, if we "threshold" at t1-/2,n-p-1 we will get many outliers by chance even if model is correct.

s Solution: Bonferroni correction, threshold at t1-/2n,n-p-1.

- p. 3/16

Bonferroni correction

q Today q Crude outlier detection test q Bonferroni correction

q Simultaneous inference for

q Model selection: goals q Model selection: general q Model selection: strategies q Possible criteria

q Mallow's Cp

q AIC & BIC q Maximum likelihood

estimation q AIC for a linear model q Search strategies q Implementations in R q Caveats

s If we are doing many t (or other) tests, say m > 1 we can control overall false positive rate at by testing each one at level /m.

s Proof:

P (at least one false positive)

= P m i=1|Ti| t1-/2m,n-p-1

m

P |Ti| t1-/2m,n-p-1

i=1

=

m

m

=

.

i=1

s Known as "simultaneous inference": controlling overall false positive rate at while performing many tests.

- p. 4/16

Simultaneous inference for

q Today q Crude outlier detection test q Bonferroni correction

q Simultaneous inference for

q Model selection: goals q Model selection: general q Model selection: strategies q Possible criteria

q Mallow's Cp

q AIC & BIC q Maximum likelihood

estimation q AIC for a linear model q Search strategies q Implementations in R q Caveats

s Other common situations in which simultaneous inference occurs is "simultaneous inference" for .

s Using the facts that

N , 2(XtX)-1

2

2

?

2n-p n-p

along with 2 leads to

(

- )t(XtX)( 2

- )/p

2p/p 2n-p/(n - p)

Fp,n-p

s (1 - ) ? 100% simultaneous confidence region:

: ( - )t(XtX)( - ) p2Fp,n-p,1-

- p. 5/16

Model selection: goals

q Today q Crude outlier detection test q Bonferroni correction

q Simultaneous inference for

q Model selection: goals q Model selection: general q Model selection: strategies q Possible criteria

q Mallow's Cp

q AIC & BIC q Maximum likelihood

estimation q AIC for a linear model q Search strategies q Implementations in R q Caveats

s When we have many predictors (with many possible interactions), it can be difficult to find a good model.

s Which main effects do we include? s Which interactions do we include? s Model selection tries to "simplify" this task.

- p. 6/16

Model selection: general

q Today q Crude outlier detection test q Bonferroni correction

q Simultaneous inference for

q Model selection: goals q Model selection: general q Model selection: strategies q Possible criteria

q Mallow's Cp

q AIC & BIC q Maximum likelihood

estimation q AIC for a linear model q Search strategies q Implementations in R q Caveats

s This is an "unsolved" problem in statistics: there are no magic procedures to get you the "best model."

s In some sense, model selection is "data mining."

s Data miners / machine learners often work with very many predictors.

- p. 7/16

Model selection: strategies

q Today q Crude outlier detection test q Bonferroni correction

q Simultaneous inference for

q Model selection: goals q Model selection: general q Model selection: strategies q Possible criteria

q Mallow's Cp

q AIC & BIC q Maximum likelihood

estimation q AIC for a linear model q Search strategies q Implementations in R q Caveats

s To "implement" this, we need: x a criterion or benchmark to compare two models. x a search strategy.

s With a limited number of predictors, it is possible to search all possible models.

- p. 8/16

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download