SELECTION = STEPWISE in PROC REG



Stepwise Multiple RegressionYour introductory lesson for multiple regression with SAS involved developing a model for predicting graduate students’ Grade Point Average. We had data from 30 graduate students on the following variables: GPA (graduate grade point average), GREQ (score on the quantitative section of the Graduate Record Exam, a commonly used entrance exam for graduate programs), GREV (score on the verbal section of the GRE), MAT (score on the Miller Analogies Test, another graduate entrance exam), and AR, the Average Rating that the student received from 3 professors who interviewed em prior to making admission decisions. GPA can exceed 4.0, since this university attaches pluses and minuses to letter grades. We used a simultaneous multiple regression, entering all of the predictors at once. Now we shall learn how to conduct stepwise regressions, where variables are entered and/or deleted according to statistical criteria. Please run the program STEPWISE.SAS from my SAS Programs page.Forward SelectionIn a forward selection analysis we start out with no predictors in the model. Each of the available predictors is evaluated with respect to how much R2 would be increased by adding it to the model. The one which will most increase R2 will be added if it meets the statistical criterion for entry. With SAS the statistical criterion is the significance level for the increase in the R2 produced by addition of the predictor. If no predictor meets that criterion, the analysis stops. If a predictor is added, then the second step involves re-evaluating all of the available predictors which have not yet been entered into the model. If any satisfy the criterion for entry, the one which most increases R2 is added. This procedure is repeated until there remain no more predictors that are eligible for entry.Look at the program. The first model (A:) asks for a forward selection analysis. The SLENTRY= value specifies the significance level for entry into the model. The defaults are 0.50 for forward selection and 0.15 for fully stepwise selection. I set the entry level at .05 -- I think that is unreasonably low for a forward selection analysis, but I wanted to show you a possible consequence of sticking with the .05 criterion.Look at the output. The “Statistics for Entry” on page 1 show that all four predictors meet the criterion for entry. The one which most increases R2 is the Average Rating, so that variable is entered. Now look at the Step 2 Statistics for Entry. The F values there test the null hypotheses that entering a particular predictor will not change the R2 at all. Notice that all of these F values are less than they were at Step 1, because each of the predictors is somewhat redundant with the AR variable which is now in the model. Now look at the Step 3 Statistics for Entry. The F values there are down again, reflecting additional redundancy with the now entered GRE_Verbal predictor. Neither predictor available for entry meets the criterion for entry, so the procedure stops. We are left with a two predictor model, AR and GRE_V, which accounts for 54% of the variance in grades.Backwards EliminationIn a backwards elimination analysis we start out with all of the predictors in the model. At each step we evaluate the predictors which are in the model and eliminate any that meet the criterion for removal.Look at the program. Model B asks for a backwards elimination model. The SLSTAY= value specifies the significance level for staying in the model. The defaults are 0.10 for BACKWARD and 0.15 for STEPWISE. I set it at .05.Look at the output for Step 1. Of the variables eligible for removal (those with p > .05), removing AR would least reduce the R2, so AR is removed. Recall that AR was the first variable to be entered with our forwards selection analysis. AR is the best single predictor of grades, but in the context of the other three predictors it has the smallest unique contribution towards predicting grades. The Step 2 statistics show that only GRE_V is eligible for removal, so it is removed. We are left with a two predictor model containing GRE_Q and MAT and accounting for 58% of the variance in grades.Does it make you a little distrustful of stepwise procedures to see that the one such procedure produces a two variable model that contains only predictors A and B, while another such procedure produces a two variable model containing only predictors C and D? It should make you distrustful!Fully Stepwise SelectionWith fully stepwise selection we start out just as in forwards selection, but at each step variables that are already in the model are first evaluated for removal, and if any are eligible for removal, the one whose removal would least lower R2 is removed. You might wonder why a variable would enter at one point and leave later -- well, a variable might enter early, being well correlated with the criterion variable, but later become redundant with predictors that follow it into the model.Look at the program. For Model C I asked for fully stepwise analysis and set both SLSTAY and SLENTRY at .08 (just because I wanted to show you both entry and deletion).Look at the output. AR entered first, and GRE_V second, just as in the forward selection analysis. At this point, Step 3, both GRE_Q and MAT are eligible for entry, given my .08 criterion for entry. MAT has a p a tiny bit smaller than GRE_Q, so it is selected for entry. Look at the F for entry of GRE_Q on Step 4 -- it is larger than it was on Step 3, reflecting a suppressor relationship between GRE_Q and MAT. GRE_Q enters. We now have all four predictors in the model, but notice that GRE_V and AR no longer have significant partial effects, and thus become eligible for removal. AR is removed first, then GRE_V is removed.It appears that the combination of GRE_Q and MAT is better than the combination of GRE_V and AR, due to GRE_Q and MAT having a suppressor relationship. That suppressor relationship is not accounted for, however, until one or the other of GRE_Q and MAT are entered into the model, and with the forward selection analysis neither get the chance to enter.R2 SelectionSELECTION = RSQUARE finds the best n (BEST = n) combinations of predictors among all possible 1 predictor models, then among 2 predictor models, then 3, etc., etc., where “best” means “highest R2.” You may force it to INCLUDE=i the first i predictors, START=n with npredictor models, and STOP=n with npredictor models. I specified none of these options, so I got every possible 1 predictor model, every possible 2 predictor model, etc.I did request Mallows’ Cp statistic and MSE. One may define the “best model” as that which has a small value of Cp which is also close to p (the number of parameters in the model, including the intercept). The small Cp indicates precision, small variance in estimating the population regression coefficients. With Cp small and approximately equal to p, the model should fit the data well, and adding additional predictors should not improve precision much. Models with Cp >> p do not fit the data well.The output shows that the best one-predictor model is AR, as we already know. The best two-predictor model is GRE_Q and MAT, which should not surprise us, given our evidence of a suppressor relationship between those two predictors. Adding GRE_V to that model gives us the best three predictor model, but in such a model GRE_V does not have a significant unique effect, and the R2 doesn’t go up much, but we might as well add GRE_V, because the GRE_V scores come along with the GRE_Q scores at no additional cost. For economic reasons, we might well decide to drop the AR predictor, since it does not contribute much beyond what the other three predictors provide, and it is likely much more expensive to gather scores for that predictor than for the other three.My Opinion of Stepwise Multiple RegressionI think it is fun, but dangerous. For the person who understands multiple regression well, a stepwise analysis can help reveal interesting relationships such as the suppressor effects we noted here. My experience has been that the typical user of a stepwise multiple regression has little understanding of multiple regression, and absolutely no appreciation of how a predictor’s unique contribution is affected by the context within which it is evaluated (the other predictors in the model). Too many psychologists think that stepwise regression somehow selects out the predictors that are really associated with the criterion and leaves out those which have only spurious or unimportant relationships with the criterion. Stepwise analysis does no such thing. Furthermore, statistical programs such as SPSS for Windows make it all too easy for such psychologists to conduct analyses, such as stepwise multiple regression analysis, which they cannot understand and whose results they are almost certain to misinterpret.Voodoo RegressionCopyright 2015, Karl L. Wuensch, All Rights Reserved ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download