Biostat 510 - University of Michigan



Biostat 510

Homework 8

Due Tuesday, April 3rd, 2007

For this homework, you will use the permanent SAS data set, afifi.sas7bdat, created for homework 5. If you do not have it available, you can access it from my web page. However, you need to use the permanent version of the SAS data file – and not just read in the raw data file. Use alpha = 0.05 for all statistical tests for this homework. Only use the formats for your categorical variables for question 2. They will mess up the logistic regression output if you use them there (unless you use some fancy SAS coding).

1. Create new variables. You will create some of the same new variables as in homework 7, plus some additional variables. If you already have these variables created, you do not need to create them again:

a) SHOCK1_2:

Create this variable from the dummy variable SHOCK, so for your new variable, the value 1 = shock and 2 = no shock.

b) DIED1_2:

Create this variable from the original variable SURVIVE, so that for your new variable, the value 1 = died and 2 = lived. (You can check the values of SURVIVE in the data set description in your coursepak).

c) DIED:

Create this variable from the original variable SURVIVE.

1=Died

0=Lived

d) FEMALE:

Create this dummy (indicator) variable from SEX.

1=Female

0=Male

2. Create formats for the variables: SHOCK1_2, DIED1_2, SEX, and SHOKTYPE. Refer to your coursepack for information about the coding for each of these variables.

a) Get oneway frequency tabulations for the variables: SHOKTYPE, SHOCK, SHOCK1_2, SURVIVE, DIED DIED1_2, SEX, and FEMALE.

i. Be sure to assign your formats to the variables, SHOCK1_2, DIED1_2, SEX, AND SHOKTYPE.

ii. Check to make sure the frequencies match up for the original variable and for your new variables.

iii. Include the output from your oneway frequencies in your writeup.

3. Logistic regression with a continuous predictor variable.

a) Run a logistic regression to predict DIED using the continuous variable SBP1 as the predictor.

b) What is the sample size for this model? What outcome are you predicting? How many patients died? How many lived?

c) Write out the fitted model for the logit of Y (where Y is DIED), using the estimated model parameters.

d) What is the parameter estimate for the effect of SBP1? Why is it negative? Please interpret this parameter estimate in words.

e) Is the effect of SBP1 significant? Write out the test statistic, the degrees of freedom, and the p-value.

f) What is the odds ratio for SBP1? What is the 95% confidence interval for this odds ratio. Interpret this odds ratio in words.

g) What is the pseudo R-square for this model?

h) Include the output from this logistic regression in your writeup.

4. Logistic regression with a binary predictor.

a) Run a logistic regression with DIED as the dependent variable, and the dummy variable, SHOCK, as the predictor.

b) What are the odds ratio and 95% confidence interval for SHOCK?

c) Please interpret this odds ratio (remember, SHOCK is simply an indicator variable for being in shock or not, it is not continuous, so you need to explain the effect of this variable in terms of the effect of being in shock vs. not being in shock).

d) Run a crosstabs with SHOCK1_2 as the row variable and DIED1_2 as the column variable. Get the odds ratio and 95 % CI for this odds ratio from the SAS output from Proc Freq. (Don’t use the formats for this crosstab).

e) Compare the odds ratio for the crosstab with the odds ratio from the logistic regression. Are they the same? They should be.

f) Include the output from the logistic regression and the crosstab in your writeup.

5. Logistic regression with dummy variables for a categorical predictor:

a) Run a logistic regression with DIED as the dependent variable, and the shock dummy variables as predictors. Use SHOKTYPE=2 (non-shock) as the reference category.

b) Write out the fitted model for the logit of Y as a function of the dummy variables in your model.

c) What are the odds ratios and 95 % confidence limits for each dummy variable? Please interpret these odds ratios.

d) Run a crosstab with SHOKTYPE as the row variable and DIED1_2 as the column variable. Get a chisquare test for this table.

e) Compare the Score test from Proc Logistic with the Pearson chisquare test from Proc Freq (the crosstab). Are they the same? They should be. Report the test statistic, the degrees of freedom and the p-value for these tests.

f) Include the logistic regression output and the crosstabulation from Proc Freq in your writeup.

6. Logistic regression using a class statement for a categorical predictor:

a) Run a logistic regression with DIED as the outcome, and SHOKTYPE as a cagtegorical predictor. Use a class statement for SHOKTYPE, and specify that the first level of SHOKTYPE (level 2) is the reference level.

b) Compare the likelihood ratio test from this model to that for the logistic regression in model for question 5. Are they the same? They should be.

c) Compare the parameter estimates from this model to those for the previous model? Are they the same? They should be.

d) For this model you will have a Type 3 test for the effect of SHOKTYPE. Report this test statistic, its degrees of freedom and its p-value. What is this test testing?

e) Include the logistic regression results in your homework.

7. Multiple logistic regression using Proc Logistic:

a) Run a logistic regression with DIED as the outcome, and SBP1, URINE1, and SHOCK as the predictors.

b) What type of variable is each of these predictors?

c) Please interpret the odds ratios for each of these predictors? Are they significant? Report the test statistic, degrees of freedom and p-value for the effect of each predictor.

d) What is the pseudo R-square for this model?

e) Include the output from the logistic regression in your writeup.

8. Carry out the same logistic regression as in Question 7, but use Proc Genmod

a) Compare the output from Proc Genmod to that from Proc Logistic.

i. Compare the parameter estimates and their standard errors for Proc Genmod with those from Proc Logistic. Are they the same? They should be.

ii. Compare the Type 3 tests from Proc Genmod to those from Proc Logistic. Are they the same? (They will be slightly different, because Proc Logistic is basing the type 3 tests on Wald tests, while Proc Genmod bases the type 3 tests on likelihood ratio tests).

b) Include the output from Proc Genmod in your writeup.

9. Include your SAS commands as the first part of your homework. Be sure to run all of your SAS commands at once to check for any errors.

Extra Credit Problem:

Stepwise Logistic Regression using Proc Logistic.

a) Run a stepwise logistic regression with DIED as the dependent variable and SBP1, URINE1, AGE, SHOCK, CARDIAC1, FEMALE, and HGB1 as the possible predictors. Use the default entry and stay p-values. Get the details of the stepwise selection process.

b) What variables are included in your final model?

c) Write out the estimated final model.

d) Include the logistic regression output in your writeup.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download