Technology - Winona State University



STAT 405 - Fall 2014 - Homework 9 (41 points) Due Friday, Nov. 14th Prostate Cancer StudyIt is believed that men’s prostate glands grow with age. Therefore prostate-specific antigen (PSA), a substance made in the prostate, increase slightly as men become older. Oncologists have established that prostate cancer cells are more permeable than normal prostate cells, causing the PSA level to rise in most cases of prostate cancer. When prostate cancer is not detected early enough the tumor can develop and penetrate the prostatic capsule.Dr. Donn Young, at the The Ohio State University Comprehensive Cancer Center, collected data, including demographic variables, and various test results on a cohort of men with prostate cancer. The interest is in how to use various test results among patients, and in particular, whether variables measured at the baseline exam can be used to predict whether or not the tumor has penetrated the prostatic capsule. You will use the binary logistic regression tool to help identify the indicators of tumor penetration of the prostatic capsule in cancer patients.Data Files: Prostate Logistic.TXT, Prostate.JMP7810501619250Drexam (1 = nodule, 0 = no nodule)00Drexam (1 = nodule, 0 = no nodule)Additional predictors added:10 Log base 10 of PSA Level log10(mg/ml) log10PSA11 Alpha-numeric recoding NP = no penetration CapPen of CAPSULE CP = penetrationUse a scatterplot to examine the relationship between PSA level and age. Use a different plotting symbol for capsule with penetration and those without penetration. Does there appear to be an association between PSA and age? In particular, do these data suggest that PSA increases with age? (2 pts.)Now plot the log base 10 of the PSA levels vs. age. Comment on the advantage of converting to the log scale for PSA. Does your answer to part (a) change? (1 pt.)Is there an association between digital rectal exam (Drexam) and capsule penetration? Justify your answer using the appropriate statistical test and reporting the p-value. (3 pts.)What is the OR and associated CI for capsule penetration associated with the detection of a nodule via digital rectal exam (Drexam) at baseline? (3 pts.)Fit the simple logistic model for capsule penetration using PSA level as the predictor. Is PSA level a significant predictor of capsule penetration status? Explain. (2 pts.)What is multiplicative effect of a 20 mg/ml increase in the PSA level? Give a 95% CI for this effect as well. Discuss these results. (3 pts.)Write out the logistic model for capsule penetration using PSA, age, and the results of the digital rectal exam (Drexam) as the predictors. (1 pt.)Fit the model from part (g) and include the output. Test the utility of the logistic model. Discuss. (3 pts.)After adjusting for PSA level and results of the digital rectal exam at baseline is age of the subject any value in predicting capsule penetration? Explain. (2 pts.)What is a Gleason Score? Google it! (1 pt.)Fit the logistic model for capsule penetration using the following predictors:Age, Race, Dpos, Dcaps, PSA, Vol, and Gleason Summarize the model for Dr. Young. (4 pts.)Use backward elimination or forward selection to obtain a suitable reduced model. Summarize each predictor in the final model in terms of odds ratios. If you use JMP to fit the stepwise model choose the Whole Effects option from the Rules drop down menu as shown below. This will force JMP to not combine levels for Dpos variable. Summarize these results as if you were writing them for Dr. Young. Be sure to discuss the OR’s associated with each term in your final model. For Dpos you will probably want to focus on the OR’s that are greater than 1 as they are easier to discuss. (10 pts.) Examine the case diagnostics for your final reduced model. Identify the case with the highest Cook’s distance (or some other suitable measure of influence) and discuss why you think this case stands out. In order to do this you will need to run the model in R. Here are some commands to help you out substantially in this process. (3 pts.)Read in Prostate.txt from my website:Prostate = read.table(file.choose(),header=T,sep=”,”) names(Prostate) [1] "ID" "Capsule" "Age" "Race" "Dpros" "Drexam" "Dcaps" "PSA" [9] "Vol" "Gleason" "log10PSA" "CapPen" Fit the full logistic model for part (k)fullmodel = glm(Capsule~Age+as.factor(Race)+as.factor(Dpros)+as.factor(Dcaps)+PSA+Vol+Gleason, family=”binomial”,data=Prostate)The use of as.factor() is necessary to make sure that the nominal variables: Race, Dpros, and Dcaps (which are coded numerically) are NOT treated as numeric in the logistic model fit.Perform backward elimination to obtain a reduced model stepmodel = step(fullmodel) summary(stepmodel) Note: this model will not be the same as the one obtained using stepwise selection in JMP. stepmodel2 = update(stepmodel,.~.-Vol) This will give the same model you obtained in JMP. Diagplot.log(stepmodel) identify points in the two plots that standout. Be sure to hit Esc when you are finished identifying points in each plot.Identify the case which is most poorly fit using a suitable measure and discuss why you think this case is identified as being poorly fit. See plots obtain in part (m) above. (3 pts.) ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download