Simple Linear Regression – Assignment #7 ( points)



STAT 602 – Multiple Linear Regression (55 pts.)Spring 20171 – Compressive Strength of RocksSource: E. Ali, W. Guang, A. Ibrahim (2014). "Empirical Relations Between Compressive Strength and Microfabrics Properties Using Multivariate Regression, Fuzzy Inference and Neural Networks: A Comparative Study," Engineering Geology, To Appear, Available online 9/16/2014.Variable Descriptionsucs – uniaxial compressive strength of rocks (Y)qtz = quartz cotent in % X1plag = plagioclase content in % (X2)k.fds = feldspar content in % (X3)hb = hornblende content in % X4)gs = grain size in pixels/mm X5ga = grain area in pixels/mm2 (X6)sf = shape factor as measure of circularity of the sample (close to 1 for circular, close to 0 for elongated) (X7)ar = aspect ratio which is the ratio of major axis/minor axis lengths (X8)Examine pairwise correlations between the response and the predictors. Also examine the corresponding scatterplot matrix. Which predictors appear to have the strongest relationship with the response (Y). (3 pts.)Based on the scatterplot matrix it appears the we probably will not need to use both grain size (gs) and grain area (ga) in a multiple regression model, why would I say this? (2 pts.)Develop a multiple regression model for compressive strength using the available predictors. Comment on the models adequacy by examining plots of the residuals. Summarize your findings. (10 pts.)Using your model in part (c) which factor has the strongest adjusted relationship with the response? The weakest amongst those in your model? Which feature has a negative adjusted relationship with the response? (4 pts.)Use your model to predict the compressive strength of a sample with qtz = 40, hb = 20, ga = 750, sf = .60. (2 pts.)2 – Modeling Viscosity as a Function of Polymer Concentration & Shear Rate (Datafile: Viscosity.JMP)Source: S. Akbari, S.M. Mahmood, I.M. Tan, A.M. Bharadwaj, H. Hematpour (2017), "Experimental Investigation of the Effect of Different Process Variables on the Viscosity of Sulfonated Polyacrylamide Copolymers," Journal of Petroleum Exploration and Production Technology, Vol. 7, pp. 87-101.The paper reports on the results of an experiment relating Viscosity (mPas) topolymer concentration (100, 1000, 2500, 3500, 5000 ppm) and shear rate (10, 31.6228, 100.776, 316.228, 1000 1/sec). The authors modeled these data using a two-way ANOVA with 1 replicate per combination of polymer concentration and shear rate each at the five levels above. Thus we have a total of 5 ×5=25 observations. You will be analyzing these data using multiple regression treating both polymer concentration and shear rate as continuous predictors. Variable DescriptionsViscosity (mPa) – the response (Y)PolymerConc (ppm) – polymer concentration (X1)ShearRate (1/sec) – shear rate (X2)Examine pairwise correlations between the response and the two predictors. Also examine the corresponding scatterplot matrix. Which predictor appears to have the strongest relationship with the response (Y). (3 pts.)Develop a multiple regression model for Viscosity using the two predictors. Comment on the models adequacy by examining plots of the residuals. Summarize your findings. (5 pts.)Develop a multiple regression model for ln(Viscosity). Comment on the models adequacy by examining plots of the residuals. Summarize your findings. Does this model appear to be better in terms of assumptions than model from part (b)? (5 pts.)Use Graph Builder to construct a plot of ln(Viscosity) vs. Polymer Concentration adjusting for Shear Rate. To do this Graph Build drag Polymer Concentration to the horizontal axis, Log(Viscosity) to the vertical axis and drag Shear Rate to the Group X field at the stop of the scatterplot of Log(Viscosity) vs. Polymer Concentration. You could also drag Viscosity to the vertical axis by dragging it near the top of vertical axis field. Now answer these questions – does it appear that the conditional relationship between Log(Viscosity) and/or Viscosity is linear? Is there visual evidence that the nature of the relationship between Log(Viscosity) and/or Viscosity and Polymer Concentration depends on the Shear Rate? (5 pts.)To further understand the relationship between the response or natural log-response and these factors you could reverse the roles of Polymer Chemistry and Shear Rate in the process outlined above. Using the results from Graph Builder, what additional “nonlinear” terms do you think should be added to model? Add these terms to your multiple regression model and examine residual plots to assess the adequacy of your model. (6 pts.)Use Factor Profiling > Profiler or Factor Profiling > Surface Profiler examine your fitted model and the role Polymer Concentration and Shear Rate play in determining the viscosity/Log(viscosity). Discuss (4 pts.)Save the Mean Confidence Limit Formula and the Indiv Confidence Limit Formula to the data table. Give a 95% Confidence and Prediction Intervals for the Log(Viscosity) and Viscosity when Polymer Concentration = 1000 ppm and Shear Rate = 100.76 1/sec. Remember Log(Viscosity) = ln(Viscosity) so to back-transform to the original scale use ex. Interpret the intervals in the original scale. (6 pts.)CHALLENGE PROBLEM 3 – Selling Price of Homes in Polk County, IA from 2012-2013(Datafile: Polk County 2012.JMP)The variables descriptions for this analysis are given below. The goal of the analysis is identifying key factors/variables that are related to the selling price of the homes. All sales represent arms length situations, i.e. where both the buyer and seller are on equal footing. VariableDescriptionpriceThis is the response for the prediction problem and is the price the home sold for in U.S. dollars ($).jurisThis is a categorical variable which denotes the region/city within Polk County the home was in. The codes are:AL = Altoona, ANK = Ankeny, BO = Bondurant, CL = Clive, DM = Des Moines, GR = Grimes, JO = Johnston, O = Other, PC = Polk City, PH = Pleasant Hill, UR = Urbandale, WDM = West Des MoinesmonthMonth the home sale occurred (1 = January, …, 12 = December)instrumentDeed or ContractZIP*Zip Code (FOR MAPPING PURPOSES ONLY!!!)bldg.fullAssessed value of building(s) on the lot. ($)total.fullAssessed value of entire property (land and structures) ($)land.acresLot size in acres.residence.typeResidence Type - 1.5 Stories, 2 Stories, 1 Story, Over 2 Stories, Split bldg.styleSplit, Ranch, Other, Early 20s, Conv (conventional), Bungalow ext.wallMaterial used to construct exterior walls – Wood, Vinyl, Metal, Hardboard, Conc.Board, Brick, Otherpercent.brickPercentage of house that is brick, ranges from 0 – 100.roof.typeGable, Hip, Othermain.living.areaSquare foot main living areaupper.living.areaSquare foot upper living areafin.attic.areaSquare foot of finished attic areatotal.living.areaTotal square foot living areaunfin.attic.areaSquare foot of unfinished attic areafoundationMaterial used to construct foundation –Poured Concrete, Concrete Block, Brick, Otherbasement.areaSquare foot of basement areafin.bsmt.area.totSquare foot of finished basement areabsmt.walkoutLineal feet of exposed wallbsmt.gar.capacityNumber of cars (capacity) that fit in basement garageatt.garage.areaSquare foot of attached garageopen.porch.areaSquare foot of all attached open porchesenclose.porch.areaSquare foot of all attached enclosed porchespatio.areaSquare foot of all patio areasdeck.areaSquare foot of all deck areascanopy.areaSquare foot of all canopiesveneer.areaLineal feet of brick veneer on housecarportIs there are carport? (1 = yes, 0 = no)bathroomsNumber of full or ? bathstoilet.roomsNumber of ? bathsextra.fixturesNumber of extra fixtureswhirlpoolsNumber of whirlpool tubsfireplacesNumber of fireplacesbedroomsNumber of bedroomsroomsTotal number of roomsyear.builtYear home was builtgasairHomes has gas furnace with forced air heat (1 = yes, 0 = no)air.conditioningPercent of central air conditioning (ranges 0 = none to 100 = full)detachedAre there detached structures? (1 = yes, 0 = no)bsmt.qualBasement quality – None (i.e. no basement), Low, Average, Average Plus, Living QuartersConditionThe condition of the house for its age and type of construction1 = Very poor, 2 = Poor, 3 = Below Normal, 4 = Normal, 5 = Above Normal, 6 = Very Good, 7 = ExcellentGradeThe quality of original construction ranging from 1 to 6, with 1 being the best (notice it uses an opposite ordering than condition)Grade AdjustedAn adjustment made between different grades above – adjusts up or down on the grade scale above. The smaller the adjusted grade value the better the quality of the original construction.Use multiple regression to develop a regression model for Y= selling price (price) or possibly log10(price) using the other variables in the table above as predictors. Summarize your final model and assess model adequacy by examining the residuals. (25 pts.) ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download