DS 533 - Western Illinois University



DS 533

Fall 2004

Exam # 3

Name: ___________________

Show All your Work

An automobile rental company wants to predict the yearly maintenance expense (Y) for an automobile using the number of miles driven during the year ([pic]) and the age of the car ([pic], in years) at the beginning of the year. The company has gathered the data on 10 automobiles and the regression information from Excel is presented below. Use this information to answer the following questions.

|Summary measures | | | |

|Multiple R |0.9689 | | | |

|R-Square |0.9387 | | | |

|Adj R-Square |0.9212 | | | |

|Standard Error |72.218 | | | |

| | | | | |

|Regression coefficients | | | |

| |Coefficient |Std Err |t-value |p-value |

|Constant |33.796 |48.181 |0.7014 |0.5057 |

|Miles Driven |0.0549 |0.0191 |2.8666 |0.0241 |

|Age of car |21.467 |20.573 |1.0434 |0.3314 |

a. Use the information above to estimate the linear regression model.

[pic]

b. Interpret each of the estimated regression coefficients of the regression model in Question a.

For every extra 100 miles driven, the maintenance cost goes up by $5.49, given the age of the car is fixed.

As the age of the car goes up by one year the maintenance cost goes up by $21.467, give the miles driven is fixed.

c. Identify and interpret the coefficient of determination ([pic]), and the standard error of the estimate (Sy.x) for the model in Question 3.

R2 = .9387. 93.87% of the variability in maintenance cost can be explained by the age of the car and the miles driven.

S = 72.218. This measures the variability around the fitted model.

d. Does the given set of explanatory variables do a good job of explaining changes in the maintenance costs? Explain why or why not.

The R2 is high, indicating a good model, but the variable age of the car is not a significant predictor of the maintenance car given the first variable (Miles driven) in the model. The variable age of the car may not be needed in the model.

d. Would you recommend that this company examine any other factors to predict maintenance expense? If yes, what other factors would you want to consider? Explain your answer.

This is a good model with R2 = 94%. Other variable that may be considered is the make and model of the car.

f. Give a 95% confidence interval for the average yearly maintenance cost for an automobile for every extra mile driven during the year ([pic]).

[pic]

g. What is the average yearly maintenance cost for a 10-year-old automobile that drives 12000 miles per year? [pic]

[pic]

Mid-Valley Travel Agency (MVTA) has offices in 12 cities. The company believes that its monthly airline bookings are related to the mean income in those cities and has collected the following data:

|Location |Bookings |Income |

|1 |1098 |43299 |

|2 |1131 |45021 |

|3 |1120 |40290 |

|4 |1142 |41893 |

|5 |971 |30620 |

|6 |1403 |48105 |

|7 |855 |27482 |

|8 |1054 |33025 |

|9 |1081 |34687 |

|10 |982 |28725 |

|11 |1098 |37892 |

|12 |1387 |46198 |

The data are analyzed using regression analysis. The partial computer output is given below:

|SUMMARY OUTPUT | | | |

| | | | | |

|Regression Statistics | | | |

|Multiple R |0.879189 | | | |

|R Square |0.772974 | | | |

|Adjusted R Square |0.750271 | | | |

|Standard Error |78.16735 | | | |

|Observations |12 | | | |

| | | | | |

|ANOVA | | | | |

|  |df |SS |MS |F |

|Regression |1 |208036.3 |208036.3 |34.04775 |

|Residual |10 |61101.35 |6110.135 | |

|Total |11 |269137.7 |  |  |

| | | | | |

|  |Coefficients |Standard Error |t Stat |P-value |

|Intercept |371.6758 |128.5571 |2.891133 |0.016076 |

|X Variable 1 |0.019381 |0.003322 | | |

a) What is the estimated least square regression line?

[pic]

b) What is the standard error of the estimate?

S =78.167

c) Forecast the number of bookings when the mean income is $51385.

[pic]

d) Test the significance of the regression coefficient at the 5% level (state the null and alternative hypothesis, the value of your test statistic, the p-value or the decision rule, and your conclusion).

H0 : (1 = 0

Ha ; (1 ≠ 0

[pic]

P-value < 2(.005) = .01

Reject H0.

Mean income is a significant predictor of the air line bookings.

e) Give an interval estimate of (1 with a 95% confidence coefficient.

[pic]

Multiple Choice Questions

Select the best answer

1. In choosing the “best-fitting” line through a set of points in linear regression, we choose the one with the:

a. smallest sum of squared residuals **

b. largest sum of squared residuals

c. smallest number of outliers

d. largest number of points on the line

e. none of the above

2. In a multiple regression analysis, there are 25 data points and 5 independent variables, and the sum of the squared differences between observed and predicted values of y is 160. The regression standard error will be:

a. 2.530

b. 3.464

c. 2.902**

d. 5.657

e. none of the above

3. In a simple linear regression analysis, the following sum of squares are produced:

[pic]

The proportion of the variation in y that is explained by the variation in x is:

a. 20%

b. 80%**

c. 25%

d. 50%

e. none of the above

4. Given the least squares regression line [pic]8 – 3x,

a. the relationship between x and y is positive

b. the relationship between x and y is negative**

c. as x increases, so does y

d. as x decreases, so does y

e. there is no relationship between x and y

5. A multiple regression equation includes 6 independent variables, and the coefficient of multiple determination is 0.91. The percentage of the variation in y that is explained by the regression equation is:

a. 91%**

b. 95%

c. 83%

d. about 15%

e. none of the above

6. A “fan” shape in a scatterplot indicates:

a. unequal variance**

b. a nonlinear relationship

c. he absence of outliers

d. sampling error

7. The values of the regression parameters (i are not known. We estimate them from the data.

a) True ** b) false c) Not enough information

8. Residual plots can be used to check the aptness of the model for the data.

a) True** b) False c) Not enough information

9. We need to estimate the variance of the error terms because:

I) It gives an indication of the variability of the distribution of y.

II) It is needed for making inference concerning regression function and the prediction of y.

a) Only (I) is true.

b) Only (II) is true.

c) Both (I) and (II) are true.**

d) Neither (I) nor (II) is true.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download