DS 533 - Western Illinois University



DS 533

Fall 2004

Final Exam

Name: ___________________

Show All your Work

1. A realtor in a local area is interested in being able to predict the selling price for a newly listed home or for someone considering listing their home. This realtor would like to attempt to predict the selling price by using the size of the home ([pic], in square feet), the number of rooms ([pic]), the age of the home ([pic], in years) and if the home has an attached garage ([pic]). Use the Excel output below to determine if this realtor will be able to use this information to predict the selling price (in $1000).

|Summary measures | | | |

|Multiple R |0.9439 | | | |

|R-Square |0.8910 | | | |

|Adj. R-Square |0.8474 | | | |

|StErr of Estimate |22.241 | | | |

| | | | |

|Regression coefficients | | | |

| |Coefficient |Std Err |t-value |p-value |

|Constant |-19.026 |54.769 |-0.3474 | 0.7355 |

|Size | 7.494 | 1.529 | 4.9010 | 0.0006 |

|Number of Rooms | 7.153 | 9.211 | 0.7767 | 0.4553 |

|Age | -0.673 | 0.992 |-0.6789 | 0.5126 |

|Attached Garage | 0.453 |20.192 | 0.0224 |0.9826 |

| | | | | |

a. Use the information above to estimate the linear regression model.

b. Interpret each of the estimated regression coefficients of the regression model in Question a.

c. Do the variables presented above seem to be significant in predicting the selling price? Explain your answer.

d. Would any of the variables in this model be considered a dummy variable? Explain your answer.

e. Identify and interpret the coefficient of determination ([pic]) and the standard error of the estimate (se) for the model in Question a.

f. Would you recommend that the realtor use this model to predict the selling price of a home? Would you want to make any changes to this model before using it to predict the selling price of a home? Explain.

2. Below you will find a regression model that compares the relationship between the average utility bill (Y, in $) for homes of a particular size and the average monthly temperature (X, in Fahrenheit). The data represents monthly values for the past year. Also, the value for the Durbin-Watson statistic = 1.244, and a residual plot is shown below.

|Summary measures | |

|Multiple R | 0.0295 |

|R-Square | 0.0009 |

|StErr of Estimate |24.8184 |

|ANOVA table | | | | | |

|Source |df |SS |MS |F |p-value |

|Explained |1 | 5.3575 | 5.3575 |0.0087 |0.9275 |

|Unexplained |10 |6159.5125 |615.9512 | | |

| | | | | | |

|Regression coefficients | | | | | |

| |Coefficient |Std Err |t-value |p-value | |

|Constant |112.547 |28.815 |3.9059 |0.0029 | |

|Average Monthly Temp |0.0403 |0.4316 |0.0933 |0.9275 | |

a. Estimate the regression model. How well does this model fit the given data?

b. Is there a linear relationship between X and Y? Explain how you arrived at your answer (state the null and the alternative hypothesis, test statistic, p-value and your conclusion).

c. In looking at the graph of the residuals, do you see any evidence of any violations of the assumptions regarding the errors of the regression model?

d. Giving the Durbin-Watson value presented above, what would you conclude about the data (state the null and the alternative hypothesis, and your decision criteria)?

e. Given you answer in Question d, would you recommend modifying the original regression model? If so, how would you modify it?

3. TOD Chevy is using Holt’s Method to forecast weekly car sales. Currently, the level is estimated to be 50 cars per week, and the trend is estimated to be 6 cars per week. During the current week 30 cars are sold. Forecast the number of cars 3 weeks from now. ( = ( =0.3.

3. The following specific percentage seasonal Factors are given for the month of December:

75.4, 86.8, 96.9, 72.6, 80.0, 85.4

Assume multiplicative decomposition model. If the expected trend-cycle for December is $900, and the mean seasonal Factors is used, what is the forecast for December?

Multiple Choice Questions

Select the best answer

1. If you are going to use a regression equation for prediction, you hope to have a reasonably [pic] and a reasonably [pic].

a. small; large

b. large; small

c. small; small

d. large; large

e. none of the above

2. In choosing the “best-fitting” line through a set of points in linear regression, we choose the one with the:

a. smallest sum of squared residuals

b. largest sum of squared residuals

c. smallest number of outliers

d. largest number of points on the line

e. none of the above

3. In a multiple regression analysis, there are 20 data points and 3 independent variables, and the sum of the squared differences between observed and predicted values of y is 160. The multiple standard error of estimate will be:

a. 3.162

b. 10

c. 9.41

d. 8.42

e. none of the above

4. The F-ratio from the ANOVA table is calculated by:

a. MSR / MSE

b. MSE / MSR

c. SST / SSE

d. SSR / SSE

e. none of the above

5. The can be used to test for autocorrelation.

a. regression coefficient

b. correlation coefficient

c. Durbin-Watson statistic

d. F-test

e. t-test

6. A multiple regression equation includes 6 independent variables, and the coefficient of multiple determination is 0.91. The percentage of the variation in y that is explained by the regression equation is:

a. 91%

b. 95%

c. 83%

d. about 15%

e. none of the above

7. In regression analysis, multicollinearity refers to:

a. the response variables being highly correlated

b. the explanatory variables being highly correlated

c. the response variable(s) and the explanatory variable(s) are highly correlated with one another

d. the response variables are highly correlated over time.

e. none of the above

8. When determining whether to include or exclude a variable in regression analysis, if the p-value associated with the variable’s t-value is above some accepted significance value, such as 0.05, then:

a. the variable is a candidate for inclusion

b. the variable is a candidate for exclusion

c. the variable is redundant

d. the variable does not fit the guidelines of parsimony

e. none of the above

9. The following are the values of a time series for the first four time periods:

|t |1 |2 |3 |4 |

|[pic] |24 |25 |26 |27 |

Using a three-period moving average, the forecasted value for time period 5 is:

a. 20.4

b. 25.5

c. 26

d. none of the above

10. When using exponential smoothing, a smoothing constant must be used. The smoothing constant is a value that:

a. ranges between 0 and 1

b. ranges between –1 and +1

c. is equal to the largest observed value in the series

d. represents the strength of the association between the forecasted and observed values

e. none of the above

11. Winter’s model differs from simple exponential smoothing in that it includes a term for:

a. seasonality

b. trend

c. residuals

d. cyclical fluctuations

e. none of the above

Questions 12, through 15 refer to the following table.

Seasonal Indexes of sales revenue of People's Bank are:

|January |1.20 |

|February |.90 |

|March |1.00 |

|April |1.08 |

|May |1.02 |

|June |1.10 |

|July |1.05 |

|August |.90 |

|September |.85 |

|October |1.00 |

|November |1.10 |

|December |.80 |

12. Total revenue for People's Bank in 1999 is forecasted to be $60,000. Based on the seasonal indexes above, sales in the first three months of 1999 should be:

a. $4,800

b. $15,500

c. $14,723

d. $13,500

e. None of the above.

13. If December 1999 revenue for People's Bank amounted to $5,000, a reasonable estimate of revenue for January 2000, based on the seasonal indexes given above would be:

a. $3,000

b. $4,500

c. $4,800

d. $7,500

f. None of the above.

14. If revenue of People's Bank amounted to $5,500 in November 1999; the November 1999 sales revenue, after adjustment for seasonal variation using the indexes given above, would be:

a. $6,500

b. $6,050

c. $5,500

d. $4,500

e. None of the above.

15. Suppose that a simple exponential smoothing model is used (with [pic] = 0.40) to forecast monthly sandwich sales at a local sandwich shop. The forecasted demand for September was 1560 and the actual demand was 1480 sandwiches. Given this information, what would be the forecast for October in number of sandwiches?

a. 1480

b. 1528

c. 1560

d. 1592

e. cannot be determined from the information given

16. Which of the following is not an attribute of a normal probability distribution?

a. It is symmetrical about the mean.

b. Most observations cluster around the mean.

c. Most observations cluster around zero.

d. The distribution is completely determined by the mean and variance.

e. All the above are correct.

17. When a time series contains no trend, it is said to be

a. nonstationary.

b. seasonal.

c. nonseasonal.

d. stationary.

e. filtered.

18. The difference between seasonal and cyclical components is:

a. Duration.

b. Source.

c. Predictability.

d. Frequency.

e. All the above.

19. A linear trend means that the time series variable changes by:

a. a constant amount each time period

b. a constant percentage each time period

c. a positive amount each time period

d. a negative amount each time period

e. none of the above

20. When using the moving average method, you must select which

represent(s) the number of terms in the moving average.

a. a smoothing constant

b. the explanatory variables

c. an alpha value

d. a span

e. none of the above

21. The forecast error is:

a. the difference between this period’s value and the next period’s value

b. the difference between the average value and the expected value of the response variable

c. the difference between the explanatory variable value and the response variable value

d. the difference between the actual value and the forecast

e. none of the above

22. A regression approach can also be used to deal with seasonality by using variables for the seasons.

a. smoothing

b. response

c. residual

d. dummy

e. none of the above

-----------------------

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download