1 - University of Iowa



6N:216 Fall 2003

Data and Decisions

Quiz 3

• Please write your name at the top of each page

• For work done with Excel, attach a printout of your spreadsheet(s)

1. The exam is open book, open note, and open computer

2. You are expected to do your work entirely alone

3. The exam has a total of 100 points

4. The exam is due in class; if you cannot return it during class, you can email or fax it at your own risk, so long as it is received by the College of Business before 6 p.m. on the night of class.

5. Late exams will not be accepted.

Good luck!

1. Planting Trees (15 points)

Tyler Landscaping estimates jobs based on the length of time expected to perform different landscaping tasks. For instance, their price for tree planting is based on an estimate of one hour of labor per tree. The owner suspects that the actual time required exceeds the one-hour estimate. You may assume that the underlying distribution (of labor hours to plant a tree) is normal. During some recent jobs, she recorded the number of hours required per tree, and obtained the following times:

0.9, 0.7, 1.8, 1.4, 1.6, 1.5, 1.8, 2.2, 0.6, 1.6

a. (3 points) Formulate a null hypothesis for testing whether the labor time is greater than one hour.

b. (2 points) Compute the appropriate test statistic.

c. (5 points) Determine the critical value using ( = 0.05 and state the conclusion of the test.

d. (5 points) What is the smallest level of significance for which we would reject the null hypothesis? What recommendations (if any) would you make based on this result?

2. Educational Status (15 points)

The Excel file quiz3_data.xls contains data on the educational status of persons 25 years and older. We would like to know if marital status and education level have some relationship with each other.

a. (5 points) Construct two tables that show education level vs. marital status. One table should have the observed counts, the other should have the expected counts. Hint: use all six marital status categories and all six education level categories.

b. (5 points) Test the null hypothesis that the two variables are independent. What is the p-value?

e. (5 points) Concisely state the conclusion of your test. Then, closely examine the subset of people with advanced degrees. Which types of marital status are over-represented or under-represented among people with advanced degrees? State (in a few sentences) any conclusions you can reach from this analysis. (Hint: Consider how much each cell contributed to the test statistic.)

3. School Spending (25 points)

The data for this question come from "Getting What You Pay For: The Debate Over Equity in Public School Expenditures" published in Journal of Statistics Education (Guber 1999). The cases are the 50 US states.

The variables are:

expend: Current expenditure per pupil in average daily attendance in public elementary and secondary schools, 1994-95 (in thousands of dollars)

ratio: Average pupil/teacher ratio in public elementary and secondary schools, Fall 1994

salary: Estimated average annual salary of teachers in public elementary and secondary schools, 1994-95 (in thousands of dollars)

takers: Percentage of all eligible students taking the SAT, 1994-95

verbal: Average verbal SAT score, 1994-95

math: Average math SAT score, 1994-95

total: Average total score on the SAT, 1994-95

a) (5 points) Fit a model with the total SAT score as the dependent variable and expend, salary, ratio, and takers as independent variables. Which of the independent variables are significant at α = 0.05?

b) (5 points) Remove the independent variable with the largest p-value from the model in part (a) and refit. How does the R2 value and residual standard error of the new model compare to the original model? Repeat this procedure (remove the independent variable with the largest remaining p-value and refit). What has happened to the R2 value and residual standard error now? Does anything peculiar happen to the p-values of the remaining predictors? Explain the discrepancy between the R2 values and the residual standard errors of all three models.

c) (5 points) Perform four simple regressions with total SAT score as the dependent variable but use expend, salary, ratio, and takers as the independent variables for the first, second, third, and fourth regressions respectively. Perform a multiple regression using expend, salary, and ratio as the independent variables. What conclusions can you draw from these five regressions and why?

d) (10 points) Which of the models that you constructed in parts (a) thru (c) would you recommend? Hint: Examine the residuals for the different models. In particular, check the constant variance assumption for the errors and the normality assumption for errors. Is linearity an appropriate assumption? Provide commentary and any relevant plots. Are there any outliers? If so, what are they? What conclusions can you infer from the regression model?

4. Housing Prices (25 points)

A national homebuilder builds single-family homes and condominium-style townhouses. Quiz3.xls contains a worksheet of data on the selling price, lot cost, type of home, and region of the country (M=Midwest, S=South) for closings during one month.

a) (10 points) Develop a multiple regression model for sales price as a function of lot cost, region of country, and type of home.

b) (5 points) What is the 99% confidence interval on the coefficient of lot cost? At α = 0.05, can you reject the null hypothesis that the intercept is zero?

c) (5 points) What are the slopes of the independent variables? By examining p-values, what can you say about statistical significance? Are any of these slopes surprising? Why or why not?

d) (5 points) Use the pivot table tool to construct tables for performing an analysis of the residuals. What conclusions can you draw from your analysis of residuals?

5. Planting Strategy (20 Points)

Your family farm consists of 2000 acres of land. You must decide what to plant this next Spring, and you can plant either corn or soybeans (or a combination thereof). Corn and soybean seeds can be purchased in either treated or untreated form, with the difference being that treated seeds germinate better in cold, moist Spring weather, while untreated seeds cost less and have higher yields (provided that the Spring weather is warm and dry enough to provide proper germination) if the Summer is very hot. You have identified four possible weather patterns for the upcoming growing season (denoted cw, ch, ww, and wh respectively) corresponding to cold Spring-warm Summer, cold Spring-hot Summer, warm Spring-warm Summer, warm Spring-hot Summer. The per acre yields (in bushels) are given in the following table:

| |cw |ch |ww |wh |seed price/# |

|treated beans |85 |95 |90 |100 |$3.50 |

|untreated beans |50 |60 |100 |120 |$3 |

|treated corn |150 |175 |160 |190 |$2.50 |

|untreated corn |75 |100 |175 |225 |$2 |

Beans require 50 pounds of seeds per acre while corn requires 30 pounds per acre. Corn requires more fertilizer than beans, and the additional fertilization costs incurred will be $5 per acre. The sales price per bushel of beans and corn do not depend on whether you planted treated or untreated seed, but do depend upon the weather which affects total crop yields. The estimated prices at which you can sell your harvest are:

| |cw |ch |ww |wh |

|beans |$9.00 |$8.75 |$8.50 |$7.00 |

|corn |$6.00 |$5.50 |$5.75 |$4.00 |

a. (2 points) Compute the net profit per acre (for each variety of seed & each possible weather pattern) to fill out the following table.

| |cw |ch |ww |wh |

|treated beans | | | | |

| | | | | |

|untreated beans | | | | |

|treated corn | | | | |

| | | | | |

|untreated corn | | | | |

| | | | | |

b. (5 points) You wish to formulate the problem of maximizing your certain or guaranteed net profit as a linear programming problem. That is, although you may make more than this guaranteed profit, you cannot earn less regardless of what the weather does. In doing so, you identify 4 decision variables defined as follows:

BT The # of acres planted in treated beans

BU The # of acres planted in untreated beans

CT The # of acres planted in treated corn

CU The # of acres planted in untreated corn

Using the information from part a, write out 4 equations that give the contingent net profit earned in terms of the 4 decision variables identified immediately above (contingent upon each of the four different weather scenarios). Hint: one equation for each scenario.

c. (5 points) Your model will also need a variable to represent the guaranteed or certain profit. Let Z denote the guaranteed profit. What is (are) the relationship(s) between Z and the contingent net profits identified in part b? Hint: try to represent these relationships as constraints.

d. (4 points) Using your answers from parts a, b, and c, as well as any other information you need from the problem statement, write out the complete specification of the linear programming model.

e. (4 points) Solve the linear programming problem using Excel.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download