Ch 12 # 42 - JustAnswer



Ch 12 # 42

Martin Motors has in stock three cars of the same make and model. The president would

like to compare the gas consumption of the three cars (labeled car A, car B, and car C)

using four different types of gasoline. For each trial, a gallon of gasoline was added to

an empty tank, and the car was driven until it ran out of gas. The following table shows

the number of miles driven in each trial.

Distance (miles)

Types of Gasoline Car A Car B Car C

Regular 22.4 20.8 21.5

Super regular 17.0 19.4 20.7

Unleaded 19.2 20.2 21.2

Premium unleaded 20.3 18.6 20.4

Using the .05 level of significance:

a. Is there a difference among types of gasoline?

b. Is there a difference in the cars?

SEE EXCEL

Ch 13 #37

A regional commuter airline selected a random sample of 25 flights and found that the

correlation between the number of passengers and the total weight, in pounds, of luggage

stored in the luggage compartment is 0.94. Using the .05 significance level, can we

conclude that there is a positive association between the two variables?

H0: ρ ≤ 0

Ha: ρ > 0

 

Critical value, from a t table, one tail, with df = n-2 = 23:

1.7139

 

Test statistic:

r*sqrt((n-2)/(1-r^2))

0.94*sqrt(23/(1-0.94^2))

= 13.213

 

This is MUCH higher than the test statistic, so we reject the null hypothesis and conclude that there is a correlation!

#40

A suburban hotel derives its gross income from its hotel and restaurant operations. The

owners are interested in the relationship between the number of rooms occupied on a

nightly basis and the revenue per day in the restaurant. Below is a sample of 25 days

(Monday through Thursday) from last year showing the restaurant income and number of

rooms occupied.

Day Income Occupied Day Income Occupied

1 $1,452 23 14 $1,425 27

2 1,361 47 15 1,445 34

3 1,426 21 16 1,439 15

4 1,470 39 17 1,348 19

5 1,456 37 18 1,450 38

6 1,430 29 19 1,431 44

7 1,354 23 20 1,446 47

8 1,442 44 21 1,485 43

9 1,394 45 22 1,405 38

10 1,459 16 23 1,461 51

11 1,399 30 24 1,490 61

12 1,458 42 25 1,426 39

13 1,537 54

Use a statistical software package to answer the following questions.

a. Does the breakfast revenue seem to increase as the number of occupied rooms

increases? Draw a scatter diagram to support your conclusion.

b. Determine the coefficient of correlation between the two variables. Interpret the value.

c. Is it reasonable to conclude that there is a positive relationship between revenue and

occupied rooms? Use the .10 significance level.

d. What percent of the variation in revenue in the restaurant is accounted for by the number of rooms occupied?

SEE EXCEL

Ch 14 #17

The district manager of Jasons, a large discount electronics chain, is investigating why

certain stores in her region are performing better than others. She believes that three factors are related to total sales: the number of competitors in the region, the population in the surrounding area, and the amount spent on advertising. From her district, consisting

of several hundred stores, she selects a random sample of 30 stores. For each store she

gathered the following information.

Y= total sales last year in thousands

X1= number of competitors in the region.

X2= population of the region in millions

X3= advertising expense in thousands

The sample data were run on MINITAB, with the following results.

Analysis of variance

SOURCE DF SS MS

Regression 3 3050.00 1016.67

Error 26 2200.00 84.62

Total 29 5250.00

Predictor Coef StDev t-ratio

Constant 14.00 7.00 2.00

X1 -1.00 0.70 -1.43

X2 30.00 5.20 5.77

X3 0.20 0.08 2.50

a. What are the estimated sales for the Bryne store, which has four competitors, a

regional population of 0.4 (400,000), and advertising expense of 30 ($30,000)?

b. Compute the R^2 value.

c. Compute the multiple standard error of estimate.

d. Conduct a global test of hypothesis to determine whether any of the regression coefficients

are not equal to zero. Use the .05 level of significance.

e. Conduct tests of hypotheses to determine which of the independent variables have

significant regression coefficients. Which variables would you consider eliminating?

Use the .05 significance level.

a. What are the estimated sales for the Bryne store, which has four competitors, a

regional population of 0.4 (400,000), and advertising expense of 30 ($30,000)?

Use the regression:

14 - 1*4 + 30*0.4 + 0.2*30

= 28

So the sales would be $28000, since the regression gives the value in thousands

b. Compute the value of r^2.

ssr/sst

= 3050/5250

= 0.58095

c. Compute the multiple standard error of estimate.

sqrt(mse)

= sqrt(84.62)

= 9.1989

d. Conduct a global test of hypothesis to determine whether any of the regression coefficients

are not equal to zero. Use the .05 level of significance.

H0: coeffs are zero

Ha: coeffs are not zero

F = 1016.67/84.62

= 12.01454

The critical value, from an F table, with df = 3, 26 is: 2.975

The test statistic is much higher, so we reject the null.

At least one coefficient is non zero.

e. Conduct tests of hypotheses to determine which of the independent variables have

significant regression coefficient. Which variables would you consider eliminating? Use the .05 significance level.

Each hypothesis looks like this:

H0: coefficient for Xn is 0

Ha: coefficient for Xn is non zero

The critical T value, from a table, is +/- 2.056

The t value for X1 is not outside that range, so we don't reject.

The t value for X2 and X3 are outside, so we reject.

X2 and X3 are significant.

Therefore, we could consider removing X1.

#18

Suppose that the sales manager of a large automotive parts distributor wants to estimate

as early as April the total annual sales of a region. On the basis of regional sales,

the total sales for the company can also be estimated. If, based on past experience, it

is found that the April estimates of annual sales are reasonably accurate, then in future

years the April forecast could be used to revise production schedules and maintain the

correct inventory at the retail outlets. Several factors appear to be related to sales, including the number of retail outlets in the region stocking the company’s parts, the number of automobiles in the region registered as of April 1, and the total personal income for the first quarter of the year. Five independent variables were finally selected as being the most important (according to the sales manager). Then the data were gathered for a recent year. The total annual sales for that year for each region were also recorded. Note in the following table that for region 1 there were 1,739 retail outlets stocking the company’s automotive parts, there were 9,270,000 registered automobiles in the region as of April 1 and so on. The sales for that year were $37,702,000.

Annual Sales #of Retail Outlets #Auto Registered Personal Income Avg Age # of Supervisors

Millions millions billions years

Y X1 X2 X3 X4 X5

37.702 1,739 9.27 85.4 3.5 9.0

24.196 1,221 5.86 60.7 5.0 5.0

32.055 1,846 8.81 68.1 4.4 7.0

3.611 120 3.81 20.2 4.0 5.0

17.625 1,096 10.31 33.8 3.5 7.0

45.919 2,290 11.62 95.1 4.1 13.0

29.600 1,687 8.96 69.3 4.1 15.0

8.114 241 6.28 16.3 5.9 11.0

20.116 649 7.77 34.9 5.5 16.0

12.994 1,427 10.92 15.1 4.1 10.0

a. Consider the following correlation matrix. Which single variable has the strongest correlation with the dependent variable? The correlations between the independent variables outlets and income and between cars and outlets are fairly strong. Could this

be a problem? What is this condition called?

sales outlets cars income age

outlets 0.899

cars 0.605 0.775

income 0.964 0.825 0.409

age -0.323 -0.489 -0.447 -0.349

bosses 0.286 0.183 0.395 0.155 0.291

The “income” has the highest correlation value with the “sales” (the dependent variable).

The fact that there are high correlations between the independent variables could be a problem because it will cause multicolinearity.

b. The output for all five variables is on the following page. What percent of the variation

is explained by the regression equation?

The regression equation is

sales = -19.7 - 0.00063 outlets + 1.74 cars + 0.410 income + 2.04 age - 0.034 bosses

Predictor Coef StDev t-ratio

Constant -19.672 5.422 -3.63

outlets -0.000629 0.002638 -0.24

cars 1.7399 0.5530 3.15

income 0.40994 0.04385 9.35

age 2.0357 0.8779 2.32

bosses -0.0344 0.1880 -0.18

Analysis of Variance

SOURCE DF SS MS

Regression 5 1593.81 318.76

Error 4 9.08 2.27

Total 9 1602.89

We have to calculate the r^2 value:

= SSReg/SSTotal

1593.81/1602.89

= 99.43%

c. Conduct a global test of hypothesis to determine whether any of the regression coefficients are not zero. Use the .05 significance level.

H0: coeffs are zero

Ha: coeffs are not zero

F = 318.76/2.27

= 140.4229

The critical value, from an F table is: 6.26

The test statistic is much higher, so we reject the null.

At least one coefficient is non zero.

d. Conduct a test of hypothesis on each of the independent variables. Would you consider

eliminating “outlets” and “bosses”? Use the .05 significance level.

Each hypothesis looks like this:

H0: coefficient for Xn is 0

Ha: coefficient for Xn is non zero

The critical T value, from a table, is +/- 2.777

The ones that not outside that range are (don’t reject the null): outlets and bosses, so we can consider removing them. The others are outside the range (reject the null), so they are significant.

e. The regression has been rerun below with “outlets” and “bosses” eliminated. Compute

the coefficient of determination. How much has R2 changed from the previous analysis?

The regression equation is

sales = -18.9 + 1.61 cars +0.400 income +1.96 age

Predictor Coef StDev t-ratio

Constant -18.924 3.636 -5.20

cars 1.6129 0.1979 8.15

income 0.40031 0.01569 25.52

age 1.9637 0.5846 3.36

Analysis of Variance

SOURCE DF SS MS

Regression 3 1593.66 531.22

Error 6 9.23 1.54

Total 9 1602.89

Now we can get the new r^2:

1593.66/1602.89

= 99.42%

It went down by 0.01%

f. Following is a histogram and a stem-and-leaf chart of the residuals. Does the normality

assumption appear reasonable?

Histogram of residual N=10 Stem-and-leaf of residual N=10

Leaf Unit =0.10

Midpoint Count

-1.5 1 * 1 -1 7

-1.0 1 * 2 -1 2

-0.5 2 ** 2 -0

-0.0 2 ** 5 -0 440

0.5 2 ** 5 0 24

1.0 1 * 3 0 68

1.5 1 * 1 1

1 1 7

There are no outliers in this chart, and it looks bell shaped (normal). So the normality assumption appears reasonable.

Ch. 17 #22

Banner Mattress and Furniture Company wishes to study the number of credit applications received per day for the last 300 days. The information is reported on the next page.

Number of Credit Applications Frequency in Days

0 50

1 77

2 81

3 48

4 31

5 or more 13

To interpret, there were 50 days on which no credit applications were received, 77 days

on which only one application was received, and so on. Would it be reasonable to conclude that the population distribution is Poisson with a mean of 2.0? Use the .05 significance level. Hint: To find the expected frequencies use the Poisson distribution with a mean of 2.0. Find the probability of exactly one success given a Poisson distribution with a mean of 2.0. Multiply this probability by 300 to find the expected frequency for the number of days in which there was exactly one application. Determine the expected frequency for the other days in a similar manner.

H0: distribution is Poisson with mean of 2.0

Ha: distribution is not Poisson with mean of 2.0

The critical value is 11.07, with df = 6-1 = 5

The expected values for each of the 6 categories, based on the Poisson, are:

40.6

81.2

81.2

54.1

27.1

15.8

The chi square value is:

sum ( (observed - expected)^2 / expected )

 

So:

(50-40.6)^2/40.6 + (81.2-77)^2/81.2 + (81.2-81)^2/81.2 + (54.1-48)^2/54.1 + (27.1-31)^2/27.1 + (15.8-13)^2/15.8

= about 4.1

The statistic is below the critical value. We don't reject H0. It follows the Poisson.

 

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download