Chapter 7, Independent Sampling and Comparing Two Means



Chapter 7, Independent Sampling and Comparing Two Means

Now we’ll begin looking at populations in pairs – or at least, data from two or more samples that were collected independently from one another. Note that two different n’s are possible…we require independent sampling and that both samples be “large”.

[Large samples – n ( 30]

The relevant statistic is the difference of the sample means:

[pic]

This difference is distributed N ( [pic][pic]).

We’ll use a new sample standard deviation:

[pic]

yes, learn this by heart

We’ll use it to form confidence intervals:

[pic][pic] using the exact same z’s as before

Hypothesis tests have the same 7 steps and the same probabilities of a Type 1 error.

For example:

A restaurant has changed its menu so that most of the meals now offered cost less than the meals it used to offer. The new meals are also less expensive to prepare and the owner of the restaurant wants to compare the mean net daily income obtained with the lower priced menu to the previous mean net daily incomes for the earlier period. A random sample of 50 net daily incomes for the earlier period is selected and a second random sample of 30 net daily incomes for the current menu is selected.

So: Higher Priced Meals:

[pic]

Lower Priced Meals:

[pic]

Find a 95% Confidence Interval for the difference in mean net daily incomes obtained:

Formula: [pic][pic]

z score: 96

difference in the means: 19

combined standard deviation:

hint: [pic] 11.28

Confidence interval: ( (3.11, 41.11)

This is a bit of a disaster, of course…we’d hoped that the second average is truly more than the first one…ie more daily income now than before…but this confidence interval covers the second is bigger than the first, they are equal, and the first is bigger than the second…and we don’t know where in the interval the true difference is.

When a Confidence Interval for the difference of means contains 0, then you’re in a difficult position. Ideally you’d have all positive values ( mu1> mu2) or all negative values (mu1 1.96

Dec cannot reject Ho.

What does this mean for our restaurant?

Maybe the new menu is no change in daily net income at all.

Hopefully you’re not more than about half a year into the new menu.

Let’s interpret a confidence interval:

An experiment was conducted to compare two methods of teaching spelling to children. Method 1 and Method 2. The objective of the experiment is to estimate the difference between μ1, the mean score on a standardized spelling test for all children taught by Method 1 and μ2 , the mean score on the same test for all children taught by Method 2.

Once the statistics were run, a 99% confidence interval for the difference was found to be (4, 10).

What can we conclude from this?

Another example:

It is often believed that economic status is related to the commission of crimes. To test this out, a sociologist selected a random sample of 70 people who had never even been indicted for a crime much less convicted and he noted their annual incomes. Similarly, a random sample of 60 convicted criminals (each one a first time offender) was taken and their incomes noted.

Do the following data provide sufficient evidence at the α = .05 level that the mean income level of criminals prior to committing their first offense is lower than for the noncriminal public? Income is in thousands of dollars

| |mean |variance |

|criminals |13.3 |24.2 |

|not |15.4 |42.6 |

take a minute to note the overlap in the ranges:

So, are these different groups of people or not?

Ho [pic]

Ha

Why is this a less than test?

TS

Rej Reg z < ( 1.65

Decision

OK. What about independently taken small samples?

Well, small samples means additional assumptions:

the original populations must be approximately normal (recall section 4.6)

the original population variances must be EQUAL

the samples are taken independently

This means that you have some work to do prior to taking your samples…or some explaining to do in your decision.

We will use a pooled sample estimator for variance:

[pic]

We use this pooled variance in the sample standard deviation:

[pic]

This is used in confidence intervals and in hypothesis testing (which uses a t distribution).

For example:

A local TV station wants to find out if sports events or first run movies attract more viewers. A random sample of 13 evenings that had major sports events were chosen and 15 evenings with first run movies were sampled…independently. Note that the station knows that the populations for viewers is normally distributed and that the variances for both audiences are the same (this from data gathered at other times).

Do the following data provide sufficient evidence, at the α = .01 to indicate a difference in mean number of viewers for sports and movies?

|sports |movies |

|n = 13 |n = 15 |

|mean = 6.8 million |mean = 5.3 million |

|std dev = 1.8 million |std dev = 1.6 million |

Ho [pic]

Ha [pic]

Calculate the pooled standard deviation:

[pic] you should get 2.87

TS [pic] you should get 2.34

Rej Reg based on [pic] dof

t > 2.779 or t < (2.779

Decision

Would it help if we backed off to an alpha of .05?

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download