Chapter 11 2nd Edition - Faculty Websites



Basic Form of a Confidence Interval for a Parameter:

Let's DoIt!11.1 11.1

The Goodness of Dark Chocolate

An Ann Arbor News article (June 8, 2004) entitled “Dark Chocolate may help blood flow” reported the results of a study in which researchers fed a small 1.6-ounce bar of dark chocolate to each of 22 volunteers daily for two weeks. Half of the subjects were randomly selected and assigned to receive bars containing dark chocolate’s typically high levels of flavonoids, and the other half received placebo bars with just trace amounts of flavonoids.

(a) Why was a placebo used as a control in this experiment?

(b) Randomization was used to divide the 22 subjects into two groups of 11 each, resulting in two independent sets of responses. Design a random allocation scheme and use your calculator (seed or random number table (Row 6, Column 6) to carry out the randomization value =50)

(c) Let represent the population mean improvement in blood flow for those on the high-flavonoid diet and represent the population mean improvement in blood flow for those on the placebo diet. The researchers tested that the high-flavonoid group would have a higher mean improvement in blood flow. How would you express the null and alternate hypotheses in terms of and remember the researchers’ theory should be expressed in the alternative hypothesis.

[pic]

[pic]

(d) The article stated that the ability of the brachial artery to dilate significantly improved for those in the high-flavonoid group compared to those in the placebo group. A significance level of 0.05 was used. Based on the statements, what can you say about the p-value? Clearly circle your answer:

[pic] [pic] Can’t tell

(e) The researchers also found that concentrations of the cocoa flavonoid epicatechin soared in blood samples taken from the group that received the high-flavonoid chocolate, rising from a baseline of 25.6 nmol/L to 204.4 nmol/L. In the group that received the low-flavonoid chocolate, concentrations of epicatechin decreased slightly, from a baseline of 17.9 nmol/L to 17.5 nmol/L. The mean improvement for the high-flavonoid group of 204.4 -25.6 =178.8 nmol/L is a (clearly circle all correct answers):

parameter statistic sample mean population mean

11.2 Paired Samples versus Independent Samples

Paired Design 11.1

[pic]

With paired data, we are interested in comparing the responses within each pair. We will analyze the differences of the responses that form each pair.

Paired Data: Response = Annual Salary (in $1000s)

[pic]

Paired Design 11.2

[pic]

DEFINITION:

We have paired or matched samples when we know, in advance, that an observation in one data set is directly related to a specific observation in the other data set. It may be that the related sets of units are each measured once (Paired Design 11.1), or that the same unit is measured twice (Paired Design 11.2). In a paired design, the two sets of data must have the same number of observations.

Independent Samples Design 11.3

[pic]

Independent Samples Data:

Response = Annual Salary (in $1000s)

[pic]

In the two independent samples scenario, we will compare the responses of one treatment group as a whole to the responses of the other treatment group as a whole. We will calculate summary measures for the observations from one treatment group and compare them to similar summary measures calculated from the observations from the other treatment group.

DEFINITION:

We have two independent samples when two unrelated sets of units are measured, one sample from each population, as in Independent Samples Design 11.3. In a design with two independent samples, although the same sample size is often preferable, the sample sizes might be different.

Let’s Do It! 11.2

Paired Samples versus Independent Samples

(a) Three hundred registered voters were selected at random, 30 from each of 10 midwestern counties, to participate in a study on attitudes about how well the president is performing his job. They were each asked to answer a short multiple-choice questionnaire and then they watched a 20-minute video that presented information about the job description of the president. After watching the video, the same 300 selected voters were asked to answer a follow-up multiple-choice questionnaire. The investigator of this study will have two sets of data: the initial questionnaire scores and the follow-up questionnaire scores. Is this a paired or independent samples design?

Circle one: Paired Independent

Explain:

(b) Thirty dogs were selected at random from those residing at the humane society last month. The 30 dogs were split at random into two groups. The first group of 15 dogs was trained to perform a certain task using a reward method. The second group of 15 dogs was trained to perform the same task using a reward-punishment method. The investigator of this study will have two sets of data: the learning times for the dogs trained with the reward method and the learning times for the dogs trained with the reward-punishment method. Is this a paired or independent samples design?

Circle one: Paired Independent

Explain:

Let’s Do It! 11.3

Design a Study

For each of the following research questions, briefly describe how you might design a study to address the question (discuss whether paired or independent samples would be obtained):

(a) Do freshmen students use the library to study more often than senior students?

(b) Do books cost more on average at the local bookstore or through ?

(c) Will taking summer school improve reading levels for Kindergarteners going into first grade?

11.3 Paired Samples

In a paired design, units in each par are alike (in fact, they may be the same unit), whereas units in different pairs may be quite dissimilar.

[pic]

Since we are interested in the difference for each pair, the differences are what we analyze in paired designs.

Example 11.1 Weight Change

A study was conducted to estimate the mean weight change of a female adult who quits smoking. The weights of eight female adults before they stopped smoking and five weeks after they stopped smoking were recorded. The differences, computed as “after -before,” are given below.

|Subject |1 |2 |3 |4 |5 |6 |7 |8 |

|After |154 |181 |151 |120 |131 |130 |121 |128 |

|Before |148 |176 |153 |116 |129 |128 |120 |132 |

|Difference |6 |5 |-2 |4 |2 |2 |1 |-4 |

Here we have another example of a paired design.

(a) Compute the sample mean difference in weight.

(b) Compute the sample standard deviation of the differences.

(c) Compute the standard error of the mean difference and interpret this.

Solution

(a) The sample mean difference is [pic]=1.75 pounds. Note that the differences computed as “after - before” represent the weight gain for a subject. A positive value indicates weight gain and a negative value indicates a weight loss.

(b) The sample standard deviation is SD =3.412 pouds

(c) The standard error of the mean difference is:[pic].

We would estimate the average distance between the possible sample mean differences ([pic]values) and the population mean difference [pic] to be roughly 1.21 pounds.

Example 11.2 Comparing Test Scores

A group of 10 randomly selected children of elementary school age among those in the Mankato County who were recently diagnosed with asthma was tested to see if a new children’s educational video is effective in increasing the children’s knowledge about asthma. A nurse gave the children an oral test containing questions about asthma before and after seeing the animated video. The test scores are given below:

|Child: |1 2 3 4 5 6 7 8 9 10 | |

| | |Mean = 60 |

| | |Mean = 64.5 |

|Before: |61 60 52 74 64 75 42 63 53 56 | |

|After: |67 62 54 83 60 89 44 67 62 57 | |

(a) Explain why we have paired data here and not two independent samples.

(b) We are interested in examining the differences in the scores for each child. Compute the differences and find the sample mean difference and the sample standard deviation of the differences.

(c) The researchers wish to assess if the data provide sufficient evidence to conclude that the mean score after viewing the educational video is significantly higher than the mean score before the viewing. The test will be conduced at the 5% level of significance. State the appropriate hypotheses to be tested in terms of the population mean difference in test scores [pic].

(d) What must be assumed about the population for this test to be valid? Make appropriate graphs to check that assumption.

(e) Compute the observed t-test statistic value.

(f) Find the corresponding p-value.

(g) State the decision and conclusion using a 5% significance level.

Solution

(a) Since we have two observations from the same child, we have paired data.

(b) The observed differences computed here as are as follows: “after-before”.

|Child: |1 2 3 4 5 6 7 8 9 10 | |

|d = After - Before |6 2 2 9 -4 14 2 4 9 1 |Mean diff =4.5 |

The first observed difference is 6 and is represented by d1, and the last difference is also positive and is represented by d10 =1. The observed sample mean difference is [pic], which is our estimate of the unknown mean difference, [pic]. The observed sample standard deviation of the differences is [pic], which is our estimate of the unknown population standard deviation [pic].

(c) Since we defined our differences as diff =after -before, it is positive differences that would show some support that the video is effective in improving the mean test score. Thus the corresponding hypotheses to be tested are [pic] versus [pic].

(d) We must assume that the population of all differences follows a normal distribution. We could check this assumption by examining a histogram or making a QQplot of the differences. Both graphs below show no strong departures from normality.

(e) The observed t-test statistic is given by [pic].

This means we observed a sample mean difference that was about 2.78 standard errors above the hypothesized mean difference of zero.

Is this large enough (that is, far enough above zero) to reject the null hypothesis?

(f) The p-value is the probability of getting a test statistic as large as or larger than the observed test statistic of 2.78, computed under the null hypothesis model, which is a t-distribution with nine degrees of freedom.

[pic]

With the TI

1. Using the tcdf( function.

Using the tcdf( function on the TI we have

p-value =[pic]= tcdf(2.78, E99, 9) = 0.0107

2. Using the T-Test function under the STAT TESTS menu.

In the TESTS menu located under the STAT button, we select the 2:T-Test option. With the sample mean of 4.5, the sample standard deviation of 5.126, and the sample size of n = 10, we can use the Stats option of this test. The corresponding input and output screens are shown. Notice that the null or hypothesized value is zerp.

[pic]

p-value =[pic]= 0.01077.

With Table IV

In Table IV we focus on the df = 9 row and find that our observed test statistic value of 2.78 falls between the values in the columns headed with 0.025 and 0.01. So we state that the p-value is between 0.01 and 0.025; that is,

0.01 < p-value =[pic] Let 'diff' = 'after'- 'before'

MTB > TTest 0.0 'diff';

SUBC> Alternative 1.

TEST OF MU = 0.00 VS MU G.T. 0.00

N MEAN STDEV SE MEAN T P VALUE

diff 10 4.50 5.13 1.62 2.78 0.011

(g) Decision and Conclusion

Since our p-value is less than 0.05, at the [pic] significance level we would reject [pic], and conclude there is sufficient evidence to say that the mean score after viewing the educational video is significantly higher than the mean score before the viewing.

Think about it:

Why Not?

Why can’t we simply reject the null hypothesis and support the alternative hypothesis, [pic], since the observed sample mean difference [pic] is greater than 0? If you repeated this study with another group of 10 randomly selected children, would the mean difference be 4.5 again?

Example 11.3 Comparing Two Burn Treatments

Two creams are available by prescription for treating moderate skin burns. A study to compare the effectiveness of the two creams is conducted using 15 patients with moderate burns on their arms. Two spots of the same size and degree of burn are marked on each patient’s arm. One of the two creams is selected at random and applied to the first spot, while the remaining spot is treated with the other cream. The number of days until the burn has healed is recorded for each spot. These data are provided with the difference in healing time (in days).

|Patient Number |1 |2 |3 |

|1. | | | |

|2. | | | |

|3. | | | |

|4. | | | |

|5. | | | |

|6. | | | |

|7. | | | |

|8. | | | |

|9. | | | |

|10. | | | |

Test if the pulse rate is significantly higher after jumping.

[pic] versus

[pic]

Compute the differences and perform the corresponding test. Provide the observed test statistic value, the p-value, give your decision using a 5% significance level, and state your conclusion in a well-written sentence.

Effect Size for the Paired t-test

DEFINITION:

The estimated effect size for a population mean difference is:

[pic]

where [pic] is the sample mean difference, [pic] is the sample standard deviation of the differences, and [pic] is the hypothesized mean difference which is generally 0.

11.4 Independent Samples: Comparing Means

[pic]

Population 1 mean = μ1

Population 2 mean = μ2

Population 1 standard deviation = σ1

Population 2 standard deviation = σ2

Assumptions for a Two Independent Samples Design

We have a simple random sample of [pic] observations

from a [pic] population.

We have a simple random sample of [pic] observations

from a [pic] population.

The two random samples are independent of each other.

Notation in Two Independent Samples Design

[pic] = sample size for first sample (number of observations from Population 1)

[pic]= sample size for second sample (number of observations from Population 2)

[pic] = observed sample mean for the first sample.

[pic] = observed sample mean for the second sample.

[pic] = observed sample standard deviation for the first sample.

[pic] = observed sample standard deviation for the second sample.

We are interested in comparing the population means [pic]

and [pic], so the parameter of interest is the difference [pic].

We need to know the sampling distribution of [pic] which is based on three main statistical results that we will not prove here:

1. The mean of the difference of two random variables is the difference of the means.

2. The variance of the difference of two independent random variables is the sum of the variances.

3. The difference of two independent normally distributed random variables is also normally distributed.

Distribution of [pic] for the Two Independent Samples Scenario

The mean of [pic] is [pic].

The standard deviation of [pic] is [pic].

The distribution of [pic] is normal.

Thus, [pic] is [pic].

Standard Error of the Difference in Sample Means

[pic] where [pic] and [pic] are the two sample standard deviations. The standard error of estimates [pic], roughly, the average distance of the possible [pic] values from [pic]. The possible values result from considering all possible independent random samples of the same sizes from the same two populations.

Think about it

Should we estimate the common population standard deviation [pic] by just averaging the two sample standard deviations? What if the first sample is of size 10 but the second sample is of size 100? Would you want to weight the corresponding sample standard deviations equally? Which sample standard deviation, [pic] or [pic], would you “trust” more?

Pooled Estimate of [pic]:

[pic].

Distribution of the Standardized [pic] for the

Two Independent Samples Scenario

If [pic] is [pic]

then [pic] is [pic].

If the true standard deviation [pic] is replaced by an estimate [pic],

where [pic],

then the standardized quantity [pic]

has a t-distribution with [pic] degrees of freedom.

Example 11.4

Comparing Two Headache Treatments

[pic]

Medical researchers are comparing two treatments for migraine headaches. They wish to perform a double-blind experiment to assess if Treatment 2 (the new treatment) is significantly better than Treatment 1 (the standard treatment) using a 5% significance level.

The data

[pic]

(a) State the appropriate hypotheses to be tested. Keep in mind that smaller responses imply a better treatment and Treatment 1 is the new treatment.

(b) State the conditions required for performing a two independent samples pooled t-test are satisfied.

(c) The mean time to relief for the Treatment 1 subjects was 22.6 minutes, with a standard deviation of 5.2 minutes. The mean time to relief for the Treatment 2 group was 19.4 minutes, with a standard deviation of 4.9 minutes. Recall that one of the assumptions for performing this test is equal population standard deviations. However, 5.2 is not equal to 4.9. Does this imply that the pooled test will not be valid?

(d) Give an estimate of the common population standard deviation.

(e) Compute the pooled t-test statistic.

(f) Find the corresponding p-value.

(g) State the decision and conclusion using a 5%significance level.

Solution

(a) [pic] vs [pic].

(b) The first sample is a random sample from a normal population with mean [pic] and standard deviation [pic]. The second sample is a random sample from a normal population with mean [pic] but same standard deviation [pic]. The two samples are independent.

(c) Even though the sample standard deviations of 5.2 and 4.9 are not equal, this does not mean the equal population standard deviations assumption has been violated. Examining the relative magnitude of the two sample standard deviations is a quick check for this assumption.

(d) An estimate of the equal population standard deviation is

[pic]

Our estimate should make sense. Since it is a weighted average, it should be between the two sample standard deviations of 5.2 and 4.9. When the sample sizes are equal, the pooled variance estimate is the average of the two sample variances.

(e) The observed pooled t-test statistic is [pic]

The value of 1.416 means that we observed two sample means that are about 1.4 standard errors apart. Is this a large enough difference to reject the null hypothesis at a 5%significance level?

(f) The p-value is the probability of observing a test statistic as large as or larger than the observed value of 1.416, computed under the null distribution, which is the t-distribution with degrees of freedom. 10 +10 -2 =18

[pic]

Using the TI:

1. Using the tcdf( function.

Using the tcdf( function on the TI we have:

p-value =[pic]= tcdf(1.416, E99, 18) = 0.0869.

2. Using the 2-SampTTest function under STAT TESTS.

In the TESTS menu located under the STAT button, we select the 4:2-SampTTest option. With the sample means of 22.6 and 19.4, the sample standard deviations of 5.2 and 4.9, and the sample sizes of 10 and 10, we can use the Stats option of this test. The steps and corresponding input and output screens are shown. Notice that you must specify Yes under the Pooled option. The No Pooled option is discussed at the end of this section as another version of our test.

[pic]

p-value =[pic]= 0.08688.

With Table IV

In Table IV we focus on the df = 18 row and find that our observed test statistic value of 1.416 falls between the values in the columns headed with 0.10 and 0.05. So we state that the p-value is between 0.05 and 0.10.

0.05 < p-value =[pic] TwoSample 95.0 'device 1' 'device 2';

SUBC> Alternative 1;

SUBC> Pooled.

TWOSAMPLE T FOR device 1 VS device 2

N MEAN STDEV SE MEAN

device 1 10 1.1280 0.0413 0.013

device 2 10 1.0710 0.0443 0.014

95 PCT CI FOR MU device 1 - MU device 2: (0.017, 0.097)

TTEST MU device 1 = MU device 2 (VS GT): T= 2.97 P=0.0041 DF= 18

POOLED STDEV = 0.0428

Solution

(a) [pic] versus [pic].

(b) We can see graphically that there is some difference in the median emission levels, with Device 1 producing the higher levels overall. We also can informally check the equal population standard deviations assumption IQRs (lengths of the boxes) are quite similar.

(c) In the output, the sample size (N), the observed sample mean (MEAN), and the observed sample standard deviation (STDEV) for each group (that is, device) are provided. The mean amount of nitric oxide emitted by the 10 Device 1 controls was 1.128, with a standard deviation of 0.0413.The mean amount of nitric oxide emitted by the 10 Device 2 controls was 1.071, with a standard deviation of 0.0443. The observed test statistic of t =2.97 and the corresponding p-value of 0.0041 for the one-sided to the right alternative hypothesis are provided.

(d) For any[pic] > 0.0041 we would conclude the mean level of emission is greater for Device 1 controls.

Let’s Do It!

11.6 Sheep Treatment

An experiment was conducted to compare the mean number of tapeworms in the stomachs of sheep that had been treated for worms against the mean number in those untreated. A sample of 14 worm-infected lambs was randomly divided into two groups. Seven were injected with the drug and the remainder were left untreated. After a six-month period, the lambs were slaughtered and the following counts were recorded:

|Group 1: Drug-treated sheep |18 |43 |28 |50 |16 |32 |13 |

|Group 2: Untreated sheep |40 |54 |26 |63 |21 |37 |39 |

Assume that these data are the observed values of independent random samples where the two distributions are normal with equal population variance.

[pic]

(a) Side-by-side boxplots for the two sets of data are provided at the right. The box plot for the untreated sheep responses establishes that a normal distribution is the appropriate model for number of worms.

Circle one: True False

Explain:

(b) Based on these boxplots, does the assumption of equal population variance seem plausible?

Circle one: True False

Explain:

(c) State the hypotheses for testing that the mean number of worms for the treated lambs is less than the mean number of worms for the untreated lambs.

[pic]: ___________________ [pic]: ______________________

Computer output from the SPSS computer package based on the sheep data is provided.

| |Levene’s Test for |t-test of Equality of Means |

| |Equality of Variances | |

| |F |Sig. |

| |F |Sig. | |

| | | |t |

|1 |13.2 (L) |14.0 (R) |-0.8 |

|2 |8.2 (L) |8.8 (R) |-0.6 |

|3 |10.9 (R) |11.2 (L) |-0.3 |

|4 |14.3 (L) |14.2 (R) |0.1 |

|5 |10.7 (R) |11.8 (L) |-1.1 |

|6 |6.6 (L) |6.4 (R) |0.2 |

|7 |9.5 (L) |9.8 (R) |-0.3 |

|8 |10.8 (L) |11.3 (R) |-0.5 |

|9 |8.8 (R) |9.3 (L) |-0.5 |

|10 |13.3 (L) |13.6 (R) |-0.3 |

(e) A scatterplot of the paired observations is provided below. What does this plot tell us about the wear measurements for Material 1 and Material 2?

[pic]

(f) The SPSS output for a paired samples t-test on these data is given below. Report the sample mean difference, the sample standard deviation of the differences, the observed test statistic, p-value, decision and conclusion using a 5%significance level.

Paired Samples Test

| |Paired Differences | | | |

| | | | | |

| | | | | |

| | | | | |

| | | | | |

| | |t |df |Sig. |

| | | | |(2-tailed) |

| | |Std. Deviation | | | | | |

| |Mean | | |95% Confidence | | | |

| | | | |Interval of the | | | |

| | | |Std. Error |Difference | | | |

| | | |Mean | | | | |

| | | | |

|Two Dependent Means | |[pic] | |

| | | | |

| | | | |

| |[pic] |[pic] |[pic] |

| | | |where df = n - 1 |

|Two Independent Means | |[pic] |[pic] |

|(equal population variance) | | | |

| |[pic] |[pic] | |

| | | |

| | |where [pic] |

| | |and df = n1 + n2 - 2 |

|Two Independent Proportions | |[pic] | |

|(large sample sizes) | | | |

| |[pic] |[pic] |[pic] |

| | | | |

| | | where [pic] |

TI Quick Steps

Finding Areas under a t-distribution using the TI

Recall from Chapter 10 that the TI has the tcdf( function built into the calculator, which can compute various areas under the t-distribution with a specified degrees of freedom. This function is located under the DISTR menu. Please see page 612 for a review on the use of this function for the TI.

Hypothesis Testing and Confidence Intervals with the TI

Paired t-test for the Mean Difference

Recall that the paired t-test for the mean difference is actually just a one-sample t-test performed on the differences. So the TI for a one sample t-test in Chapter 10 will apply again.

If you enter the paired observations into two columns (for example, L1 and L2), then you will first need to compute the differences using a linear transformation. The sequence of buttons for computing the differences as L3 = L2 - L1 are provided next.

[pic]

If the differences have already been computed you can enter them directly, say as L1, and the test can be performed on the 1 sample of Data (differences) entered. If you only have the summary statistics for the differences, you can enter these Stats (the sample mean difference, sample standard deviation of the differences, and sample size). The steps for performing a paired t-test are as follows:

[pic]

Paired Confidence Interval for the Mean Difference

You can either have the 1 sample of Data (differences) entered, for example, as L1, or just know the Stats (the sample mean difference, sample standard deviation of the differences, and sample size). To generate the confidence interval, the steps are as follows:

[pic]

Two Independent Samples t-test for the Difference in Two Means

You can either have the two samples of Data entered, for example, as L1 and L2, or just enter the Stats (the sample means, sample standard deviations, and sample sizes). To perform a test, use the following steps:

[pic]

Two Independent Samples Confidence Interval

for the Difference in Two Means

You can either have the two samples of Data entered, for example, as L1 and L2, or just enter the Stats (the sample means, sample standard deviations, and sample sizes). To generate the confidence interval, use the following steps:

[pic]

Two Independent Samples z-test for

the Difference in Two Proportions

You need to specify the number of “successes” in each sample (denoted by the x’s), the sample sizes, and the direction for the alternative hypothesis. To perform a test follow these steps:

[pic]

Two Independent Samples Confidence Interval for

the Difference in Two Proportions

You need to specify the number of “successes” in each sample (denoted by the x’s), the sample sizes, and the confidence level. To generate the confidence interval, use the following steps:

[pic]

-----------------------

Basic Steps for Testing a Hypothesis about a Parameter

▪ State the population(s) and corresponding parameter(s) of interest.

Remember inference is only valid if the sample(s) is representative of the population(s) of interest.

▪ State the competing theories—that is, the null and alternative hypotheses.

The null hypothesis gives a specific value for the parameter, called the hypothesized value or null value.

▪ State the significance level [pic] for the test.

This level should always be set in advance of examining the results.

▪ Collect and examine the data and assess if the assumptions are valid.

If assumptions are not reasonable, there may be alternative procedures, some of which are discussed in Chapter 15.

▪ Compute a test statistic using the data and determine the p-value.

The test statistic is a measure of the distance between the sample statistic or point estimate of the parameter and the hypothesized value or null value for the parameter.

The general form of a test statistic is that of a standard score:

Test Statistic = Point Estimate - Null Value

Null Standard Deviation or Error

If the denominator is the actual standard deviation under the null hypothesis, then the test statistic is a z-statistic.

If the denominator is a standard error (an estimated standard deviation) under the null hypothesis, then the test statistic is a t-statistic.

The p-value is found using the null distribution for the test statistic.

For a z-statistic we use a standard normal N(0,1) distribution.

For a t-statistic we use a t-distribution with a certain degrees of freedom

(recall in the one-sample t-test the degrees of freedom were n - 1).

The p-value is the probability of getting a test statistic as extreme or more extreme than observed, assuming H0 is true. The direction of more extreme is determined by the direction in H1.

▪ Make a decision and state a conclusion using a well-written sentence.

We compare the p-value to the set significance level to make the decision.

Learning about the Population Mean Difference mðD

Parameter of interest: The population mean difference mðD

Data: We have a random sample of differences from the population of all possible differences. The differences have a normal distribution with unknown mean mðD and unknowntion Mean Difference μD

Parameter of interest: The population mean difference μD

Data: We have a random sample of differences from the population of all possible differences. The differences have a normal distribution with unknown mean μD and unknown standard deviation σD. Normality is not so crucial if the sample size n is large ([pic]30) due to the Central Limit Theorem.

Point Estimate of μD: Sample mean difference [pic], the average of the differences in the sample. This is just like [pic], but we use the d notation to remember we have paired data.

Standard Deviation of a sample mean difference: [pic]

Standard Error of a sample mean difference: [pic], where [pic] is the standard deviation computed on the differences in the sample.

Paired t-Test

Assumptions: The sample of differences is a random sample from a larger population of differences. The model for the differences is normal, although this is less crucial if the sample size n is large.

Hypotheses: [pic] versus

[pic] or [pic] or [pic].

The significance level α to be used is determined.

Data: The sample of n differences, generically written as [pic] from which the sample mean difference [pic]and the sample standard deviation of the differences [pic] can be computed.

Observed Test Statistic: [pic] and the null distribution for the T variable is a t(n-1) distribution.

p-value: We find the p-value for the test using the t(n - 1) distribution.

The direction of extreme will depend on how the alternative hypothesis is expressed.

Decision: A p-value less than or equal to α leads to rejection of H0

Notes:

• If we are interested in assessing if μD is equal to some hypothesized value that is not 0, we would replace 0 in the test statistic expression with this other null value.

• The test statistic is the same no matter how the alternative hypothesis is expressed.

Confidence Interval for μD

Assumptions: The sample of differences is a random sample from a larger population of differences. The model for the differences is normal, although this is less crucial if the sample size n is large.

Data: The sample of n differences, generically written as [pic] from which the sample mean difference [pic]and the sample standard deviation of the differences [pic] can be computed.

Confidence Interval: [pic]

where t* is an appropriate percentile of the

t(n - 1) distribution.

Interpretation of the Interval: The interval gives potential values for the population mean difference μD based on just one random sample of differences. The interval can be used to test hypotheses when the alternative hypothesis is two-sided.

Interpretation of

the Confidence Level: The confidence level pertains to the proportion of times the procedure produces an interval that actually contains μD when this procedure is repeated over and over using a new random sample of the same size n each time.

p. 670-671

p. 688

p. 673 - 674

p. 673 - 674

p. 674 - 675

p. 675

p. 678 - 679

p. 679

p. 629

p. 680-681

p. 681 - 683

p. 684

p. 683

p. 683-684

p. 684 - 686

p. 685

p. 686

p. 686 - 687

p. 687-688

p. 694

Two Independent Samples Pooled t-Test

Assumptions: The first sample is a random sample from a normal population with mean μ1 and standard deviation σ1. The second sample is a random sample from a normal population with mean μ2 and standard deviation σ2. The two samples are independent. Normality is less crucial if the sample sizes n1 and n2 are large, and preferably, n1 = n2.

Hypotheses: [pic] versus

[pic] or [pic] or [pic].

The significance level α to be used is determined.

Data: The two sets of data from which the two sample means [pic] and [pic], and the two sample standard deviations [pic]and [pic] can be computed.

Observed Test Statistic: [pic]where [pic]

and the null distribution for this T variable is a t(n1+ n2 - 2) distribution.

p-value: We find the p-value for the test using the t(n1+ n2 - 2) distribution. The direction of extreme will depend on how the alternative hypothesis is expressed.

Decision: A p-value less than or equal to α leads to rejection of H0

Notes:

▪ If we are interested in assessing if [pic] is equal to some hypothesized value that is not 0, we would replace 0 in the test statistic expression with this other null value.

▪ The test statistic is the same no matter how the alternative hypothesis is expressed.

Pooled t-Confidence Interval for [pic]

Assumptions: The first sample is a random sample from a normal population with mean μ1 and standard deviation σ. The second sample is a random sample from a normal population with mean μ2 and standard deviation σ. The two samples are independent. Normality is less crucial if the sample sizes n1 and n2 are large and preferably, n1 = n2.

Data: The two sets of data from which the two sample means [pic] and [pic],

and the two sample standard deviations [pic] and [pic] can be computed.

Confidence Interval: [pic]

where [pic] and

t* is an appropriate percentile of the t(n1+ n2 - 2) distribution.

Interpretation of the Interval: The interval gives potential values for the difference in the population means [pic] based on just one random sample from each population. The interval can be used to test hypotheses when the alternative hypothesis is two sided.

Interpretation of

the Confidence Level: The confidence level pertains to the proportion of times the procedure produces an interval that actually contains [pic] when this procedure is repeated over and over using new random samples of the same sizes each time.

Two Independent Samples z-Test for Proportions (Large Samples)

Assumptions: The first sample is a random sample of size n1 from the first population. The second sample is a random sample of size n2 from the second population. The two samples are independent. The sample sizes n1 and n2 are large. It is best to have at least five successes in each sample.

Hypotheses: [pic] versus

[pic] or [pic] or [pic].

The significance level α to be used is determined.

Data: The two sets of data from which the two sample proportions [pic] and [pic] can be computed and the pooled estimate of the common population proportion p (under [pic]) can be computed

as [pic].

Observed Test Statistic: [pic]

and the null distribution for this Z variable is a standard normal N(0,1) distribution.

p-value: We find the p-value for the test using the N(0,1) distribution. The direction of extreme will depend on how the alternative hypothesis is expressed.

Decision: A p-value less than or equal to α leads to rejection of H0.

Confidence Interval for [pic]

Assumptions: The first sample is a random sample of size n1 from the first population. The second sample is a random sample of size n2 from the second population. The two samples are independent. The sample sizes n1 and n2 are large. It is best to have at least five successes in each sample.

Data: The two sets of data from which the two sample proportions [pic] and [pic] can be computed.

Confidence Interval: [pic]

where z* is an appropriate percentile of the standard normal N(0,1) distribution.

Interpretation of the Interval: The interval gives potential values for the difference in the population proportions [pic] based on independent random samples, one from each population. The interval can be used to test hypotheses when the alternative hypothesis is two-sided.

Interpretation of

the Confidence Level: The confidence level pertains to the proportion of times the procedure produces an interval that actually contains [pic] when this procedure is repeated over and over using new independent random samples of the same sizes each time.

Error Margin: The margin of error E = [pic]

Minimum (common) sample sizes: n1 and n2 needed for a confidence interval with desired error margin E: [pic]

p. 680

p. 695

p. 696

p. 697

p. 698 - 699

p. 698

p. 699 - 702

p. 699 - 702

p. 699 - 702

p. 702 - 704

p. 702 - 705

p. 704 - 706

p. 705 - 707

p. 707 - 709

p. 707 - 709

p. 709

p. 711-712

p. 718

p. 719 - 720

p. 720

If the population standard deviations were not known and not assumed to be equal, then it seems natural to base our hypothesis test on the following statistic:

[pic]

▪ The sampling distribution of this T variable, under H0, is not a t-distribution and is rather complicated. This situation is called the Behrens–Fisher problem. The results of this test are often reported in standard computer output. In Example 11.7, the results are provided in the line labeled “Unequal.” One approach to the complicated distribution is to use the t-distribution with an approximate number of degrees of freedom. The expression for the degrees of freedom, called Welch’s approximation, is

[pic]

▪ Luckily, many software packages and calculators will determine degrees of freedom and will find the appropriate t* value for performing the approximate test or constructing the approximate confidence interval estimate. Exercises 11.31 and 11.51 comment on this version of the two independent samples t-test. A second, somewhat conservative, approach is to use the t-distribution, with degrees of freedom being the smaller of [pic] and [pic] as the approximate sampling distribution under H0. Then the true significance level is always smaller than the level used with this t-distribution.

p. 724

p. 723

Some Notes Regarding the Assumptions Two Independent Samples Pooled Procedures

The two independent samples pooled t-test is a powerful technique for comparing the means of two populations, provided that the assumptions hold. What if some of the assumptions are not satisfied?

▪ If the sample sizes are large enough to apply the central limit theorem ([pic]and [pic]), then the assumption of normal populations is less crucial, since the distribution of [pic] will be approximately normal.

▪ If the sample sizes are equal, the assumption of common population standard deviation is not so crucial. The t-test statistic will still follow a t-distribution approximately.

▪ If the population standard deviations were assumed to be known and not necessarily equal, then we could base our test on the statistic

[pic]

where [pic] is the hypothesized value under and the Z variable would follow an N(0, 1) distribution. If the sample sizes are large, the normal approximation for this test statistic remains valid even if the population standard deviations are estimated by the sample standard deviations

p. 724

p. 726-727

p. 737

p. 738

p. 739

p. 740

p. 741

p. 742

p. 742

Version of Test: Pooled t-test

(select one) Nonpooled t-test

Nonparametric rank sum test

Explain:

Version of Test: Pooled t-test

(select one) Nonpooled t-test

Nonparametric rank sum test

Explain:

Version of Test: Pooled t-test

(select one) Nonpooled t-test

Nonparametric rank sum test

Explain:

p. 683

p. 672

p. 676

p. 696-697

p. 699 - 702

p. 699- 702

p. 702 - 704

p. 704 - 706

p. 704 - 706

p. 707 - 709

p. 712

▪ If there is concern that the assumptions are not satisfied, there is another test procedure that requires fewer or less stringent assumptions about the underlying populations. Such tests are called nonparametric or distribution-free procedures. The Wilcoxon rank sum test is a nonparametric procedure that can be used for two independent samples scenarios, which is based on ranks of the data. This nonparametric procedure can be effective and is discussed in Chapter 15. In practice, the presented t-tests based on normal distributions are the most common.

p. 710 - 711

p. 682

p. 682-683

p. 685

p. 685

p. 686

p. 687-688

p. 695

p. 705 - 707

p. 705 - 707

p. 705 - 707

p. 707 - 709

p. 707 - 709

p. 707 - 709

p. 719

p. 723

p. 672

p. 679

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download