Chapter 22 – Comparing Two Proportions

366 Part V From the Data at Hand to the World at Large

Chapter 22 ? Comparing Two Proportions

1. Gender gap.

a) This is a stratified random sample, stratified by gender.

b) We would expect the difference in proportions in the sample to be the same as the difference in proportions in the population, with the percentage of respondents with a favorable impression of the candidate 6% higher among males.

c) The standard deviation of the difference in proportions is:

( p^M - p^F ) =

p^M q^M + p^Fq^F =

nM

nF

(0.59)(0.41) + (0.53)(0.47) 4%

300

300

d)

e) The campaign could certainly be

misled by the poll. According to the

model, a poll showing little difference

could occur relatively frequently. That

result is only 1.5 standard deviations

below the expected difference in

proportions.

2. Buy it again?

a) This is a stratified random sample, stratified by country of origin of the car.

b) We would expect the difference in proportions in the sample to be the same as the difference in proportions in the population, with the percentage of respondents who would purchase the same model again 2% higher among owners of Japanese cars than among owners of American cars.

c) The standard deviation of the difference in proportions is:

( p^J - p^A ) =

p^Jq^J + p^Aq^A =

nJ

nA

(0.78)(0.22) + (0.76)(0.24) 2.8%

450

450

Chapter 22 Comparing Two Proportions 367

d)

e) The magazine could certainly be

misled by the poll. According to the

model, a poll showing greater

satisfaction among owners of American

cars could occur relatively frequently.

That result is less than one standard

deviation below the expected

difference in proportions.

3. Arthritis.

a) Randomization condition: Americans age 65 and older were selected randomly. 10% condition: 1012 men and 1062 women are less than 10% of all men and women. Independent samples condition: The sample of men and the sample of women were drawn independently of each other. Success/Failure condition: np^ (men) = 411, nq^ (men) = 601, np^ (women) = 535, and nq^ (women) = 527 are all greater than 10, so the samples are both large enough.

Since the conditions have been satisfied, we will find a two-proportion z-interval.

( ) ( ) ( )( ) ( )( ) ( ) b)

p^F - p^M ? z

p^Fq^F + p^Mq^M

nF

nM

=

- 535 411

1062 1012

? 1.960

535 527

411 601

1062 + 1062 1012 1012 = 0.055, 0.140

1062

1012

c) We are 95% confident that the proportion of American women age 65 and older who suffer from arthritis is between 5.5% and 14.0% higher than the proportion of American men the same age who suffer from arthritis.

d) Since the interval for the difference in proportions of arthritis sufferers does not contain 0, there is strong evidence that arthritis is more likely to afflict women than men.

4. Graduation.

a) Randomization condition: Assume that the samples are representative of all recent graduates. 10% condition: Although large, the samples are less than 10% of all graduates. Independent samples condition: The sample of men and the sample of women were drawn independently of each other. Success/Failure condition: The samples are very large, certainly large enough for the methods of inference to be used.

Since the conditions have been satisfied, we will find a two-proportion z-interval.

( ) b) p^F - p^M ? z

p^Fq^F + p^Mq^M

nF

nM

= (0.881 - 0.849) ? 1.960 (0.881)(0.119) + (0.849)(0.151) = (0.024, 0.040)

12, 678

12, 460

368 Part V From the Data at Hand to the World at Large

c) We are 95% confident that the proportion of 24-year-old American women who have graduated from high school is between 2.4% and 4.0% higher than the proportion of American men the same age who have graduated from high school.

d) Since the interval for the difference in proportions of high school graduates does not contain 0, there is strong evidence that women are more likely than men to complete high school.

5. Pets.

( ) a) SE p^Herb - p^ None =

p^ Herbq^Herb + p^ Noneq^None =

nHerb

nNone

( )( ) ( )( ) 473 354

19 111

827

+ 827

130

130

= 0.035

827

130

b) Randomization condition: Assume that the dogs studied were representative of all dogs. 10% condition: 827 dogs from homes with herbicide used regularly and 130 dogs from homes with no herbicide used are less than 10% of all dogs. Independent samples condition: The samples were drawn independently of each other. Success/Failure condition: np^ (herb) = 473, nq^ (herb) = 354, np^ (none) = 19, and nq^ (none) = 111 are all greater than 10, so the samples are both large enough.

Since the conditions have been satisfied, we will find a two-proportion z-interval.

( ) p^ Herb - p^ None ? z

p^ Herbq^Herb + p^ Noneq^None

nHerb

nNone

( ) ( )( ) ( )( ) =

- 473 19

827 130

? 1.960

473 354

19 111

( ) 827

+ 827

130

130

= 0.356, 0.495

827

130

c) We are 95% confident that the proportion of pets with a malignant lymphoma in homes where herbicides are used is between 35.6% and 49.5% higher than the proportion of pets with lymphoma in homes where no pesticides are used.

6. Carpal Tunnel.

( ) a) SE p^Surg - p^Splint =

p^Surgq^Surg + p^Splintq^Splint =

nSurg

nSplint

(0.80)(0.20) + (0.54)(0.46) = 0.068

88

88

b) Randomization condition: It's not clear whether or not this study was an experiment. If so, assume that the subjects were randomly allocated to treatment groups. If not, assume that the subjects are representative of all carpal tunnel sufferers. 10% condition: 88 subjects in each group are less than 10% of all carpal tunnel sufferers. Independent samples condition: The improvement rates of the two groups are not related. Success/Failure condition: np^ (surg) = (88)(0.80) = 70, nq^ (surg) = (88)(0.20) = 18, np^ (splint) = (88)(0.54) = 48, and nq^ (splint) = (88)(0.46) = 40 are all greater than 10, so the

samples are both large enough.

Chapter 22 Comparing Two Proportions 369 Since the conditions have been satisfied, we will find a two-proportion z-interval.

( ) p^Surg - p^Splint ? z

p^Surgq^Surg + p^Splintq^Splint

nSurg

nSplint

= (0.80 - 0.54) ? 1.960 (0.80)(0.20) + (0.54)(0.46) = (0.126, 0.394)

88

88

c) We are 95% confident that the proportion of patients who show improvement in carpal tunnel syndrome with surgery is between 12.6% and 39.4% higher than the proportion who show improvement with wrist splints.

7. Prostate cancer.

a) This was an experiment. Men were randomly assigned to imposed treatments. They were assigned to either have prostate surgery or assigned to not have prostate surgery.

b) Randomization condition: The men were randomly assigned to the two treatment groups. 10% condition: 347 men who had surgery and 348 men who did not have surgery are both less than 10% of all men. Independent samples condition: The groups were assigned randomly, so the groups are not related. Success/Failure condition: np^ (surg) = 16, nq^ (surg) = 331, np^ (none) = 31, and nq^ (splint) = 317 are all greater than 10, so the samples are both large enough.

Since the conditions have been satisfied, we will find a two-proportion z-interval.

( ) p^ None - p^Surg ? z

p^ Noneq^None + p^Surgq^Surg

nNone

nSurg

( ) ( )( ) ( )( ) =

- 31 16

348 347

? 1.960

31 317

16 331

( ) 348

+ 348

347

347

= 0.006, 0.080

348

347

We are 95% confident that the proportion of patients who die from prostate cancer after having no surgery is between 0.60% and 8.0% higher than the proportion of patients who die after having surgery.

c) Since 0 is not contained in the interval, there is evidence that surgery may be effective in preventing death from prostate cancer.

8. Race and smoking.

a) Randomization condition: Assume that the survey was conducted randomly. 10% condition: 550 white adults and 550 black adults are both less than 10% of all adults. Independent samples condition: The samples are independent of one another. Success/Failure condition: np^ (white) = (550)(0.248) = 136, nq^ (white) = (550)(0.752) = 414, np^ (black) = (550)(0.257) = 141, and nq^ (black) = (550)(0.743) = 409 are all greater than 10, so the samples are both large enough.

370 Part V From the Data at Hand to the World at Large

Since the conditions have been satisfied, we will find a two-proportion z-interval.

( ) p^ Black - p^White ? z

p^ Blackq^Black + p^Whiteq^White

nBlack

nWhite

= (0.257 - 0.248) ? 1.645 (0.257)(0.743) + (0.248)(0.752) = (-0.034, 0.052)

550

550

We are 90% confident that the proportion of black smokers is between 3.4% lower and 5.2% higher than the proportion of white smokers.

b) H0 : The proportion of black smokers is the same as the proportion of white smokers.

( ) pBlack = pWhite or pBlack - pWhite = 0

HA : The proportion of black smokers is different from the proportion of white smokers.

( ) pBlack pWhite or pBlack - pWhite 0

Since 0 is contained within the confidence interval, we fail to reject the null hypothesis. There is no evidence of a race-based difference in smoking percentages.

9. Politics.

a) The margin of error is larger for the difference in proportions of support because differences have larger standard errors than single samples.

b) The confidence interval for difference in support was 2% ? 4%, or (?2%, 6%). Since the interval contains 0, there is no evidence of a difference in the proportion of support for antiterrorist legislation.

10. War.

a) The poll estimated that the difference in support between Republicans and Democrats was 75% ? 30%, or 45% higher for the Republicans.

b) The margin of error for the difference would be greater than 3%, the margin of error for the single sample. Differences have standard errors that are larger than standard errors for single samples.

11. Teen smoking, part I.

a) This is a prospective observational study.

b) H0 : The proportion of teen smokers among the group whose parents disapprove of

smoking is the same as the proportion of teen smokers among the group whose parents

( ) are lenient about smoking. pDis = pLen or pDis - pLen = 0

HA : The proportion of teen smokers among the group whose parents disapprove of

smoking is lower than the proportion of teen smokers among the group whose parents

( ) are lenient about smoking. pDis < pLen or pDis - pLen < 0

c) Randomization condition: Assume that the teens surveyed are representative of all teens. 10% condition: 284 and 41 are both less than 10% of all teens. Independent samples condition: The groups were surveyed independently. Success/Failure condition: np^ (disapprove) = 54, nq^ (disapprove) = 230, np^ (lenient) = 11, and nq^ (lenient) = 30 are all greater than 10, so the samples are both large enough.

Chapter 22 Comparing Two Proportions 371

Since the conditions have been satisfied, we will model the sampling distribution of the

difference in proportion with a Normal model with mean 0 and standard deviation

( ) estimated by SEpooled p^Dis - p^ Len =

p^ pooledq^pooled + p^ pooledq^pooled =

nDis

nLen

( )( ) ( )( ) 65 260

65 260

325 325 + 325 325 = 0.0668.

284

41

d) The observed difference between the proportions is 0.190 ? 0.268 = ? 0.078.

Since the P-value = 0.1211 is high, we fail to reject the null hypothesis. There is little evidence to suggest that parental attitudes influence teens' decisions to smoke.

e) If there is no difference in the proportions, there is about a 12% chance of seeing the observed difference or larger by natural sampling variation.

f) If teens' decisions about smoking are influenced, we have committed a Type II error.

12. Depression.

a) This is a prospective observational study.

b) H0 : The proportion of cardiac patients without depression who died within the 4 years is the same as the proportion of cardiac patients with depression who died during the

( ) same time period. pNone = pDep or pNone - pDep = 0

HA : The proportion of cardiac patients without depression who died within the 4 years is the less than the proportion of cardiac patients with depression who died during the

( ) same time period. pNone < pDep or pNone - pDep < 0

c) Randomization condition: Assume that the cardiac patients followed by the study are representative of all cardiac patients. 10% condition: 361 and 89 are both less than 10% of all teens. Independent samples condition: The groups are not associated. Success/Failure condition: np^ (no depression) = 67, nq^ (no depression) = 294, np^ (depression) = 26, and nq^ (depression) = 63 are all greater than 10, so the samples are

both large enough.

Since the conditions have been satisfied, we will model the sampling distribution of the

difference in proportion with a Normal model with mean 0 and standard deviation

( ) estimated by SEpooled p^ None - p^Dep =

p^pooledq^pooled + p^pooledq^pooled =

nNone

nDep

( )( ) ( )( ) 93 357

93 357

450 450 + 450 450 0.0479.

361

89

372 Part V From the Data at Hand to the World at Large

d) The observed difference between the proportions is: 0.1856 ? 0.2921 = ? 0.1065.

Since the P-value = 0.0131 is low, we reject the null hypothesis. There is strong evidence to suggest that the proportion of non-depressed cardiac patients who die within 4 years is less than the proportion of depressed cardiac patients who die within 4 years.

e) If there is no difference in the proportions, we will see an observed difference this large or larger only about 1.3% of the time by natural sampling variation.

f) If cardiac patients without depression don't actually have a lower proportion of deaths in 4 years than cardiac patients with depression, then we have committed a Type I error.

13. Teen smoking, part II.

a) Since the conditions have already been satisfied in Exercise 9, we will find a twoproportion z-interval.

( ) p^Dis - p^ Len ? z

p^ Disq^Dis + p^ Lenq^Len

nDis

nLen

( ) ( )( ) ( )( ) =

- 54 11

284 41

? 1.960

54 230

11 30

284

+ 284

41

41

= (-0.065, 0.221)

284

41

b) We are 95% confident that the proportion of teens whose parents disapprove of smoking who will eventually smoke is between 6.5% less and 22.1% more than for teens with parents who are lenient about smoking.

c) We expect 95% of random samples of this size to produce intervals that contain the true difference between the proportions.

14. Depression revisited.

a) Since the conditions have already been satisfied in Exercise 10, we will find a twoproportion z-interval.

( ) p^ None - p^Dep ? z

p^ Noneq^None + p^ Depq^Dep

nNone

nDep

( ) ( )( ) ( )( ) =

- 67 26

361 89

? 1.960

67 294

26 63

( ) 361 361 + 89 89 = 0.004, 0.209

361

89

b) We are 95% confident that the proportion of cardiac disease patients who die within 4 years is between 0.4% and 20.9% higher for depressed patients than for non-depressed patients.

c) We expect 95% of random samples of this size to produce intervals that contain the true difference between the proportions.

Chapter 22 Comparing Two Proportions 373

15. Pregnancy.

a) H0 : The proportion of live births is the same for women under the age of 38 as it is for

( ) women 38 or older. p ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download