Statistics and Probability - MSU - Department of ...



Chapter 22 Comparing two proportions

Two populations, two unknown proportions p1 and p2

Problems

• Estimate the difference p1 - p2

• Test HO: p1 = p2

Samples: Two independent, large samples of sizes n1, n2

Sample proportions: [pic] [pic]

• Point estimate of p1 - p2 is [pic]

• If n1, n2 are large then [pic]is approximately normal

with mean p1 - p

• Standard deviation of [pic] is

Two-proportion z-interval

Assumptions

1. Random samples, each with independent observations

2. Independent samples

3. If sampling without replacement, the sample size n should be no more than 10% of the population.

4. "Large" samples (n1p1 > 10, n1q1>10, n2p2 > 10, n2q2 >10)

Standard Error:

C% Margin of Error:

where z* is a critical value for standard normal distribution that corresponds to C% confidence level

A C% confidence interval for a difference p1 - p2 is

[pic]

Example: In 2000 researchers contacted 25,138 Americans aged 24 years to see if they had finished high school;

84.9% of the 12,460 males and

88.1% of the 12,678 females

indicated that they had high school diploma. Create a 95% confidence interval for the difference in graduation rate between males and females.

Data

[pic]

Standard Error:

Critical value: z* = 1.96

95% Margin of Error:

C% confidence interval for a population proportion p is

[pic]

Answer: -.032(0.008 or (-0.040, -0.024)

Two-proportion z-test

Assumptions

1. Random samples, each with independent observations

2. Independent samples

3. If sampling without replacement, the sample size n should be no more than 10% of the population.

4. "Large" samples (n1p1 > 10, n1p1>10, n2p2 > 10, n2q2 >10)

Hypotheses:

1. Null hypothesis HO: p1 = p2 that is HO: p1 - p2 =0

2. Alternative hypothesis

HA: p1 > p2 or HA: p1 < p2 or HA: p1 ≠ p2 that is

HA: p1- p2 > 0 or HA: p1 - p2 < 0 or HA: p1- p2 ≠ 0

Attitude: Assume that the null hypothesis HO is true and uphold it, unless data strongly speaks against it.

To estimate the common p = p1 = p2 we combine (pool) the two samples together

[pic]

and use it to estimate the standard deviation of [pic]

Pooled standard error of [pic]

Test statistic: [pic]

Distribution under H0: approximately standard normal

P-value: Let zo be the observed value of the test statistic. The way we compute it depends on HA

|HA |P-value | |

|HA: p1 > p2 |P(z > zo) | |

|HA: p1 < p2 |P(z |zo|) + P(z < -|zo|) | |

Example.

Of 995 respondents, 37% reported they snored at least a few night a week. Split into two age categories, 26% of the 184 people under 30 snored, compared with 39% of 811 in the older group. Is this difference real (statistically significant) or due only to natural fluctuations. Use (=0.05

Assumptions

1. Random samples, each with independent observations

2. Independent samples

3. If sampling without replacement, the sample size n should be no more than 10% of the population.

4. "Large" samples (n1p1 > 10, n1p1>10, n2p2 > 10, n2q2 >10)

Data: [pic]

Hypotheses: HO: p1 = p2 (HO: p1 - p2 =0)

HA: p1 < p2 (HA: p1 - p2 < 0)

Estimate of the common p = p1 = p2

[pic]

Pooled standard error of [pic]

Test statistic: [pic]

P-value: P(z (0 |P(t > to) | |

|HA: ( < (0 |P(t |to|) + P(t < -|to|) | |

Example - cont.

Below is the speed of vehicles recorded on Triphammer Road:

[pic]

Test whether the data provides evidence that the mean speed of vehicles on Triphammer Road exceeds 30 mph.

n = 23 (small),

Histogram is symmetric, we assume normal model.

We use one-sample t-test

Hypotheses: HO: ( = 30 vs. HA: ( > (0

Test: one sample t-test

Standard error:

Test statistic: [pic]

Degrees of freedom: df = n - 1 = 22

P-value: bigger that .10

TI-83 tcdf(1.13,1E99,22) = 0.14

Fail to reject H0 even at ( = .10

-----------------------

[pic]

[pic]

[pic]

[pic]

(

[pic]

[pic]

[pic]

p2

p1

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download