Hypothesis Testing:



Handout 8 (Chapter 9): Inferences Based on Two Samples

Population characteristics (parameter) Sample characteristics (statistics)

[pic] [pic]

[pic] [pic]

[pic]/[pic] [pic]/[pic]

1. One-sided (One tailed) test:

A. Lower tailed: H0: population characteristics ( claimed constant value

(Left-sided) Ha: population characteristics < claimed constant value

B. Upper tailed: H0: population characteristics ( claimed constant value

(Right-sided) Ha: population characteristics > claimed constant value

2. Two-sided (Two tailed) test: H0: population characteristics = claimed constant value

Ha: population characteristics ( claimed constant value

A. Independent Samples

I. Population characteristics: Difference between two population means, (1-(2.

(0 is the claimed constant.

[pic] and [pic] are the population means for X's and Y's, respectively.

[pic] and [pic] are the sample means for X's and Y's, respectively.

m and n are the sample sizes for X's and Y's, respectively.

[pic]and [pic] are the population variances for X's and Y's, respectively.

[pic]and [pic] are the sample variances for X's and Y's, respectively.

Test statistics:

• [pic] when both popn. distributions are normal and [pic],[pic] are known

• [pic] when there is large sample size (m>40 and n>40) and [pic],[pic] are unknown

• [pic] when both popn. distributions are normal and at least one sample size is small with unknown [pic],[pic] are assumed to be different ([pic]([pic]) and degrees of freedom, [pic] is used to look up the critical values. If v is not computed as integer, it should be rounded down. Your textbook calls this two-sample t test.

• [pic], [pic]when both popn. distributions are normal and at least one sample size is small where unknown [pic],[pic] are assumed to be the same ([pic]=[pic]) and degrees of freedom, [pic] is used to look up the critical values. Your textbook calls this pooled t test.

Decision can be made in one of the two ways in hypothesis testing:

a. Let z* or t* be the computed test statistic values.

| |if test statistics is z |if test statistics is t |

|Lower tailed test |P-value = P(zt*) |

|Two-tailed test |P-value = 2P(z>|z*|)= 2P(z |t*| )= 2P(t (

b. Rejection region for level ( test:

| |if test statistics is z |if test statistics is t |

|Lower tailed test |z ( -z( |t ( -t(;v |

|Upper tailed test |z ( z( |t ( t(;v |

|Two- tailed test ||z | ( z(/2 ||t| ( t(/2;v |

100(1-()% confidence Intervals with the same assumptions,

[pic]

[pic]

[pic]when you assume [pic]([pic]

[pic]when you assume [pic]=[pic]

Example 1: Two types of plastic are suitable for use by an electronics component manufacturer. The breaking strength of this plastic is very important. It is known that [pic]=[pic]=1 psi. Random sample of size [pic]=10 and [pic]=12 drawn from a normal distribution, we obtain [pic]=162.5 and [pic]=155. The company will not adopt plastic 1 unless its mean breaking strength exceeds that of plastic 2 by at least 10 psi. Based on the sample information, would they use plastic 1? Use the significance level 0.05 in reaching a decision.

[pic] (adopt plastic 1) versus [pic] (do not adopt plastic 1)

test statistics, [pic]

Decision:

(i) reject H0 if z ( -z(=-1.645. z =-5.84 < -1.645 then reject H0.

(ii) P-value = P(Z5.84)=0. Since the P-value ( (=0.05, reject H0.

Conclusion: Do not adopt plastic 1.

Example 2 (Exercise 9.2): We will use the given data in this exercise.

[pic]: true average tread lives for two competing brand of size P205/65R15 radial tires, i=1,2

Test [pic] versus [pic]

m=45 [pic] [pic]

n=45 [pic] [pic]

Notice that sample sizes are large while population variances are unknown and this is a two-tailed test.

Test statistics: [pic]=4.8462

Decision:

(i) Reject H0 if z ( -z(/2=-1.96 or z ( z(/2 =1.96. z =4.8462 > 1.96 then reject H0.

(ii) P-value = 2P(Z>4.8462)=2(0)=0. Since the P-value ( (=0.05, reject H0.

Conclusion: true average tread lives for two competing brand tires are different.

If you prefer to answer the question computing the confidence interval, 95% confidence interval would become [pic]

=(1250.67, 2949.33). You would see that zero does not fall into interval and you would reject H0.

What would be different if this was an upper tailed test instead of two tailed test? (hypothesis, test statistics, decision, conclusion)

What would be different if this was an upper tailed test with the hypothesized value 1000 instead of two tailed test with the hypothesized value 0? (hypothesis, test statistics, decision, conclusion)

Example 3 (Exercise 9.8): The data is on the tensile strength test of two different grades or wire rod.

|Grade |Sample size |Sample mean |Population mean |Sample standard deviation |

|AISI 1064 |m=129 |[pic]=107.6 |[pic] |[pic]=1.3 |

|AISI 1078 |n=129 |[pic]=123.6 |[pic] |[pic]=2.0 |

(a) Does the data provide compelling evidence for concluding that the true average strength for 1078 grade exceeds that for the 1064 grade by more than 10kg/mm2?

Sample sizes are large while population variances are unknown. . Notice that this is an upper-tailed test.

[pic] versus [pic]

Test statistics: [pic]=28.57

Decision:

(i) Reject H0 if z ( z( =1.645 if (=0.05. z =28.57 > 1.645 and reject H0.

(ii) P-value = P(Z>28.57)=0. Since the P-value ( (=0.05, reject H0.

Conclusion: the data provide compelling evidence that the true average strength for the 1078 grade exceeds that for the 1064 grade by more than 10.

Or you can answer the same question using [pic] versus [pic]

Notice that it became a lower tailed test instead of upper tailed test.

Test statistics: [pic]= -28.57

Decision:

(i) Reject H0 if z (- z( =-1.645 if (=0.05. z =-28.57 6.166)=2(0.0005)=0.001. Since the P-value ( (=0.05, reject H0.

Conclusion: the true average densities for two different types of brick are different.

If the population variances are assumed to be the same,

Test statistics: [pic]=[pic]=6.396 where [pic]=0.0405

Decision:

(i) Reject H0 if t ( -t(/2;v= -2.262 or t ( t(/2;v =2.262 where (=0.05 and v=6+5-2=9. t =6.166 > 2.262 and reject H0.

(ii) P-value = 2P(t>6.166) |z*| )=2P(z (

b) Rejection region for level ( test:

|Lower tailed test |z ( -z( |

|Upper tailed test |z ( z( |

|Two- tailed test |z ( -z(/2 or z ( z(/2 |

100(1-()% large sample confidence Interval:

[pic]

Example 5: Two different types of injection-molding machines are used to form plastic parts. A part is considered defective if it has excessive shrinkage or is discolored. Two random samples, each of size 300 are selected and 15 defective parts are found from machine 1 while 8 defective parts are found in the sample from machine 2. Is it reasonable to conclude that both machines produce the same fraction of defective parts, using the significance 0.05?

If this analysis done by hand

Sample sizes are large enough to satisfy the assumptions and it is a two-tailed test.

[pic]=0.0383 where [pic]=0.05 and [pic]=0.0267

Test statistics: [pic]=1.49

(i) Reject H0 if z ( -z(/2=-1.96 or z ( z(/2 =1.96. -1.96< z=1.49 < 1.96 and fail to reject H0. .

(ii) P-value = 2P(Z>1.49)=2(0.0681)=0.1362. Since the P-value > (=0.05, fail to reject H0.

Conclusion: Yes it is reasonable to assume that both machines produce the same fraction of defective parts

The MINITAB output analyzing such data is

Test and CI for Two Proportions

Sample X N Sample p

1 15 300 0.050000

2 8 300 0.026667

Estimate for p(1) - p(2): 0.0233333

95% CI for p(1) - p(2): (-0.00733568, 0.0540023)

Test for p(1) - p(2) = 0 (vs not = 0): Z = 1.49 P-Value = 0.136

Example 6 (Exercise 9.48(a)):

Sample sizes are large enough to satisfy the assumptions and it is a two-tailed test.

[pic]=0.2875 where [pic]=0.21 and [pic]=0.4167

Test statistics: [pic]=-4.844

Decision:

(i) Reject H0 if z ( -z(/2=-1.96 or z ( z(/2 =1.96. |z| =4.844 > 1.96 and reject H0.

(ii) P-value = 2P(Z>4.844)=2(0)=0. Since the P-value ( (=0.05, reject H0.

Conclusion: it is different for two groups of residents.

The MINITAB output analyzing such data is

Test and CI for Two Proportions

Sample X N Sample p

1 63 300 0.210000

2 75 180 0.416667

Estimate for p(1) - p(2): -0.206667

95% CI for p(1) - p(2): (-0.292174, -0.121159)

Test for p(1) - p(2) = 0 (vs not = 0): Z = -4.74 P-Value = 0.000

III. Population characteristics: Ratio of the two population variances, [pic] or standard deviations, [pic].

X and Y's are random sample from a normal distribution.

[pic]and [pic] are the population variances for X's and Y's, respectively.

[pic]and [pic] are the sample variances for X's and Y's, respectively.

m and n are the sample sizes for X's and Y's, respectively.

Test statistics: [pic]

Decision can be made in one of the two ways:

a) Let F* be the computed test statistic values.

|Lower tailed test |P-value = P(FF*) |

|Two-tailed test |P-value = 2P(F > F*) |

In each case, you can reject H0 if P-value ( ( and fail to reject H0 (accept H0) if P-value > (

b) Rejection region for level ( test:

|Lower tailed test |F ( F1-(;m-1,n-1 |

|Upper tailed test |F ( F(;m-1,n-1 |

|Two- tailed test |F ( F1-(/2;m-1,n-1 or F ( F(/2;m-1,n-1 |

Notice that F1-(/2;m-1,n-1 = 1 / F(/2;n-1,m-1

100(1-()% confidence Interval for [pic]:

[pic]

Example 7 (Exercise 9.57):

(a) On the F-table, column for 5 and row for 8 will give the area on the right 0.05 with F0.05:5,8 = 3.69

(d) F0.95:8,5 = 1/ F0.05:5,8 = 1/3.69 =0.271

(e) The percentile is the area on the left of the value and it means the area on the right of the value is 0.01. On the F-table, look at column for 10 and row for 12 with the area on the right 0.01.

P( F ( F0.01:10,12 )= 0.99 then F0.01:10,12 = 4.30

(h) P(0.177 ( F ( 4.74) = P(F0.99:10,5 ( F ( F0.05:10,5) =1-(0.01+0.05)=0.94 where F0.99:10,5 = 1/F0.01:5,10 = 1/5.64=0.177

Example 8: A study was performed to determine whether men and women differ in their repeatability in assembling components on printed circuit boards. Two samples of 26 men and 21 women were selected and each subject assembled the units. The two sample standard deviations of assembly time were smen=0.98 min and swomen=1.02 min. Is there evidence to support the claim that men and women differ in repeatability for this assembly task? Use the significance level 0.02 and state any necessary assumptions about underlying distribution of the data.

[pic]

[pic]

(=0.02, m=26, n=21

Test statistics: [pic]

Decision: Reject H0 if F ( F1-(/2;m-1,n-1 = F0.99;25,20 =1/2.70=0.37 or F ( F(/2;m-1,n-1= F0.01;25,20 =2.84. Since 0.37 < F=0.9231 |t*| )=2P(t (

(b) Rejection region for level ( test:

|Lower tailed test |t ( -t(;n-1 |

|Upper tailed test |t ( t(;n-1 |

|Two- tailed test |t ( -t(/2;n-1 or t ( t(/2;n-1 |

100(1-()% confidence Intervals with the same assumptions,

[pic]

Example 10: The manager of a fleet of automobiles is testing two brands of radial tires. He assigns one tire of each brand at random to the two rear wheels of eight cars and runs the cars until the tires wear out. The descriptive statistics for the data are shown below (in kilometers). Find the 99% confidence interval on the difference in mean life. Which brand would you prefer based on this calculation? Is there an alternative method to answer this question instead of computing the confidence interval?

Variable N Mean Median StDev SE Mean Minimum Maximum Q1 Q3

Brand1 8 38479 37067 5590 1976 32100 48360 34185 43525

Brand2 8 37611 36655 5244 1854 31950 47800 33491 41214

Difference 8 868 475 1290 456.1 -805 3020 N/A N/A

Brand1 |36925 |45300 |36240 |32100 |37210 |48360 |38200 |33500 | |Brand2 |34318 |42280 |35500 |31950 |38015 |47800 |37810 |33215 | |Difference |2607 |3020 |740 |150 |-805 |560 |390 |285 | |

99% confidence interval for [pic]is [pic]=(-727.84 , 2463.84)

Confidence interval only tells us that there is no difference between those brands.

The MINITAB output analyzing such data is

N Mean StDev SE Mean

B1 8 38479 5590 1976

B2 8 37611 5244 1854

Difference 8 868 1290 456

Test for equal true means in independent samples assuming equal true variances

Difference = mu B1 - mu B2

Estimate for difference: 868

95% CI for difference: (-4944, 6681)

T-Test of difference = 0 (vs not =): T-Value = 0.32 P-Value = 0.753 DF = 14

Both use Pooled StDev = 5420

Test for equal true means in independent samples assuming unequal true variances

Difference = mu B1 - mu B2

Estimate for difference: 868

95% CI for difference: (-4986, 6723)

T-Test of difference = 0 (vs not =): T-Value = 0.32 P-Value = 0.754 DF = 13

Test for equal true means in dependent samples

95% CI for mean difference: (-211, 1948)

99% CI for mean difference: (-728, 2465)

T-Test of mean difference = 0 (vs not = 0): T-Value = 1.90 P-Value = 0.099

Test for equal variances: B1 versus B2

F-Test (normal distribution)

Test Statistic: 1.136

P-Value : 0.870

Example 11 : An experiment to compare the yield (kg/ha) of Sundance winter wheat and Manitou spring wheat is considered. Data from nine different plots is given in the following table. Is there sufficient evidence to conclude that true average yield for the Sundance winter wheat is more than 500 kg/ha than the Manitou spring wheat? Check the plausibility of any assumptions needed to carry out an appropriate test of hypothesis.

|1 |2 |3 |4 |5 |6 |7 |8 |9 | |S |3201 |3095 |3297 |3644 |3604 |2860 |3470 |2042 |3689 | |M |2386 |2011 |2616 |3094 |3069 |2074 |2308 |1525 |2779 | |D=S-M |815 |1084 |681 |550 |535 |786 |1162 |517 |910 | |

[pic] or [pic]

[pic] or [pic]

The difference distribution should be normal.

Use the differences and compute the sample mean, [pic]=782.2222 and the standard deviation, [pic]=236.736.

Test statistics: [pic]=3.5764

Decision:

(i) Since this is an upper tailed test, reject H0 if t ( t(;n-1= t0.05;8 =1.86 or if the P-value=P(t>3.5764) is less than (=0.05. Notice that test statistics, 3.5764 is large than 1.86.

(ii) 0.001 500): T-Value = 3.58 P-Value = 0.004

More to discuss

Two-sample T for muS vs muM

N Mean StDev SE Mean

S 9 3211 518 173

M 9 2429 518 173

Difference = muS - muM

Estimate for difference: 782

95% CI for difference: (265, 1300)

T-Test of difference = 0 (vs not =): T-Value = 3.20 P-Value = 0.006 DF = 16

Both use Pooled StDev = 518

F-Test (normal distribution)

Test Statistic: 1.002

P-Value : 0.998

Levene's Test (any continuous distribution)

Test Statistic: 0.094

P-Value : 0.763

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download