X AP Statistics Packet 11 - bls-stats

[Pages:22]X X X X X X X X X X X X X X

X X

AP Statistics Packet 11

X

Inference for Distributions Inference for the Mean of a Population

Comparing Two Means

X

X

X X X X

HW #19 1 ? 4, 7 - 10

11.1 Writers in some fields often summarize data by giving x and its standard error rather than x and s. The standard error of the mean x is often abbreviated as SEM.

(a) A medical study finds that x = 114.9 and s = 9.3 for the seated systolic blood pressure of the 27 members of one treatment group. What is the standard error of the mean?

s = 9.3 =1.7898 n 27 (b) Biologists studying the levels of several compounds in shrimp embryos reported their results in a table, with the note, "Values are means ? SEM for three independent samples." The table entry for the compound ATP was 0.84 ? 0.01. The researchers made three measurements of ATP, which had x = 0.84. What was the sample standard deviation s for these measurements? Since s = 0.01, s = 0.0173

3

11.2 What critical value t* from Table C satisfies each of the following conditions?

(a) The t distribution with 5 degrees of freedom has probability 0.05 to the right of t*. 2.015 (invT (.95, 5))

(b) The t distribution with 21 degrees of freedom has probability 0.99 to the left of t*. 2.518 (invT (.99, 21))

11.3 What critical value t* from Table C satisfies each of the following conditions?

(a) The one-sample t statistic from a sample of 15 observations has probability 0.025 to the right of t*. 2.145 (invT (.975, 14))

(b) The one-sample t statistic from an SRS of 20 observations has probability 0.75 to the left of t*. 0.688 (invT (.75, 19))

11.4 What critical value t* from Table C should be used for a confidence interval for the mean of the population in each of the following situations?

(a) A 90% confidence interval based on n = 12 observations. df = 11, t* = 1.796

(a) A 95% confidence interval from an SRS of 30 observations. df = 29, t* = 2.045

(a) An 80% confidence interval from a sample of size 18. df = 17, t* = 1.333

11.7 The one-sample t statistic for testing H0 :? = 0 HA :? >0

from a sample of n = 15 observations has the value t = 1.82. (a) What are the degrees of freedom for this statistic? df = 14

(b) Give the two critical values t* from Table C that bracket t. What are the right-tail probabilities p for these two values? 1.82 is between 1.761 (p = 0.05) and 2.145 (p = 0.025).

(c) Between what two values does the P-value of the test fall? The P-value is between 0.025 and 0.05 (in fact, p = tcdf( 1.92, , 14) = 0.0451).

(d) Is the value t = 1.82 statistically significant at the 5% level? At the 1% level? t = 1.82 is significant at = 0.05 but not at = 0.01.

11.8 The one-sample t statistic from a sample of n = 25 observations for the two-sided test of H0 : ? = 64 H A : ? 64

has the value t = 1.12. , (a) What are the degrees of freedom for this statistic? df = 24

(b) Give the two critical values t* from Table C that bracket t. What are the right-tail probabilities p for these two values? 1.12 is between 1.059 (p = 0.15) and 1.318 (p = 0.10).

(c) Between what two values does the P-value of the test fall? (Note that H A is two-sided.) The P-value is between 0.30 and 0.20 (in fact, p = 2tcdf( 1.12, , 24) = 0.2738).

(d) Is the value t = 1.12 statistically significant at the 10% level? At the 5% level? t = 1.12 is not significant at either = 0.10 or at = 0.05.

11.9 VITAMIN C CONTENT In fiscal year 1996, the U.S. Agency for International Development provided 238,300 metric tons of corn soy blend (CSB) for development programs and emergency relief in countries throughout the world. CSB is a highly nutritious, low-cost fortified food that is partially precooked and can be incorporated into different food preparations by the recipients. As part of a study to evaluate appropriate vitamin C levels in this commodity, measurements were taken on sample of CSB produced in a factory.

The following data are the amounts of vitamin C, measured in milligrams per 100 grams (mg/100 g) of blend (dry basis), for a random sample of size 8 from a production run:

26 31 23 22 11 22 14 31

(a) The specifications for the CSB state that the mixture should produce a mean (?) vitamin C content in the final product of 40 mg/100 g. Does the CSB produced in this production run conform to these specifications? Perform a significance test to answer this question.

P ? name the parameter and the population Let ? = the mean vitamin C content per 100 g

H ? state the hypotheses H0 : ? = 40 H A : ? 40

A ? verify all assumptions ? SRS ? not given, we must assume that the 8 observations represent an SRS from the population of all possible amounts of vitamin C in samples of CSB. Since the 8 observations were taken from a production run, this seems like a reasonable assumption provided that the observations were taken at regular intervals. ? 8 < 10% of all possible observations ? Since the sample size is small (n < 15), the distribution of the CSB vitamin C data should be close to normal. We can check this using a normal probability plot. Since there are no outliers and the normal plot is reasonably linear, the assumption of normality seems justified despite the small number of observations.

N ? name the test One sample t test for means

T ? calculate the test statistic (Show your work) with x = 22.50, s = 7.19, df = 7 t = 22.50 - 40 = - 6.88

7.2 / 8 O ? obtain the p-value p-value = 2P(t < -6.88) = 2tcdf( - , -6.88, 7) = 0.00024

M ? make decision Because the p-value is SO small, we reject H0 . We conclude that the vitamin C content for this run does not conform to specifications. (b) Use your calculator to find a 95% confidence interval for ?.

t* = invT(0.025, 7) = -2.365 The 95% confidence interval is 22.50 ? (2.365) 7.19 = 22.5 ? 6.0, or (16.5, 28.5).

8

11.10 HEALTHY BONES, I Here are estimates of the daily intakes of calcium (in milligrams) for 38 women between the ages of 18 and 24 years who participated in a study of women's bone health:

808 882 1062 970 909 802 374 416 784 997 651 716 438 1420 1425 948 1050 976 572 403 626 774 1253 549 1325 446 465 1269 671 696 1156 684 1933 748 1203 2433 1255 1100 (a) Display the data using a stemplot and make a normal probability plot. Describe the distribution of calcium intakes for these women. The stemplot shown below has stems in 1000s, split 5 ways. The data are right-skewed with a high outlier of 2433 (and possibly 1933). The normal plot shows these two outliers, but otherwise it is not strikingly different from a line.

(b) Calculate the mean, standard deviation, and the standard error. x = 926, s = 427.2, standard error = 69.3 (all in mg).

(c) Use your calculator to find a 95% confidence interval for the mean. Use of the t-procedure is justified here because the sample size is large (n = 38 > 30) and thus the distribution of x will be approximately normal by the central limit theorem. Using Table C with 30 degrees of freedom, we have t* = 2.042. The approximate 95% confidence interval is then 926 ? (2.042) (69.3), or 784.5 to 1067.5 mg

(d) Eliminate the two largest values and recompute the 95 % confidence interval. Without the outliers, x = 856.2, s = 306.7, standard error = 51.1 (all in mg). Using 30 degrees of freedom, we have 856.2 ? (2.042) (51.1), or 751.9 to 960.5 mg

HW #20 12, 13, 15 - 17, 19

11.12 GROWING TOMATOES An agricultural field trial compares the yield of two varieties of tomatoes for commercial use. The researchers divide in half each of 10 small plots of land in different locations and plant each tomato variety on one half of each plot. After harvest, they compare the yields in pounds per plant at each location. The 10 differences (Variety A ? Variety B) give x = 0.34 and s = 0.83. Is there convincing evidence that Variety A has the higher mean yield?

(a) Describe in words what the parameter ? is in this setting. ? is the difference between the population mean yields for Variety A plants and Variety B plants; that is, ? = ?A - ?B

(b) Use your calculator to perform a hypothesis test to answer the question. H0 : ? = 0, H A : ? > 0 . With df = 9, we obtain t = 1.295, p = 0.1137. Because the p-value is so large, we do not have enough evidence to reject H0 - the observed difference could be due to chance variation.

11.13 SPANISH TEACHERS WORKSHOP The National Endowment for the Humanities sponsors summer institutes to improve the skills of high school language teachers. One institute hosted 20 Spanish teachers for four weeks. At the beginning of the period, the teachers took the Modern Language Association's listening test of understanding of spoken Spanish. After four weeks of immersion in Spanish in and out of class, they took the listening test again. (The actual spoken Spanish in the two tests was different, so that simply taking the first test should not improve the score on the second test.) The table below gives the pretest and posttest scores. The maximum possible score on the test is 36.

(a) Perform a hypothesis test to determine if attending the institute improves listening skills.

P ? name the parameter and the population ?d is the mean improvement in score (posttest?pretest).

H ? state the hypotheses H0 : ?d = 0 H A : ?d > 0

A ? verify all assumptions SRS ? we can assume this 20 teachers < 10% of all Spanish teachers n is small, but NPP looks fairly linear, with no outliers So the t test should be reliable.

N ? name the test matched pairs t ?test

T ? calculate the test statistic (Show your work) t = 1.450 - 0 = 2.025, df = 19

3.203 / 20

O ? obtain the p-value p-value = P(t > 2.025) = tcdf(2.025, , 19) = 0.0286

M ? make decision The p-value is small, less than 0.05, so we reject the null hypothesis. We have some evidence that scores improve, but it is not overwhelming.

(b) Can you reject H0 at the 5% significance level? yes At the 1% significance level? no

(c) Use your calculator to find a 90% confidence interval for the mean increase in listening score due to

attending the summer institute.

t* = invT(.05, 19) = 1.729

CI = 1.45 ? 1.238, or 0.212 to 2.688.

11.15 DOES PLAYING THE PIANO MAKE YOU SMARTER? Do piano lessons improve the spatial-temporal reasoning of preschool children? Neurobiological arguments suggest that this may be true. A study designed to test this hypothesis measured the spatial-temporal reasoning of 34 preschool children before and after six months of piano lessons? (The study also included children who took computer lessons and a control groups; but we are not concerned with those here.) The changes in the reasoning scores are:

2 5 7 -2 2 7 4 1 0 7 3 4 3 4 9 4 5 2 9 6 0 3 6 -1 3 4 6 7 -2 7 -3 3 4 4 (a) Display the data and summarize the distribution.

The NPP looks reasonably straight, except for the granularity of the data. (b) Find the mean, standard deviation, and the standard error of the mean.

x = 3.618, s = 3.055, SEx = 0.524 (c) Use your calculator to find a 95% confidence interval for the mean improvement in reasoning.

With 33 df, we have t* = invT(.025, 33) = -2.035 CI is 2.552 to 4.684

11.16 PIANO PLAYING, II Refer to the preceding exercise. Use your calculator to test the null hypothesis that there is no improvement versus the alternative suggested by the neurobiological arguments. What do you conclude?

H0 : ? = 0, H A : ? > 0 where ? is the mean improvement in scores. With df = 33, we obtain t = 3.618 - 0 = 6.906 , p < 0.0005. Because the p-value is so small we

3.055 / 34 reject the null hypothesis and conclude that the scores are higher for students who study the piano.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download