Tests with Fixed Significance Level



Chapter 18: Inference for a Population Proportion

Previously, we have been making inferences about the unknown population mean (.

Now we look at the situations where the parameter of interest is population proportion, p.

Examples:

• Proportion of all SMU students favoring a switch to quarter system,

• Proportion of all US citizens who approve of the President’s performance,

• Proportion of all Wal-Mart customers who has excellent shopping experience in Wal-Mart, etc.

Think of our population as consisting of “successes” (having the outcome we are looking for) and “failures” (not having the outcome we are looking for).

Example: If we are interested in the proportion of students in SMU favoring a switch to quarter system, so in the population of all SMU students, we can call “success” as

Then the parameter of interest is the proportion, p, of “successes” in our population.

Q: What is an obvious estimator of the parameter p?

Ex: Suppose we take a SRS of 225 SMU students of whom 90 favor a switch to quarter system.

Parameter p =

An obvious estimate of p is

The statistic we will be using to estimate p is the sample proportion,

Q: How good is the statistic[pic]as an estimator of p?

For this we study the sampling distribution of [pic], i.e., we repeat the process of taking a sample of size n and computing[pic]from it many times. The histogram of values of [pic]is sampling distribution of[pic].

As the sample size increases,

• the sampling distribution of [pic] becomes approximately _____________.

• the mean of the sampling distribution of [pic] is ___ . So it is an __________ estimator of p.

• (not so obviously!) the standard deviation of the sampling distribution of [pic] is

In other words, for a large sample, [pic]

Ex: An experiment on the side effects of pain relievers assigned arthritis patients to one of several over-the-counter pain medications. 440 patients were given one brand of pain reliever and the number of patients suffering some “adverse symptom” is noted.

a) If 10% of all patients suffer adverse symptoms, what would be the sampling distribution of the proportion with adverse symptoms in a sample of 440 patients?

b) Find the approx. probability that fewer than 8% of the patients in any sample of size 440 suffer adverse symptoms.

c) Find the approx. probability that between 8% and 14% of the patients in any sample of size 440 suffer adverse symptoms.

Inference for a Population Proportion

Assumptions:

• The data are SRS from the population of interest.

• The population is at least 10 times as large as the sample.

• The sample size n is large – both the count of successes [pic]and the count of failures [pic].

|Confidence Intervals for a Population Proportion |

| |

|Choose an SRS of size n from a population having unknown proportion p of successes. |

| |

|An approximate level C confidence interval for p is |

| |

| |

| |

| |

|Here, z* is the value on the standard normal curve with area C between -z* and z*. |

To test the hypotheses H0: p = p0 vs. Ha: p [pic]p0 at ( level:

Construct a (1-() CI for p under the null hypothesis:

[pic]

• Reject H0 when the value p0 falls outside a level (1-() CI for p under the null hypothesis.

• Do not reject H0 when the value p0 falls inside a level (1-() CI for p under the null hypothesis.

Ex. Suppose we randomly surveyed 1,000 shoppers at Wal-Mart and asked them to rate their shopping experience as excellent, good or poor. Here, there are three event classes in the categorical variable called "rating". Further suppose that of the 1,000 shoppers, 272 rated the experience as excellent. We are interested in the population proportion of shoppers at Wal-Mart has excellent shopping experience (p).

a) Give an estimate of p.

b) Are the three assumptions met?

c) Construct a 95% confidence interval for p.

d) Suppose we wanted to test if the true population proportion of Wal-Mart shoppers rating the shopping experience at Wal-Mart as excellent was equal to 32% versus not equal to 32% at 5% level. Use the C.I. in part (c) to answer this.

Choosing the Sample Size

Goal: Decide the number of observations needed to attain the desired confidence level and margin of error.

The margin of error in the confidence interval for p is

Before we draw the SRS, we do not know the value of [pic]. Therefore, we consider the worst-case scenario. Which value of[pic]will give the largest margin of error in the above formula?

Ex. If you look at polls (they usually use 95% confidence interval), you will find most have sample size n[pic]1000 and then a margin of error of ±3%. Why?

Let m represent the desired margin of error. The number of observations needed is

*****Always round your answer up!!*****

Ex. Suppose you want to estimate p with 95% confidence and a margin of error no greater than 2%. How large a sample do you need?

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download