Statistics 512 Notes 2



Statistics 512 Notes 2

Confidence Intervals

Definition: For a sample [pic] from a model [pic], a [pic] confidence interval for a parameter [pic] is an interval [pic] such that

[pic]for all [pic].

In words, [pic] is a function of the random sample that traps the parameter [pic] with probability at least [pic].

Commonly, people use 95% confidence intervals which corresponds to choosing [pic].

Example: Suppose [pic] are iid [pic]. The interval [pic] is a 0.9544 confidence interval for [pic]:

[pic]

Motivation for confidence intervals:

A confidence interval can be thought of as an estimate of the parameter, i.e., we estimate [pic]by [pic] rather than the point estimate [pic].

What is gained by the using interval rather than the point estimate since the interval is less precise?

We gain confidence. We have the assurance that in 95.44% of repeated samples, the confidence interval will contain [pic].

In practice, confidence intervals are usually used along with point estimates to give a sense of the accuracy of the point estimate.

Interpretation of confidence intervals

A confidence interval is not a probability statement about [pic] since [pic] is a fixed parameter, not a random variable.

Common textbook interpretation: If we repeat the experiment over and over, a 95% confidence interval will contain the parameter 95% of the time. This is correct but not particularly useful since we rarely repeat the same experiment over and over.

More useful interpretation (Wasserman, All of Statistics) :

On day 1, you collect data and construct a 95 percent confidence interval for a parameter [pic]. On day 2, you collect new data and construct a 95 percent confidence interval for an unrelated parameter [pic]. On day 3, you collect new data and construct a 95 percent confidence interval for an unrelated parameter [pic]. You continue this way constructing 95 percent confidence intervals for a sequence of unrelated parameters [pic] Then 95 percent of your intervals will trap the true parameter value.

Confidence interval is not a probability statement about [pic]: The fact that a confidence interval is not a probability statement about [pic]is confusing. Let [pic]be a fixed, known real number and let [pic]be iid random variables such that [pic]. Now define [pic]and suppose we only observe [pic]. Define the following “confidence interval” which actually contains only one point:

[pic]

No matter what [pic]is, we have [pic]so this is a 75 percent confidence interval. Suppose we now do the experiment and we get [pic]and [pic]. Then our 75 percent confidence interval is {16}. However, we are certain that [pic] is 16.

Some common confidence intervals

1. CI for mean of normal distribution with known variance: [pic]iid [pic]where [pic]known.

Then [pic]

Let [pic]where [pic]is the CDF of a standard normal random variable, e.g., [pic]. We have

[pic]

Thus, [pic]is a [pic]CI for [pic]

2. CI for mean of normal distribution with unknown variance.

[pic]iid [pic]where [pic]unknown.

Key fact: The random variable [pic], where [pic], has a Student’s t-distribution with n-1 degrees of freedom. (Section 3.6.3, page 186)

Let [pic]be the inverse of the CDF of the Student’s t-distribution with n degrees of freedomevaluated at [pic]. Note [pic]

Following the same steps as above, we have

[pic]

Thus, [pic]is a [pic]CI for [pic]

Note: [pic]so we pay a price for not knowing the variance but as [pic].

3. CI for mean of iid sample from unknown distribution:

Central Limit Theorem (Theorem 4.4.1): For an iid sample from a distribution that has mean [pic]and positive variance [pic], the random variable [pic]converges in distribution to a standard normal random variable.

Slutsky’s Theorem (Theorem 4.3.5): [pic] then [pic].

From the weak law of large numbers, if [pic] [pic].

Thus, combining Slutsky’s Theorem and the central limit theorem,

[pic]

An approximate [pic] CI for [pic]is [pic] because

[pic]

Application: A food-processing company is considering marketing a new spice mix for Creole and Cajun cooking. They interview 200 consumers and find that 37 would purchase such a product. Find an approximate 95% confidence interval for p, the true proportion of buyers.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download