STA 2023 Final Exam Review Notes



STA 2023 Final Exam Review Notes

Confidence Intervals

Definition: A Confidence Interval is a range of values used to estimate some population parameter with a specific confidence level.

Constructing a Confidence Interval: In order to construct a confidence interval you MUST calculate the Margin of Error, E. Below are the formulas to calculate E and when to use each formula:

Confidence Interval of the Population Proportion p

|Confidence Level |Critical |

| |Value |

|0.90 |1.645 |

|0.95 |1.96 |

|0.99 |2.575 |

[pic], where [pic] is the sample proportion ([pic], and is always less than 1), [pic] is 1- [pic], n is the sample size, and [pic] is the Critical Value for the Confidence Level given, see table to the right. Once you have E, the Confidence Interval can be represented three different ways:

[pic] [pic] [pic]

Confidence Interval of the Population Mean μ

(When the population standard deviation σ is known)

|Confidence Level |Critical |

| |Value |

|0.90 |1.645 |

|0.95 |1.96 |

|0.99 |2.575 |

[pic], where σ is the population standard deviation, n is the sample size, and [pic] is the Critical Value for the Confidence Level given, see table to the right. Once you have E, the Confidence Interval can be represented three different ways:

[pic] [pic] [pic] [pic] will be given in the problem.

Confidence Interval of the Population Mean μ

(When the population standard deviation σ is NOT known)

|Confidence Level |Area in Two |

| |Tails |

|0.90 |0.10 |

|0.95 |0.05 |

|0.99 |0.01 |

[pic], where s is the sample standard deviation, n is the sample size, and [pic] is the Critical Value for the Confidence Level given from the t-table with n-1 degrees of freedom. The critical value will be in the row for n-1 Degrees of Freedom and the column for the Area in Two Tails associated with the Confidence Level given, see the table to the right. Once you have E, the Confidence Interval can be represented three different ways:

[pic] [pic] [pic] [pic] will be given in the problem.

Miscellaneous Confidence Interval Information:

• The width of the Confidence Interval is 2 times the margin of error, E. So if you are given a Confidence Interval and told to find the margin of error, E, simply find the width by subtracting the lower bound from the upper bound and then divide by two.

• If you are given a Confidence Interval and told to find the point estimate (the sample mean [pic] or the sample proportion[pic]) simply add the endpoints (the upper and lower bounds) of the confidence interval and divide by two. This is because the point estimate is the midpoint of the interval.

Correlation

Definition: A correlation exits between two variable when one of them is related to the other in some way.

Linear Correlation Coefficient, R measures the strength of the linear relationship between the paired x and y values in a sample. R is calculated using the formula below. R can take on values between -1 and 1. R values close to 1 indicate a strong positive linear relationship. R values close to -1 indicate a strong negative linear relationship. R values close to 0 indicate a very weak relationship or no relationship.

[pic], if you don’t get a number between -1 and 1, do over

Note: Correlation does not imply causality. Just because r is close to 1 or -1 doesn’t mean one of the variables causes the other, the relationship is only mathematical.

Expected Value: This is the same as the mean. Remember the mean or expected value of the sample mean is equal to the population mean.

Example Question: A simple random sample of size 36 is taken from a normally distributed population with mean of 92. What is the expected value of the sample means?

Answer: The expected value of the sample means is always equal to the population means so it is 92.

Also: If the expected value of a point estimate is equal to that which it estimates, then the point estimate is called unbiased. ([pic] , s and [pic] are unbiased estimators for μ, σ and p)

Hypothesis Testing

Definition: Method for testing claims made about populations; also called test of significance.

Identifying the Null and Alternative Hypotheses

You may be given a scenario and asked to identify the appropriate hypotheses. First remember that the Null Hypothesis (H0:) will always be include an = sign. You can eliminate any choice where the null hypothesis includes a > or |Right-Tail |

|< |Left-Tail |

|≠ |Two-Tail |

Example Question: Is the hypothesis test H0: p=0.75, vs. H1: p α.

Example Question: A simple random sample of size 52 is taken from a normally distributed population with σ = 34.2, yielding a sample mean of 113.7. Consider the hypothesis test H0: μ=100, vs H1: μ>100. What is the p-value?

Answer: The first thing we have to do is compute the test statistic. We are given n=52, σ = 34.2, [pic]=113.7, and μ=100. So the test statistic, z = [pic] = 2.89. Now notice the alternative hypothesis has a greater-than sign (>) so it is a right-tailed test. The p-value is the area in the tail of the standard normal curve to the right of z=2.89, or P(z>2.89), which is .0019. The p-value is .0019.

P-Value: Probability that a test statistic in a hypothesis test is at least as extreme as the one actually obtained. Also, it is the smallest α at which we can Reject the Null Hypothesis. So if α is smaller than the p-value we CANNOT Reject the Null Hypothesis.

Rules: Reject H0 is P-value ≤ α (where α is the significance level like 0.05)

Fail to reject H0 if P-value > α.

Linear Regression

Regression Equation: Algebraic equation describing the relationship among variables.

Calculation the Regression Equation Example

A sample of size 30 produces the following sums:

[pic]

Calculate the least squares regression equation, [pic]

[pic]

[pic]

So [pic] is the least squares regression equation.

Working With Regression Information

The regression equation is

Height = 21.6 + 0.690 Mothers Height

Predictor Coef SE Coef T P

Constant 21.560 9.515 2.27 0.029

Mother Height 0.6899 0.1479 4.66 0.000

S = 2.95645 R-Sq = 36.4% R-Sq(adj) = 34.7%

Given Minitab output above, you may be asked the following questions:

Example Question: What proportion in the variation in Height can be explained by a linear relationship with Mothers Height?

Answer: The proportion of variation explained by a regression equation is called the Coefficient of Determination and is denoted r2, or R-Sq in Minitab. So the answer is R-Sq = 36.4% or .364.

Example Question: What is the value of the correlation coefficient?

Answer: The correlation coefficient, r is the square root of r2. So the answer is

r = [pic]. Note: r is always the same sign as [pic] so if [pic] is negative then r is the negative square root of r2.

Example Question: If someone’s mother is 62 inches tall, how tall does the model predict the person will be?

Answer: Just plug in 62 as the mother’s height into the equation.

Height = 21.6 + 0.690*62 = 64.38 inches. The answer is 64.38.

Other Info:

SSE = (n-2)*S2, where S is the S given in the Minitab output.

Residual: The difference between an observed y value and the value of y that is predicted from a regression equation.

Example Question: You are given or have calculated a regression equation of y = .563x + 3.67. An observation has x = 6 and y = 7.5, what is the residual?

Answer: Put the observed value for x in the equation to find the predicted value of y. y = .563(6) + 3.67 = 7.048. So 7.048 is the predicted value for y, the observed value for y was 7.5 so the residual (the difference) is .452.

Type I and Type II Errors:

Type I Error: The mistake of rejecting the null hypothesis when it is actually true. The symbol α (alpha) is used to represent the probability of a type I error.

Type II Error: The mistake of failing to reject the null hypothesis when it is actually false. The symbol β (beta) is used to represent the probability of a type II error.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download