Stat 112 Review Notes for Chapter 3, Lecture Notes 1-5
Stat 112 Review Notes for Chapter 3, Lecture Notes 1-5
1. Simple Linear Regression Model: The simple linear regression model for the mean of [pic]given [pic]is
[pic] (1.1)
where [pic]=slope=change in mean of [pic]for each one unit change in [pic];
[pic]=intercept=mean of [pic]given[pic]. The disturbance [pic]for the simple linear regression model is the difference between the actual [pic]and the mean of [pic]given [pic] for observation [pic]: [pic]. In addition to (1.1), the simple linear regression model makes the following assumptions about the disturbances [pic]:
(i) Linearity assumption: [pic]. This implies that the linear model (1.1) for the mean of [pic]given [pic]is the correct model for the mean.
(ii) Constant variance assumption: The disturbances [pic]are assumed to all have the same variance [pic].
(iii) Normality assumption: The disturbances [pic]are assumed to have a normal distribution.
(iv) Independence assumption: The disturbances [pic]are assumed to be independent.
2. Least Squares Estimates of the Simple Linear Regression Model: Based on a sample [pic], we estimate the slope and intercept by the least squares principle --
we minimize the sum of squared prediction errors in the data, [pic]. The least squares estimates of the slope and intercept are the [pic] and [pic]that minimize the sum of squared prediction errors. Some properties of the least squares estimates are:
(i) Unbiased estimators: The means of the sampling distribution of [pic] and [pic]are [pic] and [pic] respectively.
(ii) Consistent estimators: As the sample size [pic]increases, the probability that [pic] and [pic] will come close[pic] and [pic] respectively converges to 1.
(iii) Minimum variance estimators: The least squares estimators are the best possible estimators of [pic] and [pic] in the sense of having the smallest variance among unbiased estimators.
3. Residuals: The disturbance [pic]is the difference between the actual [pic]and the mean of [pic]given [pic]: [pic]. The residual [pic]is an estimate of the disturbance: [pic].
4. Using the Residuals to Check the Assumptions of the Simple Linear Regression Model: The residual plot is a scatterplot of the [pic]pairs, i.e., a plot of the [pic]variable versus the residuals. To check the linearity assumption, we check if [pic] is approximately zero for each part of the range of [pic]. To check the constant variance assumption, we check if the spread of the residuals remains constant as [pic]varies. To check the normality assumption, we check if the histogram of the residuals is approximately bell shaped. For now, we will not consider the independence assumption; we will consider it in Section 6.
5. Root Mean Square Error: The root mean square error (RMSE) is approximately the average absolute error that is made when using [pic] to predict [pic]. The RMSE is denoted by [pic] in the textbook.
6. Confidence Interval for the Slope: The confidence interval for the slope is a range of plausible values for the true slope [pic] based on the sample [pic]. The 95% confidence interval for the slope is [pic], where [pic]is the standard error of the slope, [pic]. The 95% confidence interval for the slope is approximately [pic].
7. Hypothesis Testing for the Slope: To test hypotheses for the slope, we use the t-statistic [pic] where [pic]is detailed below.
(i) Two-sided test: [pic] vs. [pic]. We reject [pic]if [pic] or [pic].
(ii) One-sided test I: [pic] vs. [pic]. We reject [pic]if [pic]
(iii) One-sided test II: [pic] vs. [pic]. We reject [pic]if [pic]
When [pic], we can calculate the p-values for these two tests using JMP as follows:
(i) Two-sided test: the p-value is Prob>|t|
(ii) One-sided test I: If [pic]is negative (i.e., the sign of the t-statistic is in favor of the alternative hypothesis), the p-value is (Prob>|t)/2. If [pic]is positive (i.e., the sign of the t-statistic is in favor of the null hypothesis), the p-value is 1-(Prob>|t)/2.
(iii) One-sided test II: If [pic]is positive (i.e., the sign of the t-statistic is in favor of the alternative hypothesis), the p-value is (Prob>|t)/2. If [pic]is negative (i.e., the sign of the t-statistic is in favor of the null hypothesis), the p-value is 1-(Prob>|t)/2.
8. R Squared: The R squared statistic measures how much of the variability in the response the regression model explains. R squared ranges from 0 to 1, with higher R squared values meaning that the regression model is explaining more of the variability in the response.
[pic]
9. Prediction Intervals: The best prediction for the [pic]of a new observation [pic]with [pic]is the estimated mean of [pic]given [pic]: [pic].
The 95% prediction interval for the [pic]of a new observation [pic]with [pic]is an interval that will contain the value of [pic]most of the time. The formula for the prediction interval is :
[pic][pic], where
[pic];
[pic];
[pic].
When n is large (say n>30), the 95% prediction interval is approximately equal to
[pic].
10. Cautions in Interpreting Regression Results:
(i) The regression of [pic]on [pic]measures the association between [pic]and [pic]. A strong association between [pic]and [pic]does not necessarily mean that changes in [pic]cause changes in [pic]. A strong association between [pic]and [pic]could be explained by [pic]causing changes in [pic]or by there being a lurking variable that is related to both [pic]and [pic].
(ii) The regression model cannot be relied on to make accurate predictions for the [pic]of[pic]that are outside the range of the observed [pic]’s, [pic]. The prediction intervals for the [pic]of [pic]that are outside the range of the observed [pic]’s are also not reliable. Trying to use the regression model to predict the [pic]of [pic]that are outside the range of the observed [pic]’s is called extrapolation.
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- strategic management lecture notes pdf
- financial management lecture notes pdf
- business management lecture notes pdf
- organic chemistry lecture notes pdf
- corporate finance lecture notes pdf
- philosophy of education lecture notes slideshare
- business administration lecture notes pdf
- advanced microeconomics lecture notes pdf
- microeconomics lecture notes pdf
- marketing lecture notes pdf
- lecture notes in microeconomic theory
- mathematical logic lecture notes pdf