Chapter 11 – Simple linear regression
MAR 5621
Advanced Statistical Techniques
Summer 2003
Dr. Larry Winner
Chapter 11 – Simple linear regression
Types of Regression Models (Sec. 11-1)
Linear Regression: [pic]
• [pic] - Outcome of Dependent Variable (response) for ith experimental/sampling unit
• [pic] - Level of the Independent (predictor) variable for ith experimental/sampling unit
• [pic] - Linear (systematic) relation between Yi and Xi (aka conditional mean)
• [pic] - Mean of Y when X=0 (Y-intercept)
• [pic] - Change in mean of Y when X increases by 1 (slope)
• [pic] - Random error term
Note that [pic] and [pic] are unknown parameters. We estimate them by the least squares method.
Polynomial (Nonlinear) Regression: [pic]
This model allows for a curvilinear (as opposed to straight line) relation. Both linear and polynomial regression are susceptible to problems when predictions of Y are made outside the range of the X values used to fit the model. This is referred to as extrapolation.
Least Squares Estimation (Sec. 11-2)
1. Obtain a sample of n pairs (X1,Y1)…(Xn,Yn).
2. Plot the Y values on the vertical (up/down) axis versus their corresponding X values on the horizontal (left/right) axis.
3. Choose the line [pic] that minimizes the sum of squared vertical distances from observed values (Yi) to their fitted values ([pic]) Note: [pic]
4. b0 is the Y-intercept for the estimated regression equation
5. b1 is the slope of the estimated regression equation
Measures of Variation (Sec. 11-3)
Sums of Squares
▪ Total sum of squares = Regression sum of squares + Error sum of squares
▪ Total variation = Explained variation + Unexplained variation
▪ Total sum of squares (Total Variation): [pic]
▪ Regression sum of squares (Explained Variation): [pic]
▪ Error sum of squares (Unexplained Variation): [pic]
Coefficients of Determination and Correlation
Coefficient of Determination
▪ Proportion of variation in Y “explained” by the regression on X
▪ [pic]
Coefficient of Correlation
▪ Measure of the direction and strength of the linear association between Y and X
▪ [pic]
Standard Error of the Estimate (Residual Standard Deviation)
▪ Estimated standard deviation of data ([pic])
▪ [pic]
Model Assumptions (Sec. 11-4)
▪ Normally distributed errors
▪ Heteroscedasticity (constant error variance for Y at all levels of X)
▪ Independent errors (usually checked when data collected over time or space)
Residual Analysis (Sec. 11-5)
Residuals: [pic]
Plots (see prototype plots in book and in class):
▪ Plot of ei vs [pic] Can be used to check for linear relation, constant variance
• If relation is nonlinear, U-shaped pattern appears
• If error variance is non constant, funnel shaped pattern appears
• If assumptions are met, random cloud of points appears
▪ Plot of ei vs Xi Can be used to check for linear relation, constant variance
• If relation is nonlinear, U-shaped pattern appears
• If error variance is non constant, funnel shaped pattern appears
• If assumptions are met, random cloud of points appears
▪ Plot of ei vs i Can be used to check for independence when collected over time (see next section)
▪ Histogram of ei
• If distribution is normal, histogram of residuals will be mound-shaped, around 0
Measuring Autocorrelation – Durbin-Watson Test (Sec. 11-6)
Plot residuals versus time order
▪ If errors are independent there will be no pattern (random cloud centered at 0)
▪ If not independent (dependent), expect errors to be close together over time (distinct curved form, centered at 0).
Durbin-Watson Test
▪ H0: Errors are independent (no autocorrelation among residuals)
▪ HA: Errors are dependent (Positive autocorrelation among residuals)
▪ Test Statistic: [pic]
▪ Decision Rule (Values of [pic] and [pic] are given in Table E.10):
▪ If [pic] then conclude H0 (independent errors) where k is the number of independent variables (k=1 for simple regression)
▪ If [pic] then conclude HA (dependent errors) where k is the number of independent variables (k=1 for simple regression)
▪ If [pic] then we withhold judgment (possibly need a longer series)
NOTE: This is a test that you “want” to include in favor of the null hypothesis. If you reject the null in this test, a more complex model needs to be fit (will be discussed in Chapter 13).
Inferences Concerning the Slope (Sec. 11-7)
t-test
Test used to determine whether the population based slope parameter ([pic]) is equal to a pre-determined value (often, but not necessarily 0). Tests can be one-sided (pre-determined direction) or two-sided (either direction).
2-sided t-test:
[pic]
1-sided t-test (Upper-tail, reverse signs for lower tail):
[pic]
F-test (based on k independent variables)
A test based directly on sum of squares that tests the specific hypotheses of whether the slope parameter is 0 (2-sided). The book describes the general case of k predictor variables, for simple linear regression, k=1.
Analysis of Variance (based on k Predictor Variables)
Source df Sum of Squares Mean Square F
Regression k SSR MSR=SSR/k Fobs=MSR/MSE
Error n-k-1 SSE MSE=SSE/(n-k-1) ---
Total n-1 SST --- ---
(1-α)100% Confidence Interval for the slope parameter, β1
▪ If entire interval is positive, conclude β1>0 (Positive association)
▪ If interval contains 0, conclude (do not reject) β1’0 (No association)
▪ If entire interval is negative, conclude β10 (Positive association)
▪ If interval contains 0, conclude (do not reject) βj’0 (No association)
▪ If entire interval is negative, conclude βj ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
Related searches
- simple linear regression test statistic
- simple linear regression hypothesis testing
- simple linear regression null hypothesis
- simple linear regression model calculator
- simple linear regression uses
- simple linear regression model pdf
- simple linear regression practice problems
- simple linear regression least squares
- simple linear regression excel
- simple linear regression example pdf
- simple linear regression example questions
- simple linear regression business example