OLS Assumptions - Webs
OLS Assumptions |Parameter Estimates |Centering | |
|No measurement error (biased estimates: inflated) |( = “true” intercept |a = population intercept |How: subtract each X from the mean of X (same for Y) |
| | | |Run OLS with the new values |
|No specification error |(( = “true” slope |b( = population slope | |
|Non-linear relationship modeled (biased estimates) |( = “true” error |e = population error |Why: might make results easier to interpret |
|Relevant X’s excluded (biased estimates) |“true” regression equation: for the total population |Consequences: slope (b) no change; intercept (a) |
| | |changes |
|Irrelevant X’s included (inflated standard errors) |estimated regression equation: for sample | |
|Error term assumptions | |Variance (standard error) of Slope Estimate |
|Homoskedasticity (variance of error term is constant) |Criteria used by OLS for fitting a line |Tells you how stable the line is |
|(if not, inflated/deflated standard errors) | | |
| |Basic idea is to choose estimates b1 to bk to minimize the |High variance (points compacted around the line): |
| |sum of the squared residuals (errors) |Bad ( line is less stable |
|No autocorrelation (residuals are not correlated) | | |
|(if not, inflated/deflated standard errors) | | |
| | |Low variance (points more spread around the line): |
| | |Good ( line is more stable |
|Residuals average out to 0 (this is built into OLS) |Problems with the Slope (parameter) Estimate | |
|Covariance between residuals and ind var = 0 |1.Biased slope estimate: over an infinite number of |As sample size increases, variance usually decreases |
|(if leave relevant var out, biased estimates) |samples, the estimate will not equal the true population | |
| |value | |
| | | |
|Error terms are normally distributed |2.Want efficient estimators: unbiased estimator with the |Forcing the Intercept Through the Origin |
| |least variance | |
| | |If theory predicts that if X = 0, Y should = 0 |
|R2 | |Not a good idea: |
|% of the variance in Y that is “explained” by X |Standardized Estimates (Beta Weights) |1) changes the slope (strength of the relationship) |
|Measures goodness of fit |Change in Y in standard deviation unites brought about by |2) can’t test H0: ( = 0 |
|R2 = Regression Sum of Squares (RSS) Σ (y hat - mean y)2|one standard deviation unit change in X | |
|Total Sum of Squares (TSS) Σ (yi - mean of y)2 | | |
| | |3) won’t work if you really have curvilinear rel |
| |Units are lost in the transformation (now in std dev units)|4) maybe it doesn’t make sense to talk about the line |
| | |at all if X = 0 |
|Problems with R2: |Hard to convey meaning to the reader ( b/c std dev units) | |
|1) not a measure of magnitude of rel btwn X & Y | |5) may have a bad sample; makes it appear sig. |
|2) dependent/vulnerable on the std dev of X & Y |Functional Transformations of Independent Variable |6) if you force the line you deny the chance to see if |
|can’t compare across samples | |there is something wrong with the model and if the |
|biased in small samples | |model actually predicts an intercept of 0 |
| |Used if there is a non-linear relationship btwn X & Y | |
| |Log (X) | | |
| |√X | | |
|3) addition of any variable will increase R2 | | |7) costs of leaving a in are minor compared to taking |
|include variables only because of theory | | |it out |
| | | | |
| | | |8) R2, slope, intercept all change; difficult to |
| | | |interpret |
|Confidence Intervals for β | | | |
|Over all samples, 95% of the computed confidence | | |Parameter Estimates & Degrees of Freedom |
|intervals will cover the true β | | | |
| |X2 |[pic] |Degrees of Freedom: n-k-1 |
|βi hat ± (t(/2)(std error βi hat) | | |Parameters: k+1 (for the intercept) |
| | | | |
|Confidence Intervals for E(Y0) | | |t-statistics and One Tailed Tests |
|Over all samples, 95% of the confidence intervals will | | | b . |
|cover the true Y value | | |Std err of b |
|Each Y value has its own confidence interval |To interpret the results, the new values would be plugged |One-tailed tests should be used if the researcher’s |
| |into the equation to get predicted y values |theory suggests that the relationship between the two |
| | |variables goes in a specific direction |
|Extrapolation is when you predict a value for Y with an | | |
|X that is not actually in your sample | | |
| | | |
| |Standard Error of Regression (Std Dev of Residuals) | |
| |(Root Mean Sqd Error; Std error of Estimate) | |
|Adjusted R2 | |P values |
|Adj R2 = (R2 – k/n-1) (n-1/n-(k+1)) | |A p value is the lowest significance level at which a |
| | |null hypothesis can be rejected; they are connected |
| | |with the t statistic |
|Since adding variables inflates the R2, adjusted R2 |√ | Σei2 . |√ |Mean std. error | |
|takes the degrees of freedom into account to fix this | |n-(k+1) | | | |
| | | | | | |
|Dummy Variable |Measures the goodness of fit |To determine if β ≠ 0, we would find the actual |
| | |probability of obtaining a value of the test statistic |
| | |(p value for the t statistic) as much as or greater |
| | |than that obtained in the example; we can accept or |
| | |reject the hypothesis on the basis of that number (.05 |
| | |needed) |
|Changes the intercept |Alternative to using the R2 | |
|The two groups start at different points, but have the |Not dependent on the standard deviations of X or Y | |
|same slope | | |
| |Just one number for each equation (in Y units) | |
| |Can compare across samples | |
|Interactions | |To determine if β > 0, a one-tailed test is needed; to |
| | |do this, only one half of the area under the graph is |
| | |analyzed (1/2 of the p value) |
|Changes the slopes |Multicollinearity (MC) | |
|The difference in slopes between the groups |X1 can be predicted if the values of X2 and X3 are known | |
|Should calculate predicted Y values to see impact |Can’t tell which variable is actually having an impact | |
|Use the means to calculate the predicted values for the |It exists in degrees and the magnitude of it determines |Standard Error of Parameter Estimate |
|variables used in the interaction (4 equations) but |whether or not it is a problem | |
|include all the variables | | |
| | |√ |Σei2/n-(k+1) | |
| |Inflates the standard errors: all look more significant | |Σ(xi – x hat)2 | |
|Possible interpretations (ex: interaction of |Diagnose it using VIF (variance inflation factor) Scores : | |
|gender/feeling) |scores of 4 or 5 usually cut off point for problems; higher| |
| |scores are problematic | |
|Among Republicans, women like Clinton even less | |Variance Inflation Factor (VIF) |
|Among women, Republicans like Clinton even less | |VIF= | 1 . |
| | | |1-auxR2 |
|There is an additional effect of gender on feelings |High Aux R2 (> .75) indicates high MC: you can explain a | | |
|towards Clinton |lot of X1 with the other variables | | |
| | | |
| |What to do about it: |Finding the Standard Error |
|Outliers |Get more data; pool data (but that is problematic) |Standard error of regression |
| |Combine variables (ex: socioeconomic status combines | Σ(xi – ()2 |
| |education, income, and occupational prestige which alone | |
| |tend to be highly correlated) | |
| | | |
| | |Miscellaneous Info |
| |Drop one X |Adjusted R2 ↑ Standard Error of Reg ↓ |
| |Don’t’ do this: only added it because it was theoretically |Sample Size is total df from ANOVA table + 1 |
| |important | |
| | |If std error of estimate is inflated, t will drop, p |
| | |goes up, keep Ho when it should be rejected |
| |If you drop a relevant right hand side (RHS) variable you | |
| |get biased parameter estimates | |
| | |If std error of estimate is deflated, t will go up, p |
| | |goes down, reject Ho when it should be kept |
| |It is acceptable to run 2 models: | |
| |1) with all variables |b = |Σ (xi – mean x) (yi – mean y) |
| |2) with some dropped variables—showing that it could be |Σ (xi – mean x)2 |
| |misestimating | |
| | | |
| |You are giving full information (important) | |
| | | |
| |How to deal with outliers ( Diagnose them |
| |1. DFBETA measures if there is a change in a particular |SPSS can give you all of these numbers |
| |variable when a particular case is removed; if the value is| |
| |bug, the case has a large impact | |
| | | |
| | | |
| |2. Cook’s Distance (Cook’s D) measures the influence of an |Cook’s D: |Di > 4 . |
| |observation on the model as a whole | | |
| | |n – (k+1) |
| |3. look at abs value of standardized residuals; helps flag | |
| |cases; ІeІ > 3 indicates it is pretty far off the line | |
| | | |
|Missing Values |F-test (limitations listed below) |Heteroskedasticity |
|1. listwise deletion: if a value for any X is missing, |Testing to see if the variables are significant |When variance of error term is not constant; violation |
|entire case deleted (lose lots of data) | |of OLS assumptions: Var(() = constant |
| |Null hypothesis: none of the variables have a significant | |
| |impact (b1=b2=b3=bk) | |
|2. pairwise deletion (not a great idea, but not evil): | |When to suspect it: |
|uses info in pairs of variables to estimate slope | | |
|coefficients | | |
| |Alt. Hypothesis: at least one variable has a significant |Pooled data |
| |impact | |
| |When H0 is false Mean Reg Sum of Sqrs > Mean Err SS |Learning curves with coding |
|Problem: get different sample size for each variable, | |Fatigue effects |
|some variables have more/less info | | |
| |Durbin Watson Scores: H0 ( No autocorrelation (AC) |Whenever think some portion of data will be better |
| | |reported/predicted |
|3. substitute mean (mean substitution) NOT GOOD IDEA: | AC No AC | |
|will bias over time; substitute the mean value for |AC | |
|missing ones | | |
| |Reject ? Fail to Reject |Predictive power of model not consistent across cases |
| |? Reject |or variables |
|No positive value in doing this | 0 2 |Consequences |
| |4 | |
|No guarantee that the mean of missing values will be the|Autocorrelation |Inflated standard error (conclude there is less sig) |
|same as for the ones you have values for—could be | | |
|putting the wrong value in | | |
| |Residuals are correlated; usually happens with time series |If small var(e) are located away from mean X std error |
| |data; can inflate/deflate standard errors |is too large, underconfident that |
| | |b ≠ 0 |
|4. predict the value that is missing **best option**: |Diagnosing: | |
|run regression with missing var. as dep variable—use | | |
|actual info to make an educated guess | | |
| |1. scatterplot: look for pattern in residuals |Deflated standard error (conclude there is more sig) |
| |2. regress residuals on previous residuals |If large var(e) are located away from mean X, reported |
| | |std error will be to low |
| |ei = ρei-1 + μi ( looking for sig ρ | |
|Autocorrelation |How much bias is indicated? |How to diagnose it: |
| |ІρІ % bias induced |Scatter plot |
| |.0 0% | | |
| |.2 3% | | |
| | |.5 8% | | |
| | |.8 19% | |
| | |.9 29% |Goldfield/Quandt |
| | |3. Durbin Watson Scores |Order observations by suspect x |
| | |ρ ^d ^d = 2 – 2 ρ |Throw out middle observations |
|↑ variance |↓variance, ↑stable, ↓std |0 2 |Run 2 models using all original x’s |
| |error | | |
| |1 0 |Mean residual SS1 |( F |
| | |Mean residual SS2 | |
|Heteroskedasticity | -1 4 | | |
| | |Durbin Watson: save in SPSS, look up critical value |Look up value on F chart to see if > than critical |
| | | |value |
| | | | |
| | |F-test limitations |Limitation: only diagnoses at ends, not middle |
| | |Does not tell you which is significant, just that something| |
| | |is | |
| |If have insignificant F, all b insignificant |Glejser |
| | |Save residuals |
| |Logit |Run new regression with abs residual as DV and suspect |
| | |x as only IV |
| |OLS not for dichotomous DV | |
| |Values > 1 & < 1 which are not options; actual values are |If new parameter est sig, have hetero(can predict with |
| |only 0 & 1 |residual |
| | | |
| |Induces heteroskedasticity—residuals clustered in the |Limitations: can’t check middle, only works for linear |
| |middle, but no actual values there | |
| | | |
| |Choice functions tend to be |White’s |
| | |Save residuals |
| | |Regress all IV, squares (but not dummies) & all |
| | |potential interactions on residuals |
| |Can’t model probability as a straight line, if do, | |
| |misspecify model and bias parameter estimate | |
| | |n*R2 (χ2 |
| | |df = # of regressors |
| | |Limitations: could be overkill; doesn’t tell you where |
| | |the problem is; since have so many variables, some |
| | |could be randomly significant and you could diagnose |
| | |when not really problem |
| | | |
| | | |
| | | |
| | | |
| | |WLS |
| | |Assumes that best info in data is in the observations |
| | |with least variance in error terms |
| | | |
| | |Weighs some observations more than others |
| | |Divide all by √ hetero variable |
| | |Pure hetero should not bias parameter estimates, but |
| | |could be indication of measurement/specification |
| | |problems and correcting for it could bias parameter |
| | |estimates |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
-----------------------
Report from whole dataset and add footnote (id’d some outliers but they did not make a difference)
No
Report both
Yes
Sig difference in results?
Explained away
Look harder for explanation
Analyze with and without
Yes ІeІ > 3 std dev from Ø
moderate
Are they severe?
No
Delete, explain in footnote
Yes
Explain as anomalies?
Yes
Stop
Explained?
No
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- k12 ols online sign in
- k12 ols online
- k12 ols parent portal
- k12 ols sign in and attendance
- k12 ols sign in
- stock charts webs best
- facts and assumptions examples army
- mdmp facts assumptions constraints
- facts assumptions constraints limitations army
- facts and assumptions army
- webs free website builder
- assumptions of social exchange theory