Regression Analysis: t90 versus t50
Page 17
Correlation and Regression
Correlation and regression is used to explore the relationship between two or more variables. The correlation coefficient r is a measure of the linear relationship between two variables paired variables x and y.. For data, it is a statistic calculated using the formula
r =
The correlation coefficient is such -1 ≤ r ≤ 1. If y is a linear function of x, then r =1 if the slope is positive and -1 if it is negative. We emphasize that r is a measure of linear relationship, not functional relationship.
Example. Here is a small dataset:
x y
-3.0 0.00000
-2.5 1.65831
-2.0 2.23607
-1.5 2.59808
-1.0 2.82843
-0.5 2.95804
0.0 3.00000
0.5 2.95804
1.0 2.82843
1.5 2.59808
2.0 2.23607
2.5 1.65831
3.0 0.00000
Mean of y = 2.11983
Correlations: x, y
Pearson correlation r of x and y = -0.000; P-Value = 1.000
You see the relationship of course: x2 + y2 = 9
A more concrete interpretation of r will be given later when we discuss regression. For the moment, view high values of | r | as indicating that the points (x, y) lie nearly on a straight line while low values indicate that no obvious line passes close to all the points.
Page 18
Astronomy Example.
We use the data from Mukherjee, Feigelson, Babu, etal in “Three types of Gamma-Ray Bursts (The Astrophysical Journal, 508, pp 314-327, 1998), in which there are 11 variables, including 2 measures of burst durations t50 and t90 (times in which 50% and 90% of the flux arrives). It will be used to illustrate concepts in correlation and regression.
One would expect, from the context of the example, that the variables t50 and t90 ought to be strongly related. Here is output from Minitab showing the value of r:
Correlations: t50, t90
Pearson correlation of t50 and t90 = 0.868
P-Value = 0.000
Regression Analysis: t90 versus t50
The regression equation is
t90 = 10.1 + 1.65 t50
Predictor Coef SE Coef T P
Constant 10.106 1.066 9.48 0.000
t50 1.64553 0.03333 49.37 0.000
S = 26.8058 R-Sq = 75.3% R-Sq(adj) = 75.3%
Analysis of Variance
Source DF SS MS F P
Regression 1 1751506 1751506 2437.55 0.000
Residual Error 800 574841 719
Total 801 2326347
Page 19
Unusual Observations (Partial list!)
Obs t50 t90 Fit SE Fit Residual St Resid
2 69 208.576 123.002 2.031 85.574 3.20R
5 306 430.016 514.242 9.768 -84.226 -3.37RX
7 30 381.248 58.866 1.070 322.382 12.04R
10 95 182.016 166.813 2.847 15.203 0.57 X
20 204 292.736 345.530 6.375 -52.794 -2.03RX
26 61 221.184 111.207 1.823 109.977 4.11R
44 30 123.392 58.761 1.069 64.631 2.41R
77 58 179.840 104.783 1.713 75.057 2.81R
127 19 110.464 41.911 0.959 68.553 2.56R
130 108 158.080 187.665 3.248 -29.585 -1.11 X
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence.
Residuals vs Fits for t90
Plot shows non constant variance and very unusual standardized residuals. Suggests making transformation on both x and y. We will discuss this later.
Transform by taking logs of both variables.
Correlations: log(t50), log(t90)
Pearson correlation of log(t50) and log(t90) = 0.975
P-Value = 0.000
Page 20
Regression Analysis: log(t90) versus log(t50)
The regression equation is
log(t90) = 0.413 + 0.984 log(t50)
Predictor Coef SE Coef T P
Constant 0.413395 0.008372 49.38 0.000
log(t50) 0.984235 0.007863 125.17 0.000
S = 0.206054 R-Sq = 95.1% R-Sq(adj) = 95.1%
Analysis of Variance
Source DF SS MS F P
Regression 1 665.19 665.19 15666.85 0.000
Residual Error 800 33.97 0.04
Total 801 699.15
Unusual Observations (Partial list)
Obs log(t50) log(t90) Fit SE Fit Residual St Resid
7 1.47 2.58121 1.86195 0.01040 0.71925 3.50R
12 0.83 1.70600 1.22772 0.00765 0.47828 2.32R
47 -1.85 -1.46852 -1.41125 0.02008 -0.05727 -0.28 X
56 -1.72 -1.17393 -1.28072 0.01911 0.10679 0.52 X
60 0.70 1.52799 1.10066 0.00740 0.42733 2.08R
95 0.76 1.79607 1.16183 0.00750 0.63425 3.08R
125 -0.89 -0.04769 -0.46532 0.01332 0.41763 2.03R
141 -0.11 0.82321 0.30056 0.00885 0.52265 2.54R
156 -1.06 -0.10018 -0.63037 0.01445 0.53019 2.58R
168 0.20 1.50775 0.61430 0.00771 0.89345 4.34R
175 0.97 2.23188 1.36569 0.00806 0.86619 4.21R
176 0.20 1.26102 0.61430 0.00771 0.64673 3.14R
210 1.02 2.08812 1.41832 0.00825 0.66980 3.25R
215 0.44 1.38710 0.84611 0.00731 0.54099 2.63R
221 -0.72 -0.71670 -0.29201 0.01219 -0.42469 -2.06R
255 0.17 1.10285 0.57866 0.00780 0.52419 2.55R
272 1.45 2.26194 1.84404 0.01030
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence.
Page 21
Plot of Residuals vs. Fitted Values ‘Four in One Plot’
Still some unusual observations but plots looks much better.
Before leaving this example, we note the following relationship:
Correlations: log t90, FITS1
Pearson correlation r of log t90 and FITS1 = 0.975
P-Value = 0.000
If you look back to page 19, bottom, you will see that the correlation between log t90 and log t50 is also .975. That is, the correlation between the fitted (predicted) values and the observations y is the same as the correlation between the two variables x = log t50 and y = log t90.
Brief Overview of Forthcoming Topics:
• We will first look at general linear regression model analysis in matrix terms.
• We will discuss some model assumptions and how they can be examined.
• Then we will present an example with five predictors and some techniques for model fitting.
• We will then discuss generalized linear models, including logistic and Poisson regression.
-----------------------
hÝMÆhÝMÆOJQJ^JhÊhÝMÆOJQJ^JhÊOJQJ^JhÊhÊOJQJ^JjhÝMÆU[pic]mHnHu[pic]
he‰CJaJhÊhÊCJaJhm[?]hm[?]OJQJ^J
hÊCJaJ
hm[?]CJaJhm[?]hm[?]CJaJhm[?]CJOJQJ^JaJhm[?]hm[?]5?CJ\?aJhÝMÆ5?CJ EMBED MtbGraph.Document [pic]
[pic]
[pic]
[pic]
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- regression analysis hypothesis test
- regression analysis significance
- regression analysis p value
- regression analysis examples
- simple regression analysis example
- regression analysis calculator
- regression analysis study example
- examples of regression analysis research
- correlation and regression analysis pdf
- regression analysis book pdf
- regression analysis in excel
- regression analysis coefficient tells