Lecture 4: Multivariate Regression Model in Matrix Form
[Pages:29]Takashi Yamano
Lecture Notes on Advanced Econometrics
Lecture 4: Multivariate Regression Model in Matrix Form
In this lecture, we rewrite the multiple regression model in the matrix form. A general
multiple-regression model can be written as
yi
=
0
+
1
xi1
+ 2 xi2
+ ...+ k xik
+ ui
for i = 1, ... ,n.
In matrix form, we can rewrite this model as
y1 y2
yn
=
1 1 1
x11 x21
xn1
x12 x22
xn2
... ...
...
x1k x2k
xnk
0
1
2
k
+
uu12
un
n x 1 n x (k+1) (k+1) x 1 n x 1
Y = X + u
We want to estimate .
Least Squared Residual Approach in Matrix Form (Please see Lecture Note A1 for details)
The strategy in the least squared residual approach is the same as in the bivariate linear regression model. First, we calculate the sum of squared residuals and, second, find a set of estimators that minimize the sum. Thus, the minimizing problem of the sum of the squared residuals in matrix form is
min uu = (Y - X )(Y - X )
1xn nx1
1
Notice here that uu is a scalar or number (such as 10,000) because u is a 1 x n matrix and u is a n x 1 matrix and the product of these two matrices is a 1 x 1 matrix (thus a scalar). Then, we can take the first derivative of this object function in matrix form. First, we simplify the matrices:
uu = (Y - X )(Y - X )
= Y Y - X Y - Y X + X X
= Y Y - 2 X Y + X X
Then, by taking the first derivative with respect to , we have:
(uu)
=
-2 X
Y
+
2X
X
From the first order condition (F.O.C.), we have
- 2 X Y + 2 X X^ = 0
X X^ = X Y
Notice that I have replaced with ^ because ^ satisfy the F.O.C, by definition. Multiply the inverse matrix of (X X )-1on the both sides, and we have:
^ = ( X X ) -1 X Y
(1)
This is the least squared estimator for the multivariate regression linear model in matrix form. We call it as the Ordinary Least Squared (OLS) estimator.
Note that the first order conditions (4-2) can be written in matrix form as
2
X (Y - X^) = 0
1
x11
x1k
1 ... 1
x21
...
xn1
x2k ... xnk
y1 y2
yn
-
1 1
x11 x21
... x1k ... x2k
1 xn1 ... xnk
^0 ^1 ^k
=
0
1
x11
x1k
1 ... 1
x21
...
xn1
x2k ... xnk
y1 y2
yn
- -
-
^ 0
^ 0
^0
- ^ 1
-
^ 1
- ^1
x11 x21
xn1
... ...
...
- -
-
^ k
x1k
^k x2k
^k xnk
=
0
(k+1) x n
n x 1
This is the same as the first order conditions, k+1 conditions, we derived in the previous
lecture note (on the simple regression model):
n
( yi
- 0
-
1
xi1
-
2
xi2
-...- bk xik )
=0
i =1
n
xi1 ( yi
- 0
-
1 xi1 -
2
xi2
-...- bk xik )
=0
i =1
n
xik ( yi
- 0
-
1
xi1
-
2
xi2
-... - bk xik
)
=0
i =1
Example 4-1 : A bivariate linear regression (k=1) in matrix form
As an example, let's consider a bivariate model in matrix form. A bivariate model is
yi
=
0
+
1
xi1
+ ui
for i = 1, ..., n.
In matrix form, this is Y = X + u
3
y1 y2
=
1 1
x1 x2
10
+
uu12
yn 1 xn
un
From (1), we have
^ = ( X X ) -1 X Y
(2)
Let's consider each component in (2).
X X
=
1 x1
1 x2
... ...
1 xn
1 1
x1 x2
1 xn
n
1
=
i=1 n
i=1 xi
n
xi
i =1 n
xi2
i =1
=
n nx
nx
n
i =1
xi2
This is a 2 x 2 square matrix. Thus, the inverse matrix of X X is,
( X X )-1 =
1
n
n xi2 - n x 2
i =1
n
xi2
i=1
- nx
-
nx
n
=
1
n
n ( xi - x)2
i =1
n
xi2
i=1
- nx
-
nx
n
The second term is
4
X Y
=
1 x1
1 x2
... ...
1 xn
y1 y2
yn
n
yi
= i=1
=
n
i=1 xi y i
ny
n
i=1
xi
y
i
Thus the OLS estimators are:
^ = ( X X )-1 X Y =
1
n
n ( xi - x)2
i =1
n
xi2
i=1
- nx
-
nx n
ny
n i =1
xi
y
i
=
1
n
n ( xi - x)2
i =1
ny
n
xi2
- nx
n
xi yi
i =1
i =1
-
nx
ny
n
+ n xi yi
i =1
=
1
y
n
xi2
-x
n
xi yi
i=1
i =1
n
(
i =1
xi
-
x)2
n i =1
xi
yi
- nxy
=
1
y
n
xi2 -
y x2 +
y x2 - x
n
xi yi
i=1
i =1
n
( xi - x)2
i =1
n
(xi yi - x y)
i =1
=
n
(
i =1
1 xi -
x)2
n y(
n
xi2 - x 2 ) - x (
i=1
i =1
n i =1
(xi
-
x)( y i
-
y)
xi y i
- y x)
y
-
^ 1
x
=
n
(xi - x)( yi - y)
i=1
n
( xi - x)2
i =1
=
^
0
^1
5
This is what you studied in the previous lecture note.
End of Example 4-1
Unbiasedness of OLS
In this sub-section, we show the unbiasedness of OLS under the following assumptions.
Assumptions:
E 1 (Linear in parameters): Y = X + u E 2 (Zero conditional mean): E(u | X ) = 0 E 3 (No perfect collinearity): X has rank k.
From (2), we know the OLS estimators are ^ = ( X X ) -1 X Y
We can replace y with the population model (E 1),
^ = ( X X )-1 X ( X + u) = ( X X )-1 X X + ( X X )-1 X u = + ( X X )-1 X u
By taking the expectation on the both sides of the equation, we have: E(^ ) = + ( X X )-1 E( X u)
From E2, we have E(u | X ) = 0 . Thus, E(^) =
Under the assumptions E1-E3, the OLS estimators are unbiased.
The Variance of OLS Estimators
6
Next, we consider the variance of the estimators. Assumption: E 4 (Homoskedasticity): Var (ui | X ) = 2 and Cov(ui ,u j ) = 0 , thus Var(u | X ) = 2 I . Because of this assumption, we have
E
(uu
)
=
E
uu12
[u1
u2
un
]un
=
E E
(u1u1 ) (u2u1 )
E(u1u2 ) E(u2u2 )
E(unu1 ) E(unu2 )
E (u1u n E (u 2 u
) n)
=
0
2
0 2
E(unun ) 0 0
0
0
=
2
I
2
n x 1
1 x n
n x n
nxn nxn
Therefore,
Var (^) = Var[ + ( X X ) -1 X u]
= Var[( X X )-1 X u] = E[( X X )-1 X uuX ( X X )-1 ] = ( X X )-1 X E(uu) X ( X X )-1 = ( X X )-1 X 2 I X ( X X )-1
(E4: Homoskedasticity)
Var (^) = 2 ( X X )-1
(3)
GAUSS-MARKOV Theorem: Under assumptions 1 ? 4, ^ is the Best Linear Unbiased Estimator (BLUE).
7
Example 4-2: Step by Step Regression Estimation by STATA
In this sub-section, I would like to show you how the matrix calculations we have studied are used in econometrics packages. Of course, in practices you do not create matrix programs: econometrics packages already have built-in programs.
The following are matrix calculations with STATA using data called, NFIncomeUganda.dta. Here we want to estimate the following model:
ln(income) yi
=
0
+ 1 femalei
+ 2edui
+ 3edusqi
+ ui
All the variables are defined in Example 3-1. Descriptive information about the variables are here:
. su;
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+--------------------------------------------------------
female |
648 .2222222 .4160609
0
1
edu |
648 6.476852 4.198633
-8
19
edusq |
648 59.55093 63.28897
0
361
-------------+--------------------------------------------------------
ln_income |
648 12.81736 1.505715 7.600903 16.88356
First, we need to define matrices. In STATA, you can load specific variables (data) into matrices. The command is called mkmat. Here we create a matrix, called y, containing the dependent variable, ln_nfincome, and a set of independent variables, called x, containing female, educ, educsq.
. mkmat ln_nfincome, matrix(y)
. mkmat female educ educsq, matrix(x)
Then, we create some components: X X , ( X X )-1, and X Y :
8
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- extending linear regression weighted least squares
- restricted least squares hypothesis testing and
- regression estimation least squares and maximum likelihood
- lecture 4 multivariate regression model in matrix form
- week 5 simple linear regression princeton
- multiple linear regression mlr handouts
- linear least squares
- chapter 2 linear regression models ols assumptions and
- lecture 2 linear regression a model for the mean
- chapter 1 linear regression with 1 predictor
Related searches
- regression model significance
- regression model significance hypothesis
- regression model statistically significant
- regression model explanation
- simple linear regression model calculator
- regression model coefficient
- multivariate regression models
- multivariate regression interpretation
- simple linear regression model pdf
- multivariate regression results
- regression model calculator
- logistic regression model formula