Linear Algebra - Bilkent University
Linear Algebra
Take the model with L endogenous and K exogenous variables. Then jth equation for the ith observation is
[pic]
We can write it ( for j=1,……,L)
[pic]
where [pic] and [pic]are observation vectors. [pic] and [pic]are coefficient matrices of orders LxL and KxK
[pic] [pic]
where there are N observations i =1,……,N. We combine all
[pic] (1)
where Y and X are of order NxL and NxK
[pic] [pic] [pic]
If [pic] is nonsingular, we can postmultiply above by [pic]
[pic] (2)
and get the reduced form while the earlier is the structural equation
Variance Matrix
Let r be a column vector of random variables [pic]
[pic]
Least Squares
[pic]
[pic] (3)
because [pic]is minimized by changing b.
[pic]
if [pic] is 0 and x is nonrandom, then b is unbiased.
var(b)[pic]
since var[pic]. Substitution of (3) into [pic]yields the protection matrix [pic], where M is symmetric and idempotent.
[pic]
since MX=0, [pic]. Also M[pic]=M, causing the residual sum of squares
[pic]
[pic]
Trace:
The trace of a square matrix is the sum of its diagonal elements
tr(AB) = tr(BA) if both exist
tr(A+B) = tr(A)+tr(B) if they are of same order
example:
[pic]
[pic]
Generalized Least Squares
If cov[pic] rather than [pic]where v is nonsingular then
[pic]
and
[pic]
(derived later in Aitken`s Theorem)
Partitioned matrices:
Take the system in (2) and rewrite it as [pic]
[pic]
We partition by sets of columns.
Rules:
[pic]
[pic]
Inverse of a symmetric partitioned matrix:
[pic]
=[pic]
where [pic] and [pic]. We use the first when C is nonsingular and second when A is nonsingular.
Application: deviation from mean
[pic]
bottom right of inverse
[pic]
Kronecker Product
A special form of partitioning when all submatrices are scalar multiples of the same matrix.
[pic]
We refer to this as Kronecker product of A and B. If B is pxq, then the order is mpxnq
[pic]
[pic] [pic]
Application: Joint GLS
[pic]
[pic][pic]
[pic]
If we assume n disturbances of L equations have equal variance and uncorrelated so that [pic] and for [pic] [pic]contains contemporaneous covariances, then the full covariance matrix
[pic] where [pic]
Assuming [pic] is non-singular, application of GLS results in
[pic]
and
[pic]
[pic] is superior to Least Squares unless [pic] are all identical causing
[pic]
[pic]
or [pic] is diagonal, indicating no covariance between j and i.
Vectorization
Sometimes we need to work with vectors rather than matrices, e.g. finding the variance of [pic]. If the [pic]’s are in a matrix form B. Therefore, we vectorize these parameters by: A=[pic] (a pxq matrix), ai being the i’th column of A.
vec(A) = [pic]which is a pq element column vector.
vec(A+B) = vec(A)+vec(B)
vec(AB)=[pic]
Definiteness
If the quadratic form [pic] is positive for any [pic] is said to be positive definite. The covariance matrix [pic] of any random vector is always positive semi definite.
Example: BLUE implies [pic]is positive semi definite.
Proof:[pic] is linear in y so = [pic] . Define C = B - ([pic]
To write [pic] as
[pic]
Unbiasedness implies CX = 0.
[pic]
The difference [pic] is positive semi definite because [pic] is the non-negative squared length of the vector [pic] (inner product). We use definiteness in evaluation of minima and maxima; e.g. for maximizatior of utility the Hessian should be negative definite. If a matrix is definite and black diagonal, the principal submatrices are also definite.
Diagonalization
For some nxn matrix A, we seek a vector x so that Ax equals [pic] ([pic] is a scalar) This is trivially satisfied by x = 0 so we impose [pic] implying [pic].
[pic] (4)
[pic] is singular
[pic] (characteristic equation)
If A is diagonal, then the terms in the diagonal will be the solution to the characteristic eq. The determinant is a polynomial of degree n and has [pic]as the latent roots. The product of [pic] equals the determinant of A, and the sum [pic] equals the trace of A. All vectors are called characteristic (eigen)vectors.
If A is symmetric [pic], so assume [pic] and [pic] are two different roots with vectors [pic] and [pic].
Premultiplying [pic] by [pic] and [pic] by [pic] and then subtract
[pic]
For a symmetric matrix, left side vanishes and [pic]is non zero, implying orthogonality of the characteristic vectors.
[pic] where [pic]
AX=[pic]
[pic]
where [pic] is diagonal with [pic] on the diagonal. Premultiply by [pic]
[pic]
This double multiplication diagonalizes A (symmetric). Also postmultiplication
[pic]
Special cases: If A is square [pic]
[pic]
which shows that the roots of a square matrix[pic] is the square of [pic]. For symmetric and non singular A premultiply both sides by [pic] to obtain
[pic]
All latent roots of a positive definite matrix are positive.
Aitken’s theorem:
Any symmetric positive definite matrix A can be written as [pic] where Q is some non-singular matrix. For example, consider the covariance matrix [pic]. This matrix is positive def. So its inverse also is. So we can decompose it [pic]. We premultiply both sides of
[pic]
by Q
[pic]
[pic]
[pic] (GLS)
var([pic]
Cholesky decomposition
Rather than using an orthogonal matrix, X, in the previous diagonalization, it is also possible to use a triangular matrix. For instance, consider a diagonal D and an upper triangular C with units in the diagonal.
[pic] [pic]
yielding
[pic]
Any symmetric positive definite matrix [pic] can be uniquely written as [pic] [pic], which is referred to as the Cholesky decomposition.
Simultaneous diagonalization of two matrices
[pic] (5)
where A and B are symmetric nxn matrices, is being positive definite.[pic] is also positive def. So [pic].
[pic]
This shows that (5) can be reduced to (4) when A in (4) is interpreted as [pic]. If A is symmetric, so is [pic]. (5) has n solutions [pic], and if they are distinct, then [pic] are unique.
[pic] as [pic]
and
[pic]
premultiplication by [pic]
[pic]
Therefore
[pic] and [pic]
which shows that matrices being diagonalized together, A into the latent root matrix and B into I.
Example: Constrained extremum
Maximize [pic] subject to [pic] with respect to x
Lagrangian[pic]
[pic]
which shows that [pic] must be a root. Next premultiply by [pic] shows that the largest root [pic] is the maximum of [pic] st [pic]
Principal components
Consider nxk observation matrix Z. The objective is to approximate Z by a matrix of unit rank, [pic], where V is an n element vector and C is a k element coefficient vector. The objective is to minimize the discrepancy matrix [pic]by minimizing the square of all KN discrepancies and also impose [pic] to be able to solve for c. The solution when [pic] and [pic] is
[pic]
So [pic] is the largest latent root of [pic]. Next to approximate Z by [pic], we again minimize the discrepancies st. [pic] and [pic]. The solution again is
[pic]
which is the second largest root and the corresponding vector. You can generalize to i roots.
Derivation for 1 root
Since the sum of squares of any matrix A is equal to trace [pic], the discrepancy matrix sum of squares is
[pic]
[pic]
Derivative wrt c is [pic]
Substituting this back in, we get [pic], so to minimize the discrepancy, maximize [pic]
[pic]
[pic]
This, shows that the maximum root, [pic], minimizes [pic]. Extend to i cases.
[pic]need not be diagonal; the observed variables can be correlated. But the principle components are all uncorrelated because [pic]. Therefore, these components can be viewed as “uncorrelated linear combinations of correlated variables”.
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- linear algebra matrix solver
- linear algebra matrix pdf
- numpy linear algebra tutorial
- python linear algebra library
- java linear algebra library
- linear algebra library
- eigen linear algebra library
- python linear algebra pdf
- linear algebra determinant calculator
- linear algebra matrix calculator
- matrix linear algebra solver
- linear algebra equation solver