Matrix Decomposition - Northwestern University

Matrix Decomposition

Ming Yang Electrical and Computer Engineering

Northwestern University Evanston, IL 60208

mya671@ece.northwestern.edu

Contents

1. Overview

2

2 Matrix Multiplication and Definitions

2

2.1 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.2 Special Matrix Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 Singular Value Decomposition

4

3.1 SVD decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.2 Corollary of SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4 LU and Cholesky Decomposition

5

4.1 Elementary Operation and Gaussian Transform . . . . . . . . . . . . . . . . . . . . . . 5

4.2 LU decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4.3 Cholesky decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

5 QR Decomposition

7

5.1 Householder Reflections and Givens Rotations . . . . . . . . . . . . . . . . . . . . . . . 8

5.2 Gram-Schmidt orthonormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

5.3 QR Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

5.4 Least Square Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

6 Schur Decomposition and Eigenvalue Decomposition

11

6.1 Schur Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

6.2 Eigenvalue Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

6.3 Hessenberg Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

7 Biconjugate Decomposition

15

7.1 Biconjugate Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

7.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1

1. Overview

"Matrix decomposition refers to the transformation of a given matrix into a given canonical form." [1], when the given matrix is transformed to a right-hand-side product of canonical matrices the process of producing this decomposition is also called "matrix factorization". Matrix decomposition is a fundamental theme in linear algebra and applied statistics which has both scientific and engineering significance. The purposes of matrix decomposition typically involve two aspects: computational convenience and analytic simplicity. In the real world, it is not feasible for most of the matrix computations to be calculated in an optimal explicit way, such as matrix inversion, matrix determinant, solving linear system and least square fitting, thus to convert a difficult matrix computation problem into several easier tasks such as solving triangular or diagonal system will greatly facilitate the calculations. Data matrices representing some numerical observations such as proximity matrix or correlation matrix are often huge and hard to analyze, therefore to decompose the data matrices into some lower-order or lower-rank canonical forms will reveal the inherent characteristic and structure of the matrices and help to interpret their meaning readily.

This tutorial is primarily a summary of important matrix decomposition methods, we will first present some basic concepts in Section 2 and then introduce several fundamental matrix decomposition methods in the successive sections, e.g. SVD, LU, QR and Eigen decomposition. A unified view of matrix factorization derived from the Wedderburn rank-one reduction theorem is briefly discussed in the summary Section 7.

2 Matrix Multiplication and Definitions

2.1 Matrix Multiplication

Suppose A Rm?r and B Rr?n, the matrix multiplication C = AB can be viewed from three different perspectives as follows:

Dot Product Matrix Multiply. Every element cij of C is the dot product of row vector aTi and column vector bj.

A =

aT1 ...

aTm

B = (b1, . . . , bn)

C = (cij)

ak Rr

bk Rr cij = aTi bj.

Column Combination Matrix Multiply. Every column cj of C is a linear combination of column vector ak of A with columns bkj as the weight coefficients.

2

A = (a1, . . . , ar) B = (b1, . . . , bn) C = (c1, . . . , cn)

r

cj = bkjak

k=1

ai Rm bj Rr cj Rm

j = 1 : n.

Outer Product Matrix Multiply. C is the sum of r matrices, every matrix is an outer product of A's column vector and B's row vector, which is a rank-one matrix.

A = (a1, . . . , ar)

B =

bT1 ...

bTr

r

C = akbTk .

k=1

ai Rm bk Rn

2.2 Special Matrix Definition

Before further discussion, we first present definitions of some special matrices, here we follow the terms in [2].

Definition 1 A real matrix A is a symmetric matrix if it equals to its own transpose, that is A = AT .

Definition 2 A complex matrix A is a hermitian matrix if it equals to its own complex conjugate transpose, that is A = AH.

Definition 3 A real matrix Q is an orthogonal matrix if the inverse of Q equals to the transpose of Q, Q-1 = QT , that is QQT = QT Q = I.

Definition 4 A complex matrix U is a unitary matrix if the inverse of U equals the complex conjugate transpose of U, U-1 = UH, that is UUH = UHU = I.

Definition 5 A matrix A Rn?n is positive definite if xT Ax 0 for all nonzero x Rn. Positive definite matrices have positive definite principle sub-matrices and all the diagonal entries are positive.

Definition 6 Suppose S Rn be a subspace with orthonormal basis V = (v1, . . . , vk), P = VT V Rn?n is the orthogonal projection matrix onto S such that range(P) = S, P2 = P, and PT = P. P is unique for subspace S.

Hermitian matrix and unitary matrix are the counterparts of symmetric and orthogonal matrix in R, the following theorems in R can be readily transformed to the corresponding forms in C by substituting the transpose by conjugate transpose and orthogonal matrix by unitary matrix. Therefore, for simplicity, we present most of the matrix decomposition results in R.

3

3 Singular Value Decomposition

Suppose matrix A Rm?n, the column vectors of A, namely range(A), represent a subspace in Rm, similarly range(AT ) is a subspace in Rn, apparently the two subspaces have the same dimension equals to the rank of A. SVD decomposition is able to reveal the orthonormal basis of the range(A) and range(AT ) and the respective scale factors i simultaneously.

3.1 SVD decomposition

Theorem 1 Singular Value Decomposition(SVD) If matrix A Rm?n, then there exist orthogonal matrices U = (u1, . . . , um) Rm?m , V = (v1, . . . , vn) Rn?n and diagonal matrix = diag(1, . . . , p) Rm?n p = min(m, n), such that

A = UVT , where 1 2 . . . p 0.

Proof 1 Let 1 = v1 Rn, such that

A 2 = max v 2=1 Av 2. Then there exist unit 2-norm vectors u1 Rm and

Av1

=

1, u1

=

Av1 1

,

therefore

Av1 = 1u1.

Any orthonormal set can be extended to form an orthonormal basis for the whole space, so we can find V1 Rn?(n-1) and U1 Rm?(m-1), such that V = (v1V1) Rn?n and U = (u1U1) Rnm?m

are orthonormal basis, thus

A1 =.

uT1 UT1

(Av1 AV1) =

uT1 Av1 uT1 AV1 UT1 Av1 UT1 AU1

=

1

u1

2 2

uT1 AV1

1UT1 u1 UT1 AU1

=

1 uT1 AV1 0 UT1 AU1

Let 1 uT1 AV1 T = 1 T Rn, the 2-norm of the product with A1 gives:

A1

1

2 2

=

12 + T ...

2 2

(12

+

T )2

So the 2-norm of matrix A1 is

A1 2 = sup xRn

A1x x

(12 + T ) = (12 + T )

(12 + T ),

while U and V are both orthonomal basis and A1 2 = A 2 = 1, so = 0. An induction on arguments completes the proof.

The i are the singular values of A and the vector ui and vi are the left singular vector and right singular vector, which satisfy that

Avi = iui and AT ui = ivi.

4

3.2 Corollary of SVD

SVD decomposition reveals many intrinsic properties of matrix A and is numerical stable for calculations.

Corollary 1 If A = UVT is a SVD of A with 1 2 . . . r r+1 = . . . = p = 0, we have the following statements:

1. rank(A) = r.

2. null(A) = span{vr+1, . . . , vn}.

3. range(A) = span{u1, . . . , ur}.

4. A =

r j=1

j uj vjT

=

Ur r Vr ,

where

Ur

=

(u1, . . . , ur),Vr

=

(v1, . . . , vr),r

=

(1, . . . , r).

5. A F = 12 + . . . + p2.

6. A 2 = 1. 7. j = j(AT A), j = 1, . . . , p, where j(AT A) is the jth largest eigenvalue of AT A. 8. vi are orthonormalized eigenvectors of AT A and ui are orthonormalized eigenvectors of AAT .

SVD is generalized to simultaneously diagonalize two matrices [3] or decomposition of a matrix that employs different metrics in the normalizations [4].

4 LU and Cholesky Decomposition

Solution to the linear system equation Ax = b is the basic problem in linear algebra. Theoretically when A is a non-singular square matrix there exists a unique solution x = A-1b, however the inverse of a matrix is typically not easy to compute. So we hope to transform A to some triangular systems which are much easier to solve by forward or backward substitution, this process is referred to as Gaussian elimination [5]. This process can be summarized in matrix form as LU decompostion and a series of evolutions when matrix A has extra properties.

4.1 Elementary Operation and Gaussian Transform

For square matrix A, the following three operations are referred to as elementary row (column) operations of type 1, 2, and 3 respectively:

1. interchanging two rows (or columns) in A.

2. multiplying all elements of a row (or column) of A by some nonzero number.

3. adding to any row (or column) of A any other row (or column) of A multiplied by a non zero number.

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download