2.5 General Matrix Norms.

In section 2.3 we looked at the infinity and one norms and saw how they could be used to estimate the error in the solution of a system of linear equations y = Ax in terms of the errors in the numbers in y. In this section we look at general matrix norms where one may use some other norm for x and y. They can be used in the same way as the infinity and one norm to estimate the error in the solution of a system of linear equations.

Definition 1. Let A be a matrix and consider the mapping y = Ax. Choose a norm || x || for vectors x and choose a norm || Ax || for vectors Ax. Then the norm of A, denoted by || A ||, is the maximum value of the ratio as x varies over all non-zero vectors x, i.e.

(1) || A || =

It follows from (1) that

(2) || Ax || ( || A || || x ||

which is a generalization of (9) and (10) in section 2.3. In particular, || A || is the maximum stretching that A does when applied to vectors x.

Just as when we are dealing with the sup norm and one norm for x and y, (2) turns out to be useful when we want to analyze how errors propagate when we multiply by A. The following proposition generalizes Proposition 2 of section 2.3.

Proposition 1. Let A = and y = Ax and ya = Axa. Then

(3) || y - ya || ( || A || || x - xa ||

If m = n and A is invertible then

(4) || x - xa || ( || A-1 || || y - ya ||

Proof. (3) follows from (2) and the fact that y – ya = A(x – xa). (4) follows from (3) and the fact that x = A-1y and xa = A-1ya. //

Unfortunately, the formula (1) is not so nice for calculating the norm of some matrix A. Sometimes it is convenient to restrict the vectors x that we are maximizing over in (1) to unit vectors.

Proposition 2. The norm of A is the maximum of || Au || as u varies over unit vectors, i.e.

(5) || A || = || Au ||

Proof. Note that in (1) we can write

= || Ax || = || Ax || = || A || = || Au ||

where u = and we have used the homogeniety of the norm. Note that u is a unit vector, since

|| || = || x || = 1

So, when we maximize over all non-zero vectors x in (1), it is the same as maximizing over all unit vectors in (5). //

Here are some properties of the matrix norms.

Proposition 3. Let || A || be a matrix norm defined by (1). If A and B are matrices and c is a number then

(6) || A + B || ( || A || + || B ||

(7) || cA || = | c | || A ||

(8) || A || = 0 ( A = 0

(9) || AB || ( || A || || B ||

(10) || I || = 1

(11) || A-1 || (

Proof. To prove (6) we use (5).

|| A + B || = || (A + B)x || = || Ax + Bx || ( (|| Ax || + || Bx ||)

( || Ax || + || Bx || = || A || + || B ||

For (7) one has

|| cA || = || (cA)x || = || c(Ax) || ( | c | || Ax ||

( | c | || Ax || = | c | || A ||

To prove (8) note that || A || = 0 and (1) imply = 0 which implies || Ax || = 0 for all x which implies A = 0. Note that (1) implies || (AB)x || = || A(Bx) || ( || A || || Bx || ( || A || || B || || x ||. Therefore

  (  || A ||  || B || which implies ( || A || || B || which implies (9). (10) follows from (1) and the fact that Ix = x for all x. For (11) note that I = AA-1. Using (9) and (10) we get 1 = || I || ( || A || || A-1 || from which (11) follows. //

Even the formula (5) is not convenient for finding the norm of a matrix. For given norms || x || and || Ax || it may take some work to find a formula for the corresponding matrix norm. For the Euclidean norm here are two useful formulas.

Proposition 2. Let || x || and || Ax || be the Euclidean norms and || A || be associated matrix norm given by (1). Then

(6) || A || = max{ | ( |: ( is an eigenvalue of A} if A is symmetric

(7) || A || = max{ : ( is an eigenvalue of ATA} in general

Proof. First consider the case where A is symmetric. Then the eigenvalues (1, …, (n of A are all real and there is an orthonormal basis v1, …, vn of eigenvalues of A. Suppose Avi = (ivi and | (1 | ( … ( | (n |. For a given vector x one can write x = (1v1 + … + (nvn. Since the vi are orthonormal one has || x ||2 = One has Ax =


