Discrete Mathematics - MGNet - Easy way to solve gaussian elimination

Computational Methods in Applied Sciences I

University of Wyoming MA 5310

Fall, 2008

Professor Craig C. Douglas

Course Description: First semester of a three-semester computational methods series. Review of iterative solutions of linear and nonlinear systems of equations, polynomial interpolation/approximation, numerical integration and differentiation, and basic ideas of Monte Carlo methods. Comparison of numerical techniques for programming time and space requirements, as well as convergence and stability.

Prerequisites: Math 3310 and COSC 1010. Identical to COSC 5310, CHE 5140, ME 5140, and CE 5140. (3 hours).

Textbook: George Em Karniadakis and Robert M. Kirby II, Parallel Scientific Computing in C++ and MPI: A Seamless Approach to Parallel Algorithms and Their Implementation, Cambridge University Press, 2003 (with a cdrom of software).

Outline

1. Errors

2. Parallel computing basics

3. Solution of linear systems of equations

a. Matrix algebra review

b. Gaussian elimination and factorization methods

c. Iterative methods:

i. Splitting/Relaxation methods: Sxi+1 = Txi + b, x0 given

ii. Krylov space methods

d. Sparse matrix methods

4. Nonlinear equations

5. Interpolation and approximation

a. Given {f(x0), f(x1), …, f(xN+1)}, what is f(x), x0(x(xN+1 and xij, [pic](2).

Finally, we prove (3). Let [pic]. So,

[pic].

From the definitions of [pic],

[pic].

L nonsingular completes the proof of (3). QED

Examples:

[pic]

and

[pic].

The correct way to solve Ax=f is to compute L and U first, then solve

[pic]

Generalized Gaussian elimination

1. Order of elimination arbitrary.

2. Set [pic].

3. Select an arbitrary [pic] as the first pivot element. We can eliminate [pic] from all but the i1-st equation. The multipliers are [pic].

4. The reduced system is now [pic].

5. Select another pivot [pic] and repeat the elimination.

6. If [pic], then the remaining equations are degenerate and we halt.

Theorem 3.2: Let A have rank r. Then we can find a sequence of distinct row and column indices (i1,j1), (i2,j2), …, (ir,jr) such that corresponding pivot elements in A(1), A(2), …, A(r) are nonzero and [pic]. Define permutation matrices (whose columns are unit vectors)

[pic] and [pic],

where {ik} and {jk} are permutations of {1,2,…,n}. Then

By=g

(where [pic]) is equivalent to Ax=f and can be reduced to triangular form by Gaussian elimination with the natural ordering.

Proof: Generalized Gaussian elimination alters [pic] by forming linear combinations of the rows. Thus, whenever no nonzero pivot can be found, the remaining rows were linearly dependent on the preceding rows. Permutations P and Q rearrange equations and unknowns such that [pic]. By the first half of the theorem, the reduced B(r) is triangular since all rows r+1, …, n vanish. QED

Operation Counts

• To compute [pic]: (n-k+1)2 + (n-k+1) (do quotients only once)

• To compute [pic]: (n-k+1)

• Recall that [pic] and [pic]. Hence, there are [pic] multiplies to triangularize A and [pic] multiplies to modify f.

• Using the Ly=f and Ux=y approach, computing xi requires (n-i) multiplies plus 1 divide. Hence, only [pic] multiplies are required to solve the triangular systems.

Lemma: [pic] operations are required to solve m systems [pic], j=1, …, m by Gaussian elimination.

Note: To compute A-1 requires n3 operations. In general, n2 operations are required to compute [pic]. Thus, to solve m systems requires mn2 operations. Hence, n3+mn2 operations are necessary to solve m systems.

Thus, it is always more efficient to use Gaussian elimination instead of computing the inverse!

We can always compute A-1 by solving Axi=ei, i=1,2,…,n and then the xi’s are the columns of A-1.

Theorem 3.3: If A is nonsingular, (P such that PA=LU is possible and P is only a permutation of the rows. In fact, P may be found such that [pic] for i>k, k=1,2,…,n-1.

Theorem 3.4: Suppose A is symmetric. If A=LU is possible, then the choice of lkk=ukk(lik=uki. Hence, U=LT.

Variants of Gaussian elimination

LDU factorization: L and U are strictly lower and upper triangular and D is diagonal.

Cholesky: A=AT, so factor A=LLT.

Fun example: [pic] is symmetric, but cannot be factored into LU form.

Definition: A is positive definite if [pic].

Theorem 3.5 (Cholesky Method): Let A be symmetric, positive definite. Then A can be factored in the form A=LLT.

Operation counts:

• To find L and g=L-1f is [pic].

• To find U is [pic] operations.

• Total is [pic] operations.

Parallel LU Decomposition

There are 6 convenient ways of writing the factorization step of the n(n A in LU decomposition. The two most common are as follows:

|kij loop: A by row (daxpy) |kji loop: A by column (daxpy) |

|for k = 1, n − 1 |for k = 1, n − 1 |

| for i = k + 1, n | for p = k + 1, n |

| lik = aik /akk | lpk = apk /akk |

| for j = k + 1, n | endfor |

| aij = aij − likakj | for j = k + 1, n |

| endfor | for i = k + 1, n |

| endfor | aij = aij − likakj |

|endfor | endfor |

| | endfor |

| |endfor |

It is frequently convenient to store A by rows in the computer.

Suppose there are n processors Pi, with one row of A stored on each Pi. Using the kji access method, the factorization algorithm is

for i = 1, n-1

Send aii to processors Pk, k=i+1,…, n

In parallel on each processor Pk, k=i+1,…, n, do the daxpy update to row

k

endfor

Note that in step i, after Pi sends aii to other processors that the first i processors are idle for the rest of the calculation. This is highly inefficient if this is the only thing the parallel computer is doing.

A column oriented version is very similar.

We can overlap communication with computing to hide some of the expenses of communication. This still does not address the processor dropout issue. We can do a lot better yet.

Improvements to aid parallel efficiency:

1. Store multiple rows (columns) on a processor. This assumes that there are p processors and that [pic]. While helpful to have mod(n,p)=0, it is unnecessary (it just complicates the implementation slightly).

2. Store multiple blocks of rows (columns) on a processor.

3. Store either 1 or 2 using a cyclic scheme (e.g., store rows 1 and 3 on P1 and rows 2 and 4 on P2 when p=2 and n=4).

Improvement 3, while extremely nasty to program (and already has been as part of Scalapack so you do not have to reinvent the wheel if you choose not to) leads to the best use of all of the processors. No processor drops out. Figuring out how to get the right part of A to the right processors is lots of fun, too.

Now that we know how to factor A = LU in parallel, we need to know how to do back substitution in parallel. This is a classic divide and conquer algorithm leading to an operation count that cannot be realized on a known computer (why?).

We can write the lower triangular matrix L in block form as

[pic],

where L1 and L2 are also lower triangular. If L is of order 2k, some k>0, then no special cases arise in continuing to factor the Li’s. In fact, we can prove that

[pic],

which is also known as a Schur complement.

Norms

Definition: A vector norm [pic] satisfies for any [pic] and any [pic],

1. [pic] and [pic] if and only if [pic].

2. [pic]

3. [pic]

In particular,

[pic].

[pic].

[pic].

Example: [pic].

Definition: A matrix norm [pic] satisfies for any [pic] and any [pic],

1. [pic]

2. [pic]

3. [pic]

4. [pic]

In particular,

[pic], which is the maximum absolute column sum.

[pic], which is the maximum absolute row sum.

[pic], which is the Euclidean matrix norm.

[pic]

Examples:

1. [pic]

2. Let [pic]. Then [pic], but [pic].

Condition number of a matrix

Definition: cond(A)=[pic].

Facts (compatible norms): [pic].

Theorem 3.6: Suppose we have an approximate solution of Ax=b by some [pic], where [pic] is nonsingular. Then for any compatible matrix and vector norms, [pic].

Proof: (rhs) [pic], where [pic] is the residual. Thus,

[pic]

Since Ax=b,

[pic].

Thus,

[pic].

(lhs) Note that since [pic],

[pic] or [pic].

Further,

[pic].

Combining the two inequalities gives us the lhs. QED

Theorem 3.7: Suppose x and (x satisfy [pic], where (x and (x are perturbations. Let A be nonsingular and (A be so small that [pic]. Then

[pic].

Note: Theorem 3.7 implies that when x is small, small relative changes in f and A cause small changes in x.

Iterative Improvement

1. Solve Ax=f to an approximation [pic] (all single precision).

2. Calculate [pic] using double the precision of the data.

3. Solve Ae=r to an approximation [pic] (single precision).

4. Set [pic] (single precision [pic]) and repeat steps 2-4 with [pic].

Normally the solution method is a variant of Gaussian elimination.

Note that [pic]. Since we cannot solve Ax=f exactly, we probably cannot solve Ae=r exactly, either.

Fact: If 1st [pic] has q digits correct. Then the 2nd [pic] will have 2q digits correct (assuming that 2q is less than the number of digits representable on your computer) and the nth [pic] will have nq digits correct (under a similar assumption as before).

Parallelization is straightforward: Use a parallel Gaussian elimination code and parallelize the residual calculation based on where the data resides.

3c. Iterative Methods

3c (i) Splitting or Relaxation Methods

Let A=S(T, where S is nonsingular. Then [pic]. Then the iterative procedure is defined by

[pic]

To be useful requires that

1. [pic] be easy to compute.

2. [pic]

Example: Let A=D-L-U, D diagonal, L and U strictly lower and upper triangular, respectively. Then

a. S=D and T=L+U are both easy to compute, but many iterations are required in practice.

b. S=A and T=0 is hard to compute, but requires only 1 iteration.

Let [pic]. Then

[pic],

which proves the following:

Theorem 3.8: The iterative procedure converges or diverges at the rate of [pic].

Named relaxation (or splitting) methods:

1. [pic] (Jacobi): requires 2 vectors for xk and xk+1, which is somewhat unnatural, but parallelizes trivially and scales well.

2. [pic] (Gauss-Seidel or Gau(-Seidel in German): requires only 1 vector for xk. The method was unknown to Gauss, but known to Seidel.

3. [pic]:

a. [pic] (Successive Over Relaxation, or SOR)

b. [pic] (Successive Under Relaxation, or SUR)

c. [pic] is just Gauss-Seidel

Example: [pic], [pic], [pic], and [pic], whereas

[pic], [pic], and [pic],

which implies that 1 Gauss-Seidel iteration equals 2 Jacobi iterations.

Special Matrix Example

Let [pic] be tridiagonal.

For this matrix, let [pic] and [pic]. The optimal ( is such that

[pic], which is part of Young’s thesis (1950), but correctly proven by Varga later. We can show that [pic] makes ( as small as possible.

Aside: If (=1, then [pic] or [pic]. Hence, Gauss-Seidel is twice as fast as Jacobi (in either convergence or divergence).

If [pic].

|Facts: |[pic] |Jacobi |

| |[pic] |Gauss-Seidel |

| |[pic] |SOR-optimal ( |

Example: n=21 and h=1/22. Then [pic] 30 Jacobis equals 1 SOR with the optimal (.

There are many other splitting methods, including Alternating Direction Implicit (ADI) methods (1950’s) and a cottage industry of splitting methods developed in the U.S.S.R. (1960’s). There are some interesting parallelization methods based on ADI and properties of tridiagonal matrices to make ADI-like methods have similar convergence properties of ADI.

Parallelization of the Iterative Procedure

For Jacobi, parallelization is utterly trivial:

1. Split up the unknowns onto processors.

2. Each processor updates all of its unknowns.

3. Each processor sends its unknowns to processors that need the updated information.

4. Continue iterating until done.

Common fallacies:

• When an element of the solution vector xk has a small enough element-wise residual, stop updating the element. This leads to utterly wrong solutions since the residuals are affected by updates of neighbors after the element stops being updated.

• Keep computing and use the last known update from neighboring processors. This leads to chattering and no element-wise convergence.

• Asynchronous algorithms exist, but eliminate the chattering through extra calculations.

Parallel Gauss-Seidel and SOR are much, much harder. In fact, by and large, they do not exist. Googling efforts leads to an interesting set of papers that approximately parallelize Gauss-Seidel for a set of matrices with a very well known structures only. Even then, the algorithms are extremely complex.

Parallel Block-Jacobi is commonly used instead as an approximation. The matrix A is divided up into a number of blocks. Each block is assigned to a processor. Inside of each block, Jacobi is performed some number of iterations. Data is exchanged between processors and the iteration continues.

See the book (absolutely shameless plug),

C. C. Douglas, G. Haase, and U. Langer, A Tutorial on Elliptic PDE Solvers and Their Parallelization, SIAM Books, Philadelphia, 2003.

for how to do parallelization of iterative methods for matrices that commonly occur when solving partial differential equations (what else would you ever want to solve anyway???).

3b (ii) Krylov Space Methods

Conjugate Gradients

Let A be symmetric, positive definite, i.e.,

A=AT and [pic].

The conjugate gradient iteration method for the solution of Ax+b=0 is defined as follows with r=r(x)=Ax+b:

|x0 arbitrary |(approximate solution) |

|r0=Ax0+b |(approximate residual) |

|w0=r0 |(search direction) |

For [pic]

|[pic] |[pic] |

|[pic] | |

|[pic] |[pic] |

Lemma CG1: If [pic] and [pic], then [pic] is chosen to minimize [pic] as a function of t.

Proof:

|[pic] |= |[pic] |

| |= |[pic] |

|[pic] |= |[pic] |

|[pic] |= |[pic] |

| |= |[pic] |

| |= |[pic] |

| |= |0 |

since

|[pic] |= |[pic] |

| |= |[pic] |

| |= | [pic] |

| |= |[pic] |

| |= |0 |

Lemma CG2: The parameter [pic] is chosen so that [pic].

Lemma CG3: For [pic],

1. [pic]

2. [pic]

3. [pic]

Lemma CG4: [pic].

Lemma CG5: [pic]

Theorem 3.9 (CG): Let [pic] be symmetric, positive definite. The the CG iteration converges to the exact solution of Ax+b=0 in not more than N iterations.

Preconditioning

We seek a matrix M (or a set of matrices) to use in solving [pic] such that

• [pic]

• M is easy to use when solving Mx=b.

• M and A have similar properties (e.g., symmetry and definiteness)

Reducing the condition number reduces the number of iterations necessary to achieve an adequate convergence factor.

Thereom 3.9: In finite arithmetic, the preconditioned conjugate gradient method converges at the rate based on the largest and smallest eigenvalues of [pic],

[pic], where [pic].

What are some common preconditioners?

• Identity!!! (

• Main diagonal (the easiest to implement in parallel and very hard to beat)

• Jacobi

• Gauss-Seidel

• Tchebyshev

• Incomplete LU, known as ILU (or modified ILU)

Most of these do not work straight out of the box since symmetry may be required. How do we symmetrize Jacobi or a SOR-like iteration?

• Do two iterations: once in the order specified and once in the opposite order. So, if the order is natural, i.e., 1(N, the the opposite is N(1.

• There are a few papers that show how to do two way iterations for less than the cost of two matrix-vector multiplies (which is the effective cost of the solves).

Preconditioned conjugate gradients

|x0 arbitrary |(approximate solution) |

|r0=Ax0+b |(approximate residual) |

|[pic] | |

|[pic] |(search direction) |

followed by for [pic] until [pic] and [pic] for a given (:

|[pic] |[pic] |

|[pic] | |

|[pic] | |

|[pic] |[pic] |

3d. Sparse Matrix Methods

We want to solve Ax=b, where A is large, sparse, and N(N. By sparse, A is nearly all zeroes. Consider the tridiagonal matrix, [pic]. If N=10,000, then A is sparse, but if N=4 it is not sparse. Typical sparse matrices are not banded (diagonal) matrices. The nonzero pattern may appear to be random at first glance.

There are a small number of common storage schemes so that (almost) no zeroes are stored for A, ideally storing only NZ(A) = number of nonzeroes in A:

• Diagonal (or band)

• Profile

• Row or column (and several variants)

• Any of the above for blocks

The schemes all work in parallel, too, for the local parts of A. Sparse matrices arise in a very large percentage of problems on large parallel computers.

Row storage scheme

• 3 vectors: IA, JA, and AM.

|Length |Description |

|N+1 |IA(j) = index in AM of 1st nonzero in row j |

|NZ(A) |JA(j) = column of jth element in AM |

|NZ(A) |AM(j) = aik, for some row i and k=JA(j) |

Row j is stored in [pic]. The order in the row may be arbitrary or ordered such that [pic] within a row. Sometimes the diagonal entry for a row comes first, then the rest of the row is ordered.

The column storage scheme is defined similarly.

Modified row storage scheme

• 2 vectors: IJA, AM, each of length NZ(A)+1.

• Assume A = D + L + U, where D is diagonal and L and U are strictly lower and upper triangular, respectively. Let [pic].

Then

[pic]

[pic]

[pic] column index of jth element in AM

[pic]

[pic] is arbitrary

[pic]

The modified column storage scheme is defined similarly.

Very modified column storage scheme

• Assumes that A is either symmetric or nearly symmetric.

• Assume A = D + L + U, where D is diagonal and L and U are strictly lower and upper triangular, respectively. Let [pic] that will be stored. Let [pic].

• 2 vectors: IJA, AM with both aij and aji stored if either is nonzero.

[pic]

[pic]

[pic] row index of jth element in AM

[pic]

[pic] is arbitrary

[pic]

If [pic], then [pic]

AM contains first D, an arbitrary element, UT, and then possibly L.

Example:

[pic]

Then

| |D and column “pointers” |UT |Optional L |

| |1 |2 |

| |= |[pic] |

Theorem 3.10: [pic], if and only if

1. [pic], or

2. ( sequence [pic] such that

a. k1=l1, kp=j, [pic],

b. i=kq, some 2(q(p(1, and

c. [pic], 2(q(p.

Computing the fillin

The cost in time will be [pic]. We need 3 vectors:

• M of length N(1

• LIST of length N

• JU of length N+1 (not technically necessary for fillin)

The fillin procedure has three major sections: initialization, computing row indices of U, and cleanup.

Procedure fillin( N, IJA, JU, M, LIST )

// Initialization of vectors

M(i)=0, 1(i(N

LIST(i)=0, 1(i(N

JU(1)=N+1

do i=1:N

Length=0

LIST(i)=i

// Compute row indices of U

do j=IJA(i):IJA(i+1)(1

k=IJA(j)

while LIST(k)=0

LIST(k)=LIST(i)

LIST(i)=k

Length++

if M(k)=0, then M(k)=i

k=M(k)

endwhile

enddo // j

JU(i+1)=JU(i)+Length

// Cleanup loop: we will modify this loop when computing either

// Ly=b or Ux=z (computing Dz=y is a separate simple scaling loop)

k=i

do j=1:Length+1

ksave=k

k=LIST(k)

LIST(ksave)=0

enddo // j

enddo // i

end FILLIN

Numerical factorization (A=LDU) is derived by embedding matrix operations involving U, L, and D into a FILLIN-like procedure.

The solution step replaces the Cleanup loop in FILLIN with

k=i

Sum=0

do j=JU(i):JU(i+1)(1

ksave=k

k=LIST(k)

LIST(ksave)=0

Sum+=U(j)y(k)

enddo // j

y(i)=b(i)(Sum

LIST(k)=0

The i loop ends after this substitution.

Solving Ux=z follows the same pattern, but columns are processed in the reverse order. Adding Lshift and Ushift parameters allows the same code handle both cases A=AT and A(AT equally easily.

R.E. Bank and R.K. Smith, General sparse elimination requires no permanent integer storage, SIAM J. Sci. Stat. Comp., 8 (1987), pp. 574-584 and the SMMP and Madpack2 packages in the Free Software section of my home web.

4. Solution of Nonlinear Equations

Intermediate Value Theorem: A continuos function on a closed interval takes on all values between and including its local maximum and mimum.

(First) Mean Value Theorem: If f is continuous on [a,b] and is differentiable on (a,b), then there exists at least one [pic] such that [pic].

Taylor’s Theorem: Let f be a function such that [pic] is continuous on (a,b). If [pic], then

[pic],

where (( between x and y such that

[pic].

Given y=f(x), find all s such that f(s)=0.

Backtracking Schemes

Suppose [pic] and f is continuous on [a,b].

Bisection method: Let [pic]. Then either

1. [pic]: replace b by m.

2. [pic]: replace a by m.

3. [pic]: stop since m is a root.

[pic][pic][pic]

Features include

• Will always converge (usually quite slowly) to some root if one exists.

• We can obtain error estimates.

• 1 function evaluation per step.

False position method: Derived from geometry.

[pic]

First we determine the secant line from [pic] to [pic]:

[pic].

The secant line crosses the x-axis when [pic], where

[pic].

Then a root lies in either [pic] or [pic] depending on sign of [pic] as before.

Features include

• Usually converges faster than the Bisection method.

• [pic].

Fixed point methods: Construct a function g(s) such that [pic].

Example: [pic].

Constructing a good fixed point method is easy. The motivation is to look at a function y=g(x) and see when g(x) intersects y=x.

[pic]

Let [pic] and assume that g is defined on . Then g has either zero or possibly many fixed points on I.

Theorem 4.1: If [pic] and g is continuous, then g has at least one fixed point in I.

Proof: [pic] means that [pic] and [pic]. If either [pic] or [pic], then we are done. Assume that is not the case: hence, [pic] and [pic]. For [pic], F is continuous and [pic] and [pic]. Thus, by the initial value theorem, there exists at least one [pic] such that [pic]. QED

Why are Theorem 4.1’s requirements reasonable?

• [pic]: s cannot equal g(s) if not [pic].

• Continuity: if g is discontinuous, the graph of g may lie partly above and below y=x without an intersection.

Theorem 4.2: If [pic] and [pic], then [pic] such that g(s)=s.

Proof: Suppose [pic] are both fixed points of g. The mean value theorem with [pic] has the property that

[pic],

which is a contradiction. QED

Note that the condition on [pic] must be continuous.

Algorithm: Let [pic] be arbitrary and set [pic]

[pic]

Note that after n steps, [pic].

Theorem 4.3: Let [pic] and [pic]. For [pic], the sequence [pic] converges to the fixed point s and the nth error [pic] satisfies

[pic].

Note that Theorem 4.3 is a nonlocal convergence theorem because s is fixed, a known interval I is assumed, and convergence is for any [pic].

Proof: (convergence) Recall that s is unique. For any n, [pic] between xn-1 and s such that

[pic].

Repeating this gives

[pic].

Since [pic],

[pic].

(error bound) Note that

|[pic] |( |[pic] |

| |( |[pic] |

|[pic] |( |[pic] |

Since [pic], [pic]. QED

Theorem 4.4: Let [pic] be continuous on some open interval containing s, where g(s)=s. If [pic], then [pic] such that the fixed point iteration is convergent whenever [pic].

Note that Theorem 4.4 is a local convergence theorem since x0 must be sufficiently close to s.

Proof: Since [pic] I continuous in an open interval containing s and [pic], then for any constant K satisfying [pic], [pic] such that if [pic], then [pic]. By the mean value theorem, given any [pic] between x and s such that [pic] and thus [pic]. Using [pic] in Theorem 4.3 completes the proof. QED

Notes:

• There is no hint what ( is.

• If [pic], then [pic] such that [pic]. So if [pic], then [pic]. Hence, only [pic] imply convergence, all others imply divergence.

Error Analysis

Let [pic], I a closed interval, and g satisfies a local theorem’s requirements on I. The Taylor series of g about x=s is

|[pic] |= |[pic] |

| |= |[pic], |

|where [pic] |= |[pic]. |

If [pic], and [pic], then

[pic] kthorder convergence.

The important k’s are 1 and 2.

If we only have 1st order convergence, we can speed it up using quadratic interpolation: given [pic], fit a 2nd order polynomial p to the data such that [pic]. Use p to get the next guess. Let

[pic].

If [pic] satisfies [pic], then for n sufficiently large, [pic] is well defined and [pic], where [pic] (x* is hopefully s).

We can apply the fixed point method to the zeroes of f: Choose [pic], where [pic]. Note that [pic] and [pic] have the same zeroes, which is true also for [pic], where [pic] if y(0 and [pic].

Chord Method

Choose [pic], m constant. So, [pic]. We want

[pic] in some [pic].

Thus, m must have the same sign as [pic]. Let [pic]. Solving for m,

[pic].

Therefore, xn+1 is the x-intercept of the line through [pic] with slope 1/m.

Properties:

• 1st order convergence

• Convergence if xn+1 can be found (always)

• Can obtain error estimates

Newton’s Method

Choose [pic]. Let s be such that [pic]. Then

|[pic] |= |[pic] |

| |= |[pic] |

If [pic] exists in [pic] and [pic], then [pic]2nd order convergence. So,

[pic].

What if [pic] exists? Then [pic], where [pic] exists. So,

|[pic] |= |[pic] |

| |= |[pic] |

| | | |

|[pic] |= |[pic] |

Thus, [pic]. Then [pic] makes the method 2nd order again.

Properties:

• 2nd order convergence

• Evaluation of both [pic] and [pic].

If [pic] is not known, it can be approximated using

[pic].

Secant Method

x0 is given and x1 is given by false position. Thus,

[pic] and [pic].

Properties:

• Must only evaluate [pic]

• Convergence order [pic]

N Equations

Let [pic]. Construct a fixed point function [pic] from [pic]. Replace

[pic] by

[pic]

Equivalent: for [pic],

[pic].

Thus,

|[pic] |( |[pic] |

| |( |[pic] |

| |( |[pic] |

| |( |[pic] |

Thus, [pic].

Newton’s Method

Define the Jacobian by [pic]. If [pic] for [pic], then we define

[pic], or (better)

1. Solve [pic]

2. Set [pic]

Quadratic convergence if

1. [pic] exists for [pic]

2. [pic] nonsingular

3. x0 sufficiently close to s

1D Example: [pic]. To reduce [pic],

• Bisection: [pic] 20 steps

• False position: [pic] 7 steps

• Secant: [pic] 6 steps

• Newton: [pic] 5 steps

Zeroes of Polynomials

Let [pic].

Properties:

• Computing [pic] and [pic] are easy.

• Finding zeroes when n(4 is by use of formulas, e.g., when n=2,

[pic]

When n(5, there are no formulas.

Theorem 4.5 (Fundamental Theorem of Algebra): Given [pic] with n(1, there exists at least one r (possibly complex) such that [pic].

We can uniquely factor p using Theorem 4.5:

|[pic] |= |[pic], |[pic] degree polynomial |

| |= |[pic], |[pic] degree polynomial |

| | |[pic] | |

| |= |[pic] | |

We can prove by induction that there exist no more than n roots.

Suppose that [pic] such that [pic]. Then [pic], where [pic].

Theorem 4.6 (Division Theorem): Let [pic] and [pic] be polynomials of degree n and m, where 1(m(n. Then [pic] of degree n(m and a unique polynomial [pic] of degree m(1 or less such that [pic].

Evaluating Polynomials

How do we compute [pic]? We may need to make a change of variables: [pic], which leads to

[pic].

Using Taylor’s Theorem we know that

[pic].

We use nested multiplication,

[pic],

where there are n(1 multiplies by x before the inner [pic] expression.

The cost of evaluating p is

| |multiplies |adds |

|nested multiplication |n |n+1 |

|direct evaluation |2n(1 |n+1 |

Synthetic Division

To evaluate [pic]:

|[pic] | |

|[pic], |1(j(n |

Then [pic] for the same cost as nested multiplication. We use this method to evaluate [pic]. Write

[pic].

Note that [pic] has degree n(1 since [pic] in the definition of [pic]. Further, its leading coefficient is also [pic]. Also, [pic] using the previous way of writing [pic]. So, we can show that

[pic].

Further, [pic]. Substituting,

| |[pic] |= |[pic] |

|or | | | |

| |[pic] |= |[pic]. |

We can continue this to get

[pic], where [pic].

Deflation

Find [pic] for [pic]. Then

[pic].

Now find [pic] for [pic]. Then

[pic].

Continue for all [pic]. A problem arises on a computer. Whatever method we use to find the roots will not find usually the exact roots, but something close. So we really compute [pic]. By the Division Algorithm Theorem,

[pic] with [pic] usually.

Now we compute [pic], which is probably wrong, too (and possibly quite wrong). A better [pic] can be computed using [pic] as the initial guess in our zero finding algorithm for [pic]. This correction strategy should be used for all [pic].

Suppose [pic] and [pic]. Then

[pic],

which implies that we should find the smaller roots first.

Descartes Rule of Signs

In order, write down the signs of nonzero coefficients of [pic]. Then count the number of sign changes and call it [pic].

Examples:

|[pic] |[pic] |([pic] |([pic] |(3 | |

| |( |( |( |( |, [pic] |

|[pic] |[pic] |([pic] |([pic] |(3 | |

| |( |( |( |( |, [pic] |

Rule: Let k be the number of positive real roots of [pic]. Then [pic] and [pic] is a nonnegative even integer.

Example: For [pic] above, [pic], [pic] or [pic], which implies that [pic] has one positive real root.

Fact: If [pic], then [pic] is a root of [pic]. Hence, we can obtain information about the number of negative real roots by looking at [pic].

Example: [pic], which implies that [pic] has 0 or 2 negative real roots.

Localization of Polynomial Zeroes

Once again, let [pic].

Theorem 4.7: Given [pic], then all of the zeroes of [pic] lie in [pic], where

[pic], [pic], and [pic], [pic].

Corollary 4.8: Given [pic] and [pic], then every zero lies in [pic].

Note that the circles [pic] have origin 0. One big root makes at least one circle large. A change of variable ([pic]) can help reduce the size of the largest circle.

Example: Let [pic]. Then

[pic].

Let [pic] and generate [pic]. We get [pic] and

[pic]. So, [pic] for [pic]. Then

[pic].

Theorem 4.9: Given any ( such that [pic], then there exists at least one zero of [pic] in [pic].

Apply Theorem 4.9 to Newton’s method. We already have [pic] and [pic] calculated for [pic]. If [pic] and [pic], then no [pic]. If [pic], some (, then

[pic],

which is an upper bound on the relative error of s with respect to some zero of [pic].

5. Interpolation and Approximation

Assume we want to approximate some function [pic] by a simpler function [pic].

Example: a Taylor expansion.

Besides approximating [pic], [pic] may be used to approximate [pic] or [pic].

Polynomial interpolation

[pic]. Most of the theory relies on

• Division Algorithm Theorem

• p has at most n zeroes unless it is identically zero.

Lagrange interpolation

Given [pic], find p of degree n(1such that [pic].

Note that if we can find polynomials [pic] of degree n(1 such that for [pic]

[pic]

then [pic] is a polynomial of degree n(1 and

[pic].

There are many solutions to the Lagrange interpolation problem.

The first one is

[pic].

[pic] has n(1 factors [pic], so [pic] is a polynomial of degree n(1. Further, it satisfies the remaining requirements.

Examples:

• n=2: [pic]

• n=3: [pic]

• n(3: very painful to convert [pic] into the form [pic].

The second solution is an algorithm: assume that [pic] has the Newton form,

[pic].

Note that

[pic],

[pic],

[pic]

[pic].

For all solutions to the Lagrange interpolation problem, we have a theorem that describes the uniqueness, no matter how it is written.

Theorem 5.1: For fixed [pic], there exists a unique Lagrange interpolating polynomial.

Proof: Suppose [pic] and [pic] are distinct Lagrange interpolating polynomials. Each has degree n(1 and [pic] is also a polynomial of degree n(1. However, [pic], which implies that r has n zeroes. We know it can have at most n(1 zeroes or must be identically zero. QED

|Equally spaced [pic]’s can be disastrous, e.g., |See picture (jpeg)… |

| | |

|[pic]. | |

| | |

|It can be shown that | |

| | |

|[pic]. | |

We can write the Newton form in terms of divided differences.

1st divided difference:

[pic]

kth divided difference:

[pic]

We can prove that the Newton form coefficients are [pic].

We build a divided difference table in which coefficients are found on downward slanting diagonals.

|[pic] |[pic] | | | | | |

| | |[pic] | | | | |

|[pic] |[pic] | |[pic] | | | |

|[pic] |[pic] |[pic] | |[pic] | | |

| | |[pic] |[pic] | |[pic] | |

| | | |[pic] | | |[pic] |

|[pic] | |[pic] | | | | |

|[pic] |[pic] | |[pic] | | | |

| | |[pic] | | | | |

|[pic] |[pic] | | | | | |

This table contains wealth of information about many interpolating polynomials for [pic]. For example, the quadratic polynomial of [pic] at [pic] is a table lookup starting at [pic].

Hermite interpolation

This is a generalization of Lagrange interpolation. We assume that [pic] is available, where [pic]. We seek a [pic] of degree 2n(1 such that for [pic] two conditions are met:

1. [pic]

2. [pic]

There are two solutions. The first solution is as follows.

[pic],

where [pic], satisfies condition 1.

Also,

[pic],

where [pic], satisfies condition 2.

We must find polynomials [pic] and [pic] of degree 2n(1 satisfying these conditions. Let

[pic] and [pic].

Note that [pic] and [pic] vanish at all of the nodes except [pic] and that [pic] is a polynomial of degree 2n(2. Put

[pic]

and determine [pic] and [pic] so that [pic] and [pic]: choose

[pic].

Similarly,

[pic].

The second solution to the Hermite interpolation problem requires us to write

[pic]

Then

[pic]

[pic]

[pic]

or

[pic]

and so on…

Theorem 5.2: Given [pic], the Hermite interpolant is unique.

Just as in the Lagrange interpolation case, equally spaced nodes can cause disastrous problems.

Hermite cubics

n=2, so it is a cubic polynomial. Let [pic]. Then

[pic], [pic], and [pic].

Hermite cubics is by far the most common form of Hermite interpolation that you are likely to see in practice.

Piecewise polynomial interpolation

Piecewise linears: [pic].

----- See picture (jpeg)… -----

Piecewise quadratics: Use Lagrange interpolation of degree 2 over [pic], [pic], … This extends to Lagrange interpolation of degree k(1 over groups of k nodal points.

Piecewise Hermite: Cubics is the best known case. For [pic],

[pic]

Facts: [pic] are continuous, but [pic] is not usually continuous.

Cubic spline: We want a cubic polynomial such that [pic] are continuous. We write

[pic].

Note that [pic] must be linear on [pic]. So

[pic].

Then

|[pic] |= |[pic] |

| |= |[pic] |

and

[pic].

We know [pic] and [pic], so

[pic].

At this point, [pic] can be written by knowing [pic]. The [pic] can be eliminated by using the continuity condition on [pic]. Suppose that [pic]. Then [pic], but for [pic],

[pic]

and

[pic].

Equating both expressions for [pic] we get

[pic].

Imposing [pic] gives us n(2 equations and n(2 unknowns (plus 2 knowns) to make the system have a unique solution.

Error Analysis

Consider Lagrange interpolation with [pic]. We want to know what

[pic]

We write

[pic], where [pic] and G is to be determined.

Theorem 5.3: [pic], where ( depends on x.

Proof: Note that G is continuous at any [pic]. Using L’Hopital’s Rule,

[pic].

Since [pic] is continuous at any node [pic]. Let x be fixed and consider

[pic].

Note that [pic] since [pic]. By the definition of [pic], [pic]. Now suppose that [pic]. Then [pic] vanishes at n+1 distinct points. [pic] mush vanish at some point strictly between each adjacent pair of these points (by Rolle’s Theorem), so [pic] vanishes on n distinct points. Similarly, [pic] vanishes at n(1 distinct points. We continue this until we have 1 point, (, depending on x, such that [pic]. Since [pic] is a polynomial of degree n(1,

| |0 |= |[pic] |= |[pic] |

| | | | |= |[pic] |

|or | | | | | |

| |[pic] |= |[pic]. | | |

Now suppose that [pic], some i. Then [pic] only vanishes at n distinct points. But,

[pic],

so [pic] and [pic] still vanishes at n distinct points. We use the same trick as before. QED

Consider Hermite interpolation, with [pic], [pic], [pic], [pic], and [pic] is the Hermite interpolant. Set

[pic] and [pic].

Since [pic] and [pic], we have [pic] is continuous and

[pic].

Define

[pic].

In this case,

[pic] vanishes at 2n distinct points,

[pic] vanishes at 2n(1 distinct points,

[pic]

Hence, [pic].

Note that interpolation is a linear process. Let [pic] be any interpolating function (e.g., Lagrange, Hermite, or spline) using a fixed set of nodes. Then for any functions f and g,

[pic], any [pic].

Examples:

Lagrange: [pic].

Hermite: Similar to Lagrange.

Splines: Define the Kronecker delta function, [pic]. Let [pic] be the unique spline function satisfying [pic]. If [pic], then [pic] is the interpolatory spline for [pic]. Linearity follows as before.

Let [pic] be any linear interpolatory process that is exact for polynomials of degree m, i.e., if [pic] is a polynomial of degree m, [pic]. For a given function [pic], Taylor’s Theorem says that

[pic].

Define

[pic]

so that

[pic].

Even More Error Analysis (

Define [pic].

Theorem 5.4: If [pic] is a polynomial of degree n(1 that interpolates [pic] at [pic], then

[pic].

Tchebyshev Polynomials of the First Kind

Define

[pic],

where

[pic] and [pic].

Choose [pic], [pic]. Then [pic] and we get a three term recurrence:

|[pic] |= |[pic] |

| |= |[pic] |

Hence,

|[pic] |= |[pic] |

|[pic] |= |[pic] |

| |[pic] | |

We can verify inductively that [pic] is a kth order polynomial. The leading coefficient of [pic] is [pic] and [pic], when

[pic].

Finally, for [pic] we can show that [pic]. For [pic], [pic], so, in fact, [pic]. From Theorem 5.4, we can prove that [pic] is minimized when [pic].

Translating Intvervals

Suppose the problem on [pic] needs to be reformulated on [pic].

Example: Tchebyshev only works on [pic]. We use a straight line transformation: [pic]. Hence,

[pic].

Example: Tchebyshev with [pic] arbitrary. Then

[pic].

The shifted Tchebyshev polynomials are defined by

[pic].

Since

[pic],

then

[pic].

are zeroes of [pic]. Further,

|[pic] |= |1 |

|[pic] |= |[pic] |

| |[pic] | |

|[pic] |= |[pic] |

| |= |[pic], |

We can prove that the leading coefficient of [pic] is [pic]. Further, we know that [pic] from Theorem 5.4.

Tensor Product Interpolation

Given [pic] and [pic], interpolate [pic] over [pic], giving us

[pic].

The bi-Lagrangian is defined by [pic], where [pic] is the one dimensional Lagrangian along either the x or y axis ([pic] respectivefully) and [pic].

The bi-Hermite and bi-Spline can be defined similarly.

Orthogonal Polynomials and Least Squares Approximation

We approximate [pic] on [pic] given [pic]. Define

[pic].

Problem A: Let [pic], [pic] (weights), m>n. Find [pic] which minimizes [pic].

Problem B: Let [pic] and positive on [pic]. Find [pic] which minimizes [pic].

Properties of Both: Unique solutions and are “easy” to solve by a finite number of steps in math formulas (which is not true of solving the more general problem [pic].

Define

|[pic] |= |[pic] |

|[pic] |= |[pic] |

|[pic] |= |[pic] (either inner product) |

Note that [pic] is a real norm for [pic], but is only a semi-norm for [pic].

Theorem 5.5 (Cauchy-Schwarz): Let [pic]. Then

[pic].

Proof: If [pic], then [pic]. If [pic], then

[pic].

Use [pic]. Then [pic]. QED

Definitions: p and q are orthogonal if and only if [pic]. p and q are orthonormal if and only if [pic] and [pic].

Consider [pic], the set of monomials. The elements are not orthogonal to each other under either [pic] or [pic]. Yet any [pic] is a linear combination of the elements of S. We can transform S into a different set of orthogonal polynomials using the

Gram-Schmidt Algorithm: Given S, let

|[pic] |[pic] |

|[pic] |[pic] |

Then [pic] is orthogonal and [pic] is orthonormal.

Note that for [pic], [pic].

Let [pic]. Then

[pic].

Using this expression, we can write [pic] as

[pic]

since

[pic].

Best Least Squares Approximation

Theorem 5.6: Let [pic] be either [pic] or [pic] and [pic]. If [pic], then the polynomial [pic] that minimizes [pic] over [pic] is given by

[pic],

where [pic] is the orthonormal set of polynomials generated by Gram-Schmidt.

Proof: Let [pic]. Then [pic]. Further,

|[pic] |= |[pic] |

| |= |[pic] |

| |= |[pic] |

| |= |[pic], |

which is minimized when we choose [pic]. QED

Note: The coefficients [pic] are called the generalized Fourier coefficients.

Facts:

• [pic]

• [pic]

Efficient Computation of [pic]

We can show that we have a three term recurrence:

[pic],

where

[pic] and [pic].

This gives us

|[pic] |= |[pic] |

| |= |[pic] |

So,

[pic]

is equivalent and may be less sensitive to roundoff error.

Also,

|[pic] |= |[pic] |

| |= |[pic] |

| |= |[pic] |

If we precompute [pic], then [pic] only costs 2n(1 mltiplies.

6. Numerical Integration and Quadrature Rules

Assume [pic] is integrable over [pic]. Define

[pic], where [pic] is a weight function.

Frequently, [pic]. An formula that approximates [pic] is called numerical integration or a quadrature rule. In practice, if [pic] approximates [pic] well enough, then [pic].

Interpolatory Quadrature

Let [pic] be the Lagrangian interpolant of [pic] at [pic]. i.e.,

[pic].

Define

|[pic] |= |[pic] |

| |= |[pic] |

| |= |[pic] |

| |= |[pic], |

where the [pic] are quadrature weights and the [pic] are the quadrature nodes.

Note that if [pic], then [pic], i.e., the quadrature is exact. If [pic] is exact for polynomials of degree [pic], then we say the quadrature rule has precision m. We will develop quadrature rules that have precision [pic] later (e.g., Gaussian quadrature).

Method of Undetermined Coefficients

If [pic] has precision n, then it is exact for the monomials [pic]. Suppose the nodes are no longer fixed. We start with n+1 equations

[pic],

for our 2n+1 unknowns [pic] and [pic]. Let [pic] so we have 2n+2 (nonlinear) equations and unknowns. If it has a solution, then it has precision 2n+1. This is what Gaussian quadrature is based on (which we will get to later).

The Trapezoidal and Simpson’s Rules are trivial examples.

Trapezoidal Rule

Let [pic]. This is derived by direct integration rule. Take [pic] and [pic]. Then

|[pic] |= |[pic] |

| | | |

|[pic] |= |[pic] |

| |= |[pic] |

| |= |[pic]. |

Simpson’s Rule

This derived using undetermined coefficients. Let [pic], [pic], [pic], and [pic]. We force

[pic] for [pic].

Then

|[pic] |= |[pic] |= |2h |= |[pic] |

|[pic] |= |[pic] |= |0 |= |[pic] |

|[pic] |= |[pic] |= |[pic] |= |[pic] |

Solving this 3(3 system of linear equations gives us [pic] and [pic]. Note that [pic], but [pic] so that Simpson’s Rule has precision 3.

What Does Increasing n Do for You?

Theorem 6.1: For any [pic], let [pic] be an interpolatory quadrature derived by direct integration. Then [pic], constant, such that

[pic].

Justification for Positive Weights

We must have [pic]. Further,

[pic].

If [pic], and we can choose a set of [pic]’s to get this, then Theorem 6.1 guarantees convergence. All positive weights are good because they reduce roundoff errors since we ought to have as many roundoffs on the high and low sides, thus canceling errors. Finally, we expect roundoff to be minimized when the [pic]’s are (nearly) equal.

Translating Intervals

We will derive a formula on a specific interval, e.g., [(1,1], and then apply it to another interval [a,b]. Suppose that

[pic] that approximates [pic]

and we want [pic]. Set [pic], [pic], and [pic]. Then [pic]. Let [pic]. Then

[pic] approximates [pic].

So,

|[pic] |= |[pic] |

| |= |[pic], so |

| | | |

|[pic] |= |[pic]. |

Newton-Coates Formulas

Assume that the [pic]’s are equally spaced in [a,b] and that we define a quadrature rule by [pic]

The closed Newton-Cotes formulas [pic] assume that [pic] and [pic]. The open Newton-Cotes formulas [pic] assume that [pic] and [pic].

Examples:

|[pic] |2 point closed Trapezoidal Rule |

|[pic] |2 point closed Simpson Rule |

|[pic] |3 point open |

|[pic] |4 point closed |

For [pic], the weights are always of mixed signs. Higher order formulas are not necessarily convergent. Lower order formulas are extremely useful.

Suppose we have [pic], the Hermite interpolant of [pic]. We want [pic], which we can get by observing that

|[pic] |= |[pic] |

| |= |[pic] |

| |= |[pic] |

| |= |[pic] |

This is known as the Trapezoidal Rule with Endpoint Correction (a real mouthful). It has precision 3.

Error Analysis

Assuming that [pic], the error in interpolation is given by

[pic].

The error in integration is

|[pic] |= |[pic] |

| |= |[pic] |

So,

[pic].

We can simply the last equation by applying the Second Mean Value Theorem (which states that for [pic] such that g does not change signs, then [pic]) to the formula for [pic]. Hence,

|[pic] |= |[pic] |= |[pic] |= |[pic] |

|[pic] |= |[pic] |= | |= |[pic] |

|[pic] |= |[pic] | | | | |

Composite Rules

What if we wanted a highly accurate rule on [a,b]? The best approach is to divide [a,b] into subintervals, use a low order quadrature rule on each subinterval, and add them up since high order quadrature rules tend to have problems.

Let [pic]. Then

[pic].

Consider [pic]. Then for the Trapazoidal Rule,

|[pic] |= |[pic] |

| |= |[pic] |

| | | |

|[pic] |= |[pic] |

Theorem 6.2: Let [pic] be constants of the same sign. If [pic], then for some [pic],

[pic].

Hence,

[pic].

Consider Simpson’s Rule:

[pic].

So,

[pic]

and

[pic].

Corrected Trapezoidal Rule

[pic] and

[pic]

The number of function evaluations and order of error over n points is

|[pic] |N |[pic] |

|[pic] |2N+1 |[pic] |

|[pic] |N+2 derivatives (yikes) |[pic] |

We can show that

[pic].

Adaptive Quadrature

Suppose we want [pic] to within an error tolerance of [pic] and an automatic procedure to accomplish this feat. Consider [pic].

Motivation: Suppose [pic] is badly behaved only over [pic], where [pic] is a small part of [pic]. Then [pic] over [pic] will be accurate for small n’s, but [pic] over [pic] may be a very poor approximation to [pic]. Doubling n will not necessarily increase accuracy over [pic], where it was already acceptable, and we still not get an acceptable approximation over [pic]. Instead, we want to subdivide [pic] and work hard just there while doing minimal work in [pic]… and we do not want to know where [pic] is in advance!

Adaptive quadrature packages accept [pic], f, and [pic] and return EST, which supposedly satisfies

[pic].

An error sensing mechanism is used on intermediate steps to control the overall quadrature error. For instance, if [pic] and [pic], then

[pic] and [pic],

where [pic]. The critical (and sometimes erroneous) assumption is that [pic] constant over [pic]. This is true when [pic] is small in comparison to how rapidly [pic] varies in [pic]. Set

[pic].

Then

[pic],

which mean that [pic] is 16 times more accurate than [pic] when [pic] is well behaved on [pic]. So,

[pic].

We know to compute both [pic] and [pic] over [pic]. Many applications require that EST be very accurate, rather than inexpensive to compute. Hence, we can use a conservative error estimator of the form,

[pic].

Algorithm apparent: Compute [pic] and [pic] over [pic].

1. If the error is acceptable, then add the estimate of [pic] into EST.

2. Otherwise, divide [pic] into two equal sized intervals and try again in both intervals. The expected error on both intervals is reduced by a factor of 32.

The real estimator must depend on the size of [pic], however. A good choice is

[pic].

Theorem 6.3: This estimator will eventually produce a interval [pic] that is acceptable.

Proof: Every time we half the interval [pic], the quadrature error decreases by a factor of 32. Set

[pic].

If

[pic],

then

[pic].

Taking [pic]. QED

Theorem 6.4: The cost is only two extra function evaluations at each step.

Folk Theorem 6.5: Given any adaptive quadrature algorithm, there exists an infinite number of [pic]’s that will fool the Algorithm Apparent into failing. (Better algorithms work for the usual [pic]’s.)

Proof: Let [pic] be 5 equally spaced points used in computing [pic] and [pic]. Test

[pic].

• If true, then use [pic] as an estimate to [pic].

• If false, then retreat to [pic], where [pic], equally spaced. Now only evaluations at [pic] are necessary if we saved our previous function evaluations. We test [pic]. If the test succeeds, then we pass on to interval [pic], otherwise we work on a new level 3. This process is not guaranteed to succeed. Hence, we need to add an extra condition that

[pic] always.

If this fails, then we cannot produce EST. QED

Richardson Extrapolation

This method combines two or more estimates of something to get a better estimate. Suppose

[pic],

where [pic] is computable for any [pic]. Further, we assume that

[pic].

Finally, we assume that

[pic],

where the [pic]’s are independent of h and [pic]. Take [pic] with [pic] the most common value. We want to eliminate the [pic] term using a combination of [pic] by noting that

[pic].

We have two definitions of [pic], so we can equate them to compute [pic]first + second definitions, or

[pic].

Set

|[pic] |= |[pic] |

|[pic] |= |[pic] |

|[pic] |= |[pic] |

Then

[pic].

If [pic], then we can repeat this process to eliminate the [pic] term. Define

[pic]

Then

[pic] and [pic]

Applications of Richardson Extrapolation

Differentiation is a primary application. Assume that [pic].

First, try for small h, [pic]. The Taylor expansion about [pic] gives us

[pic],

where the [pic]’s are independent of h and probably unknown.

Second, try [pic]. We can prove that

[pic]

We can modify the definition of [pic] to use [pic] Then

[pic]

Next extrapolatation must be of the form [pic]. So,

[pic].

Use this formula whenever

[pic]

Romberg Integration

On [pic], approximate [pic]. Define

[pic],

where [pic]. This choice of [pic] eliminates half of the [pic] function evaluations when computing [pic]. The error only contains even powers of h. Hence,

[pic] or [pic].

Continue extrapolation as long as

[pic].

Roundoff error is the typical culprit for stopping Richardson extrapolation.

7. Automatic Differentiation (AD)

This is a technique to numerically evaluate the derivative of a function using a computer program. There have been two standard techniques in the past:

• Symbolic differentiation

• Numerical differentiation

Symbolic differentiation is slow, frequently produces many pages of expressions instead of a compact one, and has great difficulty converting a computer program. Numerical differentiation involves finite differences, which are subject to roundoff errors in the discretization and cancellation effects. Higher order derivatives exasperate the difficulties of both techniques.

“Automatic differentiation solves all of the mentioned problems.” Wikipedia

Throughout this section, we follow Wikipedia’s AD description and use its figures.

[pic]

The primary tool of AD is the chain rule,

[pic] for a function [pic].

There are two ways to traverse the chain rule:

• Left to right, known as forward accumulation.

• Right to left, known as backward accumulation.

Assume that any computer program that evaluates a function [pic] can be decomposed into a sequence of simpler, or elementary partial differivatives, each of which is differentiated using a trivial table lookup procedure. Each elementary partial derivative is evaluated for a particular argument using the chain rule to provide derivative information about F (e.g., gradients, tangents, Jacobian matrix, etc.) that is exact numerically to some level of accuracy. Problems with symbolic mathematics are avoided by only using it for a set of very basic expressions, not complex ones.

Forward accumulation

First compute [pic] then [pic]in [pic].

Example: Find the derivative of [pic]. We have to seed the expression to distinguish between the derivative for [pic] and [pic].

|Original code statements |Added AD statements |

|[pic] |[pic] (seed) |

|[pic] |[pic] (seed) |

|[pic] |[pic] |

|[pic] |[pic] |

|[pic] |[pic] |

[pic]

Forward accumulation traverses the figure from bottom to top to accumulate the result.

In order to compute the gradient of f, we have to evaluate both [pic] and [pic], which corresponds to using seeds [pic] and [pic], respectively.

The computational complexity of forward accumulation is proportional to the complexity of the original code.

Reverse accumulation

First compute [pic] then [pic] in [pic].

Example: As before. We can produce a graph of the steps needed. Unike, forward accumulation, we only need one seed to walk through the graph (from top to bottom this time) to calculate the gradient in half the work of forward accumulation.

[pic]

Superiority condition of forward versus reverse accumulation

Forward accumulation is superior to reverse accumulation for functions [pic]. Reverse accumulation is superior to forward accumulation for functions [pic].

Jacobean computation

The Jacobean J of [pic] is a [pic] matrix. We can compute the Jacobian using either

• n sweeps of forward accumulation, where each sweep produces a column of J.

• m sweeps of backward accumulation, where each sweep produces a row of J.

Computing the Jacobean with a minimum number of arithmetic operations is known as optimal Jacobean accumulation and has been proven to be a NP-complete hard problem.

Dual numbers

We define a new arithmetic in which every [pic] is replaced by [pic], where [pic] and ( is nothing but a symbol such that [pic]. For regular arithmetic, we can show that

[pic],

[pic],

and similarly for subtractraction and division. Po lynomials can be calculated using dual numbers:

|[pic] |= |[pic] |

| |= |[pic] |

| |= |[pic], |

where [pic] represents the derivative of P with respect to its first argument and [pic] is an arbitrarily chosen seed.

The dual number based arithmetic we use consists of ordered pairs [pic] with ordinary arithmetic on the first element and first order differential arithmetic on the second element. In general for a function f, we have

[pic],

where [pic] represent the derivative of f with respect to the first and second arguments, respectively. Some common expressions are the following:

[pic] and [pic]

[pic] and [pic]

[pic] and [pic]

[pic] and [pic]

[pic] and [pic]

[pic]

The derivative of [pic] at some point [pic] in some direction [pic] is given by

[pic]

using the just defined arithmetic. We can generalize this method to higher order derivatives, but the rules become quite complicated. Truncated Taylor series arithmetic is typically used instead since the Taylor summands in a series are known coefficients and derivatives of the function in question.

Implementations

Google “automatic differentiation” and just search through the interesting sites.

Oldies, but goodies:

• ADIFOR (Fortran 77)

• ADIC (C, C++)

• OpenAD (Fortran 77/95, C, C++)

• MAD (Matlab)

Typically, the transformation process is similar to the following:

[pic]

9. Monte Carlo Methods

a

................
................

In order to avoid copyright disputes, this page is only a partial summary.

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

Discrete Mathematics - MGNet

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches