I



CONTENT

I. NUMERICAL METHODS FOR NONLINEAR EQUATIONS AND SYSTEMS OF EQUATIONS

I.1. NEWTON'S METHOD FOR EQUATIONS 2-4

I.2. NEWTON'S METHOD FOR SYSTEMS OF EQUATIONS 4-8

II. NEWTON SEARCH FOR A MINIMUM 8-9

III. THE GOLDEN RATIO SEARCH FOR A MINIMUM 9-11

IV. NELDER-MEAD SEARCH FOR A MINIMUM 11-13

V. NUMERICAL METHODS FOR DIFFERENTIAL EQUATIONS AND SYSTEM OF EQUATIONS

V. 1 RUNGE-KUTTA METHODS 13-18

V. 2 FINITE DIFFERENCE METHODS 18-21

VI. MATHEMATICAL NOTIONS 21-27

I. NUMERICAL METHODS FOR NONLINEAR EQUATIONS AND SYSTEMS OF EQUATIONS

I.1. NEWTON'S METHOD FOR EQUATIONS

Here we try to approximate a root z of the equation [pic] in such a way that we use the tangential straight line of at different points. Starting from [pic] the left figure shows a successful trial and the right figure shows an unsuccessful one:

[pic]

So, [pic] approximation of z can be got by Taylor's series of the function [pic] used only the constant and linear term of it. Hence [pic] is the solution of the linear equation

[pic].

If this equation has unique solution for [pic], then

[pic]

and it is called the formula of Newton's method for the equation [pic].

THEOREM.

Let the function [pic] be twice continuously differentiable on [pic]. Assume that there is a root of the equation [pic] on the interval [pic] and [pic] [pic] [pic]. Then Newton's iteration converges (starting from [pic], if [pic] and from [pic], if [pic]) to the unique solution on [pic] and

[pic]

where

[pic], [pic] [pic].

Numerical example

Determine the root of the equation

[pic]

by Newton's method and with error bound [pic].

(1) We can get a suitable interval (for which the conditions of our theorem are satisfied) by graphical way and (or) tabling. If we write our equation into the form

[pic]

and represent the function of left and right sides (roughly), then we get. a (rough) estimate about the root:

[pic]

We can see the equation has unique root (z) and it is on the interval (1/3, 2). Moreover, by the table

[pic]

z is on the interval (2/3, 1), too.

(2) Now let us examine the conditions concerning the convergence of the method

a) [pic] [pic] [pic]

b) Since [pic] and [pic] [pic], so on a small neighbourhood of z the graph of [pic] is the following:

[pic]

Hence [pic] the right choice.

c) Since [pic] and [pic] is strictly monotone decreasing on [pic], [pic] Furthermore, [pic] and it is strictly monotone increasing on [pic] (because of f'''(x)>0). Therefore, [pic] is strictly monotone decreasing and [pic]

(3) The computation of the iterations and the estimation of the errors:

[pic]

[pic], [pic]

[pic], [pic]

[pic], [pic]

As [pic] is fulfilled, [pic] can be considered a suitable approximation of the root. (If we used more rought bounds for m and M then the value M/2m would be larger and we might need more iterations to guarantee the same accuracy.)

I.2. NEWTON'S METHOD FOR SYSTEMS OF EQUATIONS

Let the nonlinear system of equations

[pic]

be given. In shortly form:

[pic].

Starting from the initial vector [pic] make [pic] on such a way that the functions [pic] are made „linear" by the first (n + 1) terms of their Taylor series at [pic] It means the surfaces are replaced by their tangential planes concerning [pic]. Hence, we can get [pic] from the system of linear equations

[pic]

[pic]

[pic]

if there exists unique solution of it. Using matrices we can write

[pic]

Shortly

[pic],

where [pic] is called Jacobian matrix of the function system [pic]. If there exists unique solution (det[pic]) then we can express [pic] from this equation with using the inverse matrix of [pic]

[pic]

So, the formula of Newton's method (written [pic] and [pic] into the places of [pic] and [pic] respectively)

[pic]

The following theorem gives a fundamental characterization of the method.

THEOREM

Suppose that each of the partial derivatives [pic] [pic] is continuous on a neighbourhood of a solution [pic] and the matrix [pic] is invertable. Then Newton's method is convergent starting from sufficiently short distance. And if the partial derivates [pic] are also continuous on the above mentioned neighbourhood then the error decreases at least quadratically, that is

[pic]

where c is a positive constant.

REMARK

It is not practical to compute [pic] by the formula

[pic]

(here we have to compute the inverse of a matrix of order n in every step), but it is practical to solve the original system of linear equations for the unknowns [pic]and then we can get the values [pic]immediately. Namely, solving system of linear equations requires smaller work than the computing of the inverse matrix.

Numerical example.

Let a circle and a paraboloid be given by the formulas

[pic] [pic]

[pic] [pic], and [pic]

[pic] [pic].

respectively. Determine the distance of the given curve and surface.

Geometrically we have to find the shortest way starting from a circle point ending at a paraboloid point:

[pic]

The square of the distance

[pic]

For the extremum points it is a necessary condition that

[pic]

[pic]

[pic]

This way we have a nonlinear system of equations. Now introduce the notations [pic], where [pic] and try to solve the nonlinear system by Newton's method starting from [pic]. We determine the vector [pic] from the linear system of equations

[pic]

again and again. For [pic] the solution of the linear system is

[pic]

and

[pic]

The next approximations are

[pic] [pic] [pic] [pic] -[pic]

In the practice the following stop-criteria are usually used:

a) the iteration is stopped if [pic], where [pic] is a given positive number;

a) the iteration is stopped if the distance of the last two approximating vectors (in some norm) is small enough.

If we use the criteria a) and [pic], then [pic] is the last iteration, since

[pic], [pic], [pic].

II. NEWTON SEARCH FOR A MINIMUM

Definition (Gradient). Assume that [pic]is a function of two variables, [pic], and has partial derivatives [pic]and [pic]. The gradient of [pic], denoted by [pic], is the vector function

[pic].

Definition (Jacobian Matrix). Assume that [pic]are functions of two variables, [pic], their Jacobian matrix [pic]is

[pic].

Definition (Hessian Matrix). Assume that [pic]is a function of two variables, [pic], and has partial derivatives up to the order two. The Hessian matrix [pic]is defined as follows:

[pic].

Lemma 1. For [pic]the Hessian matrix [pic]is the Jacobian matrix for the two functions [pic], i. e.

[pic].

Lemma 2. If the second order partial derivatives of [pic]are continuous then the Hessian matrix [pic]is symmetric.

Outline of the Newton Method for Finding a Minimum

Start with the approximation [pic]to the minimum point [pic]. Set [pic].

(i) Evaluate the gradient vector [pic]and Hessian matrix [pic]

(ii) Compute the next point [pic].

(iii) Perform the termination test for minimization. Set [pic].

Repeat the process.

Example 3. Use the Newton search method to find the minimum of [pic].

Solution 3.

III. THE GOLDEN RATIO SEARCH FOR A MINIMUM

Definition (Unimodal Function)  The function  [pic]  is unimodal on  [pic],  if there exists a unique number [pic]  such that  

    

        [pic]  is decreasing on  [pic],  

    and

        [pic]  is increasing on  [pic].  

Golden Ratio Search

    If  [pic]  is known to be unimodal on  [pic],  then it is possible to replace the interval with a subinterval on which  [pic]  takes on its minimum value.  One approach is to select two interior points  [pic].  This results in  [pic].  The condition that  [pic]  is unimodal guarantees that the function values [pic]and [pic]are less than  [pic].  

    If  [pic],  then the minimum must occur in the subinterval [pic], and we replace b with d and continue the search in the new subinterval [pic].   If  [pic], then the minimum must occur in the subinterval [pic], and we replace a with c and continue the search in the new subinterval [pic].  These choices are shown in Figure 1 below.  

[pic]    [pic]

If [pic], then squeeze from the right and use the                       If [pic], then squeeze from the left and use the

new interval [pic]and the four points  [pic].                      new interval [pic]and the four points  [pic].

                    Figure 1.  The decision process for the golden ratio search.   

 

    The interior points c and d of the original interval  [pic],  must be constructed so that the resulting intervals [pic],  and [pic]are symmetrical in  [pic].   This requires that  [pic],  and produces the two equations  

(1)        [pic],  

    and

(2)        [pic],  

    where    [pic]   (to preserve the ordering  [pic]).

    We want the value of  r  to remain constant on each subinterval.  If  r  is chosen judicially then only one new point  e  (shown in green in Figure 1) needs to be constructed for the next iteration.  Additionally, one of the old interior points (either c or d) will be used as an interior point of the next subinterval, while the other interior point (d or c) will become an endpoint of the next subinterval in the iteration process.  Thus, for each iteration only one new point e will have to be constructed and only one new function evaluation [pic], will have to be made.   As a consequence, this means that the value  r  must be chosen carefully to split the interval of [pic]into subintervals which have the following ratios:

(3)        [pic]    and    [pic],  

    and

(4)        [pic]    and    [pic].  

If [pic]and only one new function evaluation is to be made in the interval [pic], then we must have

        [pic].  

Use the facts in (3) and (4) to rewrite this equation and then simplify.  

        [pic],  

        [pic],  

        [pic],  

        [pic].  

Now the quadratic equation can be applied and we get  

        [pic].  

The value we seek is  [pic]  and it is often referred to as the "golden ratio."   Similarly, if [pic], then it can be shown that  [pic].  

 

Example 1. Find the minimum of the unimodal function  [pic]  on the interval  [pic].  

Solution 1.

 

IV. NELDER-MEAD SEARCH FOR A MINIMUM

Nelder-Mead Method

The Nelder-Mead method is a simplex method for finding a local minimum of a function of several variables. It's discovery is attributed to J. A. Nelder and R. Mead. For two variables, a simplex is a triangle, and the method is a pattern search that compares function values at the three vertices of a triangle. The worst vertex, where [pic]is largest, is rejected and replaced with a new vertex. A new triangle is formed and the search is continued. The process generates a sequence of triangles (which might have different shapes), for which the function values at the vertices get smaller and smaller. The size of the triangles is reduced and the coordinates of the minimum point are found.

The algorithm is stated using the term simplex (a generalized triangle in n dimensions) and will find the minimum of a function of n variables. It is effective and computationally compact.

Initial Triangle [pic]

Let [pic]be the function that is to be minimized. To start, we are given three vertices of a triangle: [pic], for [pic]. The function [pic]is then evaluated at each of the three points: [pic], for [pic]. The subscripts are then reordered so that [pic]. We use the notation

(1) [pic], [pic], and [pic].

to help remember that [pic]is the best vertex, [pic]is good (next to best), and [pic]is the worst vertex.

Midpoint of the Good Side

The construction process uses the midpoint [pic]of the line segment joining [pic]and [pic]. It is found by averaging the coordinates:

(2) [pic].

Reflection Using the Point [pic]

The function decreases as we move along the side of the triangle from [pic]to [pic], and it decreases as we move along the side from [pic]to[pic]. Hence it is feasible that [pic]takes on smaller values at points that lie away from [pic]on the opposite side of the line between[pic] and[pic]. We choose a test point [pic]that is obtained by “reflecting” the triangle through the side [pic]. To determine [pic], we first find the midpoint [pic]of the side [pic]. Then draw the line segment from [pic]to [pic]and call its length d. This last segment is extended a distance d through [pic]to locate the point [pic]. The vector formula for [pic]is

(3) [pic].

Expansion Using the Point [pic]

If the function value at [pic]is smaller than the function value at [pic], then we have moved in the correct direction toward the minimum. Perhaps the minimum is just a bit farther than the point [pic]. So we extend the line segment through [pic]and [pic]to the point [pic]. This forms an expanded triangle [pic]. The point [pic]is found by moving an additional distance d along the line joining [pic]and [pic]. If the function value at [pic]is less than the function value at [pic], then we have found a better vertex than [pic]. The vector formula for [pic]is

(4) [pic].

Contraction Using the Point [pic]

If the function values at [pic]and [pic]are the same, another point must be tested. Perhaps the function is smaller at [pic], but we cannot replace [pic]with [pic]because we must have a triangle. Consider the two midpoints [pic]and [pic]of the line segments [pic]and [pic], respectively. The point with the smaller function value is called [pic], and the new triangle is [pic].

Note: The choice between [pic]and [pic]might seem inappropriate for the two-dimensional case, but it is important in higher dimensions.

Shrink Toward [pic]

If the function value at [pic]is not less than the value at [pic], the points [pic]and [pic]must be shrunk toward [pic]. The point [pic]is replaced with [pic], and [pic]is replaced with [pic], which is the midpoint of the line segment joining [pic]with [pic].

Logical Decisions for Each Step

A computationally efficient algorithm should perform function evaluations only if needed. In each step, a new vertex is found, which replaces [pic]. As soon as it is found, further investigation is not needed, and the iteration step is completed. The logical details for two-dimensional cases are given in the proof.

Example 1. Use the Nelder-Mead method to find the minimum of [pic].

Solution 1.

 

V. NUMERICAL METHODS FOR DIFFERENTIAL EQUATIONS AND SYSTEM OF EQUATIONS

V. 1 RUNGE-KUTTA METHODS

Let the initial value problem

[pic]

be given and suppose that we would like to determine (approximating) values of the solution function in the finite interval [pic]. (It is well known that this problem is correctly given if [pic] and [pic] are continuous in a large enough neighbourhood of the point [pic].

The basic idea of the so-called one-step methods (including Runge-Kutta methods) is the following: if we have a formula for determining a good approximating value to [pic] by using the value [pic], then starting from [pic] and using the previous (approximating) value at every step, we can give approximating values to the solution function [pic] at the points [pic].

[pic]

From the equality

[pic]

it seems reasonable to try with a formula

[pic]

which is the general form of the so-called one-step explicit formulas. (Of course we would like to choose such a function [pic] that [pic] is sufficiently large.) The speciality of Runge-Kutta methods is that the function value [pic] is given by several values of [pic] originated recursively.

In detail, determine members

[pic]

and using these values give an approximating value in form

[pic].

We have a concrete formula after choosing the number [pic] of members [pic] and determining the coefficients [pic]. When determining the coefficients we harmonize the Taylor series belonging to [pic] of the left side accurate value [pic] and the Taylor series belonging to [pic] of the right side approximating value [pic] to the largest power of the step length [pic].

Look at three different cases:

a) Let [pic]. Then

[pic]

that is,

[pic]

The Taylor series belonging to [pic] of the left side is:

[pic]

the Taylor series belonging to [pic] of the right side is itself, so the Taylor series are harmonized concerning [pic] to power 1, if [pic]. The obtained formula

[pic]

is called Euler's formula. The so-called local error concerning [pic] of this formula (which gives the error after one step starting from an accurate value) is [pic] (where [pic]) or in short form this local error is [pic]

b) Let [pic]. Then

[pic]

that is

[pic]

The Taylor series belonging to [pic] of the left side is

[pic]

where it was used that

[pic]

The Taylor series belonging to [pic] of the right side is

[pic]

The two Taylor series are harmonized concerning [pic] to power 2, if

[pic]

So, we have three equations for four unknowns, and we can get infinitely many solutions. E.g. [pic] is a possible choise and the formula is

[pic]

which is called Heun's formula (or trapezoidal formula). The local error [pic] is characterized by [pic] here.

c) Let [pic]. Then we can originate 11 equations to the 13 unknowns by harmonizing the Taylor series concerning [pic] to power 4. (For harmonizing to power 5 the 13 parameters are not enough.) A possible choise is

[pic]

[pic]

and the formula is

[pic]

which is called the classical Runge-Kutta formula. The local error [pic] can be characterized by [pic] now.

We have already mentioned that each Runge-Kutta type formula has the form

[pic]

after substituting the expressions [pic] into the last "equation" (see the general form of one-step explicit methods). Hence, using such a formula, first we determine the approximating value

[pic]

concerning [pic] starting from the accurate initial couple [pic]. Then we determine the approximating value

[pic]

concerning [pic] starting from the approximating couple [pic], etc.

So, we can give the process by the formula

[pic].

DEFINITION

The quantity

[pic]

where [pic], is called local error belonging to the place [pic]. The quantity

[pic]

(where [pic] is originated by the above mentioned formula) is called accumulated error belonging to the place [pic].

THEOREM

If [pic] is smooth enough in a suitable neighbourhood of the point [pic] in the case of the problem [pic], then for the Euler, Heun, and Runge-Kutta methods (which have [pic] [pic] local errors, respectively) the accumulated error can be given by [pic] in the finite interval [pic].

Numerical example

1. Given the differential equation

[pic]

Determine the approximating values of that particular solution which undergoes the point [pic] by the classical Runge-Kutta formula to the interval [pic] if [pic].

The results:

For [pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

V. 2 FINITE DIFFERENCE METHODS

The basic idea of these methods is the following: we substitute the derivatives in the (ordinary or partial) differential equation and in the initial or (and) boundary conditions by expressions of approximating derivation formulas. In this way we can get an "approximating system of equations" for the required function values. We illustrate the possibilities on two well-known problems.

a) Let the boundary value problem

[pic]

be given, where [pic], [pic] and [pic] are at least continuous on [pic]. Take an equidistant partition on the interval [pic]

[pic]°

by the mesh-points [pic].

If the solution function [pic] is four times continuously differentiable on [pic] then by using the values [pic] belonging to the places [pic] we can get

[pic]

where [pic]. Introducing the notation [pic], substituting the previous expression into the differential equation for [pic] and arranging the equations with respect to [pic], we can obtain the system of linear equations

[pic]

for the required function values [pic]. Introducing the notations

[pic]

[pic]

our linear system can be written in form

[pic].

During the practical computation we cannot determine the elements of the vector [pic], and therefore we omit this member (the linear system is truncated) and we determine the solution vector [pic] of the linear system

[pic],

hoping that [pic] lies near enough to [pic]. (Both linear system have unique solution because the matrix [pic] is diagonally dominant due to [pic]).

THEOREM

If [pic] then [pic]

Numerical examples

1. Give approximating values to the solution function of the boundary value problem

If we use the former approximating formula concerning [pic] at the inner meshpoints [pic] [pic] [pic], then

[pic]

Respecting the boundary values [pic] [pic] and arranging the equations, we get the linear system

[pic]

This has the unique solution

[pic]

So , the solution:

[pic]

Since the solution function is [pic], therefore the used approximating formula is accurate, i.e. [pic]=[pic] in this problem.

 

VI. Mathematical notions

1. DEFINITION

The set [pic] is called linear vector space if the addition and the multiplication by real number are defined on the set [pic] and these operations have the following properties:

[pic]

The elements of such a set [pic] are called vectors.

Examples

a) Let

[pic]

where [pic]. are real numbers, that is, [pic] is the set of such "objects" which have [pic] "coordinates". If the addition and multiplication by number are defined by the traditional rules, that is,

[pic]

then the mentioned properties can be proved easily. This linear space is usually denoted by [pic].

b) Let the interval [pic] be given and denote [pic] the set of the continuous functions on [pic]. If the addition [pic] and the multiplication by number [pic] are defined by the traditional rules, then the required properties can be proved easily. This linear space is usually denoted by [pic].

2. DEFINITION

The elements [pic] (where L is a vector space and the number of the given vectors is finite) form a linearly independent system if

[pic]

Examples

a) In the vector space [pic] the vectors [pic] and [pic] from linearly independent system, since the zero element [pic] can only be obtained in form [pic].

b) In the vector space [pic] the vectors [pic] form linearly independent system, since the zero element (the identically zero function) can only be obtained in form [pic].

3. DEFINITION

The linearly independent system formed by the vectors [pic] is a basis of [pic], if arbitrary element of [pic] can be obtained as a linear combination of the elements [pic], that is, in form

[pic].

Examples

a) In the linear space [pic] the vectors [pic] form a well-known basis. Another possible basis is given e.g. by vectors [pic].

b) Denote [pic] the set of the polynomials of degree at most 2. The traditional addition and multiplication by number do not lead out from the set, [pic] is a vector space.

The elements (functions, vectors) [pic] form a basis of [pic], because they form a linearly independent system and arbitrary at most quadratic polynomial can be obtained in form

[pic].

Another basis is given e.g. by elements [pic].

4. DEFINITION

If we can order real numbers [pic] to all elements [pic]. of the linear space [pic] respectively in such a way that

[pic]

then the vector space [pic] is called normed space and the real number [pic] is called the norm of the vector [pic].

Examples

a) Define the quantities

[pic]

for each element [pic] of the linear space [pic]. These quantities define three different norms which are called norm number 1, norm number 2 (or Euclidean norm) and norm number infinity.

b) Denote [pic]the set of the real (square) matrices of order [pic]. [pic] form a linear space with the traditional addition and multiplication by number. The quantities

[pic]

[pic]

where [pic] is the largest eigenvalue of the matrix [pic] and

[pic]

define three different norms. The norms belonging to the same notation (the same name) in a) and b) are called compatible, because such pairs have also the properties

[pic]

for any [pic] beside the properties appearing in the definition. By the latter property we could have defined the above mentioned norms of the matrix [pic] as the smallest upper bound of the suitable vector norm [pic] under the condition [pic]. In our lecture notes we supposed compatible norms at using [pic] every time.

c) In the vector space [pic] we can define norms e.g. by equalities

[pic]

Here we mention that the distance of the elements [pic] of a normed space (just like at traditional case) we can measure by the quantity

[pic].

5. DEFINITION

If we can order a real number c to arbitrary two elements [pic] of the linear space [pic] in such a way that

[pic]

then the vector space [pic] is called Euclidean space and the real number [pic] is called the scalar product of the elements (vectors) [pic] and [pic].

Examples

a) Let [pic] and [pic] be two arbitrary elements of [pic] Then the quantity

[pic]

defines the traditional scalar product on [pic].

b) If [pic] and [pic] are two arbitrary elements of [pic], then the quantity

[pic]

defines a scalar product

6. DEFINITION

The matrix [pic] of order [pic] is called positive definite matrix if

[pic].

7. DEFINITION

The matrix [pic] of order [pic] is called negative definite matrix if

[pic].

8. DEFINITION

The matrix [pic] of order [pic] is called positive semidefinite matrix if

[pic].

9. DEFINITION

The matrix [pic] of order [pic] is called negative semidefinite matrix if

[pic].

10. Definition

Denote [pic] the limit of the convergent sequence of real numbers [pic].If there exist numbers [pic] such that for arbitrary [pic]

[pic]

where [pic], then we say: the sequence [pic]converges to [pic] in order [pic]. If only the left (right) side inequality is fulfilled then we say: the order of convergence is at most (at least) [pic].

Examples

a) For [pic] (quadratic convergence), which often appears at numerical methods, we illustrate the rapidity of convergence on a simple example. Suppose that [pic] and the distance of [pic] and [pic] is 0.1. Then (by using the right side inequality):

[pic]which means a surprising rapidity comparing the traditional sequences.

b) If we use the vector sequence [pic] instead of the number sequence [pic] and we use a norm instead of absolute value, then the previous definition can be repeat with the inequality

[pic]

11. DEFINITION

Let [pic] and [pic]. Then the equality

[pic]

(where [pic] can be say capital ordo ) means there exists positive constant [pic] such that

[pic], [pic].

Examples

a) Let [pic] be the limit of the sequence [pic]. Then the equality

[pic], where [pic],

means that the error of [pic], where [pic] is independent on [pic]. In other words, it means that "the error is proportional to [pic]". Such a type of decreasing the error we can see e.g. at power-method for eigenvalues.

b) Let [pic]be the previous sequence. Then the equality (supposed it is true)

[pic]

(where [pic] is a constant and [pic] if [pic]) means that "the error is proportional to [pic]”. Such a type of decreasing the error we can see e.g. at Runge-Kutta methods.

c) Sometimes we use the sign [pic] for truncation of a polynomial. E.g. the equality

[pic], where [pic],

means that

[pic],

where [pic] is independent on [pic]. Such an application of sign [pic] can be found e.g. at elimination methods .

-----------------------

[pic]

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download