I



CONTENT

I. NUMERICAL METHODS FOR NONLINEAR EQUATIONS AND SYSTEMS OF EQUATIONS

I.1. NEWTON'S METHOD FOR EQUATIONS 2-6

I.2. NEWTON'S METHOD FOR SYSTEMS OF EQUATIONS 6-10

II. NEWTON SEARCH FOR A MINIMUM 10-14

III. THE GOLDEN RATIO SEARCH FOR A MINIMUM 14-17

IV. NELDER-MEAD SEARCH FOR A MINIMUM 17-20

V. NUMERICAL METHODS FOR DIFFERENTIAL EQUATIONS AND SYSTEM OF EQUATIONS

V. 1 RUNGE-KUTTA METHODS 20-29

V. 2 FINITE DIFFERENCE METHODS 29-45

I. NUMERICAL METHODS FOR NONLINEAR EQUATIONS AND SYSTEMS OF EQUATIONS

On the classical way (by algebraic transformations) we can solve a nonlinear equation or system of equations very rarely. E.g. all of the equations

[pic]

and both of the systems of equations

|[pic] |[pic] |

|[pic] |[pic] |

are "hopeless cases" on the classical way, although they are very simple problems compared with practical problems.

In the numerical analysis sometimes the nonlinear equation [pic] ([pic]) or the nonlinear system of equations [pic][pic] (where [pic]) is said to be solved if we can give all of the solutions from a given one or n-dimensional interval, and other times it is said to be solved if we can give a solution a given interval.

We mostly might need for solving nonlinear equations or systems of equations in the mentioned way:

a) in the cases of other mathematical problems (e.g. at determining extremal points of functions; at solving differential equations by finite difference or finite element methods);

a) at solving engineering problems (e.g. for avoiding iterative geometrical constructions at design).

The two methods, discussing in this chapter, have a common feature: they are so-called iterative methods, namely we can come closer and closer to a solution by solving simple problems again and again. Therefore, a solution z of an equation or solution vector z of a system of equations is given by the limit of a convergent sequence of numbers or vectors.

It often happens that we can solve a nonlinear equation or system of equations only in two steps. For the first we have to use a safety (but slow) method (" starting methods") to get a good enough initial approximation for a quick method ("finishing mehods") and then we can determine the solution with large accuracy in few iterations by the finishing method.

I.1. NEWTON'S METHOD FOR EQUATIONS

Here we try to approximate a root z of the equation [pic] in such a way that we use the tangential straight line of at different points. Starting from [pic] the left figure shows a successful trial and the right figure shows an unsuccessful one:

[pic]

So, [pic] approximation of z can be got by Taylor's series of the function [pic] used only the constant and linear term of it. Hence [pic] is the solution of the linear equation

[pic].

If this equation has unique solution for [pic], then

[pic]

and it is called the formula of Newton's method for the equation [pic].

THEOREM.

Let the function [pic] be twice continuously differentiable on [pic]. Assume that there is a root of the equation [pic] on the interval [pic] and [pic] [pic] [pic]. Then Newton's iteration converges (starting from [pic], if [pic] and from [pic], if [pic]) to the unique solution on [pic] and

[pic]

where

[pic], [pic] [pic].

Numerical example

Determine the root of the equation

[pic]

by Newton's method and with error bound [pic].

(1) We can get a suitable interval (for which the conditions of our theorem are satisfied) by graphical way and (or) tabling. If we write our equation into the form

[pic]

and represent the function of left and right sides (roughly), then we get. a (rough) estimate about the root:

[pic]

We can see the equation has unique root (z) and it is on the interval (1/3, 2). Moreover, by the table

[pic]

z is on the interval (2/3, 1), too.

(2) Now let us examine the conditions concerning the convergence of the method

a) [pic] [pic] [pic]

b) Since [pic] and [pic] [pic], so on a small neighbourhood of z the graph of [pic] is the following:

[pic]

Hence [pic] the right choice.

c) Since [pic] and [pic] is strictly monotone decreasing on [pic], [pic] Furthermore, [pic] and it is strictly monotone increasing on [pic] (because of f'''(x)>0). Therefore, [pic] is strictly monotone decreasing and [pic]

(3) The computation of the iterations and the estimation of the errors:

[pic]

[pic], [pic]

[pic], [pic]

[pic], [pic]

As [pic] is fulfilled, [pic] can be considered a suitable approximation of the root. (If we used more rought bounds for m and M then the value M/2m would be larger and we might need more iterations to guarantee the same accuracy.)

Problems

1.) Show that the interval [1,2] is suitable for equation [pic] using Newton's method. How many iterations is needed to get the root over accuracy [pic]?

2.) Give a suitable interval and initial value for using Newton's method in case of equation [pic].

3.) We would like to compute the lower positive root of the equation [pic].

Is the interval [0.5,1.5] suitable for using Newton's method? If it is not suitable, give a correct interval. Compute two iterations and give a bound of error to the second iteration.

4.) Determine the roots of the equation [pic] over accuracy [pic].

5.) How is it possible to compute the value [pic] by Newton's method? (Work with C = 29.)

a) Which nonlinear equation do we have to solve?

b.) Show that [pic] is correct choice.

c.) Is it sufficient to compute 3 iterations if the bound of error is [pic]?

I.2. NEWTON'S METHOD FOR SYSTEMS OF EQUATIONS

Let the nonlinear system of equations

[pic]

be given. In shortly form:

[pic].

Starting from the initial vector [pic] make [pic] on such a way that the functions [pic] are made „linear" by the first (n + 1) terms of their Taylor series at [pic] It means the surfaces are replaced by their tangential planes concerning [pic]. Hence, we can get [pic] from the system of linear equations

[pic]

[pic]

[pic]

if there exists unique solution of it. Using matrices we can write

[pic]

Shortly

[pic],

where [pic] is called Jacobian matrix of the function system [pic]. If there exists unique solution (det[pic]) then we can express [pic] from this equation with using the inverse matrix of [pic]

[pic]

So, the formula of Newton's method (written [pic] and [pic] into the places of [pic] and [pic] respectively)

[pic]

The following theorem gives a fundamental characterization of the method.

THEOREM

Suppose that each of the partial derivatives [pic] [pic] is continuous on a neighbourhood of a solution [pic] and the matrix [pic] is invertable. Then Newton's method is convergent starting from sufficiently short distance. And if the partial derivates [pic] are also continuous on the above mentioned neighbourhood then the error decreases at least quadratically, that is

[pic]

where c is a positive constant.

REMARK

It is not practical to compute [pic] by the formula

[pic]

(here we have to compute the inverse of a matrix of order n in every step), but it is practical to solve the original system of linear equations for the unknowns [pic]and then we can get the values [pic]immediately. Namely, solving system of linear equations requires smaller work than the computing of the inverse matrix.

Numerical example.

Let a circle and a paraboloid be given by the formulas

[pic] [pic]

[pic] [pic], and [pic]

[pic] [pic].

respectively. Determine the distance of the given curve and surface.

Geometrically we have to find the shortest way starting from a circle point ending at a paraboloid point:

[pic]

The square of the distance

[pic]

For the extremum points it is a necessary condition that

[pic]

[pic]

[pic]

This way we have a nonlinear system of equations. Now introduce the notations [pic], where [pic] and try to solve the nonlinear system by Newton's method starting from [pic]. We determine the vector [pic] from the linear system of equations

[pic]

again and again. For [pic] the solution of the linear system is

[pic]

and

[pic]

The next approximations are

[pic] [pic] [pic] [pic] -[pic]

In the practice the following stop-criteria are usually used:

a) the iteration is stopped if [pic], where [pic] is a given positive number;

a) the iteration is stopped if the distance of the last two approximating vectors (in some norm) is small enough.

If we use the criteria a) and [pic], then [pic] is the last iteration, since

[pic], [pic], [pic].

Problems.

1. a) Give the suitable linear system belonging to the problem

[pic]

b) Find a root of the nonlinear system, compute three iterations starting from [pic].

2. a) Determine the first two approximating vectors starting from [pic] if

[pic] [pic].

(Solution:[pic])

b)

[pic]

(Solution:[pic])

II. NEWTON SEARCH FOR A MINIMUM

Newton's Method

The quadratic approximation method for finding a minimum of a function of one variable generated a sequence of second degree Lagrange polynomials, and used them to approximate where the minimum is located. It was implicitly assumed that near the minimum, the shape of the quadratics approximated the shape of the objective function [pic]. The resulting sequence of minimums of the quadratics produced a sequence converging to the minimum of the objective function [pic]. Newton's search method extends this process to functions of n independent variables: [pic]. Starting at an initial point [pic], a sequence of second-degree polynomials in n variables can be constructed recursively. If the objective function is well-behaved and the initial point is near the actual minimum, then the sequence of minimums of the quadratics will converge to the minimum of the objective function. The process will use both the first- and second-order partial derivatives of the objective function. Recall that the gradient method used only the first partial derivatives. It is to be expected that Newton's method will be more efficient than the gradient method.

Background

Now we turn to the minimization of a function [pic]of n variables, where [pic]and the partial derivatives of [pic]are accessible. Although the Newton search method will turn out to have a familiar form. For illustration purposes we emphasize the two dimensional case when [pic]. The extension to n dimensions is discussed in the hyperlink.

Definition (Gradient). Assume that [pic]is a function of two variables, [pic], and has partial derivatives [pic]and [pic]. The gradient of [pic], denoted by [pic], is the vector function

[pic].

Definition (Jacobian Matrix). Assume that [pic]are functions of two variables, [pic], their Jacobian matrix [pic]is

[pic].

Definition (Hessian Matrix). Assume that [pic]is a function of two variables, [pic], and has partial derivatives up to the order two. The Hessian matrix [pic]is defined as follows:

[pic].

Lemma 1. For [pic]the Hessian matrix [pic]is the Jacobian matrix for the two functions [pic], i. e.

[pic].

Lemma 2. If the second order partial derivatives of [pic]are continuous then the Hessian matrix [pic]is symmetric.

Example 1. Find the gradient vector and Hessian matrix at the point [pic]for the function [pic].

Solution 1.

Taylor Polynomial for f(x,y)

Assume that [pic]is a function of two variables, [pic], and has partial derivatives up to the order two. There are two ways to write the quadratic approximation to f(x,y), based on series and matrices, respectfully. Assume that the point of expansion is [pic]and use the notation [pic]and [pic], then

[pic].

The Taylor polynomial using series notation is

[pic]

The Taylor polynomial using vector and matrix notation is

[pic]

The latter can be written in the form

[pic].

Using vector notations [pic], [pic]and [pic]it looks like

[pic].

The above formula is also the expansion of [pic]centered at the point [pic]with [pic].

Example 2. Calculate the second-degree Taylor polynomial of [pic]centered at the point [pic].

Solution 2.

Newton's Method for Finding a Minimum

Now we turn to the minimization of a function [pic]of n variables, where [pic]and the partial derivatives of [pic]are accessible. Assume that the first and second partial derivatives of [pic]exist and are continuous in a region containing the point [pic], and that there is a minimum at the point [pic]. The quadratic polynomial approximation to [pic]is

[pic]

A minimum of [pic]occurs where [pic].

Using the notation [pic]and [pic]and the symmetry of [pic], we write

[pic]

The gradient [pic]is given by the calculation

[pic]

Thus the expression for [pic]can be written as

[pic].

If [pic]is close to the point [pic](where a minimum of f occurs), then [pic]is invertible the above equation can be solved for [pic], and we have

[pic].

This value of [pic]can be used as the next approximation to [pic]and is the first step in Newton's method for finding a minimum

[pic].

Lemma If column vectors are preferred over row vectors, then [pic]is given by the computation

[pic].

Remark. This is equivalent to a Newton-Raphson root finding problem: Given the vector function [pic]find the root of the equation [pic]. For this problem the Newton-Raphson formula is known to be

[pic],

where our previous work with Newton-Raphson method used column vectors [pic]and [pic]. Here we use [pic]and Lemma 1 gives the relationship [pic].

Outline of the Newton Method for Finding a Minimum

Start with the approximation [pic]to the minimum point [pic]. Set [pic].

(i) Evaluate the gradient vector [pic]and Hessian matrix [pic]

(ii) Compute the next point [pic].

(iii) Perform the termination test for minimization. Set [pic].

Repeat the process.

In equation (ii) the inverse of the Hessian matrix is used to solve for [pic]. It would be better to solve the system of linear equations represented by equation (ii) with a linear system solver rather than a matrix inversion. The reader should realize that the inverse is primarily a theoretical tool and the computation and use of inverses is inherently inefficient.

Example 3. Use the Newton search method to find the minimum of [pic].

Solution 3.

Example 4. Use the Newton method to find the minimum of [pic]

Rosenbrock's parabolic valley, circa 1960.

Solution 4.

Example 5. Use the Newton method to find the minimum of [pic].

Looking at your graphs, estimate the location of the local minima.

Solution 5.

Example 6. Given the initial point [pic], and [pic]. Use the Newton search method to find [pic]and [pic].

Solution 6.

III. THE GOLDEN RATIO SEARCH FOR A MINIMUM

Bracketing Search Methods

    An approach for finding the minimum of   [pic]  in a given interval is to evaluate the function many times and search for a local minimum.  To reduce the number of function evaluations it is important to have a good strategy for determining where  [pic]  is to be evaluated.  Two efficient bracketing methods are the golden ratio and Fibonacci searches.  To use either bracketing method for finding the minimum of  [pic],  a special condition must be met to ensure that there is a proper minimum in the given interval.

Definition (Unimodal Function)  The function  [pic]  is unimodal on  [pic],  if there exists a unique number [pic]  such that  

    

        [pic]  is decreasing on  [pic],  

    and

        [pic]  is increasing on  [pic].  

Golden Ratio Search

    If  [pic]  is known to be unimodal on  [pic],  then it is possible to replace the interval with a subinterval on which  [pic]  takes on its minimum value.  One approach is to select two interior points  [pic].  This results in  [pic].  The condition that  [pic]  is unimodal guarantees that the function values [pic]and [pic]are less than  [pic].  

    If  [pic],  then the minimum must occur in the subinterval [pic], and we replace b with d and continue the search in the new subinterval [pic].   If  [pic], then the minimum must occur in the subinterval [pic], and we replace a with c and continue the search in the new subinterval [pic].  These choices are shown in Figure 1 below.  

[pic]    [pic]

If [pic], then squeeze from the right and use the                       If [pic], then squeeze from the left and use the

new interval [pic]and the four points  [pic].                      new interval [pic]and the four points  [pic].

                    Figure 1.  The decision process for the golden ratio search.   

 

    The interior points c and d of the original interval  [pic],  must be constructed so that the resulting intervals [pic],  and [pic]are symmetrical in  [pic].   This requires that  [pic],  and produces the two equations  

(1)        [pic],  

    and

(2)        [pic],  

    where    [pic]   (to preserve the ordering  [pic]).

    We want the value of  r  to remain constant on each subinterval.  If  r  is chosen judicially then only one new point  e  (shown in green in Figure 1) needs to be constructed for the next iteration.  Additionally, one of the old interior points (either c or d) will be used as an interior point of the next subinterval, while the other interior point (d or c) will become an endpoint of the next subinterval in the iteration process.  Thus, for each iteration only one new point e will have to be constructed and only one new function evaluation [pic], will have to be made.   As a consequence, this means that the value  r  must be chosen carefully to split the interval of [pic]into subintervals which have the following ratios:

(3)        [pic]    and    [pic],  

    and

(4)        [pic]    and    [pic].  

If [pic]and only one new function evaluation is to be made in the interval [pic], then we must have

        [pic].  

Use the facts in (3) and (4) to rewrite this equation and then simplify.  

        [pic],  

        [pic],  

        [pic],  

        [pic].  

Now the quadratic equation can be applied and we get  

        [pic].  

The value we seek is  [pic]  and it is often referred to as the "golden ratio."   Similarly, if [pic], then it can be shown that  [pic].  

  

 

Algorithm (Golden Ratio Search for a Minimum).  To numerically approximate the minimum of  [pic]  on the interval  [pic]  by using a golden ratio search.  Proceed with the method only if  [pic]  is a unimodal function on the interval  [pic].  Terminate the process after  max iterations have been completed.  

Example 1. Find the minimum of the unimodal function  [pic]  on the interval  [pic].  

Solution 1.

 

Example 2.  Find the local minimum of the function  [pic]  over  [pic].

Solution 2.

 

Example 3.  Find the local maximum of the function  [pic]  in the interval  [pic].

Solution 3.

 

Exercise 4.  Find the maximum of the function  [pic]  for  [pic].  Use the Golden Ratio Search.  

Solution 4.

 

Exercise 5.  The voltage input to an electrical component is  [pic]  for  [pic].  The manufacturer states that the voltage is not to exceed  [pic] or else the component will "burn out."  Can we expect that the component will be o.k. or will it  "burn out."  

Solution 5.

 

Example 6. Find the minimum of  [pic]  on the interval  [pic].  

Solution 6.

IV. NELDER-MEAD SEARCH FOR A MINIMUM

Nelder-Mead Method

The Nelder-Mead method is a simplex method for finding a local minimum of a function of several variables. It's discovery is attributed to J. A. Nelder and R. Mead. For two variables, a simplex is a triangle, and the method is a pattern search that compares function values at the three vertices of a triangle. The worst vertex, where [pic]is largest, is rejected and replaced with a new vertex. A new triangle is formed and the search is continued. The process generates a sequence of triangles (which might have different shapes), for which the function values at the vertices get smaller and smaller. The size of the triangles is reduced and the coordinates of the minimum point are found.

The algorithm is stated using the term simplex (a generalized triangle in n dimensions) and will find the minimum of a function of n variables. It is effective and computationally compact.

Initial Triangle [pic]

Let [pic]be the function that is to be minimized. To start, we are given three vertices of a triangle: [pic], for [pic]. The function [pic]is then evaluated at each of the three points: [pic], for [pic]. The subscripts are then reordered so that [pic]. We use the notation

(1) [pic], [pic], and [pic].

to help remember that [pic]is the best vertex, [pic]is good (next to best), and [pic]is the worst vertex.

Midpoint of the Good Side

The construction process uses the midpoint [pic]of the line segment joining [pic]and [pic]. It is found by averaging the coordinates:

(2) [pic].

Reflection Using the Point [pic]

The function decreases as we move along the side of the triangle from [pic]to [pic], and it decreases as we move along the side from [pic]to[pic]. Hence it is feasible that [pic]takes on smaller values at points that lie away from [pic]on the opposite side of the line between[pic] and[pic]. We choose a test point [pic]that is obtained by “reflecting” the triangle through the side [pic]. To determine [pic], we first find the midpoint [pic]of the side [pic]. Then draw the line segment from [pic]to [pic]and call its length d. This last segment is extended a distance d through [pic]to locate the point [pic]. The vector formula for [pic]is

(3) [pic].

Expansion Using the Point [pic]

If the function value at [pic]is smaller than the function value at [pic], then we have moved in the correct direction toward the minimum. Perhaps the minimum is just a bit farther than the point [pic]. So we extend the line segment through [pic]and [pic]to the point [pic]. This forms an expanded triangle [pic]. The point [pic]is found by moving an additional distance d along the line joining [pic]and [pic]. If the function value at [pic]is less than the function value at [pic], then we have found a better vertex than [pic]. The vector formula for [pic]is

(4) [pic].

Contraction Using the Point [pic]

If the function values at [pic]and [pic]are the same, another point must be tested. Perhaps the function is smaller at [pic], but we cannot replace [pic]with [pic]because we must have a triangle. Consider the two midpoints [pic]and [pic]of the line segments [pic]and [pic], respectively. The point with the smaller function value is called [pic], and the new triangle is [pic].

Note: The choice between [pic]and [pic]might seem inappropriate for the two-dimensional case, but it is important in higher dimensions.

Shrink Toward [pic]

If the function value at [pic]is not less than the value at [pic], the points [pic]and [pic]must be shrunk toward [pic]. The point [pic]is replaced with [pic], and [pic]is replaced with [pic], which is the midpoint of the line segment joining [pic]with [pic].

Logical Decisions for Each Step

A computationally efficient algorithm should perform function evaluations only if needed. In each step, a new vertex is found, which replaces [pic]. As soon as it is found, further investigation is not needed, and the iteration step is completed. The logical details for two-dimensional cases are given in the proof.

Example 1. Use the Nelder-Mead method to find the minimum of [pic].

Solution 1.

A more efficient subroutine

The following subroutine stores the function values in the variables [pic]and the convergence criterion [pic]is used to terminate the algorithm when

[pic].

Example 2. Use the Nelder-Mead method to find the minimum of [pic],

This example is referred to as Rosenbrock's parabolic valley, circa 1960.

Solution 2.

 

 

 

V. NUMERICAL METHODS FOR DIFFERENTIAL EQUATIONS AND SYSTEM OF EQUATIONS

In this chapter there are two numerical methods concerning differential equations. These methods are so-called discrete methods - approximating values of the solution function are computed only at some (discrete) points. We always suppose that our problem is a so-called correctly given problem. This means that there is only one function which satisfies our differential equation and initial or (and) boundary conditions (existence and uniqueness), furthermore the solution depends on the parameters of initial or (and) boundary conditions continuously (a small change in the parameter values yields only a small change in the result).

V. 1 RUNGE-KUTTA METHODS

Let the initial value problem

[pic]

be given and suppose that we would like to determine (approximating) values of the solution function in the finite interval [pic]. (It is well known that this problem is correctly given if [pic] and [pic] are continuous in a large enough neighbourhood of the point [pic].

The basic idea of the so-called one-step methods (including Runge-Kutta methods) is the following: if we have a formula for determining a good approximating value to [pic] by using the value [pic], then starting from [pic] and using the previous (approximating) value at every step, we can give approximating values to the solution function [pic] at the points [pic].

[pic]

From the equality

[pic]

it seems reasonable to try with a formula

[pic]

which is the general form of the so-called one-step explicit formulas. (Of course we would like to choose such a function [pic] that [pic] is sufficiently large.) The speciality of Runge-Kutta methods is that the function value [pic] is given by several values of [pic] originated recursively.

In detail, determine members

[pic]

and using these values give an approximating value in form

[pic].

We have a concrete formula after choosing the number [pic] of members [pic] and determining the coefficients [pic]. When determining the coefficients we harmonize the Taylor series belonging to [pic] of the left side accurate value [pic] and the Taylor series belonging to [pic] of the right side approximating value [pic] to the largest power of the step length [pic].

Look at three different cases:

a) Let [pic]. Then

[pic]

that is,

[pic]

The Taylor series belonging to [pic] of the left side is:

[pic]

the Taylor series belonging to [pic] of the right side is itself, so the Taylor series are harmonized concerning [pic] to power 1, if [pic]. The obtained formula

[pic]

is called Euler's formula. The so-called local error concerning [pic] of this formula (which gives the error after one step starting from an accurate value) is [pic] (where [pic]) or in short form this local error is [pic]

b) Let [pic]. Then

[pic]

that is

[pic]

The Taylor series belonging to [pic] of the left side is

[pic]

where it was used that

[pic]

The Taylor series belonging to [pic] of the right side is

[pic]

The two Taylor series are harmonized concerning [pic] to power 2, if

[pic]

So, we have three equations for four unknowns, and we can get infinitely many solutions. E.g. [pic] is a possible choise and the formula is

[pic]

which is called Heun's formula (or trapezoidal formula). The local error [pic] is characterized by [pic] here.

c) Let [pic]. Then we can originate 11 equations to the 13 unknowns by harmonizing the Taylor series concerning [pic] to power 4. (For harmonizing to power 5 the 13 parameters are not enough.) A possible choise is

[pic]

[pic]

and the formula is

[pic]

which is called the classical Runge-Kutta formula. The local error [pic] can be characterized by [pic] now.

We have already mentioned that each Runge-Kutta type formula has the form

[pic]

after substituting the expressions [pic] into the last "equation" (see the general form of one-step explicit methods). Hence, using such a formula, first we determine the approximating value

[pic]

concerning [pic] starting from the accurate initial couple [pic]. Then we determine the approximating value

[pic]

concerning [pic] starting from the approximating couple [pic], etc.

So, we can give the process by the formula

[pic].

DEFINITION

The quantity

[pic]

where [pic], is called local error belonging to the place [pic]. The quantity

[pic]

(where [pic] is originated by the above mentioned formula) is called accumulated error belonging to the place [pic].

So, we can see that the real error appears as a local error only in the first step and then we get accumulated error because we use approximating values again and again.

[pic]

It is a natural question how we can characterize the accumulated error. The following Theorem gives information about the accumulated errors of the three mentioned Runge-Kutta type method.

THEOREM

If [pic] is smooth enough in a suitable neighbourhood of the point [pic] in the case of the problem [pic], then for the Euler, Heun, and Runge-Kutta methods (which have [pic] [pic] local errors, respectively) the accumulated error can be given by [pic] in the finite interval [pic].

REMARK

The discussed methods can be generalized for the initial value problem

[pic]

concerning a system of differential equations. Introducing the notations

[pic]

by using the suitable vectors (matrices), the generalized formula of the classical Runge-Kutta method is:

[pic]

[pic].

Since the initial value problem

[pic]

concerning a differential equation of order [pic] can be transformed into the form

[pic]

by introducing the functions [pic], [pic], we can use the generalized Runge-Kutta formulas again.

Numerical example

1. Given the differential equation

[pic]

Determine the approximating values of that particular solution which undergoes the point [pic] by the classical Runge-Kutta formula to the interval [pic] if [pic].

The results:

For [pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

2. Give an approximation to [pic] by Runge-Kutta formula, if

[pic].

After substitution [pic], the new problem is:

[pic] [pic].

where

[pic]

Hence

[pic]

[pic]

[pic]

[pic]

and

[pic].

The required value is [pic]

Problems

1. Determine approximating values using the Euler’s formula and Heun’s formula, if [pic] and

a) [pic]

[pic]

2. Give approximating value to [pic] by classical Runge-Kutta method if

a) [pic]

[pic]

b) [pic].

([pic])

c) [pic]

([pic])

3. Use the generalized Runge-Kutta formula for the problems.

a) Determine value of [pic] if

[pic] [pic]

[pic]

b) Determine values of [pic] and [pic] if

[pic] [pic].

[pic]

c) Determine values of [pic] and [pic] if

|[pic] |[pic] |

|[pic] |[pic] |

([pic])

d) Determine values of [pic] and[pic] if

|[pic] |[pic] |

|[pic] |[pic] |

([pic])

e) Determine values of [pic], [pic] and [pic] if

|[pic] |[pic] |

|[pic] |[pic] |

|[pic] |[pic] |

([pic])

f) Determine values of [pic], [pic] and [pic] if

|[pic] |[pic] |

|[pic] |[pic] |

|[pic] |[pic] |

([pic])

g) Determine values of [pic], [pic] and [pic] if

|[pic] |[pic] |

|[pic] |[pic] |

|[pic] |[pic] |

([pic])

V. 2 FINITE DIFFERENCE METHODS

The basic idea of these methods is the following: we substitute the derivatives in the (ordinary or partial) differential equation and in the initial or (and) boundary conditions by expressions of approximating derivation formulas. In this way we can get an "approximating system of equations" for the required function values. We illustrate the possibilities on two well-known problems.

a) Let the boundary value problem

[pic]

be given, where [pic], [pic] and [pic] are at least continuous on [pic]. Take an equidistant partition on the interval [pic]

[pic]°

by the mesh-points [pic].

[pic]

If the solution function [pic] is four times continuously differentiable on [pic] then by using the values [pic] belonging to the places [pic] we can get

[pic]

where [pic]. Introducing the notation [pic], substituting the previous expression into the differential equation for [pic] and arranging the equations with respect to [pic], we can obtain the system of linear equations

[pic]

for the required function values [pic]. Introducing the notations

[pic]

[pic]

our linear system can be written in form

[pic].

During the practical computation we cannot determine the elements of the vector [pic], and therefore we omit this member (the linear system is truncated) and we determine the solution vector [pic] of the linear system

[pic],

hoping that [pic] lies near enough to [pic]. (Both linear system have unique solution because the matrix [pic] is diagonally dominant due to [pic]).

[pic]

It is important basically what we can say about the error appearing because of truncation of the original linear system.

THEOREM

If [pic] then [pic]

b) Denote [pic] the following rectangle region:

[pic]

and denote [pic] the set of the points in the boundary. Let the boundary value problem (Poisson equation with homogeneous boundary condition)

[pic]

be given, where [pic] is at least continuous on the region [pic]. The solution function [pic] can be illustrated by the figure:

[pic]

(The general case, when the solution function [pic] is not 0 on the boundary, can be reduced for this case, too.) Now partition the region [pic] into square subregion (supposed that the quotient of the sides is suitable):

[pic]

If the solution function is four times continuously differentiable on the region [pic] concerning both variables, then by using the values [pic] [pic] belonging to the places [pic] [pic] we can get

[pic]

where [pic], [pic] and by using up the values [pic] [pic] belonging to the places [pic] [pic] we can get that

[pic]

where [pic], [pic]

[pic]

Introducing the notation [pic], substituting the previous expressions into the differential equation concerning the all inner mesh-points [pic] [pic] [pic] and arranging the equations we can obtain the system of linear equations

[pic]

[pic]

for the required function values [pic], [pic] [pic]. (In our equations [pic] because of the boundary condition, if [pic] or [pic] or if [pic] or [pic].) In the practice we omit the members coming from error members of the approximating derivation formulas (the linear system is truncated again) and we determine the "solution set" [pic], [pic] [pic] of the linear system

[pic]

hoping that [pic] lies near to [pic].

[pic]

The examination of the error appearing because of the truncation of the original linear system is similar to that which we saw in the sub-section above.

REMARKS

1. The theorem concerning the first problem gives a sufficient condition for the convergence and also gives information about the order of the convergence: decreasing the step-length [pic] the error decreases proportionally to [pic] at a fix [pic].

2. The stability in general sense (the influence of the inaccuracy of the boundary values and the rounding errors) can be characterized by the number [pic] for the first problem. If we use the norm [pic], then

[pic]

[pic]

where [pic] [pic] roughly this means that the relative error connecting coefficients during the computation could yield [pic]– times larger relative error in the solution.

3. The linear system belonging to the first problem is a so-called tridiagonal system which can be solved well by both Gauss elimination (there are very few non-zero coefficients) and Seidel iteration (the discussed sufficient condition is fulfilled). The linear system belonging to the second problem contains unknowns between 3 and 5 in each equation. For this linear system the Seidel iteration (or an over-relaxation version of it) can be applied more conveniently, since the ordering of the unknowns into a sequence is not necessary. Finally, we mention that there are many special methods for the discussed (large-sized) linear systems.

Numerical examples

1. Give approximating values to the solution function of the boundary value problem

If we use the former approximating formula concerning [pic] at the inner meshpoints [pic] [pic] [pic], then

[pic]

Respecting the boundary values [pic] [pic] and arranging the equations, we get the linear system

[pic]

This has the unique solution

[pic]

So , the solution:

[pic]

Since the solution function is [pic], therefore the used approximating formula is accurate, i.e. [pic]=[pic] in this problem.

2. Give approximating values to the solution function of the boundary value problem

[pic]

[pic] [pic]

[pic]

if [pic].

Use the following couples of indices for the mesh points:

[pic]

The values of the solution function [pic]at the mesh points on the boundary are

[pic]

from the given boundary conditions. If we use the former approximating formula concerning [pic] [pic] at the inner mesh points, then

[pic]

Respecting the boundary values and arranging the equations, we get the linear system

[pic]

which has the unique solution

[pic]

The approximating solution:

[pic]

Problems

1. Give approximating values to the solution function of the boundary value problem, if the given interval is partitioned into four congruent subintervals.

a) [pic]

[pic]

b) [pic]

[pic]

2. Give approximating values to [pic] [pic] by finite difference method if

[pic]

and

|[pic] |[pic] |[pic] |

|[pic] |[pic] |[pic]. |

[pic]

Mathematical notions

DEFINITION 1

The set [pic] is called linear vector space if the addition and the multiplication by real number are defined on the set [pic] and these operations have the following properties:

[pic]

The elements of such a set [pic] are called vectors.

Examples

a) Let

[pic]

where [pic]. are real numbers, that is, [pic] is the set of such "objects" which have [pic] "coordinates". If the addition and multiplication by number are defined by the traditional rules, that is,

[pic]

then the mentioned properties can be proved easily. This linear space is usually denoted by [pic].

b) Let the interval [pic] be given and denote [pic] the set of the continuous functions on [pic]. If the addition [pic] and the multiplication by number [pic] are defined by the traditional rules, then the required properties can be proved easily. This linear space is usually denoted by [pic].

DEFINITION 2

The elements [pic] (where L is a vector space and the number of the given vectors is finite) form a linearly independent system if

[pic]

Examples

a) In the vector space [pic] the vectors [pic] and [pic] form a linearly independent system, since the zero element [pic] can only be obtained in the form [pic].

b) In the vector space [pic] the functions [pic] form a linearly independent system, since the zero element (the identically zero function) can only be obtained in the form [pic].

DEFINITION 3

The linearly independent system formed by the vectors [pic] is a basis of [pic], if arbitrary element of [pic] can be obtained as a linear combination of the elements [pic], that is, in the form

[pic].

Examples

a) In the linear space [pic] the vectors [pic] form a well-known basis. Another possible basis is given e.g. by vectors [pic].

b) Denote [pic] the set of the polynomials of degree at most 2. The traditional addition and multiplication by number do not lead out from the set, [pic] is a vector space.

The elements (functions, vectors) [pic] form a basis of [pic], because they form a linearly independent system and at most quadratic arbitrary polynomial can be obtained in the form

[pic].

Another basis is given e.g. by elements [pic].

DEFINITION 4

If we can order real numbers [pic] to all elements [pic]. of the linear space [pic] respectively in such a way that

[pic]

then the vector space [pic] is called normed space and the real number [pic] is called the norm of the vector [pic].

Examples

a) Define the quantities

[pic]

for each element [pic] of the linear space [pic]. These quantities define three different norms which are called norm number 1, norm number 2 (or Euclidean norm) and norm number infinity.

b) Denote [pic]the set of the real (square) matrices of order [pic]. [pic] form a linear space with the traditional addition and multiplication by number. The quantities

[pic]

[pic]

where [pic] is the largest eigenvalue of the matrix [pic] and

[pic]

define three different norms. The norms belonging to the same notation (the same name) in a) and b) are called compatible, because such pairs have also the properties

[pic]

for any [pic] beside the properties appearing in the definition. By the latter property we could have defined the above-mentioned norms of the matrix [pic] as the smallest upper bound of the suitable vector norm [pic] under the condition [pic]. In our lecture notes we supposed compatible norms when using [pic] every time.

c) In the vector space [pic] we can define norms e.g. by equalities

[pic]

Here we mention that the distance of the elements [pic] of a normed space (just like in the traditional case) can be measured by the quantity

[pic].

DEFINITION 5

If we can order a real number [pic] to arbitrary two elements [pic] of the linear space [pic] in such a way that

[pic]

then the vector space [pic] is called Euclidean space and the real number [pic] is called the scalar product of the elements (vectors) [pic] and [pic].

Examples

a) Let [pic] and [pic] be two arbitrary elements of [pic] Then the quantity

[pic]

defines the traditional scalar product on [pic].

b) If [pic] and [pic] are two arbitrary elements of [pic], then the quantity

[pic]

defines a scalar product.

DEFINITION 6

The matrix [pic] of order [pic] is called positive definite matrix if

[pic].

DEFINITION 7

The matrix [pic] of order [pic] is called a negative definite matrix if

[pic].

DEFINITION 8

The matrix [pic] of order [pic] is called a positive semidefinite matrix if

[pic].

DEFINITION 9

The matrix [pic] of order [pic] is called a negative semidefinite matrix if

[pic].

Definition 10

Denote [pic] the limit of the convergent sequence of real numbers [pic] If there exist numbers [pic] such that for arbitrary [pic]

[pic]

where [pic], then we say: the sequence [pic]converges to [pic] in order [pic]. If only the left (right) side inequality is fulfilled then we say: the order of convergence is at most (at least) [pic].

Examples

a) For [pic] (quadratic convergence), which often appears in numerical methods, we illustrate the rapidity of convergence on a simple example. Suppose that [pic] and the distance of [pic] and [pic] is 0.1. Then (by using the right side inequality):

[pic]which means a surprising rapidity compared to the traditional sequences.

b) If we use the vector sequence [pic] instead of the number sequence [pic] and we use a norm instead of absolute value, then the previous definition can be repeat with the inequality

[pic].

DEFINITION 11

Let [pic] and [pic]. Then the equality

[pic]

(where [pic] can be said as capital ordo ) means there exists positive constant [pic] such that

[pic], [pic].

Examples

a) Let [pic] be the limit of the sequence [pic]. Then the equality

[pic], where [pic],

means that the error of [pic], where [pic] is independent on [pic]. In other words, it means that "the error is proportional to [pic]".

b) Let [pic]be the previous sequence. Then the equality (supposed it is true)

[pic]

(where [pic] is a constant and [pic] if [pic]) means that "the error is proportional to [pic]”.

c) Sometimes we use the sign [pic] for truncation of a polynomial. E.g. the equality

[pic], where [pic],

means that

[pic],

where [pic] is independent on [pic]. Such an application of the sign [pic] can be found e.g. in elimination methods.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

-----------------------

[pic]

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download