Turing intro



Chapter 3

The Rules of Programming

This chapter formalizes some things you already know intuitively, but may not understand fully: the meaning of basic programming constructs. The information we present is a description of the effects of executing typical program statements, or the semantics of these statements. When you first learned to program, you developed an intuitive understanding of the effect of executing an assignment statement and the way in which control constructs such as the if statement control execution. These are the topics of this chapter, but instead of describing the semantics of program commands intuitively, we will describe them precisely.

The semantic descriptions we give here provide a link between program behavior and documentation. The previous chapter provided the tools for describing what a program is intended to do. The formal descriptions of program constructs contained in this chapter are the basis for any argument that a program does (or fails to do) what is claimed.

Our real concern, however, is not for arguing or proving programs correct, but for increasing your understanding of why programs "work", and thereby increasing your skill at writing clear and correct programs.

The preceding chapter introduced two concepts that we will use in this chapter to describe carefully the meaning of program commands and constructs, and thereby programs themselves. The two concepts are the state of a program, and program correctness with respect to a precondition and a postcondition.

Recall that we defined the (complete) state of a program at any point during program execution as the following collection of information:

1. The value of the program counter, which specifies which instruction of the program will be executed next.

2. The values of all the program variables.

3. The input values that are available to be read by the program.[1]

While the third item, the input values not yet read, is an important part of the state, in fact we will seldom refer to them because of the complexity they add to a description. For that reason, we have chosen to refer to the first two items, the value of the program counter and the value of program variables, as "the program state", and, when necessary, refer to all the information as the "complete program state".

Assertions about a program state will appear in our programs as comments or as calls to the assert method at the beginning or end of a program, and between lines of a program.

The concept of program correctness will be applied to programs, program segments, and methods. The important definition is the following:

Definition: If P and Q are assertions about the state of a program C, then the program C is correct with respect to precondition P and postcondition Q if, whenever condition P holds prior to execution of program C, and C terminates, then condition Q will (always!) hold after C has finished execution.

Pre and postconditions will be our way of specifying a program's behavior. A program specification in the form of a pre and postcondition pair is a contract between the programmer and the user of a program; the user should be able to assume that if the program is executed when the precondition holds, then the postcondition will hold when the program ends. We say that the program meets the specification of the precondition and the postcondition.

The concept of program behavior being specified by a pre and postcondition pair is sufficiently important that there is a special notation for it:

Definition: If P and Q are assertions and C is a program, then

{P} C {Q}

means "If the program C is executed beginning in a state that satisfies the precondition P, and the program halts, then after termination the state of the program will satisfy the postcondition Q."

We also say that {P} C {Q} means that program C is correct with respect to the precondition P and postcondition Q, or that C meets the specifications of precondition P and postcondition Q.

Note that {P} C {Q} is a predicate with three variables. When values are given for P, Q and C, the assertion {P} C {Q} is either true or false, although it can sometimes be difficult or, for all practical purposes, impossible to tell which. One way of defining the task of a programmer is the following:

The Programmer's Charge: Given a precondition P and a postcondition Q, write a program C for which the truth of {P} C {Q} can be argued convincingly.

This chapter provides the basic tools for arguing (or, if done formally, proving) that a program meets a specification.

We're now ready to describe the meaning of some of the statements of Java. In principle we could do so for each Java statement, but that would be too great a task. We'll treat only the assignment statement and several control statements to gain an understanding of how programming languages can be defined, and how program correctness can be argued.

Formal Systems

A logical framework for reasoning about a mathematical system is often presented with axioms and rules of inference. An axiom is an assertion taken to be true. A rule of inference is a rule that, on the basis of some assertions known to be true, guarantees that a new assertion is true. Later we will describe a single axiom – the one that describes how the assignment statement works. We'll begin by listing some rules of inference for programs.

A rule of inference is a rule for reasoning. Rules of inference always have the form "If you know some things to be true, then you can conclude something else is true." A well-known rule of inference is known as modus ponens[2]; this rule can be stated as follows:

"If p is true, and p => q, then q is true."

The correctness of this rule of inference follows from the truth table that defines the operation =>; the only time that both p and p => q are true is when q is true.

A convenient schematic representation of a rule of inference lists all hypotheses (the assertions that you must know to be true for the rule to apply) above a line and the conclusion (the new truth) below the same line. Thus, modus ponens is given as follows:

p

p => q

_ __________________________________

q

What do the rules of inference for programming say? All the rules are based on the notation {P} C {Q}, which is an assertion that "If P is true before the program C is executed, and C halts, then after C has halted, Q will be true." Our rules of inference are all of the form "If we know the specifications of some (one or more) programs, then we can give specifications for new programs, or new specifications for the given programs." Another characterization might be that rules of inference assert "If we know some programs work as advertised, then we can conclude that other (possibly different) programs will work as we say." As you read these first rules of inference, bear in mind that none of them should be a surprise — these are simply statements of program properties that we commonly use when programming.

The Specification Rules of Inference

1 Rules for Changing Specifications

The first two rules of inference describe how a program specification can be changed safely.

1 The Rule of the Weakened Postcondition

Recall that in our discussion of weak versus strong assertions in the preceding chapter, we argued that if {P} C {Q} and {P} D {T} and Q => T, then C might be considered a 'better' program than D because execution of C results in more information than execution of D. This observation is the substance of this first rule of inference, which states that a postcondition can be replaced in a specification by a weaker postcondition. The formal statement is the following:

{P} C {Q}

Q => T 

________________________________________________

{P} C {T}

The correctness of the rule is clear: if C, executed when precondition P holds, results in Q being true, and Q => T, then C, executed when precondition P holds, results in condition T being true. This rule, called the rule of the weakened postcondition, is often useful because it enables us to change the specification of programs so that the postcondition ignores information that is not of interest.

Example

The following swap program meets its specification. Assume x and y are integers and have been initialized.

int old_x = x;

int old_y = y;

assert(x == old_x && y == old_y); // Precondition

int temp = x;

x = y;

y = temp;

assert(x == old_y && y == old_x && temp == old_x); // Postcondition

But the value of the auxiliary variable temp is of no interest. Because

(x==old_y && y==old_x && temp==old_x) => (x==old_y && y==old_x)

we can use the Rule of Weakened Postcondition to change the postcondition to the simpler and more appropriate one.

End of Example

The next rule describes the condition under which we can change a precondition.

2 The Rule of the Strengthened Precondition

In our discussion of weak versus strong assertions in the preceding chapter, we argued that if {P} C {Q} and {R} D {Q} and R => P, then C might be considered a 'better' program than D because the requirements for execution of C are weaker than those for D, and they both produce the same final state. This observation is the substance of the second rule of inference, which states that a precondition can be replaced in a specification by a stronger assertion. The formal statement is the following:

R => P

{P} C {Q}

{R} C {Q}

This rule asserts that if C, executed when precondition P holds, results in Q being true, and R => P, then C, executed when precondition R holds, results in condition Q being true. This rule, called the rule of the strengthened precondition, enables us to change a program specification when the correctness of a program segment does not depend on all the information available, such as when the values of some program variables do not affect the segment.

Finally, we note that because p => p for every assertion p, the two preceding rules can be combined into a single rule:

R =>P

{P} C {Q}

Q => T

{R} C {T}

This rule asserts that a correct program is not made incorrect by strengthening the precondition, or weakening the postcondition, or doing both.

2 The Rules of Control Structures

The remaining rules of inference describe the effects of some of the control structures of the language. The control structures of a programming language determine, at each point in program execution, which statement of a program is to be executed next. The default control structure in most imperative languages, including Java, is sequential execution – unless an explicit control structure dictates otherwise, the statements of a program are executed in the order in which they appear. The explicit control structures of these languages include selection statements (e.g., if statements), iteration (loops), and subroutines.

2 The Rule of Sequential Execution, or The Rule of Program Composition

Sequential execution causes statements to be executed sequentially, in the order in which they appear in the program, unless some control structure explicitly causes a different order. The first rule of inference concerning control structures characterizes the effect of executing programs (including individual statements) sequentially. This rule, which we'll refer to as the rule of program composition, or simply the rule of composition, asserts that if program C changes a state P into a state Q, and program D changes a state Q into a state R, then executing C followed by D will change state P into state R:

 {P} C {Q}

{Q} D {R}

{P} C D {R}

This rule is called the 'rule of composition' because it is based on the composition of functions. A program can be viewed as an implementation of a function that maps any state in which the precondition holds to a state in which the postcondition holds. The rule of functional composition states that if f(x) == y, and g(y) == z, then g(f(x)) == z. Viewing the code as a state-transforming function, the rule of inference states that if C(P) == Q and D(Q) == R, then D(C(P)) == R.

The rules of strengthened precondition and weakened postcondition are often used together with the rule of composition to write the specifications for adjacent program segments. This use is formalized by the following more general rule of program composition:

 {P} C {Q1}

 Q1 => Q2

{Q2} D {R}

{P} C D {R}

3 Selection Statements (Conditional Execution)

Selection statements determine which of a collection of code segments (if any) will be executed. These structures generally include if statements and switch statements. We will discuss only two.

1 The if(b){C} structure

The flowchart of the if(b){C} structure with precondition P and postcondition Q is the following. (Edges are annotated with upper case letters denoting assertions; diamonds are program tests, and rectangles are blocks of code.)

[pic]

Our problem is to determine what must be true to guarantee that {P} S {Q} when S is the code if(b){C}. The flowchart makes it clear that there are two paths through the code; one path is taken when the value of the expression b is true and the other path is taken when b is false. The code S is correct if both paths work correctly. The path taken when b is true and C is executed will give the correct result if {P && b}C{Q}. The path taken when b is false involves no code execution, so this path will give the correct result only if P && !b => Q. These two conditions cover all possible paths through the structure, so the rule of inference is the following:

{P && b} C {Q}

P && !b => Q

{P} if (b){C} {Q}

Example

Write a program segment to set maxval equal[3] to the larger of a and b. The assertions in the following code segment characterize the intended state at each point.

assert(true);

maxval = a;

assert(maxval == a);

if (b > a) maxval = b;

assert((maxval == a || maxval == b) && (maxval>=a && maxval>=b));

// maxval is the larger of a and b.

To show that this code works we must show that each of the two possible execution paths produces the correct result. The first path we consider is the one taken when the test succeeds (b > a evaluates to true). This case corresponds to the first hypothesis of the rule of inference. Written in terms of this program, the first hypothesis states that

{maxval == a && b > a}

maxval = b;

{maxval == b && b > a hence maxval is the larger of a and b}

The correctness of this code is obvious, but we'll be able to prove this one-line program meets its specifications after studying the Axiom of Assignment later in this chapter. The second hypothesis of the rule of inference is the assertion

(maxval == a && !(b > a)) => (maxval is the larger of a and b)

This assertion is true on the basis of algebra and the definition of the boolean operator =>.

End of Example

Example

The following program compares the values of a and b and, if a is greater than b, swaps them.

int old_a = a;

int old_b = b;

assert(a==old_a && b==old_b); // Precondition

if (a > b)

{

// Swap a and b.

int temp = a;

a = b;

b = temp;

}

assert(((a==old_a && b==old_b) || (a==old_b && b==old_a))

&& (a b}

This precondition implies that old_a > old_b. Thus we can expand the swap precondition to

{a==old_a && b==old_b && a>b && old_a>old_b}

Recall that old_a and old_b are mathematical variables, and their values do not change. Thus, after the swap, according to the specifications of the swap code,

{b==old_a && a==old_b && old_a>old_b}

which implies the postcondition of

{b==old_a && a==old_b && a < b}

which can be weakened to obtain the desired postcondition of

{b==old_a && a==old_b && a b)}

This is equivalent to

{a==old_a && b==old_b && a= 0

{x >= 0}

absval = x;

{((absval == x || absval == -x) && absval >=0)}

Later in this chapter we will study the Axiom of Assignment, which will give us the tools to show that these one-line program segments meet their specifications.

End of Example

We will not treat additional conditional control structures explicitly, but the rules of inference for variations such as

if (b1) {C} else if (b2) {E} else {F};

and the switch statement are easily constructed if you understand the rules given above. To show these statements correct, we decompose the statement into its constituent paths and show that no matter which path is taken, the postcondition holds.

4 Iteration Statements

Iteration and recursion are the muscle of programming. Without them, no section of program code could be executed more than once during program execution, and consequently, every program would terminate within a bounded time; that is, a program could not perform an unbounded amount of computation.

In other words, without iteration or recursion, programs would be of little use. But the power of iteration brings complexity: loops and recursion are where programming gets hard, and where programmers stumble. We will treat recursion later; here we will treat only the rules of inference for some basic forms of iteration. These rules reflect the power of loops. They are more complex (and more difficult to understand), largely as a consequence of the importance of an auxiliary assertion called the loop invariant. A loop invariant is an assertion that is placed within a loop body and is true each time the assertion is encountered. A loop invariant is usually an assertion whose meaning changes with each execution of the loop. Consider, for example, the following code to initialize the first 10 entries of an array B to 0:

assert (true);

int i=0;

while (true)

{

B[i] = 0;

// Invariant: (Aj: 0 {Q}

{P} while (true) {C if (b) break; D} {Q}

The three hypotheses of this rule of inference address loop initialization (to establish the invariant initially), preservation (of the loop invariant), and termination.

•Initialization: {P} C {INV} asserts that the invariant holds when it is encountered the first time through the loop if the precondition P was satisfied prior to entering the loop. In other words, if P is true before entering the loop, then after C is executed, the invariant will be true when the if (b) break; statement is first reached.

•Preservation: {INV && !b} D C {INV} asserts that when the loop body is traversed, starting and ending at the test, the invariant is re-established by execution of the loop body. This does not mean that the invariant remains true throughout the loop body. Rather, if the invariant starts out true and the loop body DC is executed, then INV is again true when we arrive back at the test. Note that the loop body is traversed only when b is false, so !b is a conjunct in the precondition for the code D C.

•Termination: {INV && b} => Q asserts that the invariant and the truth of the exit condition b together guarantee that the postcondition Q will be true.

Note that we define a loop invariant to be more than simply an assertion in a loop that is true each time it is encountered. That requirement would be satisfied by the trivial assertion true. An assertion is a loop invariant only if it makes all the hypotheses of the rule of inference true. Thus the loop invariant is a function not only of the code, but also the loop pre and postconditions.

Example 1:

Finding the sum of entries preceding the first 0 entry of an array B[0...n].

Suppose we know that some entry B[k] of B[0...n] is equal to 0, and we are to assign a variable named sum the sum of the entries of the subarray B[0..k-1]. If B[0] == 0, the value of the sum is to be 0.

If i is the loop index variable, the loop is naturally preceded by initialization code

int i = 0;

int sum = 0;

Then the precondition for the loop is

assert (sum == 0 && i == 0);

// and (E j: 0 Q

{P} do {C} while(b) {Q}

As always, the invariant goes right before the exit test. In a generalized loop that uses if(b)break; as the exit test, it is obvious where the invariant should go. However it is neither obvious nor easy to place the invariant properly in a while(b) or do loop. For the while(b) form there are two choices: we can put the invariant in twice, once before the while and once at the bottom of the loop. Or we can make the invariant part of the boolean expression that controls loop exit. The two forms are shown below. We assume that the invariant has been made into a method inv() that returns a boolean.

assert(inv()); // Invariant

while(b)

{

D

assert(inv()); // Invariant

}

while (inv() && b) // SC eval

{

D

}

The second form is more concise, although it is essential that the invariant be tested before the boolean expression b. Exactly why this is necessary is left as an exercise.

For the do loop with the test at the end, the invariant can be placed either at the end of the loop body or as part of the exit expression.

do

{

C

assert(inv());

}

while (b)

do

{

C

}

while (inv() && b) // SC eval

4 The for loop Structure

The rule of inference for the for loop is somewhat more complicated because of the loop index. We desire a rule that enables us to conclude the correctness of the claim

{P} for(int i = low; i = 0.

• Each execution of the loop body causes the value of t (when evaluated immediately prior to the execution of the loop exit statement) to decrease by at least 1.

If all these properties hold, then the value of t (evaluated immediately prior to the exit statement of a loop) will be an upper bound on the number of times the loop body will be executed; for this reason, a loop variant is sometimes called the bound function of a loop. But note that the bound need not be a 'tight' one; it need only have the properties described above to guarantee termination of a loop.

These rules can be integrated into the rule of inference for the loop, giving a rule that assures total correctness. We denote the loop variant expression by t.

{P} C {INV && t>=0}

{INV && !b and t == old_t} D C {INV && t < old_t}

{INV && t>=0}

{INV && b} => Q

{P} while (true) {C if (b)break; D} {Q}

Note that because t is integer-valued, the condition that t < old_t (in the second hypothesis) is equivalent to t = 0 && i == 0); // loop precondition

while(true)

{

// Invariant

assert(0 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download