Chapter Three - Texas Tech University



Chapter Three

Language Semantics Design

Introduction.

The syntax of a target language defines the valid ways in which one can combine elements of the language vocabulary to form grammatically correct sentences. Semantic definitions of a target language define the way in which one determines the meaning of a grammatically correct sentence. In chapter two a metalanguage is described that can precisely define the syntax of a language. Introduced in the present chapter are two metalanguages, which are often used for providing precise definitions of the semantics of target languages.

Just as there is an abstract definition for the syntax of a language, there are abstract definitions giving the semantics of a language. Recall that the abstract syntax of a language is a set of rules which, with reference to the language vocabulary, define how one produces valid sentences in the language. Likewise, the reader will see that the semantic definitions of a language reference its abstract syntax rules to define precisely the meaning of the syntactic structure.

Precise syntactic and semantic definitions are an important attribute of a language. Languages defined according to precise, mathematical methods of syntactic and semantic description are called formal languages. A language that does not have precise semantics is not formal.

In the absence of a precise definition a language is likely to be syntactically and semantically ambiguous. Natural (or spoken) languages are not defined precisely. A good example of the ambiguity that results is found in the English word "nurse." The word is syntactically ambiguous because the word can be a noun or a verb. Naturally the meaning of the word is also ambiguous in the absence of clues from the context. For example, a statement "nurse!" could be viewed as a command to nurse (given an implied subject such as the word "you") or as someone calling out for a nurse's attention (assuming an implied verb such as the word "come").

The purpose of a programming language is to provide instructions to a computer. Computers work as automatons. If a programmer tells the computer to perform task A, the computer does not have the wherewithal to stop and ask the programmer what is meant when he says perform task A. An automaton must carry out its work without clarification. Therefore, the meaning of the computer instructions must be absolutely clear. Precise semantics are required.

Once the syntax of a target language is defined formally, one can develop the target language's semantic description. In this chapter, two approaches to semantic definition are presented. The denotational approach is presented first and is based upon Lattice Theory and a Lambda Calculus-based metalanguage. The axiomatic approach to semantic definition is based upon a logic-like language.

Denotational Semantics.

The denotational approach to language definition is based upon Lattice Theory. A Lattice is a set of elements closed under two binary operations which are commutative, associative, idempotent, and which satisfy the absorption laws. The denotational semantics were developed by Scott-Strachey. [Stoy]

A Lattice is a partially ordered set S, which has a least upper bound called the supremum of S and a greatest lower bound called the infimum of S. The supremum and infimum of S are where elements of S join and meet, respectively. Consider the following example which depicts in a Hasse Diagram the partial ordering of the subset relation on the power set of {a,b,c}, i.e., P{a,b,c}:

[pic]

Assume that the node containing {a,b,c} is at level 0 of the diagram and that the null set {} is at level 3. One can see that the subsets of a level i set are found in level i+1. The edges of the diagram connect a set with its corresponding subsets - the subset relation is the basis of the ordering. The join of the P{a,b,c} is the least upper bound, I.E., the set {a,b,c} while the elements meet in the greatest lower bound of P{a,b,c}, I.E., the null set.

The notion of containment is essential in the Lattice. Whenever information is ordered by the subset relation, the ordering implies the amount of information contained in an element. For example, since {a,b,c} ( {b,c} the set {a,b,c} contains more information. The integers are ordered in the following way:

[pic]

The integers are ordered as a "flat" set. The numerical ordering relation ( does not convey the notion of containment, i.e., ⊇. For example, the integer 3 does not "contain" the information contained in the integer 2. The special elements top and bottom, T and ⊥, respectively, indicate all values (a contradictory situation) and no values, respectively. These elements form the least upper bound and greatest lower bound  of the Integer Lattice.

The goal of the denotational approach is to identify functions which map from a syntactic domain to a semantic domain (where domains may be infinite or finite). The semantic domains are based upon the mathematical notion of the Lattice – I.E., semantic domains are not sets, they are Lattices. Before mappings can be defined, we must first build up the domains involved in the mappings. We must also develop an understanding of another metalanguage, which precisely defines how the functions perform their mappings.

To demonstrate, we will first develop a small example language LE for arithmetic expressions:

E::= E + E | E - E | E * E | E / E | (E) | C

C ::= DC | D

D ::= 0 | 1 | 2 | ... | 9

This grammatical definition corresponds with the BNF definitions from the previous chapter. Notice that there are no variables in this language; assume only integer constants. The syntactic domains of this language are:

SYNTACTIC DOMAINS:

C: Cons

E: Exp

D:Digits

The syntactic domains correspond to the LHS's of the productions. This is always the case. For every LHS of a production there is a syntactic domain. It seems clear that the mapping from one of LE’s syntactic objects (either C, E, or D), yields a semantic object from the domain of integers:

ν : N = {...,-2,-1,0,1,2,....}

The complete semantic domain is referred to by N while ν is a metavariable that can take on any possible integer value. Semantic domains require two additional elements to denote the possibility of erroneous and undefined values. These values are represented by two special symbols - the top and bottom of a Lattice, respectively:

Top: T

Bottom: ⊥

It is now possible to denote the semantic domain:

SEMANTIC DOMAIN:

ν : N = {T,...,-2,-1,0,1,2,....,⊥}

This latter definition often appears as {...,-2,-1,0,1,2,...}° where the degree symbol ° denotes inclusion of ⊥ and T.

The next step involves the declaration of a function. The general form of a function declaration is:

( : LHS → Semantic_domain

where ( is a semantic function name, LHS is a LHS of a BNF rule, and Semantic_domain is a previously defined semantic domain.

Below, the semantic functions which will map from the syntactic domain of Exp to the semantic domain of N, and from Cons to N are declared:

SEMANTIC FUNCTION DECLARATION:

ε : Exp → N

ι : Cons → N

δ : Digits → N

Given an expression or a constant based upon the BNF above, a meaning is construed in terms of integers:

ε : Exp → N

The expression above declares which syntactic domain is mapped to which semantic domain. The expression indicates that there is a function ε which maps from the syntactic Exp to the semantic N. It is now necessary to describe how the mapping is to take place. Semantic equations are developed for the semantic function describing how the mapping will take place.

Recall that for every LHS of the BNF there is a syntactic domain. It is also true that for every RHS option of the BNF there is a semantic equation. Semantic equations are of the general form:

f « RHS » = semantic_definition

The symbol f is a semantic function previously declared. The RHS is a syntactic object as defined by a RHS option for the BNF. The = sign is a replacement operator. In determining the meaning of an instruction, if one can match f«RHS» precisely, he can replace f «RHS» with the corresponding semantic_definition. The relationships between BNF, domains, semantic function declaration, and semantic equations are given in figure 1.

A semantic equation states precisely how the mapping is to be performed. The operators «» are employed to enclose and isolate syntactic objects on the LHS of the semantic equations. The equations should be read as: the syntactic object in «»'s on the LHS of the "=" is defined and replaced by the expression on the RHS of the "=."

SEMANTIC FUNCTION EQUATIONS:

a. ε « E1+E2 » = ε « E1 » + ε « E2 »

b. ε « E1-E2 » = ε « E1 » - ε « E2 »

c. ε « E1*E2 » = ε « E1 » * ε « E2 »

d. ε « E1/E2 » = ε « E1 » / ε « E2 »

e. ε « (E) » = ε « E »

f. ε « C » = ι « C »

g. ι « CD » = ((10 × ι « C » ) + δ « D » )

h. ι « D » = δ « D »

i. δ « 0 » = 0

j. δ « 1 » = 1

k. δ« 2 » = 2

: :

r. δ « 9 » = 9

The semantic equations define how to map from a syntactic specification of integer arithmetic to integers and integer arithmetic. For example, the equations i-r map from a syntactic integer digit to an integer. Equations f-h define how to map from multi-digit syntactic integers to multi-digit semantic integers. Equations a-f map from a specification of integer operations to the actual integer operations.

[pic]

Figure 1. Relationships between Syntax and Semantics.

Consider the derivation of an integer. The mapping below is from a syntactic representation of a number to a semantic representation. Intuitively, the mapping can be viewed as analogous to a program - you may have written yourself – that converts an ASCII character representation of an integer to an integer representation – I.E., a representation upon which you can perform numerical computations.

In the first step, one wishes to know the meaning of the syntactic characters 2345.

ι « 2345 » = ((10 × ι « 234 » ) + δ « 5 » )

The function reference matches the LHS of semantic equation g. The reference ι «2345 » is replaced by an appropriate instantiation of g's RHS seen on the right, above. The resulting expression includes two references to semantic functions, ι «234 » and δ«5 ». These references form the LHS of equations g and n, respectively. The references result in the following substitutions:

= ((10 × ((10 × ι « 23 » ) + δ « 4 » )) + 5 )

The references ι « 23 » and δ «4» result in appropriate replacements from equations g and m:

= ((10 × ((10 × ((10 × ι « 2 » ) + δ « 3 » )) + 4 )) + 5 )

The references to ι« 2» and δ «3 » result in the following replacements according to equations h and l, respectively.

= ((10 × ((10 × ((10 × δ « 2 » ) + 3 )) + 4 )) + 5 )

Now the final substitution is made according to equation k. Since there are no further references to semantic functions, the expressions are evaluated according to the standard definitions of integer arithmetic:

= ((10 × ((10 × ((10 × 2 ) + 3 )) + 4 )) + 5 )

= ((10 × ((10 × (20 + 3 )) + 4 )) + 5 )

= ((10 × ((10 × 23) + 4 )) + 5 )

= ((10 × (230 + 4 )) + 5 )

= ((10 × 234) + 5 )

= (2340 + 5 )

= 2345

Now consider the meaning of an arithmetic expression. Above the equals sign (i.e., =) is the letter of the semantic equation used in a translation step. In a string of equalities:

e1 =a e2 =b e3 =c ... =d en-1 =e en

The expression e1 matches the LHS of equation a and is replaced by e2 which matches the LHS of equation b  and is replaced by e3 which matches the LHS of equation c, etc. For brevity, when equations may apply in parallel, an &-sign is used - when the same equation may apply to several terms, a factor is assigned the equation:

ε «(3+2) / (4-2) » =d

ε «(3+2)» / ε «(4-2)» =2( e

ε «3+2» / ε «4-2» =a&b

ε«3»+ ε«2» / ε«4»- ε«2» =4( f

ι«3»+ ι«2» / ι«4»- ι«2» =4( h

δ«3»+ δ«2» / δ«4»- δ«2» =l,k,m,k

3+2 / 4-2

Even with the formal definitions there is ambiguity in the result: 3+2 / 4-2. Precise definitions can lead to ambiguity if not developed properly. The question arises: is 2.5 (or 2 since we are dealing with integers) the answer, or is it 1.5 (or 1)? This ambiguity flies in the face of our goal, which is to obtain a precise definition of infix integer arithmetic.

To solve this problem of ambiguity, we will alter the BNF for expressions, the semantic equations, and we will also adopt two conventions for evaluating the equations. Instead of a breadth-first application of semantic equations, let us do depth-first and adopt the additional convention that integer expressions and subexpressions are to be evaluated at the earliest possible point in the derivation.

Since the semantics of integer data is now well understood, future examples will refer to integers simply as cons – and we will assume the syntactic construction and semantic derivation of integers. Specifically, this abbreviation is meant to summarize the earlier syntax, which describes how to syntactically construct an integer. Likewise, the semantics will summarize the equations g-r which state how to interpret the syntactic representation of the integer. Consider, the BNF as altered to separate additive and multiplicative operations:

E::= T | E + T | E - T

T::= F | T * F | T / F

F::=(E) | cons

This set of equations results in the following domains:

Syntactic Domains:

E: Exp

T: Ter

F: Fac

C: Cons

We will continue to map to the semantic domain of integers:

SEMANTIC DOMAIN:

ν : N = {⊥,...,-2,-1,0,1,2,....,T}

Due to the fact that there now exist three syntactic categories from the BNF LHS's, we need to declare three semantic functions:

SEMANTIC FUNCTION DECLARATION:

ε : Exp → N

α : Ter → N

β : Fac → N

Next we elaborate the semantic equations for all RHS's for a given BNF syntax rule. In equation g the RHS is altered by placing parentheses around the semantic expression, thus increasing its precedence for evaluation. In equation h the integer corresponding to the syntactic integer is selected - summarizing informally equations g-q in the original semantics.

SEMANTIC EQUATIONS:

a. ε «E + T» = ε « E » + ε « T »

b. ε « E - T » = ε « E » - ε « T »

c. ε « T » = α « T »

d. α « T * F» = α « T » * α « F »

e. α « T / F » = α « T » / α « F »

f. α « F » = β « F »

g. β «(E) » = ( ε « E » )

h. β « cons » = element of N corresponding to cons

Now consider the evaluation of an expression, employing the two conventions mentioned earlier and the altered semantic equations. The equation letter over the = sign indicates the pertinent LHS/RHS relationships.

ε « (3+2) / (4-2) » =c

α « (3+2) / (4-2) » =e

α « (3+2) » / α « (4-2) » =f

α « (3+2) » / β « (4-2) » =g

α « (3+2) » / ( ε «4-2 » ) =b

α « (3+2) » / ( ε «4 »- ε«2» ) =2( c

α « (3+2) » / ( α«4 »- α«2» ) =2( f

α « (3+2) » / ( β«4 »- β«2» ) =2( h

α « (3+2) » / ( 4 - 2 ) =math

α « (3+2) » / 2 =f

β « (3+2) » / 2 =g

( ε «3+2 » ) / 2 =a

( ε «3 »+ ε «2 » ) / 2 =2( c

( α«3 »+ α«2 » ) / 2 =2( f

( β«3 »+ β«2 » ) / 2 =2( h

( 3 + 2 ) / 2 =math

5/2 =math

2

In the next section we develop the metalanguage employed for more complicated semantic equations.

The Denotational Semantic Metalanguage

In this section, much of the metalanguage employed in denotational semantic defintions is presented. In the previous section we observed simple syntactic and semantic domains. It is possible to construct more complicated domains from existing domains with the product and sum operators: × and +, respectively.

Product Domains:

Given domains D1,D2,...,Dn it is possible to form a new domain N = D1× D2×...× Dn which gives a set of ordered n-tuples: where for each i: di∈Di . For example,

Given, B = {t,f,⊥,T} and V = {a,b,c}°

B × V = {,, ,,, }°

The down-arrow is an operator which permits access to the components of a tuple formed by a product operation, t ↓ i. In this operation, t is a tuple of a product domain and i is an integer. Given N = D1× D2×...× Dk , n ∈ N, and di ∈ Di then n ↓ i = di .

There is a special form of a product domain denoted by Dn where, Dn = D1× D2×...× Dn iff D1’ D2’...’ Dn . This set is to be distinguished from the * as a superscript. The set D* is the set of all finite strings one may form using the elements of D. For example, assume D = {t,f}°:

D* = {t,f,tt,tf,ff,ft,...}°

Dn = {,,,,,,,}°

where n = 3

Therefore, D* = D0 ∪ D1 ∪ D2 ∪ ... ∪ Di ∪ ...

where D0 = ∈ - the empty string

Consider the following example where B ={t,f}°, L = {a,b,c}°, and P = B × L :

Sum Domains:

Given domains D1,D2,...,Dn it is possible to form a new domain S = D1+ D2+...+ Dn which gives a set of elements, where for any element x ∈ S, there is some i such that x ∈ Di. For example, let I be the set of integers, C the set of alphabetic characters, and R the set of real numbers. One possible use of these sets and the metalanguage operators is to specify the output domain of a program language. The output specification (I + C*+ R)* would indicate a language capable of programs that can produce strings of integer, real, and character strings as output.

The example domain (I + C*+ R)* warrants inspection. Inside the parentheses one may observe that an output can be an integer (I), a character string (C*), or a real number (R). The quantity thus formed may itself allow for strings, i.e., we can obtain any string one can form from the sum domain itself. Thus, the output of a program can be any conceivable combination of integer, character, or real strings. Included in the output domain are the following:

5

10.43

a

the value of r is 5 and the value of s is 8.95

etc

Consider the following example where B ={t,f}°, L = {a,b,c}°, and P = B + L :

Functions:

Since the goal of the semantic definitions is to map from a syntactic to a semantic domain, function definition is central to our concern. Functions are declared according to the general form:

F = D1 → D2

This definition is akin to a type declaration. In the declaration one is stating that there is a class of functions, that given an element of D1, the function will map to an element of D2. It is possible to identify a specific function f, which is an element of a function class:

f : F = D1 → D2

So that f(d1) = d2 assuming d1 ∈ D1 and d2 ∈ D2 and that there is a mapping between d1 and d2. Expressions for domains can be complicated:

D1 × D2 ×...× Dn → Dn+1 + Dn+2

The precedence relationships in order of importance are ×, +, and →. The mapping symbol, → is right associative.

Function Domains.

Frequently it is necessary to define function domains, which map domain elements to functions:

D1 → (D2 → D3)

Assume that C represents the syntactic representation of an integer and that N represents the semantic representation, I.E., the set of integers. Operators is a set of arithmetic operators {*, /, +, -}. Now consider the following function domain definition:

S=Operators → (C × C → N)

s: S=Operators → (C × C → N)

So that s [+ (4,3) ] selects the addition function which, in turn, maps the pair (4,3) to 7.

List Operations.

Recall that Dn denotes all lists of length n that can be formed using the elements of D. Furthermore, recall that D* = {nil} + D1+ D2+ ... represents all possible finite lists of a domain, D. Special operators exist to manage lists serving as operands of the denotational semantic definitions:

hd(L) D*→ D

tl(L) D*→ D*

append(L,d) D* × D → D*

prefix(d,L) D × D* → D*

Given L=nil then both the hd and the tl function return top, T, an erroneous condition. Notice that tl([20]) evaluates to nil, but one will never see a nil from the hd operation. Suppose D=(N+C*) where N is the domain of integers and C is the domain of characters. D* could thus represent an output file O=D*. The process of writing to an output file could be represented by the append. An input file I=(CN + CC*)* (where CN and CC are syntactic representations of integers and characters, respectively) could be represented by two operations: the input value obtained with hd, and the subsequent input file obtained by the tl:

append(O,d) For output;

hd(I) For an input value;

tl(I) For the input file resulting from an input operation.

Consider the following examples where I=[7,8,9]:

append(I,10) = [7,8,9,10]

prefix(I,6) = [6,7,8,9]

prefix(append(I,10),6) = [6,7,8,9,10]

hd(I) = 7

tl(I) = [8,9]

Domain Operations.

It is common to have identifiers which map to locations (i.e., memory locations) or to procedures (i.e., subprograms). In order to represent identifiers as a domain consider the following definition:

Id ’ L + P where L is a set of locations and P is a set of procedures.

Given an identifier it is possible to map to locations or to procedures. Special operators exist to allow one to manipulate the Sum domains exemplified above. For these definitions assume the general sum domain S=D1+...+Dn , P is a set of named programmed segments, L is a set of named locations, and Id = P + L:

Operator d in S.

If d is an element of Di then d in S provides an element of S corresponding to d. This operation allows one to "inject" elements of the smaller domain into the larger, e.g., during declarations of procedures and variables. Therefore, when a procedure, p, is declared, one would use the expression p in Id. The in operator is not to be confused with the membership relation: instead of testing for membership, in is forcing membership. From a practical standpoint, this operation defines the use of an identifier once its use is declared.

Operator s | Di .

If s is an element of S then s | Di provides the element of Di corresponding to s. This operation allows one to "project" elements of the smaller domain from the larger, e.g., during references. When a variable id is referenced, one would use the expression id in L in order to obtain the location of id. Projection on the name of a procedure would obtain an appropriate segment of code, in the case of a procedure name, or a memory location, in the case of a variable.

Operator s E Di .

If s is an element of S then s E Di iff s is squarely in s E Di . This operation is used to guard actions of injection and projection, to make certain that an object is where one thinks.

As in the preceding, assume Id = P + L. Consider the following example:

if id E P then id in P + L

In this example, imagine that one is capturing the meaning of a procedure declaration. In the next example, one is obtaining a named segment of code in the presence of a reference to a procedure:

if id E P then id|P

Replacement Operation.

There is an operator to allow for the replacement of information in the context of a function.

f[a/b] denotes function f’ which is identical to f except in f’, b maps to a.

Given f[a/b], one subsequently finds that f[b]=a. In the denotational semantics one can define a mapping with f[a/b] and then reference it with f[b].

Semantic Equations.

The goal of the semantic definitions is to map from a syntactic object to a semantic object. There are two important operators found in a semantic equation. First, there is a special set of brackets used only to encapsulate (and thus isolate) syntactic objects:

« syntactic object » .

One can state conditional semantic equations using the operator if-then. Consider the following semantic equation, which obtains the value of a variable:

μ « Ξ »σ = if Ξ E L then σ[Ξ|L] else T

This equation states that the meaning of the variable evaluates to one of two things: if the variable is in the domain of locations L, then the meaning of the variable is projected from the locations domain with respect to the current state (i.e., one will obtain the value currently in the location to which the variable is mapped); otherwise an erroneous condition exists. Notice that the semantic equation is similar to a lambda function. The fundamental notion is one of replacement. When the LHS of an equation is matched, e.g., μ «Ξ»σ, it is replaced by the RHS, e.g., if Ξ E L then σ[Ξ|L] else T. The arguments of the function are Ξ and σ. The general form of a semantic equation is:

LHS = RHS OR

LHS = if CONDITION then RHS(true) else RHS(false)

More Interesting Examples

A semantic metavariable, E.G., Ξ, can take on the value of any legal variable in the target language (legal based upon the BNF - syntax definitions). Suppose the following BNF rule exists:

Ξ ::= aΞ | bΞ |...| zΞ | a | b |...| z

Let’s apply the knowledge we have gained so far by refining our language of arithmetic expressions to allow for variables. To do this, we will need a function that allows us to map from the syntactic variable domain to a semantic domain containing values to which the variable may be assigned. This mapping must be done with respect to a state σ: σ[Ξ] yields the value to which Ξ is assigned in state σ. Augmenting the original syntax for arithmetic expressions, presented at the beginning of this chapter, one obtains arithmetic expressions, which provide for the use of variables:

New Syntax:

S ::= Ξ := E

E::= T | E + T | E - T

T::= F | T * F | T / F

F::= (E) | cons | Ξ

Since cons and Ξ should both be viewed as syntactic categories, it can be claimed that each syntactic category is a Syntactic Domain, i.e., a class of syntactic objects from which one will map to semantic objects. However, we will not declare them as syntactic domains, since the syntactic construction of variables and constants is well understood.

New Syntactic Domain:

S:Stmt

E:Exp

T:Ter

F:Fac

From the syntactic domain, one maps to semantic domains. Below, the semantic domains are presented:

New Semantic Domain:

σ : St = Ξ → N

ν : N = {...,-2,-1,0,1,2,....}°

For each LHS in the BNF, we declare a semantic function. In order to understand the semantic function declarations, let us consider the example from below: Σ : Stmt → σ → σ . The semantic function symbol ( is declared to take as input a statement from the syntactic domain and, based upon the statement, map from an input to some an state. Each function declared maps from a syntactic domain to a semantic domain:

Semantic Functions:

Σ : Stmt → St → St

ε : Exp → St → N

α : Ter → St → N

β : Fac → St → N

Considering the declarations, one can better understand the corresponding semantic equations. For example, given the function declaration Σ : Stmt → σ → σ , there is one equation (j.) corresponding to the single RHS option for a statement in the syntax. The LHS of the semantic equation contains the syntactic RHS option used to select the equation, and the additional argument, σ, to be used as the current state. The RHS of the semantic equation describes how the mapping is to be accomplished and is used to replace the LHS of the semantic equation with, in the case of equation (j.), a new state:

Semantic Equations:

a. ε « E + T »σ = ε « E »σ + ε « T »σ

b. ε « E - T »σ = ε « E »σ - ε « T »σ

c. ε « T »σ = α « T »σ

d. α « T * F»σ = α « T »σ * α « F »σ

e. α « T / F »σ = α « T »σ / α « F »σ

f. α « F »σ = β « F »σ

g. β « (E) »σ = ( ε « E »σ)

h. β « cons »σ = element of N corresponding to cons

i. β « Ξ »σ = σ[ Ξ ]

Equation (i) obtains the value associated with a variable. Given the full set of equations here, if a variable is undefined a null value is obtained – not undefined, but null.

j. Σ « Ξ := E »σ = σ[ε « E » σ / Ξ ]

This equation (j) assigns the meaning of the expression E (on the RHS of the assignment operator in the syntactic structure) to the variable (on the LHS).

ε « (a+10)/d »σ ’c

α « (a+10)/d»σ ’e

α «(a+10) »σ / α «d »σ ’f

β « (a+10) »σ / α «d »σ ’f

β « (a+10) »σ / β «d »σ ’i

β « (a+10) » σ / 6 ’g

( ε «a+10»σ ) / 6 ’a

( ε «a»σ + ε «10 »σ ) / 6 ’2( c

( α «a»σ + α «10 »σ ) / 6 ’2( f

( β «a»σ + β «10 »σ ) / 6 ’i

( 14 + β «10»σ ) / 6 ’h

( 14 + 10 ) / 6 ’math

24 / 6 ’math

4

In two steps translations were carried out concurrently, marked by 2( c and 2( f. This marking is meant to denote that equation c was applied to two references: ε«a»σ and ε «10»σ, respectively; and equation f was applied to α«a»σ and α«10»σ , respectively.

More Interesting Examples

In this section, the grammar and semantics above are extended to specify the syntax and meaning of the while-language, which was originally introduced in the previous chapter.

Syntax:

P::= S

S ::= Ξ := E | read(Ξ) | write(Ξ) | while C do S od | S;S

C ::= E1 < E2 | E1 > E2 | E1 = E2 | E1 = E2 | E1(E2

E::= T | E + T | E - T

T::= F | T * F | T / F

F::=(E) | cons | Ξ

Syntactic Domains:

P:Prog

C:Cond

S:Stmt

E: Exp

T: Ter

F:Fac

Semantic Domains:

τ : B={true,false}°

φ : Fi = N*

γ : CF = St × Fi × Fi

σ : St = Ξ → N

ν : N = {...,-2,-1,0,1,2,....}°

The most complicated semantic domain is the configuration γ : CF = St × Fi × Fi. The configuration is a 3-tuple where the first component is the state of the computation, the second is the state of the input file, and the third is the state of the output file. The three components can be obtained by γ ↓ 1 ’ σ, γ ↓ 2 ’ φ1, and γ ↓ 3 ’ φ2, respectively.

Semantic Function Domain:

Μ : Prog → Fi → Fi

Σ : Stmt → CF → CF

ζ : Cond → St → B

ε : Exp → St → N

α : Ter → St → N

β : Fac → St → N

Semantic Equations:

a. ε « E + T »σ = ε « E »σ + ε « T »σ

b. ε « E - T »σ = ε « E »σ - ε « T »σ

c. ε « T »σ = α « T »σ

d. α «T * F»σ = α « T »σ * α « F »σ

e. α «T / F »σ = α « T »σ / α « F »σ

f. α « F »σ = β « F »σ

g. β « (E) »σ = ( ε «E »σ)

h. β « cons »σ = element of N corresponding to cons

i. β « Ξ »σ = σ[ Ξ ]

j. ζ «E1 < E2»σ = if ε « E1 »σ < ε « E2 »σ

then true else false

k. ζ «E1 > E2»σ = if ε « E1 »σ > ε « E2»σ

then true else false

l. ζ «E1 = E2»σ = if ε « E1 »σ = ε « E2 »σ

then true else false

m. ζ «E1 = ε « E2 »σ

then true else false

o. ζ «E1 ( E2»σ = if ε « E1 »σ ( ε « E2 »σ

then true else false

Since ε « E1 »σ ( ε « E2 »σ (where ( can be =,,>=,

The assignment is a fairly sophisticated statement. The state of the machine is modified by assigning to the variable Ξ on the LHS of the assignment operator, the result of determining the meaning of the expression on the RHS. The expression ε «E»γ↓1 invokes the evaluation of the arithmetic expression, which makes use of the state (i.e., γ↓1), and not the input or output files. Clearly, the assignment has no effect on the input or output files, since these components are copied from the input to the output configuration.

q. Σ « read(Ξ) »γ = if φ1 ( nil

then

else

T

Where γ ↓ 1 ’ σ & γ ↓ 2 ’ φ1 & γ ↓ 3 ’ φ2

As you will see, the assignment and the read statements have quite a bit in common. They are the only statements that modify the contents of the state, i.e., replace the value of a variable. The difference between the statements is twofold. The value to be assigned to a variable is obtained through the evaluation of an arithmetic instruction – see equation p. The value to be read into a variable is obtained from the second component of the configuration – the input file. The second difference is that the assignment statement affects only the state of the machine. The read statement affects the state of the machine and the input file. The head of the input file replaces the value of the referenced program variable and the tail of the input file provides the new state obtained by the evaluation of the read statement.

r. Σ « write(Ξ) »γ = if σ[Ξ] ( ⊥

then

else

T

Where γ ↓ 1 ’ σ & γ ↓ 2 ’ φ1 & γ ↓ 3 ’ φ2

s. Σ «while C do S od »γ = if ζ «C »γ ↓ 1 then

Σ «while C do S od » ° Σ «S »γ else γ

The while statement’s meaning is conditional. If the condition C evaluates to false, which it should at some point, the input configuration is passed through (if ... else γ). When the condition evaluates to true, the meaning of the statement(s) S, comprising the body of the loop, are evaluated. The resulting configuration is appended (by °) to a recursive reference to the while statement that served as the input.

t. Σ « S1;S2 »γ = Σ «S2» ° Σ «S1» γ

Equation (t) reverses the order of the statements so that they can be evaluated in the proper sequence. Normally we would expect the evaluation to proceed from left to right. However, the equations are evaluated like lambda expressions – from right to left – where the evaluation of the rightmost expression provides the arguments for the next-to-rightmost operation. An interesting fact about equation (t) is that any number of statements can be reversed by repeated applications. For example, consider three statements a, b, and c:

Σ « a,b,c »γ =t Σ « c » ( Σ «a,b»γ =t Σ « c » ( Σ «b» ( Σ «a»γ

There is another possible order of applications, but the end result is the same:

Σ « a,b,c »γ =t Σ « b,c » ( Σ «a»γ =t Σ « c » ( Σ «b» ( Σ «a»γ

Equation (u) provides for the initial state of the machine:

u. Μ « P »φ1 = (Σ «P » ) ↓ 3

This equation sets the output file to nil, copies the input file provided by the user to second component of the configuration, and – through the use of a lambda function- sets all variables appearing in program P to undefined. Recall from in our earlier language, that the variables were null.

Review of the equations.

Let’s review the equations just presented. Notice, in particular, the equations for the read, write, and the while. The read checks to see if the input file is not nil. If not nil, then the head of the file is assigned to the variable referenced in the read, σ[ hd(φ1)/Ξ]. Next the remainder of the list is left for the input file, tl(φ1). The read statement is much like an assignment in terms of its affect on the state of a computation. From the equations one can clearly see that the only statements altering the state of the machine are the read and the assignment statements.

The write appends the meaning of the referenced variable to the end of the output file, append(φ2,σ[Ξ]). As such it has no impact on the state of the machine.

The while statement is a complicated definition. It first checks the condition referenced in the while statement:

Σ «while C do S od » γ = ζ «C »γ ↓ 1

Since ζ requires only the state, the first component of the configuration γ is obtained with γ ↓ 1. If the condition is satisfied, the meaning of the statements of the while (i.e., S in while C do S od) is obtained:

Σ « S »γ

The evaluated body of the while forms a configuration to be used when one reconsiders the while loop. The totality of the semantic is denoted by:

Σ «while C do S od » ° Σ « S » γ

When the condition C is not satisfied, the while statements are bypassed. The bypass is effected by replacing the while statement reference Σ «while C do S od » with the input configuration, γ. Consider the derivation of the following program:

M« read(n); i:=1; while i ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download