Principles of Programming Languages Version 1.0 - Johns Hopkins University

Principles of Programming Languages Version 1.0.3

Mike Grant Zachary Palmer

Scott Smith



i Copyright ? 2002-2020 Scott F. Smith. This work is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

This document was last compiled on April 22, 2021.

Contents

1 Introduction

1

2 Operational Semantics

3

2.1 A First Look at Operational Semantics . . . . . . . . . . . . . . . . . . . . 3

2.2 BNF grammars and Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.1 Operational Semantics for Logic Expressions . . . . . . . . . . . . 5

2.2.2 Abstract Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.3 Operational Semantics and Interpreters . . . . . . . . . . . . . . . 12

2.3 The F Programming Language . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.1 F Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.2 Variable Substitution . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.3 Operational Semantics for F . . . . . . . . . . . . . . . . . . . . . 20

2.3.4 The Expressiveness of F . . . . . . . . . . . . . . . . . . . . . . . 28

2.3.5 Russell's Paradox and Encoding Recursion . . . . . . . . . . . . . 32

2.3.6 Call-By-Name Parameter Passing . . . . . . . . . . . . . . . . . . . 38

2.3.7 F Abstract Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.4 Operational Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.4.1 Defining Operational Equivalence . . . . . . . . . . . . . . . . . . . 43

2.4.2 Properties of Operational Equivalence . . . . . . . . . . . . . . . . 45

2.4.3 Examples of Operational Equivalence . . . . . . . . . . . . . . . . 46

2.4.4 The -Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3 Tuples, Records, and Variants

49

3.1 Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.2 Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.2.1 Record Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.2.2 The F R Language . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.3 Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3.1 Variant Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.3.2 The F V Language . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4 Side Effects: State and Exceptions

57

4.1 State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.1.1 The F S Language . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.1.2 Cyclical Stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.1.3 The "Normal" Kind of State . . . . . . . . . . . . . . . . . . . . . 65

ii

CONTENTS

iii

4.1.4 Automatic Garbage Collection . . . . . . . . . . . . . . . . . . . . 65 4.2 Environment-Based Interpreters . . . . . . . . . . . . . . . . . . . . . . . . 66 4.3 The F SR Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3.1 Multiplication and Factorial . . . . . . . . . . . . . . . . . . . . . . 68 4.3.2 Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.4 Exceptions and Other Control Operations . . . . . . . . . . . . . . . . . . 72 4.4.1 Interpreting Return . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.4.2 The F X Language . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.4.3 Implementing the F X Interpreter . . . . . . . . . . . . . . . . . . 76

5 Object-Oriented Language Features

78

5.1 Encoding Objects in F SR . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.1.1 Simple Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.1.2 Object Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.1.3 Information Hiding . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.1.4 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.1.5 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.1.6 Dynamic Dispatch . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.1.7 Static Fields and Methods . . . . . . . . . . . . . . . . . . . . . . . 90

5.2 The F OB Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.2.1 Concrete Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.2.2 A Direct Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.2.3 Translating F OB to F SR . . . . . . . . . . . . . . . . . . . . . . 95

6 Type Systems

98

6.1 An Overview of Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.2 TF : A Typed F Variation . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.2.1 Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.2.2 The TF Language . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.3 Type Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6.4 Types for an Advanced Language: TF SRX . . . . . . . . . . . . . . . . 108

6.5 Subtyping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.5.2 The STF R Type System: TF with Records and Subtyping . . . 113

6.5.3 Implementing an STF R Type Checker . . . . . . . . . . . . . . . 114

6.5.4 Subtyping in Other Languages . . . . . . . . . . . . . . . . . . . . 114

6.6 Type Inference and Polymorphism . . . . . . . . . . . . . . . . . . . . . . 115

6.6.1 Type Inference and Polymorphism . . . . . . . . . . . . . . . . . . 115

6.6.2 An Equational Type System: EF . . . . . . . . . . . . . . . . . . 115

6.6.3 PEF : EF with Let Polymorphism . . . . . . . . . . . . . . . . . 120

6.7 Constrained Type Inference . . . . . . . . . . . . . . . . . . . . . . . . . . 123

7 Concurrency

126

7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

7.1.1 The Java Concurrency Model . . . . . . . . . . . . . . . . . . . . . 127

7.2 The Actor Model and AF V . . . . . . . . . . . . . . . . . . . . . . . . . 128

CONTENTS

iv

7.2.1 7.2.2 7.2.3 7.2.4 7.2.5 7.2.6

Syntax of AF V . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Operational Semantics of Actors . . . . . . . . . . . . . . . . . . . 130 The Local Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 The Global Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 The Atomicity of Actors . . . . . . . . . . . . . . . . . . . . . . . . 132

8 Compilation by Program Transformation

133

8.1 Closure Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

8.1.1 The Official Closure Conversion . . . . . . . . . . . . . . . . . . . . 137

8.2 A-Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

8.2.1 The Official A-Translation . . . . . . . . . . . . . . . . . . . . . . . 140

8.3 Function Hoisting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

8.4 Translation to C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

8.4.1 Memory Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8.4.2 The toC translation . . . . . . . . . . . . . . . . . . . . . . . . . . 150

8.4.3 Compilation to Assembly code . . . . . . . . . . . . . . . . . . . . 153

8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

8.6 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

8.7 Garbage Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

Bibliography

155

Index

157

Chapter 1

Introduction

In this book, our goal is to study the fundamental concepts in programming languages, as opposed to learning a range of specific languages. Languages are easy to learn, it is the concepts behind them that are difficult. The basic features we study in turn include higher-order functions, data structures in the form of records and variants, mutable state, exceptions, objects and classes, and types. We also study language implementations, both through language interpreters and language compilers. Throughout the book we write small interpreters for toy languages, and in Chapter 8 we write a principled compiler. We define type checkers to define which programs are well-typed and which are not. We also take a more precise, mathematical view of interpreters and type checkers, via the concepts of operational semantics and type systems. These last two concepts have historically evolved from the logician's view of programming.

The material has evolved from lecture notes used in a programming languages course for juniors, seniors, and graduate students at Johns Hopkins University [21].

While the book uses formal mathematical techniques such as operational semantics and type systems, it does not emphasize proofs of properties of these systems. We will nonetheless sketch the intuitions of some proofs.

The OCaml Language

The OCaml programming language [15] is used throughout the book, and assignments related to the book should be written in OCaml. OCaml is a modern dialect of ML which has the advantages of being reliable, fast, free, and available on just about any platform through .

The FbDK

Complementing the book is the F Development Kit, FbDK. It is a set of OCaml utilities and interpreters for designing and experimenting with the toy F and F SR languages defined in the book. It is available from the book homepage at pl/book.

1

CHAPTER 1. INTRODUCTION

2

Background Needed

The book assumes familiarity with the basics of OCaml, including the module system (but not the objects, the "O" in OCaml). Beyond that there is no absolute prerequisite, but knowledge of C, C++, and Java is helpful because many of the topics in this book are implemented in these languages. The compiler presented in chapter 8 produces C code as its target, and so a basic knowledge of C will be needed to implement the compiler. More nebulously, a certain "mathematical maturity" greatly helps in understanding the concepts, some of which are deep. for this reason, previous study of mathematics, formal logic and other foundational topics in Computer Science such as automata theory, grammars, and algorithms will be a great help.

Now, make sure your seat belts are buckled, sit back, relax, and enjoy the ride. . .

Chapter 2

Operational Semantics

2.1 A First Look at Operational Semantics

The syntax of a programming language is the set of rules governing the formation of expressions in the language. The semantics of a programming language is the meaning of those expressions.

There are several forms of language semantics. Axiomatic semantics is a set of axiomatic truths in a programming language. Denotational semantics involves modeling programs as static mathematical objects, namely as set-theoretic functions with specific properties. We, however, will focus on a form of semantics called operational semantics.

An operational semantics is a mathematical model of programming language execution. It is, in essence, an interpreter defined mathematically. However, an operational semantics is more precise than an interpreter because it is defined mathematically, and not based on the meaning of the programming language in which the interpreter is written. This might sound sound like a pedantic distinction, but interpreters interpret e.g. a language's if statements with the if statement of the language the interpreter is written in. This is in some sense a circular definition of if. Formally, we can define operational semantics as follows.

Definition 2.1 (Operational Semantics). An operational semantics for a programming language is a mathematical definition of its computation relation, e v, where e is a program in the language.

e v is mathematically a 2-place relation between expressions of the language, e, and values of the language, v. Integers and booleans are values. Functions are also values because they don't compute to anything. e and v are metavariables, meaning they denote an arbitrary expression or value, and should not be confused with the (regular) variables that are part of programs.

An operational semantics for a programming language is a means for understanding in precise detail the meaning of an expression in the language. It is the formal specification of the language that is used when writing compilers and interpreters, and it allows us to rigorously verify things about the language.

3

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download