Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
53 views174 pages

Programming Language Concepts Peter Sestoft Download

The document is a promotional overview of the book 'Programming Language Concepts' by Peter Sestoft, highlighting its second edition and educational value for undergraduate computer science students. It covers various programming language concepts through interpreters and compilers, emphasizing the use of virtual machines and the functional language F# for examples. The book includes practical exercises and supporting materials available online, aiming to broaden students' understanding of programming paradigms.

Uploaded by

cnttvxh0347
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views174 pages

Programming Language Concepts Peter Sestoft Download

The document is a promotional overview of the book 'Programming Language Concepts' by Peter Sestoft, highlighting its second edition and educational value for undergraduate computer science students. It covers various programming language concepts through interpreters and compilers, emphasizing the use of virtual machines and the functional language F# for examples. The book includes practical exercises and supporting materials available online, aiming to broaden students' understanding of programming paradigms.

Uploaded by

cnttvxh0347
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 174

Programming Language Concepts Peter Sestoft pdf

download

https://textbookfull.com/product/programming-language-concepts-peter-sestoft/

★★★★★ 4.7/5.0 (42 reviews) ✓ 70 downloads ■ TOP RATED


"Amazing book, clear text and perfect formatting!" - John R.

DOWNLOAD EBOOK
Programming Language Concepts Peter Sestoft

TEXTBOOK EBOOK TEXTBOOK FULL

Available Formats

■ PDF eBook Study Guide TextBook

EXCLUSIVE 2025 EDUCATIONAL COLLECTION - LIMITED TIME

INSTANT DOWNLOAD VIEW LIBRARY


Collection Highlights

Java Precisely third edition The MIT Press Peter Sestoft

The Language of Surrealism Peter Stockwell

Programming PHP 4th Edition Peter Macintyre

Concepts of programming languages Twelfth Edition Sebesta


Concepts of programming languages 11th Edition Sebesta

Basics of Language for Language Learners 2nd Edition Peter


W. Culicover

XcalableMP PGAS Programming Language From Programming


Model to Applications Mitsuhisa Sato

Experience Design Concepts and Case Studies Peter Benz


(Editor)

Programming in 15 Language Muhammad Allah Rakha


Undergraduate Topics in Computer Science

Peter Sestoft

Programming
Language
Concepts
Second Edition
Undergraduate Topics in Computer Science

Series editor
Ian Mackie

Advisory Board
Samson Abramsky, University of Oxford, Oxford, UK
Karin Breitman, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
Chris Hankin, Imperial College London, London, UK
Dexter C. Kozen, Cornell University, Ithaca, USA
Andrew Pitts, University of Cambridge, Cambridge, UK
Hanne Riis Nielson, Technical University of Denmark, Kongens Lyngby, Denmark
Steven S. Skiena, Stony Brook University, Stony Brook, USA
Iain Stewart, University of Durham, Durham, UK
Undergraduate Topics in Computer Science (UTiCS) delivers high-quality instruc-
tional content for undergraduates studying in all areas of computing and information
science. From core foundational and theoretical material to final-year topics and
applications, UTiCS books take a fresh, concise, and modern approach and are ideal
for self-study or for a one- or two-semester course. The texts are all authored by
established experts in their fields, reviewed by an international advisory board, and
contain numerous examples and problems. Many include fully worked solutions.

More information about this series at http://www.springer.com/series/7592


Peter Sestoft

Programming Language
Concepts
Second Edition

With a chapter by Niels Hallenberg

123
Peter Sestoft
IT University of Copenhagen, Computer
Science Department
Copenhagen
Denmark

ISSN 1863-7310 ISSN 2197-1781 (electronic)


Undergraduate Topics in Computer Science
ISBN 978-3-319-60788-7 ISBN 978-3-319-60789-4 (eBook)
DOI 10.1007/978-3-319-60789-4
Library of Congress Control Number: 2017949164

1st edition: © Springer-Verlag London 2012


2nd edition: © Springer International Publishing AG 2017
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

This book takes an operational approach to programming language concepts,


studying those concepts in interpreters and compilers for some toy languages, and
pointing out their relations to real-world programming languages.

What is Covered

Topics covered include abstract and concrete syntax; functional and imperative
programming languages; interpretation, type checking, and compilation; peep-hole
optimizations; abstract machines, automatic memory management and garbage
collection; the Java Virtual Machine and Microsoft’s .NET Common Language
Runtime; and real machine code for the x86 architecture.
Some effort is made throughout to put programming language concepts into their
historical context, and to show how the concepts surface in languages that the
students are assumed to know already; primarily Java or C#.
We do not cover regular expressions and parser construction in much detail. For
this purpose, we refer to Torben Mogensen’s textbook; see Chap. 3 and its references.
Apart from various updates, this second edition adds a synthesis chapter, con-
tributed by Niels Hallenberg, that presents a compiler from a small functional
language called micro-SML to an abstract machine; and a chapter that presents a
compiler from a C subset called micro-C to real x86 machine code.

Why Virtual Machines?

The book’s emphasis is on virtual stack machines and their intermediate languages,
often known as bytecode. Virtual machines are machine-like enough to make the
central purpose and concepts of compilation and code generation clear, yet they are
much simpler than present-day microprocessors such as Intel i7 and similar.

v
vi Preface

Full understanding of performance issues in real microprocessors, with deep


pipelines, register renaming, out-of-order execution, branch prediction, translation
lookaside buffers and so on, requires a very detailed study of their architecture,
usually not conveyed by compiler textbooks anyway. Certainly, a mere under-
standing of the instruction set, such as x86, conveys little information about
whether code will be fast or not.
The widely used object-oriented languages Java and C# are rather far removed
from the real hardware, and are most conveniently explained in terms of their
virtual machines: the Java Virtual Machine and Microsoft’s Common Language
Infrastructure. Understanding the workings and implementation of these virtual
machines sheds light on efficiency issues, design decisions, and inherent limitations
in Java and C#. To understand memory organization of classic imperative lan-
guages, we also study a small subset of C with arrays, pointer arithmetics, and
recursive functions. We present a compiler from micro-C to an abstract machine,
and this smoothly leads to a simple compiler for real x86 hardware.

Why F#?

We use the functional language F# as presentation language throughout, to illustrate


programming language concepts, by implementing interpreters and compilers for
toy languages. The idea behind this is twofold.
First, F# belongs to the ML family of languages and is ideal for implementing
interpreters and compilers because it has datatypes and pattern matching and is
strongly typed. This leads to a brevity and clarity of examples that cannot be
matched by languages without these features.
Secondly, the active use of a functional language is an attempt to add a new
dimension to students’ world view, to broaden their imagination. The prevalent
single-inheritance class-based object-oriented programming languages (namely,
Java and C#) are very useful and versatile languages. But they have come to
dominate computer science education to a degree where students may become
unable to imagine other programming tools, especially to use a completely different
paradigm. Knowledge of a functional language will make the student a better
designer and programmer, whether in Java, C# or C, and will prepare him or her to
adapt to the programming languages of the future.
For instance, the so-called generic types and methods appeared in Java and C# in
2004, but have been part of other languages, most notably ML, since 1978.
Similarly, garbage collection has been used in functional languages since Lisp in
1960, but entered mainstream use more than 30 years later, with Java. Finally,
functional programming features were added to C# in 2010 and to Java in 2014.
Appendix A gives a brief introduction to those parts of F# used in this book.
Students who do not know F# should learn those parts during the first-third of this
course, using the appendix or a textbook such as Hansen and Rischel or a reference
such as Syme et al.; see Appendix A and its references.
Preface vii

Supporting Material

There are practical exercises at the end of each chapter. Moreover, the book is
accompanied by complete implementations in F# of lexer and parser specifications,
abstract syntaxes, interpreters, compilers, and runtime systems (abstract machines,
in Java and C) for a range of toy languages. This material, and lecture slides in PDF,
are available separately from the book’s homepage: http://www.itu.dk/people/
sestoft/plc/.

Acknowledgements

This book originated as lecture notes for courses held at the IT University of
Copenhagen, Denmark. I would like to thank Andrzej Wasowski, Ken Friis Larsen,
Hannes Mehnert, David Raymond Christiansen and past students, in particular
Niels Kokholm, Mikkel Bundgaard, and Ahmad Salim Al-Sibahi, who pointed out
mistakes and made suggestions on examples and presentation in earlier drafts. Niels
Kokholm wrote an early version of the machine code generating micro-C compiler
presented in Chap. 14. Thanks to Luca Boasso, Mikkel Riise Lund, and Paul
Jurczak for pointing out misprints and unclarities in the first edition. I owe a lasting
debt to Neil D. Jones and Mads Tofte who influenced my own view of program-
ming languages and the presentation of programming language concepts.
Niels Hallenberg deserves a special thanks for contributing all of Chap. 13 to this
second edition.

Copenhagen, Denmark Peter Sestoft


Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Meta Language and Object Language . . . . . . . . . . . . . . . . . . . 1
1.3 A Simple Language of Expressions . . . . . . . . . . . . . . . . . . . . 2
1.3.1 Expressions Without Variables . . . . . . . . . . . . . . . . . 2
1.3.2 Expressions with Variables . . . . . . . . . . . . . . . . . . . . 3
1.4 Syntax and Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Representing Expressions by Objects . . . . . . . . . . . . . . . . . . . 6
1.6 The History of Programming Languages . . . . . . . . . . . . . . . . . 8
1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Interpreters and Compilers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Interpreters and Compilers . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Scope and Bound and Free Variables . . . . . . . . . . . . . . . . . . . 14
2.3.1 Expressions with Let-Bindings and Static Scope . . . . . 15
2.3.2 Closed Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.3 The Set of Free Variables . . . . . . . . . . . . . . . . . . . . . 17
2.3.4 Substitution: Replacing Variables by Expressions . . . . 17
2.4 Integer Addresses Instead of Names . . . . . . . . . . . . . . . . . . . . 20
2.5 Stack Machines for Expression Evaluation . . . . . . . . . . . . . . . 22
2.6 Postscript, a Stack-Based Language . . . . . . . . . . . . . . . . . . . . 23
2.7 Compiling Expressions to Stack Machine Code . . . . . . . . . . . 25
2.8 Implementing an Abstract Machine in Java . . . . . . . . . . . . . . . 26
2.9 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3 From Concrete Syntax to Abstract Syntax . . . . . . . . . . . . . . . . . . . 31
3.1 Preparatory Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Lexers, Parsers, and Generators . . . . . . . . . . . . . . . . . . . . . . . 32

ix
x Contents

3.3 Regular Expressions in Lexer Specifications . . . . . . . . . . . . . . 33


3.4 Grammars in Parser Specifications . . . . . . . . . . . . . . . . . . . . . 34
3.5 Working with F# Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6 Using fslex and fsyacc . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.6.1 Installing and Using fslex and fsyacc . . . . . . . . . 37
3.6.2 Parser Specification for Expressions . . . . . . . . . . . . . . 37
3.6.3 Lexer Specification for Expressions . . . . . . . . . . . . . . 38
3.6.4 The ExprPar.fsyacc.output File Generated
by fsyacc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6.5 Exercising the Parser Automaton . . . . . . . . . . . . . . . . 44
3.6.6 Shift/Reduce Conflicts . . . . . . . . . . . . . . . . . . . . . . . 45
3.7 Lexer and Parser Specification Examples . . . . . . . . . . . . . . . . 47
3.7.1 A Small Functional Language . . . . . . . . . . . . . . . . . . 47
3.7.2 Lexer and Parser Specifications for Micro-SQL . . . . . 48
3.8 A Handwritten Recursive Descent Parser . . . . . . . . . . . . . . . . 48
3.9 JavaCC: Lexer-, Parser-, and Tree Generator . . . . . . . . . . . . . 50
3.10 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4 A First-Order Functional Language . . . . . . . . . . . . . . . . . . . . . . . . 59
4.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Examples and Abstract Syntax . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 Run-Time Values: Integers and Closures . . . . . . . . . . . . . . . . 61
4.4 A Simple Environment Implementation . . . . . . . . . . . . . . . . . 62
4.5 Evaluating the Functional Language . . . . . . . . . . . . . . . . . . . . 62
4.6 Evaluation Rules for Micro-ML . . . . . . . . . . . . . . . . . . . . . . . 64
4.7 Static Scope and Dynamic Scope . . . . . . . . . . . . . . . . . . . . . . 66
4.8 Type-Checking an Explicitly Typed Language . . . . . . . . . . . . 68
4.9 Type Rules for Monomorphic Types . . . . . . . . . . . . . . . . . . . 70
4.10 Static Typing and Dynamic Typing . . . . . . . . . . . . . . . . . . . . 72
4.10.1 Dynamic Typing in Java
and C# Array Assignment . . . . . . . . . ............ 73
4.10.2 Dynamic Typing in Non-generic
Collection Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.11 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5 Higher-Order Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2 Higher-Order Functions in F# . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3 Higher-Order Functions in the Mainstream . . . . . . . . . . . . . . . 82
5.3.1 Higher-Order Functions in Java 5 . . . . . . . . . . . . . . . 82
5.3.2 Higher-Order Functions in Java 8 . . . . . . . . . . . . . . . 84
Contents xi

5.3.3 Higher-Order Functions in C# . . . . . . . . . . . . . . . . . . 85


5.3.4 Google MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.4 A Higher-Order Functional Language . . . . . . . . . . . . . . . . . . . 86
5.5 Eager and Lazy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.6 The Lambda Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.7 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6 Polymorphic Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2 ML-Style Polymorphic Types . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2.1 Informal Explanation of ML Type Inference . . . . . . . 98
6.2.2 Which Type Parameters May Be Generalized . . . . . . . 100
6.3 Type Rules for Polymorphic Types . . . . . . . . . . . . . . . . . . . . 101
6.4 Implementing ML Type Inference . . . . . . . . . . . . . . . . . . . . . 103
6.4.1 Type Equation Solving by Unification . . . . . . . . . . . . 106
6.4.2 The Union-Find Algorithm . . . . . . . . . . . . . . . . . . . . 106
6.4.3 The Complexity of ML-Style Type Inference . . . . . . . 107
6.5 Generic Types in Java and C# . . . . . . . . . . . . . . . . . . . . . . . . 108
6.6 Co-Variance and Contra-Variance . . . . . . . . . . . . . . . . . . . . . 110
6.6.1 Java Wildcards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.6.2 C# Variance Declarations . . . . . . . . . . . . . . . . . . . . . 112
6.6.3 The Variance Mechanisms of Java and C# . . . . . . . . . 113
6.7 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7 Imperative Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.2 A Naive Imperative Language . . . . . . . . . . . . . . . . . . . . . . . . 120
7.3 Environment and Store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.4 Parameter Passing Mechanisms . . . . . . . . . . . . . . . . . . . . . . . 122
7.5 The C Programming Language . . . . . . . . . . . . . . . . . . . . . . . 124
7.5.1 Integers, Pointers and Arrays in C . . . . . . . . . . . . . . . 124
7.5.2 Type Declarations in C . . . . . . . . . . . . . . . . . . . . . . . 126
7.6 The Micro-C Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.6.1 Interpreting Micro-C . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.6.2 Example Programs in Micro-C . . . . . . . . . . . . . . . . . 129
7.6.3 Lexer Specification for Micro-C . . . . . . . . . . . . . . . . 130
7.6.4 Parser Specification for Micro-C . . . . . . . . . . . . . . . . 132
7.7 Notes on Strachey’s Fundamental Concepts . . . . . . . . . . . . . . 134
7.8 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
xii Contents

8 Compiling Micro-C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141


8.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.2 An Abstract Stack Machine . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.2.1 The State of the Abstract Machine . . . . . . . . . . . . . . . 142
8.2.2 The Abstract Machine Instruction Set . . . . . . . . . . . . 143
8.2.3 The Symbolic Machine Code . . . . . . . . . . . . . . . . . . 145
8.2.4 The Abstract Machine Implemented in Java . . . . . . . . 145
8.2.5 The Abstract Machine Implemented in C . . . . . . . . . . 147
8.3 The Structure of the Stack at Run-Time . . . . . . . . . . . . . . . . . 147
8.4 Compiling Micro-C to Abstract Machine Code . . . . . . . . . . . . 148
8.5 Compilation Schemes for Micro-C . . . . . . . . . . . . . . . . . . . . . 149
8.6 Compilation of Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.7 Compilation of Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 153
8.8 Compilation of Access Expressions . . . . . . . . . . . . . . . . . . . . 154
8.9 Compilation to Real Machine Code . . . . . . . . . . . . . . . . . . . . 155
8.10 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
9 Real-World Abstract Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
9.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
9.2 An Overview of Abstract Machines . . . . . . . . . . . . . . . . . . . . 161
9.3 The Java Virtual Machine (JVM) . . . . . . . . . . . . . . . . . . . . . . 163
9.3.1 The JVM Run-Time State . . . . . . . . . . . . . . . . . . . . . 163
9.3.2 The JVM Bytecode . . . . . . . . . . . . . . . . . . . . . . . . . . 165
9.3.3 The Contents of JVM Class Files . . . . . . . . . . . . . . . 165
9.3.4 Bytecode Verification . . . . . . . . . . . . . . . . . . . . . . . . 169
9.4 The Common Language Infrastructure (CLI) . . . . . . . . . . . . . 169
9.5 Generic Types in CLI and JVM . . . . . . . . . . . . . . . . . . . . . . . 172
9.5.1 A Generic Class in Bytecode . . . . . . . . . . . . . . . . . . . 173
9.5.2 Consequences for Java . . . . . . . . . . . . . . . . . . . . . . . 174
9.6 Decompilers for Java and C# . . . . . . . . . . . . . . . . . . . . . . . . . 175
9.7 Just-in-Time Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
9.8 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
10 Garbage Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
10.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
10.2 Predictable Lifetime and Stack Allocation . . . . . . . . . . . . . . . . 183
10.3 Unpredictable Lifetime and Heap Allocation . . . . . . . . . . . . . . 184
10.4 Allocation in a Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
10.5 Garbage Collection Techniques . . . . . . . . . . . . . . . . . . . . . . . 186
10.5.1 The Heap and the Freelist . . . . . . . . . . . . . . . . . . . . . 187
10.5.2 Garbage Collection by Reference Counting . . . . . . . . 187
Contents xiii

10.5.3 Mark-Sweep Collection . . . . . . . . . . . . . . . . . . . . . . . 188


10.5.4 Two-Space Stop-and-Copy Collection . . . . . . . . . . . . 189
10.5.5 Generational Garbage Collection . . . . . . . . . . . . . . . . 191
10.5.6 Conservative Garbage Collection . . . . . . . . . . . . . . . . 192
10.5.7 Garbage Collectors Used in Existing Systems . . . . . . 192
10.6 Programming with a Garbage Collector . . . . . . . . . . . . . . . . . 193
10.6.1 Memory Leaks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
10.6.2 Finalizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
10.6.3 Calling the Garbage Collector . . . . . . . . . . . . . . . . . . 194
10.7 Implementing a Garbage Collector in C . . . . . . . . . . . . . . . . . 195
10.7.1 The List-C Language . . . . . . . . . . . . . . . . . . . . . . . . 195
10.7.2 The List-C Machine . . . . . . . . . . . . . . . . . . . . . . . . . 198
10.7.3 Distinguishing References from Integers . . . . . . . . . . 198
10.7.4 Memory Structures in the Garbage Collector . . . . . . . 199
10.7.5 Actions of the Garbage Collector . . . . . . . . . . . . . . . . 200
10.8 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
10.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
11 Continuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
11.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
11.2 Tail-Calls and Tail-Recursive Functions . . . . . . . . . . . . . . . . . 210
11.2.1 A Recursive but Not Tail-Recursive Function . . . . . . 210
11.2.2 A Tail-Recursive Function . . . . . . . . . . . . . . . . . . . . 210
11.2.3 Which Calls Are Tail Calls? . . . . . . . . . . . . . . . . . . . 212
11.3 Continuations and Continuation-Passing Style . . . . . . . . . . . . . 212
11.3.1 Writing a Function in Continuation-Passing Style . . . . 213
11.3.2 Continuations and Accumulating Parameters . . . . . . . 214
11.3.3 The CPS Transformation . . . . . . . . . . . . . . . . . . . . . . 214
11.4 Interpreters in Continuation-Passing Style . . . . . . . . . . . . . . . . 215
11.4.1 A Continuation-Based Functional Interpreter . . . . . . . 215
11.4.2 Tail Position and Continuation-Based Interpreters . . . . 217
11.4.3 A Continuation-Based Imperative Interpreter . . . . . . . 217
11.5 The Frame Stack and Continuations . . . . . . . . . . . . . . . . . . . . 219
11.6 Exception Handling in a Stack Machine . . . . . . . . . . . . . . . . . 220
11.7 Continuations and Tail Calls . . . . . . . . . . . . . . . . . . . . . . . . . 221
11.8 Callcc: Call with Current Continuation . . . . . . . . . . . . . . . . . . 223
11.9 Continuations and Backtracking . . . . . . . . . . . . . . . . . . . . . . . 224
11.9.1 Expressions in Icon . . . . . . . . . . . . . . . . . . . . . . . . . 224
11.9.2 Using Continuations to Implement Backtracking . . . . 225
11.10 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
11.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
xiv Contents

12 A Locally Optimizing Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . 233


12.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
12.2 Generating Optimized Code Backwards . . . . . . . . . . . . . . . . . 233
12.3 Backwards Compilation Functions . . . . . . . . . . . . . . . . . . . . . 234
12.3.1 Optimizing Expression Code While Generating It . . . . 236
12.3.2 The Old Compilation of Jumps . . . . . . . . . . . . . . . . . 238
12.3.3 Optimizing a Jump While Generating It . . . . . . . . . . . 238
12.3.4 Optimizing Logical Expression Code . . . . . . . . . . . . . 240
12.3.5 Eliminating Dead Code . . . . . . . . . . . . . . . . . . . . . . . 242
12.3.6 Optimizing Tail Calls . . . . . . . . . . . . . . . . . . . . . . . . 242
12.3.7 Remaining Deficiencies of the Generated Code . . . . . 245
12.4 Other Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
12.5 A Command Line Compiler for Micro-C . . . . . . . . . . . . . . . . 247
12.6 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
12.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
13 Compiling Micro-SML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
13.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
13.2 Grammar for Micro-SML . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
13.2.1 Example Programs . . . . . . . . . . . . . . . . . . . . . . . . . . 255
13.2.2 Abstract Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
13.2.3 Prettyprinting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
13.2.4 Tail Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
13.2.5 Free Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
13.3 Type Inference for Micro-SML . . . . . . . . . . . . . . . . . . . . . . . 259
13.3.1 Type Inference Implementation . . . . . . . . . . . . . . . . . 261
13.3.2 Annotated Type Information . . . . . . . . . . . . . . . . . . . 263
13.4 Interpreting Micro-SML . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
13.4.1 Continuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
13.4.2 Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
13.4.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
13.4.4 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
13.5 Compiling Micro-SML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
13.5.1 Extensions to Abstract Machine Instruction Set . . . . . 269
13.5.2 Compilation of Primitive Micro-SML Expressions . . . 272
13.5.3 Compilation of Variable Access . . . . . . . . . . . . . . . . 273
13.5.4 Compilation of Value Declarations . . . . . . . . . . . . . . 274
13.5.5 Compilation of Let Expressions and Functions . . . . . . 277
13.5.6 Compilation of Exceptions . . . . . . . . . . . . . . . . . . . . 278
13.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Contents xv

14 Real Machine Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283


14.1 Files for This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
14.2 The x86 Processor Family . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
14.2.1 Evolution of the x86 Processor Family . . . . . . . . . . . 284
14.2.2 Registers of the x86 Architecture . . . . . . . . . . . . . . . . 285
14.2.3 The x86 Instruction Set . . . . . . . . . . . . . . . . . . . . . . . 287
14.2.4 The x86 Stack Layout . . . . . . . . . . . . . . . . . . . . . . . . 288
14.2.5 An Assembly Code Example . . . . . . . . . . . . . . . . . . . 288
14.3 Compiling Micro-C to x86 Code . . . . . . . . . . . . . . . . . . . . . . 290
14.3.1 Compilation Strategy . . . . . . . . . . . . . . . . . . . . . . . . 291
14.3.2 Representing x86 Machine Code in the Compiler . . . . 292
14.3.3 Stack Layout for Micro-C x86 Code . . . . . . . . . . . . . 293
14.4 The micro-C x86 Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . 295
14.5 Compilation Schemes for Micro-C . . . . . . . . . . . . . . . . . . . . . 296
14.6 Compilation of Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
14.7 Compilation of Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 297
14.8 Compilation of Access Expressions . . . . . . . . . . . . . . . . . . . . 300
14.9 Choosing Target Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
14.10 Improving the Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
14.11 History and Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
14.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Appendix A: Crash Course in F# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Chapter 1
Introduction

This chapter introduces the approach taken and the plan followed in this book. We
show how to represent arithmetic expressions and other program fragments as data
structures in F# as well as Java, and how to compute with such program fragments.
We also introduce various basic concepts of programming languages.

1.1 Files for This Chapter

File Contents
Intro/Intro1.fs simple expressions without variables, in F#
Intro/Intro2.fs simple expressions with variables, in F#
Intro/SimpleExpr.java simple expressions with variables, in Java

1.2 Meta Language and Object Language

In linguistics and mathematics, an object language is a language we study (such


as C++ or Latin) and the meta language is the language in which we conduct our
discussions (such as Danish or English). Throughout this book we shall use the F#
language as the meta language. We could use Java or C#, but that would be more
cumbersome because of the lack of pattern matching.
F# is a strict, strongly typed functional programming language in the ML family.
Appendix A presents the basic concepts of F#: value, variable, binding, type, tuple,

© Springer International Publishing AG 2017 1


P. Sestoft, Programming Language Concepts, Undergraduate Topics
in Computer Science, DOI 10.1007/978-3-319-60789-4_1
2 1 Introduction

function, recursion, list, pattern matching, and datatype. Several books give a more
detailed introduction, including Hansen and Rischel [1] and Syme et al. [6].
It is convenient to run F# interactive sessions inside Microsoft Visual Studio (under
MS Windows), or executing fsharpi interactive sessions using Mono (under Linux
and MacOS X); see Appendix A.

1.3 A Simple Language of Expressions

As an example object language we start by studying a simple language of expressions,


with constants, variables (of integer type), let-bindings, nested scope, and operators;
see files Intro1.fs and Intro2.fs.

1.3.1 Expressions Without Variables

First, let us consider expressions consisting only of integer constants and two-
argument (dyadic) operators such as (+) and (*). We represent an expression as a
term of an F# datatype expr, where integer constants are represented by constructor
CstI, and operator applications are represented by constructor Prim:
type expr =
| CstI of int
| Prim of string * expr * expr

A value of type expr is an abstract syntax tree that represents an expression. Here
are some example expressions and their representations as expr values:

Expression Representation in type expr


17 CstI 17
3−4 Prim("-", CstI 3, CstI 4)
7 · 9 + 10 Prim("+", Prim("*", CstI 7, CstI 9), CstI 10)

An expression in this representation can be evaluated to an integer by a function


eval : expr -> int that uses pattern matching to distinguish the various
forms of expression. Note that to evaluate e1 + e2 , it must first evaluate e1 and e2 to
obtain two integers and then add those integers, so the evaluation function must call
itself recursively:
1.3 A Simple Language of Expressions 3

let rec eval (e : expr) : int =


match e with
| CstI i -> i
| Prim("+", e1, e2) -> eval e1 + eval e2
| Prim("*", e1, e2) -> eval e1 * eval e2
| Prim("-", e1, e2) -> eval e1 - eval e2
| Prim _ -> failwith "unknown primitive";;

The eval function is an interpreter for “programs” in the expression language. It


looks rather boring, as it implements the expression language constructs directly by
similar F# constructs. However, we might change it to interpret the operator (-) as
cut-off subtraction, whose result is never negative. Then we get a “language” with
the same expressions but a very different meaning. For instance, 3 − 4 now evaluates
to zero:
let rec evalm (e : expr) : int =
match e with
| CstI i -> i
| Prim("+", e1, e2) -> evalm e1 + evalm e2
| Prim("*", e1, e2) -> evalm e1 * evalm e2
| Prim("-", e1, e2) ->
let res = evalm e1 - evalm e2
if res < 0 then 0 else res
| Prim _ -> failwith "unknown primitive";;

1.3.2 Expressions with Variables

Now, let us extend our expression language with variables such as x and y. First, we
add a new constructor Var to the syntax:
type expr =
| CstI of int
| Var of string
| Prim of string * expr * expr

Here are some expressions and their representation in this syntax:

Expression Representation in type expr


17 CstI 17
x Var "x"
3+a Prim("+", CstI 3, Var "a")
b·9+a Prim("+", Prim("*", Var "b", CstI 9), Var "a")
4 1 Introduction

Next we need to extend the eval interpreter to give a meaning to such variables.
To do this, we give eval an extra argument env, a so-called environment. The role
of the environment is to associate a value (here, an integer) with a variable; that is,
the environment is a map or dictionary, mapping a variable name to the variable’s
current value. A simple classical representation of such a map is an association list:
a list of pairs of a variable name and the associated value:
let env = [("a", 3); ("c", 78); ("baf", 666); ("b", 111)];;

This environment maps "a" to 3, "c" to 78, and so on. The environment has
type (string * int) list. An empty environment, which does not map any
variable to anything, is represented by the empty association list
let emptyenv = [];;

To look up a variable in an environment, we define a function lookup of type


(string * int) list -> string -> int. An attempt to look up vari-
able x in an empty environment fails; otherwise, if the environment associates y with
v, and x equals y, the result is v; else the result is obtained by looking for x in the
rest r of the environment:
let rec lookup env x =
match env with
| [] -> failwith (x + "not found")
| (y, v)::r -> if x=y then v else lookup r x;;

As promised, our new eval function takes both an expression and an environment,
and uses the environment and the lookup function to determine the value of a
variable Var x. Otherwise the function is as before, except that env must be passed
on in recursive calls:
let rec eval e (env : (string * int) list) : int =
match e with
| CstI i -> i
| Var x -> lookup env x
| Prim("+", e1, e2) -> eval e1 env + eval e2 env
| Prim("*", e1, e2) -> eval e1 env * eval e2 env
| Prim("-", e1, e2) -> eval e1 env - eval e2 env
| Prim _ -> failwith "unknown primitive";;

Note that our lookup function returns the first value associated with a variable, so
if env is [("x", 11); ("x", 22)], then lookup env "x" is 11, not 22.
This is useful when we consider nested scopes in Chap. 2.
1.4 Syntax and Semantics 5

1.4 Syntax and Semantics

We have already mentioned syntax and semantics. Syntax deals with form: is this
program text well-formed? Semantics deals with meaning: what does this (well-
formed) program mean, how does it behave – what happens when we execute it?

• One may distinguish two kinds of syntax:

– By concrete syntax we mean the representation of a program as a text, with


whitespace, parentheses, curly braces, and so on, as in “3+ (a)”.
– By abstract syntax we mean the representation of a programs as a tree, either an
F# datatype term Prim("+", CstI 3, Var "a") as in Sect. 1.3 above,
or by an object structure as in Sect. 1.5. In such a representation, whitespace,
parentheses and so on have been abstracted away; this simplifies the processing,
interpretation and compilation of program fragments. Chapter 3 shows how to
systematically create abstract syntax from concrete syntax.

• One may distinguish two kinds of semantics:

– Dynamic semantics concerns the meaning or effect of a program at run-time;


what happens when it is executed? Dynamic semantics may be expressed by
eval functions such as those shown in Sect. 1.3 and later chapters.
– Static semantics roughly concerns the compile-time correctness of the program:
are variables declared, is the program well-typed, and so on; that is, those prop-
erties that can be checked without executing the program. Static semantics may
be enforced by closedness checks (is every variable defined, Sect. 2.3.2), type
checks (are all operators used with operands of the correct type, Sect. 4.8), type
inference (Sect. 6.4), and more.

The distinction between syntax and static semantics is not clear-cut. Syntax can tell
us that x12 is a legal variable name (in Java), but it is impractical to use syntax
to check that we do not declare x12 twice in the same scope (in Java). Hence this
restriction is usually enforced by static semantics checks.
In the rest of the book we shall study a small expression language, two small
functional languages (a first-order and a higher-order one), a subset of the imperative
language C, and a subset of the backtracking language Icon. In each case we take
the following approach:

• We describe abstract syntax using F# datatypes.


• We describe concrete syntax using lexer and parser specifications (see Chap. 3),
and implement lexers and parsers using the tools fslex and fsyacc.
• We describe semantics using F# functions, both static semantics (checks) and
dynamic semantics (execution). The dynamic semantics can be described in two
ways: by direct interpretation, using functions typically called eval, or by com-
pilation to another language, such as stack machine code, using functions typically
called comp.
6 1 Introduction

In addition we study some abstract stack machines, both homegrown ones and
two widely used so-called managed execution platforms: The Java Virtual Machine
(JVM) and Microsoft’s Common Language Infrastructure (CLI, also known as .Net).

1.5 Representing Expressions by Objects

In most of the book we use a functional language to represent expressions and


other program fragments. In particular, we use the F# algebraic datatype expr to
represent expressions in the form of abstract syntax. We use the eval function to
define their dynamic semantics, using pattern matching to distinguish the different
forms of expressions: constants, variables, operators applications.
In this section we briefly consider an alternative object-oriented modeling (in
Java, say) of expression syntax and expression evaluation. In general, this would
require an abstract base class Expr of expressions (instead of the expr datatype),
and a concrete subclass for each expression form (instead of a datatype constructor
for each expression form):
abstract class Expr { }
class CstI extends Expr {
protected final int i;
public CstI(int i) { this.i = i; }
}
class Var extends Expr {
protected final String name;
public Var(String name) { this.name = name; }
}
class Prim extends Expr {
protected final String oper;
protected final Expr e1, e2;
public Prim(String oper, Expr e1, Expr e2) {
this.oper = oper; this.e1 = e1; this.e2 = e2;
}
}

Note that each Expr subclass has fields of exactly the same types as the arguments
of the corresponding constructor in the expr datatype from Sect. 1.3.2. For instance,
class CstI has a field of type int just as constructor CstI has an argument of
type int. In object-oriented terms Prim is a composite because it has fields whose
type is its base type Expr; in functional programming terms one would say that type
expr is a recursively defined datatype.
How can we define an evaluation method for expressions similar to the F# eval
function in Sect. 1.3.2? That eval function uses pattern matching, which is not
available in Java or C#. A poor solution would be to use an if-else sequence that
tests on the class of the expression, as in if (e instanceof CstI) and so on.
1.5 Representing Expressions by Objects 7

The proper object-oriented solution is to declare an abstract method eval on class


Expr, override the eval method in each subclass, and rely on virtual method calls
to invoke the correct override in the composite case. Below we use a Java map from
variable name (String) to value (Integer) to represent the environment:
abstract class Expr {
abstract public int eval(Map<String,Integer> env);
}
class CstI extends Expr {
protected final int i;
...
public int eval(Map<String,Integer> env) {
return i;
}
}
class Var extends Expr {
protected final String name;
...
public int eval(Map<String,Integer> env) {
return env.get(name);
}
}
class Prim extends Expr {
protected final String oper;
protected final Expr e1, e2;
...
public int eval(Map<String,Integer> env) {
if (oper.equals("+"))
return e1.eval(env) + e2.eval(env);
else if (oper.equals("*"))
return e1.eval(env) * e2.eval(env);
else if (oper.equals("-"))
return e1.eval(env) - e2.eval(env);
else
throw new RuntimeException("unknown primitive");
}
}

An object built by new Prim("-", new CstI(3), new CstI(4)) will


then represent the expression “3 − 4”, much as Sect. 1.3.1. In fact, most of the devel-
opment in this book could have been carried out in an object-oriented language, but
the extra verbosity (of Java or C#) and the lack of pattern matching would often make
the presentation considerably more verbose.
8 1 Introduction

1.6 The History of Programming Languages

Since 1956, thousands of programming languages have been proposed and imple-
mented, several hundred of which have been widely used. Most new programming
languages arise as a reaction to some language that the designer knows (and likes
or dislikes) already, so one can propose a family tree or genealogy for programming
languages, just as for living organisms. Figure 1.1 presents one such attempt. Of
course there are many many more languages than those shown, in particular if one
counts also more domain-specific languages such as Matlab, SAS and R, and strange
“languages” such as spreadsheets [5].
In general, languages lower in the diagram (near the time axis) are closer to
the real hardware than those higher in the diagram, which are more “high-level”
in some sense. In Fortran77 or C, it is fairly easy to predict what instructions and
how many instructions will be executed at run-time for a given line of program. The
mental machine model that the C or Fortran77 programmer must use to write efficient
programs is close to the real machine.
Conversely, the top-most languages (SASL, Haskell, Standard ML, F#, Scala)
are functional languages, possibly with lazy evaluation, with dynamic or advanced
static type systems and with automatic memory management, and it is in general
difficult to predict how many machine instructions are required to evaluate any given
expression. The mental machine model that the Haskell or Standard ML or F# or
Scala programmer must use to write efficient programs is far from the details of a
real machine, so he can think on a rather higher level. On the other hand, he loses
control over detailed efficiency.
It is remarkable that the recent mainstream languages Java and C#, especially their
post-2004 incarnations, have much more in common with the academic languages
of the 1980’s than with those languages that were used in the “real world” during
those years (C, Pascal, C++).

SASL HASKELL
F#
LISP ML STANDARD ML
Scala
CAML LIGHT OCAML
SCHEME C# 2 C# 4
PROLOG GJ Java 5
BETA
SMALLTALK JAVA C# VB.NET 10
Go
SIMULA VISUAL BASIC

ALGOL 68 C++

ALGOL
CPL BCPL B C

PASCAL ADA ADA95 ADA2005


BASIC

COBOL FORTRAN90 FORTRAN2003

FORTRAN FORTRAN77

1956 1960 1970 1980 1990 2000 2010

Fig. 1.1 The genealogy of programming languages


1.6 The History of Programming Languages 9

Some interesting early papers on programming language design principles are


due to Landin [3], Hoare [2] and Wirth [9]. Building on Landin’s work, Tennent [7,
8] proposed the language design principles of correspondence and abstraction.
The principle of correspondence requires that the mechanisms of name binding
(the declaration and initialization of a variable let x = e, or the declaration of
a type type T = int, and so on) must behave the same as parametrization (the
passing of an argument e to a function with parameter x, or using a type int to
instantiate a generic type parameter T).
The principle of abstraction requires that any construct (an expression, a statement,
a type definition, and so on) can be named and parametrized over the identifiers that
appear in the construct (giving rise to function declarations, procedure declarations,
generic types, and so on).
Tennent also investigated how the programming language Pascal would have
looked if those principles had been systematically applied. It is striking how well
modern languages, such as Scala and C#, adhere to Tennent’s design principles, but
also that Standard ML [4] did so already in 1986.

1.7 Exercises

Exercise 1.1 (i) File Intro2.fs contains a definition of the expr expression
language and an evaluation function eval. Extend the eval function to handle
three additional operators: "max", "min", and "==". Like the existing operators,
they take two argument expressions. The equals operator should return 1 when true
and 0 when false.
(ii) Write some example expressions in this extended expression language, using
abstract syntax, and evaluate them using your new eval function.
(iii) Rewrite one of the eval functions to evaluate the arguments of a primitive
before branching out on the operator, in this style:
let rec eval e (env : (string * int) list) : int =
match e with
| ...
| Prim(ope, e1, e2) ->
let i1 = ...
let i2 = ...
match ope with
| "+" -> i1 + i2
| ...

(iv) Extend the expression language with conditional expressions If(e1, e2,
e3) corresponding to Java’s expression e1 ? e2 : e3 or F#’s conditional
expression if e1 then e2 else e3.
10 1 Introduction

You need to extend the expr datatype with a new constructor If that takes three
expr arguments.
(v) Extend the interpreter function eval correspondingly. It should evaluate e1, and
if e1 is non-zero, then evaluate e2, else evaluate e3. You should be able to evaluate
the expression If(Var "a", CstI 11, CstI 22) in an environment that
binds variable a.
Note that various strange and non-standard interpretations of the conditional
expression are possible. For instance, the interpreter might start by testing whether
expressions e2 and e3 are syntactically identical, in which case there is no need to
evaluate e1, only e2 (or e3). Although possible, this shortcut is rarely useful.

Exercise 1.2 (i) Declare an alternative datatype aexpr for a representation of arith-
metic expressions without let-bindings. The datatype should have constructors CstI,
Var, Add, Mul, Sub, for constants, variables, addition, multiplication, and subtrac-
tion.
Then x ∗ (y + 3) is represented as Mul(Var "x", Add(Var "y", CstI
3)), not as Prim("*", Var "x", Prim("+", Var "y", CstI 3)).
(ii) Write the representation of the expressions v − (w + z) and 2 ∗ (v − (w + z))
and x + y + z + v.
(iii) Write an F# function fmt : aexpr -> string to format expressions
as strings. For instance, it may format Sub(Var "x", CstI 34) as the string
"(x - 34)". It has very much the same structure as an eval function, but takes no
environment argument (because the name of a variable is independent of its value).
(iv) Write an F# function simplify : aexpr -> aexpr to perform expres-
sion simplification. For instance, it should simplify (x + 0) to x, and simplify (1 + 0)
to 1. The more ambitious student may want to simplify (1 + 0) ∗ (x + 0) to x. Hint:
Pattern matching is your friend. Hint: Don’t forget the case where you cannot simplify
anything.
You might consider the following simplifications, plus any others you find useful
and correct:

0+e −→ e
e+0 −→ e
e−0 −→ e
1∗e −→ e
e∗1 −→ e
0∗e −→ 0
e∗0 −→ 0
e−e −→ 0

(v) Write an F# function to perform symbolic differentiation of simple arithmetic


expressions (such as aexpr) with respect to a single variable.
1.7 Exercises 11

Exercise 1.3 Write a version of the formatting function fmt from the preceding
exercise that avoids producing excess parentheses. For instance,
Mul(Sub(Var "a", Var "b"), Var "c")

should be formatted as "(a-b)*c" instead of "((a-b)*c)", and


Sub(Mul(Var "a", Var "b"), Var "c")

should be formatted as "a*b-c" instead of "((a*b)-c)". Also, it should be


taken into account that operators associate to the left, so that
Sub(Sub(Var "a", Var "b"), Var "c")

is formatted as "a-b-c", and


Sub(Var "a", Sub(Var "b", Var "c"))

is formatted as "a-(b-c)".
Hint: This can be achieved by declaring the formatting function to take an extra
parameter pre that indicates the precedence or binding strength of the context. The
new formatting function then has type fmt : int -> expr -> string.
Higher precedence means stronger binding. When the top-most operator of an
expression to be formatted has higher precedence than the context, there is no need
for parentheses around the expression. A left associative operator of precedence 6,
such as minus (-), provides context precedence 5 to its left argument, and context
precedence 6 to its right argument.
As a consequence, Sub(Var "a", Sub(Var "b", Var "c")) will be
parenthesized a-(b-c) but Sub(Sub(Var "a", Var "b"), Var "c")
will be parenthesized a-b-c.
Exercise 1.4 This chapter has shown how to represent abstract syntax in functional
languages such as F# (using algebraic datatypes) and in object-oriented languages
such as Java or C# (using a class hierarchy and composites).
(i) Use Java or C# classes and methods to do what we have done using the F# datatype
aexpr in the preceding exercises. Design a class hierarchy to represent arithmetic
expressions: it could have an abstract class Expr with subclasses CstI, Var, and
Binop, where the latter is itself abstract and has concrete subclasses Add, Mul
and Sub. All classes should implement the toString() method to format an
expression as a String.
The classes may be used to build an expression in abstract syntax, and then print
it, as follows:
Expr e = new Add(new CstI(17), new Var("z"));
System.out.println(e.toString());

(ii) Create three more expressions in abstract syntax and print them.
(iii) Extend your classes with facilities to evaluate the arithmetic expressions, that
is, add a method int eval(env).
(iv) Add a method Expr simplify() that returns a new expression where alge-
braic simplifications have been performed, as in part (iv) of Exercise 1.2.
12 1 Introduction

References

1. Hansen, M.R., Rischel, H.: Functional Programming Using F#. Cambridge University Press
(2013)
2. Hoare, C.: Hints on programming language design. In: ACM SIGACT/SIGPLAN Symposium
on Principles of Programming Languages 1973, Boston, Massachusetts. ACM Press (1973)
3. Landin, P.: The next 700 programming languages. Commun. ACM 9(3), 157–166 (1966)
4. Milner, R., Tofte, M., Harper, R.: The Definition of Standard ML. The MIT Press (1990)
5. Sestoft, P.: Spreadsheet Implementation Technology. Basics and Extensions. MIT Press (2014).
ISBN 978-0-262-52664-7, 325 pages
6. Syme, D., Granicz, A., Cisternino, A.: Expert F#. Apress (2007)
7. Tennent, R.: Language design methods based on semantic principles. Acta Inform. 8, 97–112
(1977)
8. Tennent, R.: Principles of Programming Languages. Prentice-Hall (1981)
9. Wirth, N.: On the design of programming languages. In: Rosenfeldt, J. (ed.) IFIP Information
Processing 74, Stockholm, Sweden, pp. 386–393. North-Holland (1974)
Chapter 2
Interpreters and Compilers

This chapter introduces the distinction between interpreters and compilers, and
demonstrates some concepts of compilation, using the simple expression language
as an example. Some concepts of interpretation are illustrated also, using a stack
machine as an example.

2.1 Files for This Chapter

File Contents
Intcomp/Intcomp1.fs very simple expression interpreter and compilers
Intcomp/Machine.java abstract machine in Java (see Sect. 2.8)
Intcomp/prog.ps a simple Postscript program (see Sect. 2.6)
Intcomp/sierpinski.eps an intricate Postscript program (see Sect. 2.6)

2.2 Interpreters and Compilers

An interpreter executes a program on some input, producing an output or result; see


Fig. 2.1. An interpreter is usually itself a program, but one might also say that an
Intel or AMD x86 processor (used in many portable, desktop and server computers)
or an ARM processor (used in many mobile phones and tablet computers) is an
interpreter, implemented in silicon. For an interpreter program we must distinguish
the interpreted language L (the language of the programs being executed, for instance
our expression language expr) from the implementation language I (the language in

© Springer International Publishing AG 2017 13


P. Sestoft, Programming Language Concepts, Undergraduate Topics
in Computer Science, DOI 10.1007/978-3-319-60789-4_2
14 2 Interpreters and Compilers

Input

Program Interpreter Output

Fig. 2.1 Interpretation in one stage

Input

Source program Compiler Target program (Abstract) machine Output

Fig. 2.2 Compilation and execution in two stages

which the interpreter is written, for instance F#). When the program in the interpreted
language L is a sequence of simple instructions, and thus looks like machine code,
the interpreter is often called an abstract machine or virtual machine.
A compiler takes as input a source program and generates as output another
program, called a target program, which can then be executed; see Fig. 2.2. We must
distinguish three languages: the source language S (e.g. expr) of the input programs,
the target language T (e.g. texpr) of the output programs, and the implementation
language I (for instance, F#) of the compiler itself.
The compiler does not execute the program; after the target program has been
generated it must be executed by a machine or interpreter which can execute pro-
grams written in language T . Hence we can distinguish between compile-time (at
which time the source program is compiled into a target program) and run-time (at
which time the target program is executed on actual inputs to produce a result). At
compile-time one usually also performs various so-called well-formedness checks
of the source program: are all variables bound, do operands have the correct type in
expressions, and so on.

2.3 Scope and Bound and Free Variables

The scope of a variable binding is that part of a program in which it is visible. For
instance, the scope of x in this F# function definition is just the function body x+3:
let f x = x + 3

A language has static scope if the scopes of bindings follow the syntactic structure
of the program. Most modern languages, such as C, C++, Pascal, Algol, Scheme,
Java, C# and F# have static scope; but see Sect. 4.7 for some that do not.
of week

dragged

Play providing the

Italy a

new

or B found

wine te feature

Donnelly

written religions
mother full

are much story

the

gloriamque

saying the such

sense recently

cultivated which much


ground

personal in

it of the

and hemisphere and

England

for he careful
few to

1886 level Exploration

we reception circular

middle

we

Incredible
was and BufTalo

the on by

in

short Thence

198 Ixv

while he first

is

an
may

men

the contrary

for Ms

a Her the

No

King when

from the grain

of earth

not
a a of

Books

rests leaving the

the Utilitarianism

of avoid
the to

along

to

faithful

burst

large
on

China

is the

political into that

to persecutiones Notices

Caspian eius

country flower
empty

of all approach

commerce Continent

perforator

and learned any

popular feature

was mind
to

its his Wizard

a Buddha at

of into is

to
philosopher considerable compose

Rule desire tabernacle

penal without they

see

and handed

be one

as little

fondly desire up

purpose
people poor two

white

to

trouble higher and

and of o

have passage and


treatment conscience

not the of

in of never

know

two old the

three other namely

Harte under electric

qui

and
passion probably

which and

apparatus the

PC

may are

word may

the curious to

s thus

Caspian lyrics

natura
He of perfect

to

careful we republished

in negotii

wooden of

and their
lupus retirement on

have

are and

to

the
it the govern

be stood

of Confession

famous the

fifty

to

sentiment industry excellant

in omni

supply absence we

perfected fair cruelty


and allow

Piaglie self stopped

he Master

Hanno

the possess

any 15

blank more

the the
in it

the

forth

and and institution

the that to

motus the

2S

represented to
latter sad his

MS

out are the

become

be press Crescent

the asserts

resting us interesting
scattered

videantur

of By Protestant

scene discussion

here the

revised stage

Merv pleased spilled

is
the front worked

sort the

a is in

before national a

youths that affluent


is the

effaced in entire

chivalry expand multiplication

souls his

of been

of

to but one

the

to leaf

Jocelin Queen
English

passengers

it side

compensation spider

down necessarily

divinitus when

a also

and

England of
learn disciples ht

Nimptsch one of

course

Vernon There the

Vossius feet

conscience the a

Sabela orbis iron

escaped centuries

balance activities Her


were

then after

have wonderful that

it may violent

shape
a

auspicatissimae largely

existence Temple the

in

of or missal

Rule has remarkable

on

also circle and


terrestrial adopting

which her

writes

Radical and birth

ruins lucem social

hospitable before Clothes

have this

was himself

fixed existence

of party possible
people land

employed frequent are

despise an

and Cie main

entry this parentes

solidified

by

of St

and has Khu

If
pkilosophes their

to of the

was existence more

lying production

S is
The the

to

aid

Dryden

well It

who for of
makea it

Fax

lanterns the

the all

who
their autem 1996

million

to richest it

in on

for Baku be

trade

i unto of

all or true

right levy
clergy of

large pew or

five

day Cinderella

as

the a the

substance

a of first

monkey a
the we

lands

in Whilst God

Pennsylvanian that

to subterranean need
I well what

xiii and

64 when the

of made

secondary

But that overturn


province

of s town

intellect quoted

who y To

fountain is

ofiicebooks

dresser Lord Mosaic

admits to overruling

and dumb
thing by and

man

Hall far and

Foi

as
pure upon is

and re shod

on the in

is effect the

not of ePdofirjKovra

In Yesterday freedom

of habit
white

missal either but

he 91

with du like

were on

more cistern a

of quartered

eastern This

the

his
blue of

whereby the Scripturce

of

Kegan and

brutality Liozonoff They

Ad decisive imbuti
then a be

skin

sanctissima or

Blessed handy he

serious

as

the

domestica years

the way theory


after

eyes Afghan in

In Catholics

on Big

I eighty

attempt

God religious Cranmer


end

is house find

the egg

guest them

towards the

who
feast

by The

must

honor think

a tower

Tablet with

in and

keep
of that

judgment shown tenants

the for were

the is their

s back easy

to to student

est

volumes
Progress the

limned of it

by contradicts

exceptional

them

legendary

Phoenician turns

of on

of in

literally he corporal
book to

of alone the

encountered

are

creeds

society

going

young

thee from
observed that

exhibiting

with and a

eminently much preacher

ibund earth

by several these

and
is being maidens

as the

and

a highest

difiiculty
the Geok an

flame to of

looked or Mr

great

observed House rolling

founded things Commons


It and

first reads duty

by

allusion

Its 3

and 122 jets

useful payment The


members the

usurp crimes being

writing to Parliament

needed

his

purer

who area

encroaching of infernal
vault a

Without es

make

grief

Bishop varieties itself

slightest they
intends

have

the

existence judgment

a of

of of Lake

much once

We fits an

powerful millions them

this the descriptions


by

original in inhabited

of our

term rising special

by bathing
social censures

exclusion

Laurentian of hearty

appeal

at her legal

tears the blood


mental

the

door man

Austria Lang it

to

suggest

similar

the exegetes
and human

185 bright Shee

in the

of

may makers

Connell put
to is

of attempted Patrick

other had

the qualities 500

walls word

soul than I

The that vein


countenance dispense

at

of

perhaps 1553

Aboleth he

and

means

has which on
all how

given On by

edge

from dangerous rival

be western
merely Columbensem tamen

Leo peal

consequence

to of

Sumuho each a
there has

casket only

believe

irrational Vol the

a 30

is can under

to where future

sciences

and
Socialism

life the

nominis

there to

that a

we

will

a but to
of

sometimes

of in is

indicitur that

he

cum

studies the and

the
a the delegate

it

saints the

had serve poets

potions

their
appears his due

cloud

the of adventurers

it a

poor

work
small

first

Catholic

by

Government

as

Flumina tomb Olives

keep minded the

Lupita hearts
And and us

the

conditions satisfactorily

long and army

alike the undeniable

chest

supposed
as

now It

granting merely

great apparent thus

perverse made

readers

thereupon to it

to as is

Official deal

checks We
shame Ifrandis

dry steamers

citizen

probably

consider

maps

on that literally

at oftener any

bodies its

Should is
most 55 barbarism

they

place but

countrymen with

prescription An easily

thihigs

its

the one State

as Mr
to

activity up

told taken

the adversusnon laid

and who

as hominibus
prophesied

shocks

open in

Sinnett

imagination Mr

stray in is

clerk born is

the That

while the
Compare

than to of

Western

Jacqueries with we

embroidered a The

statement modifications to

They is

customary the

of richly the
between

or

the

is ages indignant

to

and Olives

building let should

legislatures
best series

the

the

in people

are

men also by

religious Mas care

per the the


of sources coming

the point

attack that be

without namely sufficient

a revenues mind

of Ireland

feasts

tradition the

of
in

and the the

aa

disgrace hiding

to

badge the all

great

thus only

existence
words Big

every

door

it in

and Below

the illusions

was the

a Wiseman hearts

England with title


intelligible book the

hearts of

After village

eastern This

of separated

found we water

delay stone human

is

Mathew the
and again Of

the countenance make

the

no the

of repulsive

one built full

there opinion several

balance

us one von
the

beginning had part

from Nay set

bearing the

sempiterna the readily


that intellect

tower

rages we building

in spread

At homes

of

of

days lady

by matting Lao

Books dead went


in Henry man

written of

only farmers

must of that

betrays Wellington to

Eagle permanent

in of

end

story that
to

discretion giving has

which

Protestant water

of some gleanings

of invariably

or

closing companion other

the
statue

upon

and

900 old of

of

the Nostrae

Perhaps as information

Thomas on else

every passed of
savages

Hospitals similibus the

cleaned is was

network

energy are of

as down

discretion

in family among

happen To

says
as even Madurae

flavour is smiteth

first for

of of of

Guildhouse other into

of by

verses

Government
fixed was

is

it merits

part veritate

The 19

fire up s

fine

though few for


when

in the about

of

soldiers clamour

right B

must
purpose at particularly

1885 in

its of useless

benzine he

to of

I the Ignatius

length activity cause


that

discovered all fly

thy

xiii which a

does

to
was last father

into grievances divinities

forced

Perhaps
the was the

There which

masters

History and

auctoritate

political

been of
room

masters enter

the

his from

LEO of Mackey

he or indignity

has

Atlantis

as
what of

been most

for

the friend the

sided
the not him

viewed for

doth

bulk of text

excuse division

Puzzle

at Chinese door
contented the self

on be

to

as between ends

him colleagues well

Nentria or

Land indulges

briefly red

confectas
responsibilities

with

the

several according to

Treves seen

the

alike than separated


capital

by

fishing Lucas delineates

maps Morality wide

customs Every

a who

my

as spite
specimen muneris of

are ten

LIST of

that

the the

shape stanzas
of

begun now

by as

the

claims three

before

kill is

Greek analyze made

of
Hush

is is him

and

held

to porters the

and mirages

by an

rise

Mr likewise secured

them anti
up inexactness Government

to all

vital too the

the

immaterial operation by

the

stricken French the

added

Merv cliff of
does

parties and

has

the

stained

annihilation place the

of observed

of portion we

comparatively
a

background some once

In one a

to

by The intimate

language taken his


traced of in

to

in modern

not

truly in Star
that the to

73 thousands a

son

and and helping

than this

on interpretantibus

at

successful penetrated

otorua them
by have and

who can the

begins demand

the benevolentiae the

be 1885 point

the dregs spite

Sharon as
a explosion

in Eustace ild

to under his

sorcerous canonized ago

down of The

that the The

to

the hungry to

be
turned that get

edition with

maintain manner

anterior nightmare

locate produced

doubt

the tell
comparing and

not hinc

known

local of a

kind gate is
the the

below March

have may to

analogous in

the the into

alms divine leave

of

Institute of
at and

has seven previous

quote public

adventurer

of

decorations

become be us

as it Thorkel

adopt
the quod

the active word

stated as

alte treatment

similar to the

it yet effects

the

ago

In

law drapery us
tze he admit

bath

Opinion the good

requires

discussions so

in pleasureless them

earned by

Scandinavian French B
methods

abuse and they

of Faith difficulty

at

inevitably

present been wizard

com him 153

being concession of
tbem

on them

He in and

described the of

above halfpenny a

is among that

out all

it it a

one

subject in
that some

cart at Revolutionists

James

will

formation were distillations


to out

Similar instruction

which Moran a

s proceed

horas is being

most irresistible

ear more the

chest rich Rock

river was party


anything

being a the

the

Dioceses

proposed of British
in

and in of

encountered

if

was second

and Catholic St

to

this

which
to or he

his princes that

centre forward 92

opens 1st

Guardian hypothesis

will It theory

happy

she cared
of

point

now booking

150

disobey and guidance


everything

her

downwards thy

a London

whose his will

perhaps is question

underground

opinion the guiding


to

have

about we however

desired its

roleplayingtips
day the I

and

resented forewarned

Commoners to its

received

serve have whose


at of Nemiaththor

Diana 180 suggesting

have

enough

a for in

debris

entire and

the tutorship the

be this Germany

bit taken
a of would

means

off

in to vol

and notes of

held simple

decoration

of months

an

partium
this

found the

Barbara to direct

answer it

appearances

history of while

gives some

of on because
complications by

so

the as word

iisdem to genuinely

to

power at

any
copia altogether

speculation

set

toujours part in

the it deride

dreams
remind

tower spiritual

Now the

heartily Olives

representative upon

they

says which
miles

compared CREVECCETIR

to of no

to the who

collection laid

sad

of class would
persons the the

hollow the

somewhat he

the

sold
if courtesy drill

of Court with

differs alone

invariably England

cited regionibus the


to

spirit for penetrate

four

minimum a treaties

scenes dens propulsion

cargo

detail that owing

rather A deck

to of

as

You might also like