College of Computer Science & Technology
Compiler Construction Principles &
Implementation Techniques
Dr. Ying JIN
Associate Professor
College of CST, Jilin University
Sept. 2007
Compiler Construction Principles & Implementation Techniques -1-
College of Computer Science & Technology
Questionaire
• How much you know about a compiler?
• Which compilers have you used before?
• What kind of compiling errors have you find in
your programs before?
– Undefined identifier;
– Missing ……;
Compiler Construction Principles & Implementation Techniques -2-
College of Computer Science & Technology
Outline
1. Introduction to Compiler
1.1 Programming Languages
1.2 Compiler and Interpreter
1.3 Programs related to Compiler
1.4 Design and Implementation of a Compiler
1.5 Functional Decomposition and Architecture of a Compiler
1.6 General Working Process of a Compiler for a Toy Language
Compiler Construction Principles & Implementation Techniques -3-
College of Computer Science & Technology
Objectives
• To know
– Different programming languages and their features
– Different ways to implement a programming language
– The process to handling programming languages
– The functional components of a compiler
– The working process of a general simple compiler with
an example
Compiler Construction Principles & Implementation Techniques -4-
College of Computer Science & Technology
1.1 Programming Languages
Compiler Construction Principles & Implementation Techniques -5-
College of Computer Science & Technology
1.1 Programming Languages
• History
– 1800 , First Programmer
( Jacquard loom; Analytical engine; Ada Augusta )
– 1950 , First Programming Language
(FORTRAN; COBOL; Algol60; LISP)
– 1960 , emerging hundreds of programming languages
(special-purpose languages; universal language)
– 1970 , simplifying, abstraction (PASCAL; C; )
– 1980 , Object Oriented (Ada; Modular; Smalltalk; C++ )
– 1990 , Internet (Java) , Libraries , Script
(Scripting; Perl; Javascript)
– 2000 , new specification language
Compiler Construction Principles & Implementation Techniques -6-
College of Computer Science & Technology
1.1 Programming Languages
• Classifications
– Functions
• Scientific computation; business; table handling; forms; strings;
multi-functional;
– Abstraction level
• Low level
– Machine language; assembly language
• High Level (different paradigms 范例 )
– Procedural programming languages: FORTRAN, PASCAL, C
– Functional programming languages: LISP, HASKELL, ML
– Logical programming languages: PROLOG
– Object-oriented programming languages: Smalltalk, Java, C++
Compiler Construction Principles & Implementation Techniques -7-
College of Computer Science & Technology
1.1 Programming Languages
• Definition of a programming language includes
– Lexeme
• Allowed set of characters
• Lexical structure
– Syntax
• Program structure
– Semantics
• Meaning of different structures
Compiler Construction Principles & Implementation Techniques -8-
College of Computer Science & Technology
1.2 Compiler and Interpreter
Compiler Construction Principles & Implementation Techniques -9-
College of Computer Science & Technology
1.2 Compiler and Interpreter
• Implementation of Programming Languages
– Interpreter :
Program
Interpreter Output
Input
– Translator : Language1 Language2
• Assembler: Assembly Languages Machine Code
• Compiler : High-level Languages Low-level Languages
Compiler Construction Principles & Implementation Techniques -10-
College of Computer Science & Technology
1.2 Compiler and Interpreter
• Compiler: a program that reads a program written in one
language (source language) and translate it into an
equivalent program in another language (target language).
Source Target
Compiler input
Program Program
• The status of compiler
– System software (Operating System) output
– Meta software system( 元级软件系统 )
Compiler Construction Principles & Implementation Techniques -11-
College of Computer Science & Technology
1.2 Compiler and Interpreter
• Comparing Compiler with Interpreter
– Similarity
• Using same implementation techniques
– Difference
• Mechanism: Translation vs. Interpretation
• Execution efficiency: high vs. low
• Storage cost: less vs. more
• Interpreter has some advantages over Compiler
– Portability: Java
– General
– Intermediate code generation is not necessary
Compiler Construction Principles & Implementation Techniques -12-
College of Computer Science & Technology
1.3 Programs related to Compiler
Compiler Construction Principles & Implementation Techniques -13-
College of Computer Science & Technology
1.3 Programs Related to Compiler
Editor
• Editor Absolute Machine Code
• Preprocessor Skeletal Source Program Loader/Linker
• Compiler Preprocessor Relocatable Machine Code
• Assembler
Source Program Assembler
• Loader
• Linker Compiler Target Assembly
Program
Compiler Construction Principles & Implementation Techniques -14-
College of Computer Science & Technology
1.4 Design and Implementation of Compiler
Compiler Construction Principles & Implementation Techniques -15-
1.4 Design and Implementation of a Compiler
College of Computer Science & Technology
• There are 3 languages associated with a compiler
– Source language Ls: Input
– Target language Lt: Output
– Implementation language Li: the language for developing the
compiler
• A compiler is a program written in Li, whose function is
translating a program written in Ls into equivalent program
in Lt.
Ls Lt
Li
Compiler Construction Principles & Implementation Techniques -16-
1.4 Design and Implementation of a Compiler
College of Computer Science & Technology
• For different programming language paradigms, different
techniques will be applied for developing their compilers;
• In this course, focus on general compiler construction
principles and techniques on procedural programming
languages;
Compiler Construction Principles & Implementation Techniques -17-
1.4 Design and Implementation of a Compiler
College of Computer Science & Technology
• There are no existing compilers
– Manually programming machine code ( 手工编写机器代码 )
• Inefficient, hard to maintain
– Self extending ( 自展法 )
• There are compilers available
– Preprocessing ( 预处理方法 )
– Porting ( 移植法 )
– Tools ( 工具法 )
• Automatic generator ( 自动生成工具 )
Compiler Construction Principles & Implementation Techniques -18-
1.4 Design and Implementation of a Compiler
College of Computer Science & Technology
• Self extending
– Problem: if there is no any compiler available, we want
to develop a compiler for a programming language L;
- Solution:
- Define L0 as a sub-language of L;
- Manually write a compiler for L0;
- Make some extensions to L0, which is called L1,
- Develop L1’s compiler with L0;
- ……
- Develop Ln(=L)’s compiler with Ln-1;
Compiler Construction Principles & Implementation Techniques -19-
1.4 Design and Implementation of a Compiler
College of Computer Science & Technology
• Preprocessing
– Problem: if we have a programming Language L and its
compiler, we want to develop a compiler for a
programming languageL1 which makes some extensions
to L;
- Solution:
- Develop a preprocessor: Translating L1 into L
- Use L’s compiler: from L to Target code
- For example: C++ C
Compiler Construction Principles & Implementation Techniques -20-
1.4 Design and Implementation of a Compiler
College of Computer Science & Technology
• Porting
– Problems:
• source language L
• L’s compiler for machine M1
• we wan to develop another compiler of L for machine M2;
– Same source language, Different target languages
– Two ways
• Develop a program for translating from machine code for M1 to
machine code for M2;
• Rebuild the back-end of the compiler
Compiler Construction Principles & Implementation Techniques -21-
College of Computer Science & Technology
1.5 Functional Decomposition &
Architecture of a Compiler
Compiler Construction Principles & Implementation Techniques -22-
1.5 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
• Programming Problem
– Develop a compiler for a programming language
• Need to make clear
– What we already know?
– What we are going to do?
• Input & Output
• Data structure + algorithm
Compiler Construction Principles & Implementation Techniques -23-
1.5 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
• What we already know?
– Definition of the source language (notation, structure,
semantics, rules)
– Definition of the target language (notation, structure,
semantics, rules)
– The language that we are going to use to develop the
compiler
Compiler Construction Principles & Implementation Techniques -24-
1.5 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
• Functional description of a compiler
– Input: programs written in source language (source programs)
= sequence of characters
– Output: programs written in target language (target programs/code)
= sequence of instructions
– Algorithm?
• A general process of translating each source program into corresponding
target program;
Compiler Construction Principles & Implementation Techniques -25-
1.5 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
• Think about “natural language translation”
– From English to Chinese
You can put your dream into reality through your efforts!
你 能够 通过你的努力 实现你的梦
想!
General process of translation: analysis
Recognize Grammatical
meaningful translation
words Analysis
Compiler Construction Principles & Implementation Techniques -26-
1.5 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
• To summarize
– Grasp source language and target language
• Words, syntax, the meaning
– The process of translation one sentence includes
• Analyzing the sentence to make sure that it is correct
– Spell, including recognizing words and their attributes
– Build syntactic structure with respect to the grammar of source
language;
– Make sure it is meaningful;
I eat sky in dog.
• Translating the sentence into target language
– Translating each syntactic parts
– Composing them into a meaningful sentence in target language
Compiler Construction Principles & Implementation Techniques -27-
1.5 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
Compiler Construction Principles & Implementation Techniques -28-
1.5.1 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
Table Processing
中 中 目
Source 词 语 语 间 间 标 Target
program 法 法 义 代 代 代 Program
分 分 分 码 码 码
析 析 析 生 优 生
成 化 成
Error Handling
Compiler Construction Principles & Implementation Techniques -29-
1.5 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
Lexical Analysis Target Code
scanning Generation
Intermediate Code
Syntax Analysis Optimization
Parsing
Intermediate Code
Semantic Analysis Generation
analysis/front end synthesis/back end
Compiler Construction Principles & Implementation Techniques -30-
1.5 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
• Lexical Analysis
– Reading the source program, which is actually in the form of a
stream of characters;
– Collects sequences of characters into meaningful unit, called
tokens;
• Syntax Analysis
– Reading the sequences of tokens;
– Determining the syntactical structure of the program;
– The results of parsing are represented as a parse tree or a syntax
tree;
• Semantic Analysis
– Static semantics checking, such as type checking
– Symbol table (attributes of identifiers)
Compiler Construction Principles & Implementation Techniques -31-
1.5 Functional Decomposition &
College of Computer Science & Technology Architecture of a Compiler
• Code Generation
– Intermediate Code generation
• Intermediate representation
• portability
– Target Code generation
• Code Optimization
– Efficiency of target program
• Intermediate code optimization
• Target code optimization
Compiler Construction Principles & Implementation Techniques -32-
College of Computer Science & Technology
1.6 A General Working Process of a Compiler
Compiler Construction Principles & Implementation Techniques -33-
1.6 A General Working Process of a
College of Computer Science & Technology
Compiler
• Source language : a toy programming language
ToyL
• Target language : assembly language AL
• Demonstrate with an example on how a compiler is
translating a program in ToyL into assembly codes;
Compiler Construction Principles & Implementation Techniques -34-
College of Computer Science & Technology
Toy language
• General definition
– Lexical structure
• Allowed set of characters {a-z, A-Z,0-9}
• Tokens:
– keywords : var, if, then, else, while, read, write, int, real, bool
– Identifiers: sequence of limited number of characters starting with
letters
– Numbers: integer or real
– Operators: +, -, *, /, >, <, =
– Delimiters: : , ; , ( , ), {, }
Compiler Construction Principles & Implementation Techniques -35-
College of Computer Science & Technology
Toy language
var
int x, y;
• General definition ……
– Syntax (structure of program)
x := a+b;
if x>0 then …
variable declaration else … ;
while x<10 {…};
read(x);
{
write(x+y) ;
sequence of
statements
} +, -, *, /, ( , )
Compiler Construction Principles & Implementation Techniques -36-
College of Computer Science & Technology
Toy language
• General definition
– Semantics
• Static semantics
– One identifier should not be declared more than once;
– Use identifiers after they are declared;
– Type equivalence in assignment and expressions;
– The result type of conditional expression in if or while statement
should be boolean;
• Dynamic semantics
Compiler Construction Principles & Implementation Techniques -37-
College of Computer Science & Technology
• Sample program
var
int x, y;
{
read(x);
read(y);
if x>y then write(1) else write(0);
}
Compiler Construction Principles & Implementation Techniques -38-
College of Computer Science & Technology
v a r i n t △ x
, y ; {
r e a d ( x ) ; r e a d ( y
)
; i f △ x > y △ t h e n △
w r i t e ( 1 ) △ e l s e △ w
r i t e ( 0 ) ; }
Compiler Construction Principles & Implementation Techniques -39-
Lexical analysis
College of Computer Science & Technology
var , int, k x, ide , y, ide ;
k
{ read, k ( x, ide ) ; read, k
( y, ide ) ; if, k x, ide >
y , ide then, k write,k ( 1, num ) ;
else, k write,k ( 0, num ) ; }
Compiler Construction Principles & Implementation Techniques -40-
College of Computer Science & Technology
Syntax Analysis
program
variable declaration statements
type variables read-statement statements
x read-statement if-statement
int x, y
write-statement
y expression
expression
x 〉 y write-statement
0
expression
1
Compiler Construction Principles & Implementation Techniques -41-
College of Computer Science & Technology
Semantic Analysis
x varKind int ……
y varKind int ……
Compiler Construction Principles & Implementation Techniques -42-
College of Computer Science & Technology
Code Generation
INP(x)
INP(y)
LOAD x , R1
GT R1, y
JUMP1 R1, L1
OUT(1)
JUMP L2
L1: OUT(0)
L2: EXIT
Compiler Construction Principles & Implementation Techniques -43-
College of Computer Science & Technology
Summary
• Different classification of programming languages
• Definition of a programming language
• Definitions and differences of compiler and interpreter;
• Programs related to processing programming languages;
• Design and implementation of a compiler;
• Functional components of a compiler;
• General working process of a compiler;
Compiler Construction Principles & Implementation Techniques -44-
College of Computer Science & Technology
Summary
• The problem that we are going to solve in this course
– How to develop a compiler for a programming language?
• Source language: a high-level programming language
• Target language: assembly language or machine language
• Develop a program, whose function is to translating a program
written in source language
• Principle
– Divide and conquer ( 分而治之 )
– Problem programming task solution general
principles and methods
Compiler Construction Principles & Implementation Techniques -45-
College of Computer Science & Technology
Compiler Construction Principles & Implementation Techniques -46-
College of Computer Science & Technology
Reading Assignment
• Topic: How to develop a scanner( 词法分析器 ) ?
• Objectives
– Get to know
• What is a scanner? (input, output, functions)
• Lexical rules for C & Java?
• Originally how you want to develop a scanner?
• From textbook, how a scanner can be built?
• References
– Optional textbooks
• Hand in a report either in English or in Chinese, and one group will be
asked to give a presentation at the beginning of next class;
• Tips:
– Collect more information from textbooks and internet;
– Establish your own opinion;
Compiler Construction Principles & Implementation Techniques -47-