CS 420: Advanced
Programming
Languages
Dr. Mary Pourebadi
San Diego State University
Lecture 15
Describing Syntax
and Semantics
Book: Chapter 3
ISBN 0-321-49362-1
Chapter 3 Topics
• Introduction
• The General Problem of Describing Syntax
• Formal Methods of Describing Syntax
• Attribute Grammars
• Describing the Meanings of Programs:
Dynamic Semantics
Copyright © 2018 Pearson. All rights reserved. 1-3
Introduction
• Syntax: the form or structure of the
expressions, statements, and program
units
• Semantics: the meaning of the expressions,
statements, and program units
• Syntax and semantics provide a language’s
definition
– Users of a language definition
• Other language designers
• Implementers
• Programmers (the users of the language)
Copyright © 2018 Pearson. All rights reserved. 1-4
The General Problem of Describing
Syntax: Terminology
• A sentence is a string of characters over
some alphabet
Copyright © 2018 Pearson. All rights reserved. 1-5
The General Problem of Describing
Syntax: Terminology
• A sentence is a string of characters over
some alphabet
• A language is a set of sentences
Copyright © 2018 Pearson. All rights reserved. 1-6
The General Problem of Describing
Syntax: Terminology
• A sentence is a string of characters over
some alphabet
• A language is a set of sentences
• A lexeme is the lowest level syntactic unit
of a language (e.g., *, sum, begin)
Copyright © 2018 Pearson. All rights reserved. 1-7
The General Problem of Describing
Syntax: Terminology
• A sentence is a string of characters over
some alphabet
• A language is a set of sentences
• A lexeme is the lowest level syntactic unit
of a language (e.g., *, sum, begin)
• A token is a category of lexemes (e.g.,
identifier)
Copyright © 2018 Pearson. All rights reserved. 1-8
Formal Definition of Languages
• Recognizers
– A recognition device reads input strings over the alphabet
of the language and decides whether the input strings
belong to the language
– Example: syntax analysis part of a compiler
- Detailed discussion of syntax analysis appears in
Chapter 4
Copyright © 2018 Pearson. All rights reserved. 1-10
Formal Definition of Languages
• Recognizers
– A recognition device reads input strings over the alphabet
of the language and decides whether the input strings
belong to the language
– Example: syntax analysis part of a compiler
- Detailed discussion of syntax analysis appears in
Chapter 4
• Generators
– A device that generates sentences of a language
– One can determine if the syntax of a particular sentence is
syntactically correct by comparing it to the structure of
the generator
Copyright © 2018 Pearson. All rights reserved. 1-11
Generators & Recognizers
1-12
Basic Compiler
• Lexical analysis - scanner
– Scanning the source statement, recognizing and
classifying the various tokens
Ref: C.T. Yang, system programming course materials;
13
Ref: L. Beck, Syst. Software - Intro to system programming.
Basic Compiler
• Lexical analysis - scanner
– Scanning the source statement, recognizing and
classifying the various tokens
• Syntactic / syntax analysis - parser
– Recognizing the statement as some language
construct.
– Construct a parser tree (syntax tree)
Ref: C.T. Yang, system programming course materials;
14
Ref: L. Beck, Syst. Software - Intro to system programming.
Basic Compiler
• Lexical analysis - scanner
– Scanning the source statement, recognizing and
classifying the various tokens
• Syntactic analysis - parser
– Recognizing the statement as some language
construct.
– Construct a parser tree (syntax tree)
• Code generation – code generator
– Generate assembly language codes
– Generate machine codes (Object codes)
Ref: C.T. Yang, system programming course materials;
15
Ref: L. Beck, Syst. Software - Intro to system programming.
Basic Compiler
• Lexical analysis - scanner
– Scanning the source statement, recognizing and
classifying the various tokens
• Syntactic analysis - parser
– Recognizing the statement as some language
construct.
– Construct a parser tree (syntax tree)
• Semantic analysis
– ensuring logical correctness
– Ensuring optimization
• Code generation – code generator
Ref: C.T. Yang, system programming course materials;
16
Ref: L. Beck, Syst. Software - Intro to system programming.
Lexical Analysis
• Tokens, Patterns, Lexemes
Ref: C.T. Yang, system programming course materials;
17
Ref: L. Beck, Syst. Software - Intro to system programming.
Lexical Analysis
• Tokens, Patterns, Lexemes
• Specification of Tokens
• Regular Expressions
• Notational Shorthand
Ref: C.T. Yang, system programming course materials;
18
Ref: L. Beck, Syst. Software - Intro to system programming.
Lexical Analysis
• Tokens, Patterns, Lexemes
• Specification of Tokens
• Regular Expressions
• Notational Shorthand
• Token Recognizer - Finite Automata
• Nondeterministic Finite Automata (NFA).
• Deterministic Finite Automata (DFA).
Ref: C.T. Yang, system programming course materials;
19
Ref: L. Beck, Syst. Software - Intro to system programming.
Lexical Analysis
• Tokens, Patterns, Lexemes
• Specification of Tokens
• Regular Expressions
• Notational Shorthand
• Token Recognizer - Finite Automata
• Nondeterministic Finite Automata (NFA).
• Deterministic Finite Automata (DFA).
• From a Regular Expression to an NFA.
• Conversion of an NFA into a DFA.
Ref: C.T. Yang, system programming course materials;
20
Ref: L. Beck, Syst. Software - Intro to system programming.
Interaction of Lexical Analyzer with
Parser
Ref: C.T. Yang, system programming course materials;
21
Ref: L. Beck, Syst. Software - Intro to system programming.
Interaction of Lexical Analyzer with
Parser
Ref: C.T. Yang, system programming course materials;
22
Ref: L. Beck, Syst. Software - Intro to system programming.
Interaction of Lexical Analyzer with
Parser
Ref: C.T. Yang, system programming course materials;
23
Ref: L. Beck, Syst. Software - Intro to system programming.
Interaction of Lexical Analyzer with
Parser
Ref: C.T. Yang, system programming course materials;
24
Ref: L. Beck, Syst. Software - Intro to system programming.
Interaction of Lexical Analyzer with
Parser
Ref: C.T. Yang, system programming course materials;
25
Ref: L. Beck, Syst. Software - Intro to system programming.
Parser and Context-Free Grammars
• Context-Free Grammars
• BNF: Backus-Naur Form
Copyright © 2018 Pearson. All rights reserved. 1-26
Parser and Context-Free Grammars
• Context-Free Grammars
– Developed by Noam Chomsky in the mid-1950s
– Language generators, meant to describe the
syntax of natural languages
– Define a class of languages called context-free
languages
• BNF: Backus-Naur Form (1959)
– Invented by John Backus to describe the syntax
of Algol 58
– BNF is equivalent to context-free grammars
Copyright © 2018 Pearson. All rights reserved. 1-27
BNF Fundamentals
• In BNF, abstractions are used to represent classes
of syntactic structures--they act like syntactic
variables (also called nonterminal symbols, or just
terminals)
Copyright © 2018 Pearson. All rights reserved. 1-28
BNF Fundamentals
• In BNF, abstractions are used to represent classes
of syntactic structures--they act like syntactic
variables (also called nonterminal symbols, or just
terminals)
• Terminals are lexemes or tokens
Copyright © 2018 Pearson. All rights reserved. 1-29
BNF Fundamentals
• In BNF, abstractions are used to represent classes
of syntactic structures--they act like syntactic
variables (also called nonterminal symbols, or just
terminals)
• Terminals are lexemes or tokens
• A rule has a left-hand side (LHS), which is a
nonterminal, and a right-hand side (RHS), which is
a string of terminals and/or nonterminals
Copyright © 2018 Pearson. All rights reserved. 1-30
BNF Fundamentals (continued)
• Nonterminals are often enclosed in angle brackets
– Examples of BNF rules:
<ident_list> → identifier | identifier, <ident_list>
<if_stmt> → if <logic_expr> then <stmt>
Copyright © 2018 Pearson. All rights reserved. 1-31
BNF Fundamentals (continued)
• Nonterminals are often enclosed in angle brackets
– Examples of BNF rules:
<ident_list> → identifier | identifier, <ident_list>
<if_stmt> → if <logic_expr> then <stmt>
• Grammar: a finite non-empty set of rules
Copyright © 2018 Pearson. All rights reserved. 1-32
BNF Fundamentals (continued)
• Nonterminals are often enclosed in angle brackets
– Examples of BNF rules:
<ident_list> → identifier | identifier, <ident_list>
<if_stmt> → if <logic_expr> then <stmt>
• Grammar: a finite non-empty set of rules
• A start symbol is a special element of the
nonterminals of a grammar
Copyright © 2018 Pearson. All rights reserved. 1-33
BNF Fundamentals (continued)
• Simple but powerful to describe nearly all
of the syntax of programming languages.
– lists of similar constructs,
– the order in which different constructs
must appear,
– nested structures to any depth,
– enforce / imply operator precedence
– imply operator associativity.
Copyright © 2018 Pearson. All rights reserved. 1-34
BNF Rules
• An abstraction (or nonterminal symbol)
can have more than one RHS
<stmt> → <single_stmt>
| begin <stmt_list> end
Copyright © 2018 Pearson. All rights reserved. 1-35
Example: Describing “Lists” in BNF
• Syntactic lists are described using
recursion
<ident_list> → ident
| ident, <ident_list>
Copyright © 2018 Pearson. All rights reserved. 1-36
An Example Grammar in BNF
Copyright © 2018 Pearson. All rights reserved. 1-37
An Example Grammar in BNF
Copyright © 2018 Pearson. All rights reserved. 1-38
An Example Grammar in BNF
Copyright © 2018 Pearson. All rights reserved. 1-39
An Example Grammar in BNF
Copyright © 2018 Pearson. All rights reserved. 1-40
An Example Grammar in BNF
Copyright © 2018 Pearson. All rights reserved. 1-41