4-1
Compiler Design
CH4 – Syntax Directed Analysis
4-2
Outline
Introduction
Syntax Directed Translation
Syntax Directed Definition
Synthesized vs. Inherited Attributes
Semantic Rules
Evaluation Order of Sematic Rules
Dependency graph
Some Classes of Non-circular Attributed Grammars
S-Attributed Grammars
L-Attributed Grammars
Type Checking
4-3
CH4 – Syntax Directed Analysis
Parser uses a CFG(Context-free-Grammar) to validate the input string and produce
output for the next phase of the compiler. Output could be either a parse tree or an
abstract syntax tree.
Now to interleave semantic analysis with the syntax analysis phase of the compiler, we
use Syntax Directed Translation.
Syntax Directed Translation has augmented rules to the grammar that facilitate semantic
analysis.
SDT involves passing information bottom-up and/or top-down to the parse tree in form
of attributes attached to the nodes.
In syntax directed translation, every non-terminal can get one or more than one attribute
or sometimes 0 attribute depending on the type of the attribute.
The value of these attributes is evaluated by the semantic rules associated with the
production rule.
4-4
The general approach to Syntax-Directed Translation is to construct a parse tree or
syntax tree and compute the values of attributes at the nodes of the tree by visiting
them in some order.
Conceptually, with both syntax-directed definitions and translation schemes:
We parse the input token stream, Build the parse tree, and
Then traverse the tree as needed to evaluate the semantic rules at the parse-tree
nodes.
• Both notations are used for
Specifying semantic checking, particularly the determination of types(type
checking), and
Generating intermediate code.
4-5
In syntax directed translation, along with the grammar we associate some informal
notations and these notations are called as semantic rules.
There are two notations for associating semantic rules with productions:
1. Syntax directed definitions, and
2. Translation schemes.
Syntax-directed definitions are high-level specifications for translations.
They hide many implementation details and free the user from having to specify
explicitly the order in which translation takes place.
Translation schemes indicate the order in which semantic rules are to be evaluated.
So they allow some implementation details to be shown.
4-6
Syntax-Directed Definitions
A syntax directed definition is a generalization of the CFG in which each grammar
symbol has an associated set of attributes (synthesized and inherited).
An attribute can represent anything we choose
a string, a number, a type, a memory location, etc.
The value of an attribute at a parse-tree node is defined by a semantic rule associated
with the production used at that node.
The value of a synthesized attribute is computed from the values of attributes at the
children of that node in the parse tree.
The value of an inherited attribute is computed from the values of attributes at the
siblings and parent of that node in the parse tree.
4-7
Semantic Rules
Semantic rules calculate the values of attributes.
Hence they setup dependencies between attributes that will be represented by a
dependency graph.
The dependency graph enables to find an evaluation order for the semantic rules.
A parse tree showing the values of the attributes is called an annotated or decorated
parse tree.
Evaluation of the semantic rules
may generate code,
save information in the symbol table,
issue error messages, or
perform any other activities.
The translation of the token stream is the result obtained by evaluating the semantic
rules.
4-8
Formal Definition
In a syntax directed definition, each grammar production has associated with it a set
of semantic rules of the form:
where
is a function and
are attributes of and the symbols at the right side of the production.
either b is a synthesized attribute of A and the values c1,..., ck are attributes of
the grammar symbols of or A,
or b is an inherited attribute of a grammar symbol of and the values c1,..., ck
are attributes of the grammar symbols of or A.
In either case, we say that attribute b depends on attributes .
4-9
Semantic Rules
Example 1: Consider an example semantic rules for
binary to decimal conversion: Syntax Rules Semantic Rules
How many attributes are there?
Which are synthesized?
Which are inherited?
In the these example, everything is calculated
from leaves to root, all attributes (i.e. and ) are
synthesized.
Draw the decorated or annotated parse tree for
the input 101?
4-10
Semantic Rules
Example 2: Consider another example semantic rules Syntax Rules Semantic Rules
for binary to decimal conversion:
How many attributes are there?
Which are synthesized?
Which are inherited?
Exercise: Draw the decorated parse tree for the input
1011.01
4-11
Semantic Rules
Example 3: Consider the following syntax-
directed definition for a desk calculator program:
PRODUCTION SEMANTIC RULES
The attributed grammar that calculate the
value of the expression
Note:
is the attribute of that gives its value.
is a synthesized attribute associated with
each nonterminal.
The token has a synthesized attribute whose
value is assumed to be supplied by the
lexical analyzer.
4-12
Synthesized Attributes
A syntax-directed definition that uses synthesized attributes exclusively is said
to be an S-attributed definition.
A parse-tree for an S-attributed definition can always be annotated by
evaluating the semantic rules for the attributes at each node bottom up,
from the leaves to the root.
Example 3: Consider an annotated parse tree for the input .
4-13
Inherited Attributes
An inherited attribute is one whose value at a node in a
parse-tree is defined in terms of attributes at the parent
and/or siblings of that node. PRODUCT SEMANTIC RULES
Inherited attributes are convenient for expressing the ION
dependence of a programming language construct on D®TL L.in := T.type
the context in which it appears. T ® int T.type := integer
Example 4: Consider an example that uses an inherited T ® real T.type := real
attribute : L ® L1 , id L1.in := L.in
that distributes type information to the various identifiers addtype(id.entry,
L.in)
in a declaration. L ® id addtype(id.entry,
Rules associated with the productions for call procedure L.in)
to add the type of each identifier to its entry in the
symbol table (pointed to by attribute entry).
4-14
Inherited Attributes
Example 4: cont.…
The following is the annotated parse-tree for the sentence :
Parse tree with inherited attribute at each node labeled
4-15
Inherited Attributes
Example 5: Syntax-directed definition with inherited attribute
4-16
Inherited Attributes
4-17
Evaluation Order of Semantic Rules
The attributes should be evaluated in a given order because they depend on one another.
The dependency of the attributes is represented by a dependency graph.
Dependency Graph
Has a node for each attribute and an edge to the node for from the node for if
attribute depends on attribute c.
if and only if there exists a semantic action such as
4-18
Evaluation Order of Semantic Rules
Example 6: Dependency graph for the parse tree in example 4.
4-19
Evaluation Order of Semantic Rules
A topological sort of a directed acyclic graph is any ordering of the nodes of the
graph such that
Edges go from nodes earlier in the ordering to later nodes;
That is, if is an edge from to , then appears before in the ordering.
Any topological sort of a dependency graph gives a valid order in which the
semantic rules associated with the nodes in a parse tree can be evaluated.
4-20
Example 7: The topological sort for the dependency graph in Example 6 is in
order of node numbers, from which, we obtain the following program.
We write for the attribute associated with the node numbered in the
dependency graph.
Evaluating these semantic rules stores the type real in the symbol-table entry
for each identifier.
4-21
Evaluation Order
Several methods have been proposed for evaluating semantic rules:
1. Parse rule based methods: for each input, the compiler finds an evaluation
order.
• These methods fail only if the dependency graph for that particular parse tree has a
cycle.
2. Rule based methods: the order in which the attributes associated with a
production are evaluated is predetermined at compiler-construction time.
• For this method, the dependency graph need not be constructed.
3. Oblivious methods: The evaluation order is chosen without considering
the semantic rules.
• This restricts the class of syntax directed definition that can be used.
4-22
Some Classes of Non-circular Attributed Grammars
S-Attributed Grammars
An attributed grammar is S-Attributed when all of its attributes are synthesized, i.e. it doesn't have
inherited attributes.
Synthesized attributes can be evaluated by a bottom-up parser as the input is being parsed.
A new stack will be maintained to store the values of the attributes as in the example below.
Example:
We assume that the synthesized attributes are evaluated just before each reduction.
• Before the reduction, attribute of is in and attributes of and are in and respectively.
• After the reduction, is put at the top of the State stack and its attribute values are put at the
top of Value stack.
• The semantic actions that reference the attributes of the grammar will in fact be translated by
the Compiler generator (such as Yacc) into codes that reference the value stack.
4-23
Some Classes of Non-circular Attributed Grammars
L-Attributed grammars
It is difficult to execute the tasks of the compiler just by synthesized attributes.
The L-attributed class of grammars allow a limited kind of inherited attributes.
Definition: A grammar is L-Attributed if and only if
for each rule , all inherited attributes of depend only on:
• Attributes of
• Inherited attributes of
Of course all S-attributed grammars are L-attributed.
Example1: A ® L M { L.h = f1 (A.h) Example2:
M.h = f2 (L.s)
A.s = f3 (M.s) }
• Does this production contradict the rules?
• Does this production contradict the rules? • Yes, since depends on
• No the corresponding grammar may be L- • The grammar containing this production is
attributed if all of the other productions follow not L-Attributed.
the rule of L-attributed grammars.
4-24
Type Checking
Introduction
Type Systems
Type Conversions
4-25
Introduction – Static Checking
● The compiler must check if the source program follows semantic conventions of
the source language.
● This is called static checking (to distinguish it from dynamic checking executed
during execution of the program).
● Static checking ensures that certain kind of errors are detected and reported.
4-26
Introduction – Static Checking
The following are examples of static checks:
– Type Checking: Incompatible operands
– Flow control check:-
●
A break instruction in C that is not in an inclosing statement,
●
A return in Pascal that is not in functions body.
– Uniqueness checks:- Redefined variable
– Name related checks:-
●
In Ada loops can have names.
●
However, the same name should be used to start and end the loop.
4-27
Type Checking
●
A type checker verifies that the type construct matches that expected by its context.
– For example, a type checker should verify that the type value assigned to a
variable is compatible with the type of the variable.
– For instance, expects two integer operands
– If any errors are found, they will be reported by the type checker. Type
information produced by the type checker may be needed when the code is
generated.
●
In almost all languages, types are either basic or constructed – In C++, for
example:... are basic types, ... are constructed types
4-28
Type Systems
●
A Type System is a collection of rules implemented by the type checker
for assigning type expressions to the various parts of a program.
●
Different type systems may be used by different compilers for the same
language.
– For example, some compilers implement stricter rules than others.
– Lint, for instance, has much more detailed type system than the C compiler itself.
●
Errors: At the very least, the compiler must report the nature and
location of errors.
●
Error Recovery:
– It is also desirable that the type checker recovers from errors and
continues parsing the rest of the input.
4-29
Type Systems
●
A type expression is either a basic type or formed by applying an operator called
type constructor to the type expression.
●
The type expression may be obtained by using the following definition.
– A basic type is a type expression, e.g., Integer, Boolean, char, ...
– A type name is a type expression
– A type constructor applied to a type expression is a type expression.
●
The type checker uses two more basic types:
– Void indicates absence of type error.
– Type_error indicates the presence of a type error.
4-30
Type Systems
The following are type constructors:
1. Arrays: if is an index set and is a type expression, then is a Type Expression.
• For example, is a Type Expression.
2. Products: if and are type expressions, the Cartesian product is a Type Expression.
• is left associative.
3. Records:
» For example, is a record
4. Pointers: if T is a Type Expression then Pointer(T) is a Type expression
» For example, .
5. Functions: the Type Expression of a function has the form D R where
• D is the type expression of the parameters and
• R is the Type Expression of the returned value.
» For example, is constructed from
4-31
Type Systems
DAG(Directed Acyclic Graph) Representation:
●
A convenient way to represent a type expression is to use a graph (tree or DAG).
●
For example, the type expression corresponding to the above function declaration
is shown below:
4-32
Specification of a Simple Type Checker
Declarations
The purpose of the semantic actions is to determine the type
expression of a variable and add the type expression in the symbol
table.
{}
{}
{}
{}
}
4-33
Specification of a Simple Type Checker
Expressions
E literal { E.type := char }
E num { E.type := integer }
E id { E.type := lookup (id.lexeme) }
E E1 mod E2 { E.type := if E1.type = integer and E2.type := integer then
Integer
Else
Type_error }
E E1[E2 ] { E.type := if E2.type = integer and E1.type := array (s, t) then t
Else
Type_error }
E ^E1 { E.type := if E1.type = pointer (t) then t
Else
Type_error }
4-34
Specification of a Simple Type Checker
Statements
S id := E { S.type := if id.type = E.type then void Else Type_error }
S if E then S1 { S.type := if E.type = Boolean then S1 .type Else Type_error }
S while E do S1 { S.type := if E.type = Boolean then S1.type Else Type_error }
S S1 ; S 2 { S.type := if S1.type = void and S2.type = void then void Else
Type_error }
The following example gives a type checking system for function calls:
E E1(E2) { E.type := if E2.type = s and E1.type = s t then t Else
Type_error }
4-35
Equivalence of types
So far we compared the type expressions using the equal operators.
However, such an operator is not defined except perhaps between basic types.
In fact we should rather use the equivalent operator which is more
appropriate.
A natural notion of equivalence is structural equivalence:
Two type expressions are structurally equivalent if and only if they are the
same basic type or are formed by applying the same constructor to
structurally equivalent types.
For example,
• integer is equivalent only to integer
• pointer (integer) is structurally equivalent to pointer (integer).
4-36
Equivalence of types
Some relaxing rules are very often added to this notion of equivalence.
For example, when arrays are passed as parameter, the array boundaries of
the effective parameters may not be the same as those of the formal
parameters.
In some languages, types may be given names.
For example in Pascal we can define:
Type:
• Have the variables the same type?
• Surprisingly, the answer varies from implementation to implementation.
4-37
Equivalence of types
When names are allowed in type expressions, a new notion of equivalence is introduced:
Name equivalence.
We have name equivalence between two type expressions if and only if they are
identical.
Under structural equivalence, names are replaced by type expressions they define,
• so two types are structurally equivalent if they represent two structurally
equivalent type expressions when all names have been substituted.
For example, ptr and pointer (integer) are not name equivalent but they are
structurally equivalent.
Note:
Confusion arises from the fact that many implementations associate an implicit type
name with each declared identifier.
Thus, and of the above example may not be name equivalent.
4-38
Type Conversions
Consider expressions like , where is of type real and of type integer.
Of course, the machine cannot execute this operation as it involves different types of
values.
However, most languages accept such expressions to be used; the compiler will be in
charge of converting one of the operand into the type of the other.
The type checker can be used to insert these conversion operations into the intermediate
representation of the source program.
For example, an operator may be inserted whenever an operand needs to be implicitly
converted.
Type conversions may be implicit or explicit
Explicit – done by the programmer
• For example,
Implicit – done by the compiler
• For example,
4-39
Thank You!