0% found this document useful (0 votes)

4 views40 pages

CD

The document discusses various concepts in compiler design, including regular expressions, lexical analysis, parsing techniques, and code optimization. It covers topics such as the construction of finite automata, handle pruning in parsing, type checking, and the use of directed acyclic graphs (DAG) in optimization. Additionally, it explains the differences between deterministic and nondeterministic finite automata, as well as the challenges associated with top-down parsing and the construction of parsing tables.

Uploaded by

butterbyte152

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views40 pages

CD

Uploaded by

butterbyte152

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

CD

2 marks

a) Regular Expression (RE) for strings where the leftmost symbol differs from the
rightmost symbol over {a, b}

We want strings such that:

First character ≠ last character.

Over alphabet Σ = {a, b}

Possible cases:

Starts with a and ends with b

Starts with b and ends with a

Regular Expression:

a(a|b)*b | b(a|b)*a

This expression covers all strings:

That start with a and end with b

Or start with b and end with a

With any number of as and bs in between.

b) Why is buffering used in lexical analysis? What are the commonly used buffering
methods?

Why buffering is used:

Lexical analysis involves reading characters from the source file.

To improve efficiency, characters are read in blocks instead of one-by-one.

Buffering reduces the number of I/O operations, which are expensive.

Common buffering methods:

Single Buffering:

One buffer holds a block of input.

Not efficient for backtracking, as you may need to reread input.

Double Buffering:

Uses two halves of a buffer.

When one half is exhausted, the other is loaded while scanning continues.

Reduces latency and supports lookahead and backtracking.

Sentinel Method (used with double buffering):

Each half of buffer ends with a sentinel (usually EOF).

Simplifies end-of-buffer detection without needing explicit checks.

c) What is 'Handle Pruning' in bottom-up parsing?

Definition:

Handle pruning is the reduction process in bottom-up parsing where:

A handle (i.e., the rightmost derivation step) of the sentential form is identified.

It is replaced with its corresponding non-terminal (reverse of production).

Example:

If we have production: E → E + T, and stack contains E + T, then:

E + T is a handle.

Replacing it with E is handle pruning.

This is the essence of Shift-Reduce Parsing.

d) Conflicts in Shift-Reduce Parser

Shift-Reduce Conflict:

Parser is unsure whether to shift the next symbol or reduce the stack content.

Common in ambiguous grammars.

Example:

if E then S else Sif E then S

While parsing if E then S else S, ambiguity exists.

Reduce-Reduce Conflict:

Two different productions could be reduced from the current input.

Parser cannot decide which one to apply.

Example for Reduce-Reduce:

Given:

A → αB → α

If α is on the stack, the parser is confused whether to reduce it to A or B.

e) What is Three Address Code (TAC)? Mention the various representations.

Definition:

Three Address Code is an intermediate representation (IR) used in compilers.

Each instruction has:

At most three operands

Looks like: x = y op z

Representations of TAC:

Quadruples:

(operator, arg1, arg2, result)

Example: x = y + z → (+, y, z, x)

Triples:
(operator, arg1, arg2)

Result is implied by the index of the instruction.

Indirect Triples:

Similar to triples but uses a pointer table to the list of triples.

Facilitates easy code reordering.

Static Single Assignment (SSA):

Each variable is assigned only once.

Introduces φ-functions for merging control paths.

f) What is type checking? When is it done?

Definition:

Type checking is the process of ensuring semantic correctness of a program:

Operands in an expression must be of compatible types.

Example: You cannot add a string to an integer.

When is type checking done?

Static Type Checking:

Done at compile-time.

Most common in statically-typed languages like C, Java.

Dynamic Type Checking:

Done at run-time.

Used in dynamically-typed languages like Python, JavaScript.

g) Machine Independent Code Optimization Techniques

These are optimizations applied before target machine code generation.

Common techniques:

Constant Folding:

Evaluate constant expressions at compile time.

Example: x = 2 + 3 → x = 5

Constant Propagation:

Replace variables known to have constant values.

Example: If x = 5, and later y = x + 1, → y = 6

Dead Code Elimination:

Remove code that has no effect on program output.

Example: Unused assignments.

Common Subexpression Elimination (CSE):

Avoid recomputing expressions whose results are already known.

Example: Reuse result of a + b if already computed.

Loop Invariant Code Motion:

Move calculations that do not change inside loops to outside.

Strength Reduction:

Replace expensive operations with cheaper ones.

Example: x = i * 2 → x = i + i

Copy Propagation:

Replace variables that are simply copies of others

h) What is a DAG? Applications in Code Optimization

DAG: Directed Acyclic Graph

A DAG is a graph with:

Directed edges

No cycle

In compiler design, DAG is used to represent basic blocks.

Applications of DAG:

Common Subexpression Elimination

Subexpressions with the same operands and operators share a node.

Dead Code Elimination

Nodes with no effect or usage can be pruned.

Code Generation

Generate optimized instruction sequences from DAG.

Expression Simplification

Identify algebraic identities or redundancies.

i) Evaluate E.value for the expression: 2#3&5#6&4 using given grammar

Grammar & Semantic Rules:

E → E1 # T { E.value = E1.value * T.value }

|T { E.value = T.value }
T → T1 & F { T.value = T1.value + F.value }
|F { T.value = F.value }
F → num { F.value = num.value }

Now compute step-by-step:

Expression: 2 # 3 & 5 # 6 & 4

group by precedence: # has lower precedence than

So group as:

2 # ((3 & 5) # (6 & 4))

Compute:

3&5=3+5=8

6 & 4 = 6 + 4 = 10

Now: 2 # (8 # 10)

8 # 10 = 8 * 10 = 80

2 # 80 = 2 * 80 = 16

Answer: E.value = 160

j) Various Data Structures for Symbol Table

Symbol table stores info about identifiers (variables, functions, etc.).

Data Structures used:

Linear List:

Simple list/array
Slow lookup: O(n)

Hash Table:

Most commonly used.

Fast average lookup time: O(1)

Binary Search Tree (BST):

Balanced trees like AVL, Red-Black Tree.

Lookup time: O(log n)

Trie (Prefix Tree):

Efficient for strings.

Used in compilers for fast prefix lookup.

Self-Organizing List:

Frequently used items are moved to the front.

a) Differentiate DFA and NFA

DFA (Deterministic Finite NFA (Nondeterministic Finite

Feature
Automaton) Automaton)
One transition per input Can have multiple transitions for the
Transition
symbol from a state same input symbol
Epsilon (ε) Allowed (can move without input
Not allowed
transitions symbol)
Determinism Fully deterministic (one path) May have multiple paths or none
State transition Exactly one transition Zero, one, or more transitions
Simpler to construct but harder to
Implementation Easier and more efficient
simulate
Equivalence Equivalent to NFA Equivalent to DFA (can be converted)

b) Mention the Job of Lexical Analysis

Lexical analysis is the first phase of a compiler.

Jobs:

Tokenization: Convert input characters into tokens (like keywords, identifiers,

numbers).
Remove whitespace and comments

Recognize lexemes

Detect lexical errors (invalid tokens)

Pass tokens to the syntax analyzer

Communicate with the symbol table

Buffer input for efficiency

c) Explain the Specifications of LEX Programming

LEX is a lexical analyzer generator used to create scanners in C.

Specifications of a LEX Program:

A LEX file has three sections separated by %%:

css
CopyEdit
1. Definition Section2. Rules Section3. User Code Section

Example:

%{
#include <stdio.h>
%}

%%
[0-9]+ { printf("NUMBER\n"); }
[a-zA-Z]+ { printf("IDENTIFIER\n"); }
%%

int main() {
yylex();
return 0;
}

Definition Section: Includes headers or global declarations.

Rules Section: Regex + corresponding action in C code.

User Code Section: Main function, calling yylex().

d) What Do You Mean by LL(1)?

LL(1) is a type of top-down parser.

First L: Scans input from Left to right.

Second L: Constructs Leftmost derivation.

1: Uses 1 lookahead symbol

Key Points:

Simple and fast parsing.

Requires grammar to be non-left-recursive and factored.

Predictive parsing table used.

e) What is Handle Pruning?

Handle pruning is part of bottom-up parsing.

Definition:

A handle is a substring that matches the right-hand side of a production.

Pruning replaces this handle with the corresponding non-terminal.

Repeated until the start symbol is produced.

Example:

For E → E + T, if stack has E + T, this is a handle, and it's replaced with E.

f) What is Backpatching?

Backpatching is a technique in intermediate code generation for managing jump targets

that are not known initially.

Why it's used:

For control flow statements like if, while, where jump addresses are determined later.

Example:

if (a < b)
x = x + 1;

While generating code for the if, the jump target isn't known.

Backpatching fills the jump target after parsing the block.

g) Explain Loop Invariant with an Example

A loop invariant is a piece of code inside a loop that yields the same result on every
iteration.

Purpose:

Move such code outside the loop for optimization.

Example:

for (int i = 0; i < n; i++) {

y = x + 2;
a[i] = y * i;
}

y = x + 2 is loop invariant.

Move it outside:

y = x + 2;for (int i = 0; i < n; i++) {

a[i] = y * i;
}

h) What is S-Attributed Grammar?

An S-attributed grammar is a syntax-directed definition where:

All attributes are synthesized.

Attributes are computed from child nodes only

Used in:

Bottom-up parsing

Easy to evaluate in post-order traversal.

i) Differentiate Static and Dynamic Storage Allocation

Feature Static Allocation Dynamic Allocation

When allocated Compile-time Run-time
Flexibility Inflexible Flexible
Memory usage Fixed size Varies as needed
Examples Global/static variables Heap memory, dynamic arrays
Efficiency Fast access Slower due to allocation
6 marks

a) Define Regular Expression. Explain the properties of Regular Expressions.

Construct an FA equivalent to the regular expression (0+1)(00+11)(0+1).

Regular Expression (RE):

A regular expression is a formal notation used to describe patterns in strings over a given
alphabet. It defines a set of strings that belong to a particular language and is instrumental in
lexical analysis for pattern matching.

Properties of Regular Expressions:

Union ( + ): Represents the choice between expressions. For example, a + b denotes

either 'a' or 'b'.

Concatenation: Sequential arrangement of expressions. ab denotes 'a' followed by 'b'.

Kleene Star ( * ): Denotes zero or more repetitions of the preceding expression. a*

denotes '', 'a', 'aa', 'aaa', etc.

Precedence: Kleene Star has the highest precedence, followed by concatenation, and
then union.

Parentheses: Used to group expressions and override default precedence.

Constructing Finite Automaton (FA) for (0+1)(00+11)(0+1):

To construct an FA for the given regular expression, we can follow these steps:

Breakdown the Expression:

First part: (0+1) – Accepts either '0' or '1'.

Second part: (00+11) – Accepts either '00' or '11'.

Third part: (0+1) – Accepts either '0' or '1'.

Construct FA for Each Part:

(0+1): A simple FA with a start state transitioning to an accept state on input '0'
or '1'.

(00+11): An FA that accepts either '00' or '11'. This requires branching paths:

One path for '00': start → '0' → '0' → accept.

Another path for '11': start → '1' → '1' → accept.

(0+1): Similar to the first part.

Combine the FAs:

Concatenate the three FAs by connecting the accept state of the first to the start
state of the second, and similarly for the second to the third.

Ensure that the transitions are properly labeled and that the final accept state is
clearly defined.

The resulting FA will accept strings where the first and last symbols are either '0' or '1', and
the middle part is either '00' or '11'. Examples of accepted strings include '0000', '0110',
'1001', and '1111'.

b) Discuss the issues associated with grammars in top-down parsing.

Top-down parsing constructs a parse tree from the start symbol and proceeds by expanding
productions. However, certain grammar structures pose challenges:

Left Recursion:

Grammars with left-recursive rules (e.g., A → Aα) cause infinite recursion in

top-down parsers.

Such grammars need to be transformed to eliminate left recursion.

Ambiguity:

Ambiguous grammars have multiple parse trees for the same string, making it
difficult for parsers to decide which production to use.

Ambiguity must be resolved for deterministic parsing.

Backtracking:

Without predictive capabilities, parsers may need to backtrack to try alternative

productions, leading to inefficiency.

Left Factoring:

When multiple productions for a non-terminal share a common prefix, the

parser cannot decide which production to use based on the next input symbol.

Left factoring rewrites the grammar to defer the decision until enough input is
read.

Non-Determinism:

Grammars that require lookahead of more than one symbol to make parsing
decisions are non-deterministic and complicate top-down parsing.

Addressing these issues often involves transforming the grammar to a suitable form for top-
down parsing, such as eliminating left recursion and performing left factoring.
c) Construct the CLR parser for the following grammar:

S → (L) | aL → L,S | S

Steps to Construct a CLR (Canonical LR) Parser:

Augment the Grammar:

Add a new start symbol: S' → S.

Compute the LR(1) Items:

Generate the collection of LR(1) items, which are sets of items with lookahead
symbols.

Each item is of the form [A → α·β, a], where '·' indicates the position in the
production, and 'a' is the lookahead symbol.

Construct the Canonical Collection of Sets of LR(1) Items:

Begin with the closure of the initial item [S' → ·S, $].

Use the GOTO function to compute transitions between item sets.

Build the Parsing Table:

For each state (item set), determine the ACTION and GOTO entries based on
the items:

If [A → α·aβ, b] is in the state, and 'a' is a terminal, then ACTION[state, a]

= shift to the state corresponding to GOTO(state, a).

If [A → α·, a] is in the state, then ACTION[state, a] = reduce by A → α.

If [S' → S·, $] is in the state, then ACTION[state, $] = accept.

GOTO[state, A] is defined for non-terminals A.

Due to the complexity and length of the parsing table, it's advisable to refer to detailed
examples or use parser generation tools for practical implementation.

d) Consider the following grammar:

S → AS | bA → a

Construct the SLR parse table for the grammar.

Steps:

Augment the Grammar:

Add S' → S.

Compute the LR(0) Items:

Generate the canonical collection of LR(0) items.

Compute the FOLLOW Sets:

FOLLOW(S) = { $, b }

FOLLOW(A) = { a, b }

Construct the Parsing Table:

For each state, determine the ACTION and GOTO entries based on the items
and FOLLOW sets.

Use the standard SLR parsing table construction method, where reductions are
applied based on FOLLOW sets.

The resulting SLR parsing table will guide the parser in making shift and reduce decisions
based on the current state and input symbol.

e) Draw the annotated parse tree for:

i) int a, b, c
ii) float w, x, y, z

Given Syntax-Directed Definitions (SDD):

D → T L with semantic rule: L.inh = T.type

T → int with semantic rule: T.type = integer

T → float with semantic rule: T.type = float

L → L1 , id with semantic rule: L1.inh = L.inh; addType(id.entry, L.inh)

L → id with semantic rule: addType(id.entry, L.inh

Annotated Parse Tree for int a, b, c:

Parse T → int, so T.type = integer.

Apply D → T L, setting L.inh = T.type = integer.

Parse L → L1 , id:

First, L1 → L2 , id:
L2 → id:

Apply addType(a.entry, integer).

Apply addType(b.entry, integer).

Apply addType(c.entry, integer).

Annotated Parse Tree for float w, x, y, z:

Parse T → float, so T.type = float.

Apply D → T L, setting L.inh = T.type = float.

Parse L → L1 , id:

Continue recursively for each identifier:

Apply addType(w.entry, float).

Apply addType(x.entry, float).

Apply addType(y.entry, float).

Apply addType(z.entry, float).

Each addType function associates the identifier with its type in the symbol table.

f) Explain:

(i) Common Subexpression Elimination (CSE):

CSE identifies expressions that are computed multiple times with the same operands
and eliminates redundant computations by computing the expression once and reusing
the result.

Example:

a = b + c;
d = b + c;

Optimized:
t = b + c;
a = t;
d = t;

(ii) Code Motion:

Code motion moves computations outside loops when the result does not change
across iterations, reducing redundant calculations.

Example:

for (i = 0; i < n; i++) {

y = a + b;
z[i] = y * i;
}

Optimized:

y = a + b;for (i = 0; i < n; i++) {

z[i] = y * i;
}

These optimizations enhance performance by reducing unnecessary computations.

j) Explain the characteristics of peephole optimization.

Peephole optimization is a local optimization technique that examines a small set of

instructions (the "peephole") to identify and replace inefficient sequences with more efficient
ones.

Characteristics:

Local Scope: Operates on a small window of instructions, typically within a basic

block.

Pattern Matching: Identifies specific patterns that can be optimized.

Simplification: Replaces complex instruction sequences with simpler ones.

Redundancy Elimination: Removes unnecessary instructions, such as redundant
loads and stores.

Strength Reduction: Replaces expensive operations with cheaper ones (e.g.,

replacing multiplication with addition).

Example:

MOV R1, R2

MOV R2, R1

Optimized:

; Removed redundant instructions

Peephole optimization improves code efficiency and reduces code size.

k) Describe S-attributed and L-attributed grammars with suitable example

S-attributed Grammar:

An S-attributed grammar is a syntax-directed definition where only synthesized

attributes are used.

Synthesized attributes are those which are computed from the attribute values of
the children nodes in a parse tree (i.e., bottom-up).

It is typically used in bottom-up parsing (like LR parsers).

✅ Example:

Consider the grammar for arithmetic expressions:

E → E1 + T { E.val = E1.val + T.val }

E→T { E.val = T.val }
T → num { T.val = num.val }

Assume:

num.val is taken from the lexical value (say 3 for num).

The attribute val is synthesized and propagated upward.

Parse Tree:

For input 2 + 3, the attribute val is computed bottom-up:

T → num (val = 2)
E1 → T (val = 2)

T → num (val = 3)

E → E1 + T (val = 2 + 3 = 5)

L-attributed Grammar:

An L-attributed grammar may use:

Synthesized attributes, and

Inherited attributes — which are passed from parent or left sibling nodes to the
current node.

Used mostly in top-down parsing (like LL parsers).

✅ Example:

D→TL
T → int { T.type = "int" }
T → float { T.type = "float" }
L → L1 , id { L1.inh = L.inh; addType(id.entry, L.inh) }
L → id { addType(id.entry, L.inh) }

Here, L.inh is an inherited attribute passed from T to L.

The attribute type is inherited from T and passed to each id in L.

Input:

For int a, b, attributes are passed as:

T → int sets T.type = int

L.inh = T.type = int

L → L1 , id, both L1 and id get the type "int"

Key Differences:

Feature S-attributed Grammar L-attributed Grammar

Attribute
Only synthesized Both synthesized and inherited
types
Parsing style Bottom-up (LR) Top-down (LL)
Dependency From children to parent From parent and siblings to child
Postfix expressions, semantic actions in Variable declarations, type
Use case
LR propagation
L) Explain various storage allocation strategies with examples

✅ 1. Static Allocation:

Memory for all variables is allocated at compile time.

The memory location of each variable does not change during runtime.

Used in global/static variables and constants.

Example:

int x = 5;

Here, x gets a fixed location in data segment.

✅ Advantages:

No overhead of allocation/deallocation.

Efficient access.

❌ Disadvantages:

Inflexible; cannot handle recursive procedures or dynamic structures.

✅ 2. Stack Allocation

Used for local variables inside functions or blocks.

Memory is allocated/deallocated in LIFO (Last-In-First-Out) order.

Stack grows with function calls and shrinks on return.

Example:

void func() {
int a = 10; // stored in stack
}

Stack Frame: Includes local variables, return address, etc.

✅ Advantages:

Efficient for nested function calls.

Automatic deallocation.
❌ Disadvantages:

Lifetime tied to function calls.

No support for dynamic memory.

✅ 3. Heap Allocation:

Memory is allocated at runtime using functions like malloc(), new, etc.

It is suitable for dynamic data structures like linked lists, trees.

Example:

int* p = (int*)malloc(sizeof(int));

Here, memory for an integer is dynamically allocated from the heap.

✅ Advantages:

Flexible, can grow or shrink as needed.

Variables can outlive function calls.

❌ Disadvantages:

Slower than stack/static allocation.

Memory leaks if not freed properly.

Comparison:

Feature Static Stack Heap

Allocation Time Compile-time Run-time Run-time
Lifetime Entire program Duration of function Until explicitly freed
Flexibility Rigid Moderate High
Speed Fastest Fast Slower
Use case

a) Write the output of each phase of compilation for the statement:

a = (b + c) * (b + c) * 2;

Phase Output
Lexical Analysis Tokens: id(a), =, (, id(b), +, id(c), ), *, (, id(b), +, id(c), ), *, num(2)
Parse Tree showing the structure of the expression with correct operator
Syntax Analysis
precedence
Semantic
Type checking and validation of identifiers a, b, c, and constant 2
Analysis
t1 = b + c
Intermediate t2 = t1 * t1
Code t3 = t2 * 2
a = t3
Optimization Eliminates repeated computation of (b + c)
Code Generation Machine or assembly instructions generated for the final expression
Answer:

b) Describe the structure of a LEX program. Write a LEX specification to remove

comments (both single line and multiple line) from C source code.

Answer:

Structure of a LEX Program:

%{
/* C declarations */
%}
%%
/* Rules: regex patterns and actions */
%%
/* User code: main function, utilities */

LEX Specification to Remove Comments:

%{
#include <stdio.h>
%}
%%
"//".* ; // Remove single-line comments
"/\\*"([^*]|\\*+[^*/])*"\\*/" ; // Remove multi-line comments
.|\n { ECHO; } // Print everything else
%%
int main() {
yylex();
return 0;
}

c) Define token, lexeme, and pattern. For the following program, identify lexemes,
tokens, and patterns:

Program:
int main()
{
int a, b;
printf("Enter two integers to swap\n");
scanf("%d%d", &a, &b);
a = a + b;
b = a - b;
a = a - b;
printf("a = %d\nb = %d\n", a, b);
return 0;
}

Answer:

Lexeme Token Pattern

int Keyword int
main Identifier [a-zA-Z_][a-zA-Z0-9_]*
(, ), {, } Punctuation Literal characters
a, b Identifier [a-zA-Z_][a-zA-Z0-9_]*
,, ; Punctuation Literal characters
printf, scanf Identifier [a-zA-Z_][a-zA-Z0-9_]*
"Enter two...n" String Literal "[^"]*"
"%d%d" String Literal "[^"]*"
&a, &b Operator + ID &[a-zA-Z_][a-zA-Z0-9_]*
=, +, - Operator Literal characters
return Keyword return
0 Numeric Literal [0-9]+

d) Show that the following grammar is not SLR(1) but is CLR(1):

Grammar:

S → AaAb | BbBaA → εB → ε

Answer:

SLR(1) Analysis:

FIRST(A) = {ε}, FIRST(B) = {ε}

FOLLOW(A) = {a}, FOLLOW(B) = {a}

Both A and B derive ε and are followed by a ⇒ Reduce-Reduce conflict in SLR(1)

table.
CLR(1) Analysis:

CLR(1) uses lookahead in items, not FOLLOW sets.

Each ε-production is associated with its specific context:

A → ε , lookahead a in AaAb

B → ε , lookahead b in BbBa

No conflict due to distinct lookahead ⇒ Grammar is CLR(1)

e) Write algorithm to compute FIRST() and FOLLOW() for the following grammar:

Grammar:

S → ACB | CbB | BaA → da | BCB → g | εC → h | ε

Algorithm for FIRST(X):

If X is a terminal, FIRST(X) = {X}

If X → ε, then ε ∈ FIRST(X)

If X → Y₁Y₂...Yₙ:

Add FIRST(Y₁) excluding ε to FIRST(X)

If ε ∈ FIRST(Y₁), add FIRST(Y₂), and so on

If ε ∈ all FIRST(Yᵢ), then ε ∈ FIRST(X)

Algorithm for FOLLOW(X):

Place $ in FOLLOW(start symbol)

For each production A → αBβ:

Add FIRST(β) (except ε) to FOLLOW(B)

If ε ∈ FIRST(β), add FOLLOW(A) to FOLLOW(B)

For each production A → αB:

Add FOLLOW(A) to FOLLOW(B)

FIRST Sets:

B → g | ε ⇒ FIRST(B) = {g, ε}
C → h | ε ⇒ FIRST(C) = {h, ε}

A → da | BC
⇒ FIRST(A) = {d} from da, and also
B → g | ε, C → h | ε
⇒ FIRST(BC) = FIRST(B) ∪ FIRST(C) = {g, h, ε}
So, FIRST(A) = {d, g, h}

S → ACB, CbB, Ba
⇒ FIRST(S) = FIRST(A), FIRST(CbB), FIRST(Ba)

FIRST(CbB): C → {h, ε} ⇒ if ε ∈ C, look at b ⇒ {h, b}

FIRST(Ba): B → {g, ε} ⇒ if ε ∈ B, look at a ⇒ {g, a}

⇒ FIRST(S) = {d, g, h, b, a}

FOLLOW Sets:

FOLLOW(S) = { $ }

S → ACB

FOLLOW(A) = FIRST(C) = {h, ε}

If ε ∈ FIRST(C), include FIRST(B) = {g, ε}
⇒ FOLLOW(A) = {h, g}
If ε ∈ B too, add FOLLOW(S) ⇒ { $ }
⇒ FOLLOW(A) = {h, g, $}

S → ACB, S → CbB, S → Ba
⇒ B appears in all; FOLLOW(B) = { $, a }

C→h|ε

C is followed by b and B in ACB and CbB

FIRST(B) = {g, ε}; if ε ∈ B, add FOLLOW(S) ⇒ { $ }

⇒ FOLLOW(C) = {g, b, $}

f) Describe the issues associated with grammars in top-down parsing with suitable
example.

Answer:

Top-down parsers, like recursive descent parsers, have two main issues with grammars:

Left Recursion
A grammar is left-recursive if a non-terminal appears on the leftmost side of its
own production.

Example (Problematic)

E → E + T | TT → T * F | FF → (E) | id

Issue: Recursive descent parser enters infinite recursion.

Solution: Convert left-recursion to right recursion:

E → T E'
E' → + T E' | ε

Left Factoring

When two or more productions for a non-terminal begin with the same prefix,
the parser cannot decide which one to choose.

Example:

S → if E then S else S
| if E then S

Issue: Parser gets confused after if E then S.

Solution (Left Factoring):

S → if E then S S'
S' → else S | ε

g) Compare local optimization with global optimization with suitable example.

Feature Local Optimization Global Optimization
Scope Within a basic block Across multiple basic blocks
Speed Fast and simpler Slower due to analysis of control flow
Eliminate dead code inside one
Example Move invariant code out of loops
block
Constant folding, dead code Loop-invariant code motion, common
Techniques
elimination subexpression elimination

Example:

Local Optimization:

a = 4 * 5; // constant folding → a = 20;

Global Optimization:

for(i = 0; i < 100; i++) {

x = y + z; // loop-invariant: move outside loop
a[i] = x * i;
}

h) Compare static, stack, and heap allocations.

Feature Static Allocation Stack Allocation Heap Allocation

Entire program During function
Lifetime Until manually deallocated
run execution
Storage
Data Segment Stack Memory Heap Memory
Location
Speed Fast Faster Slow
Flexibility Fixed-size Function scope Dynamic, variable-size
Dynamic memory
Example Global variables Local variables
(malloc/new)

i) Construct the DAG for the following basic block:

e := a + b
a := e - d
c := b * c

Answer:

Directed Acyclic Graph (DAG):

[+]
/ \
a b
\ /
\ /
[e]
|
[-]
/ \
[e] d
|
a

c := [*]
/\
b c

Nodes represent operations (+, -, *)

Leaves are variables (a, b, c, d)

Reuse of computed value e avoids recomputation

j) Explain loop jamming and loop unrolling with example.

Loop Jamming:

Definition: Combining multiple loops that iterate over the same range into a single
loop.

Example:

for(i=0;i<n;i++) {
a[i] = b[i] + c[i];
}for(i=0;i<n;i++) {
d[i] = e[i] * f[i];
}

After Loop Jamming

for(i=0;i<n;i++) {
a[i] = b[i] + c[i];
d[i] = e[i] * f[i];
}

Loop Unrolling:

Definition: Reducing the overhead of loop control by executing multiple iterations per
loop.

Example:

for(i=0;i<4;i++) {
a[i] = a[i] + 1;
}

After Unrolling:

a[0] = a[0] + 1;
a[1] = a[1] + 1;
a[2] = a[2] + 1;
a[3] = a[3] + 1;

k) Describe Peephole Optimization.

Answer:

Definition: A form of local optimization that examines and replaces short sequences
of instructions (a small "peephole") to improve performance or reduce code size.

Techniques:

Redundant instruction elimination

Example:
MOV R1, R2
MOV R2, R1 → Remove second

Strength reduction

Example:
x=y*2→x=y+y

Algebraic simplification

Example:
x=x+0→x=x

Jump optimization

Eliminate unnecessary GOTO or combine jumps.

l) How is scope information of variables stored in a symbol table? Explain.

Answer:

Symbol Table stores information about identifiers: name, type, scope, memory
location, etc.

Scope Handling Mechanisms:

Using Stack of Symbol Tables:

A new table is pushed when entering a block (function, loop).

Popped when exiting the block.

Ensures variables are visible only within their scope.

Linked List of Tables:

Each table points to its parent (enclosing) scope.

Lookup starts at the current scope and moves outward.

Hash Tables with Scope Information:

Each entry holds a scope level.

Helps handle variable shadowing

Example:
int x = 5; // global scopevoid func() {

int x = 10; // local scope shadows global

}

The symbol table for func contains x=10, linked to the outer table where x=5.

16 marks

Q3. Construct an SLR parsing table for the following grammar: R → R + R | R R | (R) |
a|b

Operator Precedence and Associativity:

() > Concatenation > +

All operators are left associative.

Step 1: Augmented Grammar

Let us add an augmented start symbol S′:

S' → R
R→R+R
R→RR
R → (R)
R→a
R→b

Step 2: Compute FIRST and FOLLOW sets

FIRST(R) = { (, a, b }

FOLLOW(R) = { ), $, +, (, a, b }

Step 3: Construct LR(0) items and DFA (Canonical Collection of LR(0) Items)

I0:

S' → .R

R → .R + R

R → .R R

R → .(R)
R → .a

R → .b

Transitions:

on a → I3

on b → I4

on ( → I5

on R → I1

(Similarly define I1 through I10 as per the DFA transitions)

Step 4: Build the SLR Parsing Table

We consider ACTION and GOTO tables based on the canonical LR(0) collection and
FOLLOW sets. For conflict resolution:

'+' has lowest precedence, is left associative.

Concatenation (RR) has higher precedence than '+', also left associative.

() has highest precedence.

Conflicts are resolved using precedence and associativity rules.

The final parsing table will reflect this precedence hierarchy and associativity to parse
expressions like:

a + b a → ((a + b) a)

Q4. Type Checking, Type Expression, and Type Conversion

Type Checking

Verifies that operations are semantically valid by ensuring operand types match.

Example:

int a = 5;
float b = 3.14;
a + b; // valid (type coercion may happen)
a + "hello"; // error (type mismatch)

Done at compile time (static) or run time (dynamic).

Type Expression:
A compact representation of types using constructors like arrays, records, pointers, etc.

Examples:

int[] → array(int)

pointer to int → ptr(int)

function taking float, returning int → float → in

Type Conversion:

Implicit Conversion (Coercion): Automatic type change.

int a = 5;
float b = 3.0;
float c = a + b; // a is coerced to float

Explicit Conversion (Casting): Manual conversion.

float f = 5.6;
int i = (int)f; // i becomes 5

Q5. Code Optimization Techniques

a) Copy Propagation:

Replaces occurrences of variables with known values.

Example:

a = b;
c = a + d; // becomes c = b + d

b) Dead Code Elimination:

Removes code that doesn’t affect program results.

Example:

a = 5;
a = 6; // 'a = 5' is dead

c) Code Motion:

Moves code outside loops if it does not change within loop.

Example:
for(int i=0;i<10;i++) {
x = y + z; // move out if y and z unchanged
}

d) Loop Invariant Code Motion:

A special case of code motion where expressions invariant within loop are moved out.

Example:

for(int i=0;i<n;i++) {
t = a * b;
arr[i] = t + i;
}
// move 't = a * b;' outside loop

Q6. What is an Activation Record? Draw a diagram of general activation record and
explain the fields.

Activation Record (AR):

It is a runtime data structure used to manage function calls. It stores all necessary
information for function execution.

General Activation Record Structure:

+---------------------+
| Actual Parameters |
+---------------------+
| Return Address |
+---------------------+
| Control Link (static/dynamic link) |
+---------------------+
| Access Link |
+---------------------+
| Saved Registers |
+---------------------+
| Local Variables |
+---------------------+
| Temporary Values |
+---------------------+

Fields Explained:

Actual Parameters: Values passed to the function.

Return Address: Location to return after function completes.

Control Link: Pointer to caller's activation record (dynamic link).

Access Link: Pointer for non-local variable access (static link).

Saved Registers: Caller-saved register values.

Local Variables: Variables declared in the function.

Temporary Values: Intermediate results during expression evaluation.

Activation records are maintained in the runtime stack, and they grow/shrink as function
calls and returns happen.

Q3: Consider the following grammar

D → Type Tlist;Type → int | float

Tlist → Tlist, id | id

a) Find the SLR parser for the above grammar

To construct the SLR (Simple LR) parser, we first need to compute the LR(0) items and
then build the SLR parsing table.

Find the Canonical Collection of LR(0) Items

Let's start by constructing the LR(0) items for each step:

I0 (Initial Item Set):

D → • Type Tlist;Type → • intType → • float

Tlist → • Tlist, id
Tlist → • id

I1 (After Type → int):

D → int • Tlist;
Tlist → • Tlist, id
Tlist → • id

I2 (After Type → float):

D → float • Tlist;
Tlist → • Tlist, id
Tlist → • id

I3 (After Tlist → id
D → Type • Tlist;
Tlist → • Tlist, id
Tlist → • id

I4 (After Tlist → Tlist, id

Tlist → Tlist, • id

I5 (After Tlist → id and moving to Tlist → Tlist, id):

Tlist → Tlist, id •

Construct the SLR Parsing Table

Using the canonical collection of items, we can build the SLR parsing table.

Action Table:
The action table tells whether to shift or reduce or whether to accept based on
the terminal symbol and current state. It is populated by checking the item sets.

Goto Table:
The GOTO table indicates which state to move to when a non-terminal is
encountered.

After constructing these tables, if conflicts arise (e.g., a shift-reduce or reduce-reduce

conflict), the conflict should be resolved using the operator precedence and
associativity rules.

Example: Suppose we encounter a shift-reduce conflict. Using associativity (left

associativity for ,), we may decide to reduce instead of shifting.

Note: The exact parsing table would require constructing all item sets and resolving
any conflicts using precedence and associativity rules.

b) Show the parsing of the string "int id, id, id;" using the parsing table constructed
above.

Step-by-step parsing:

Input String: "int id, id, id;"

Start with initial state in the stack and read the first symbol (int).

Based on the SLR table, shift int and move to the next state.

Next Symbol: id

Shift id and move to the next state.

Next Symbol: ,
Based on the SLR table, shift , and move to the next state.

Next Symbol: id

Shift id and move to the next state.

Next Symbol: ,

Shift , and move to the next state.

Next Symbol: id

Shift id and move to the next state.

Next Symbol: ;

Perform a reduce action, using the production Tlist → Tlist, id and reduce the
stack.

The final stack should reflect the complete parse, and we perform accept.

Q4: List the commonly used intermediate representations. Write the following
expression in all types of intermediate representations you know:

(a-b) * (c + d) - (a + b)

Commonly Used Intermediate Representations (IR)

Abstract Syntax Tree (AST):

A tree structure that captures the hierarchical structure of the source code. It
abstracts away the syntactical details and represents the program's logical
structure.

Three-Address Code (TAC):

A form of intermediate code in which each instruction has at most three

addresses. These addresses can represent variables or temporary values.

Example for (a - b) * (c + d) - (a + b) in TAC:

t1 = a - bt2 = c + dt3 = t1 * t2t4 = a + bresult = t3 - t

Quadruples:
A type of intermediate representation where each instruction is represented as a
tuple with four fields: operator, operand1, operand2, and result.

Example for (a - b) * (c + d) - (a + b) in quadruples:

(−, a, b, t1)
(+, c, d, t2)
(*, t1, t2, t3)
(+, a, b, t4)
(−, t3, t4, result)

Triples:

Similar to quadruples, but they use references to results instead of explicitly

naming the result variables.

Example for (a - b) * (c + d) - (a + b) in triples:

(−, a, b)
(+, c, d)
(*, (result1), (result2))
(+, a, b)
(−, (result3), (result4))

Static Single Assignment (SSA):

A form of IR where each variable is assigned exactly once, making it easier for
optimization algorithms to analyze the program.

Example in SSA for (a - b) * (c + d) - (a + b):

t1 = a - bt2 = c + dt3 = t1 * t2t4 = a + bresult = t3 - t4

i) Explain the simple code generator with a suitable example.

A simple code generator translates intermediate representations such as three-address code

(TAC) into machine or assembly code. It works by generating instructions that the target
architecture can understand.

For example, consider the expression (a - b) * (c + d) - (a + b):

Intermediate Code (TAC):

t1 = a - bt2 = c + dt3 = t1 * t2t4 = a + bresult = t3 - t4

Assembly Code:

SUB t1, a, b ; t1 = a - b
ADD t2, c, d ; t2 = c + d
MUL t3, t1, t2 ; t3 = t1 * t2
ADD t4, a, b ; t4 = a + b
SUB result, t3, t4 ; result = t3 - t4

The simple code generator would translate each intermediate operation into the
corresponding assembly instruction.

ii) Write detailed notes on basic blocks and flow graphs.

Basic Block: A basic block is a sequence of consecutive statements with no branches

(except at the end). Control enters the block at the beginning and exits at the end.
There is no ambiguity about the order of execution of statements within a basic block.
It forms the building blocks of control flow analysis.

Example:

a = b + cd = e * f

Flow Graph (Control Flow Graph): A control flow graph (CFG) represents the flow
of control in a program. Each basic block is represented as a node, and edges between
nodes represent control flow. It is helpful for analyzing how a program behaves during
execution and is crucial for optimization.

Example:

Start → Basic Block 1 → Basic Block 2 → End

↘→ Basic Block 3

The flow graph helps visualize loops, branches, and paths of execution.
Q6: Obtain the translation scheme for obtaining the three-address code for the
following grammar

S → id := E
E → E1 + E2 | E1 * E2 | -E1 | (E1) | id

Translation Scheme:

For the given grammar, let's define the translation schemes to generate three-address code:

For S → id := E:

Translation Scheme

id := E.code

For E → E1 + E2

Translation Scheme:

E.code = t1 + t2

For E → E1 * E2:

Translation Scheme:

E.code = t1 * t2

For E → -E1:

Translation Scheme:
E.code = -t1

For E → (E1):

Translation Scheme

E.code = E1.code

For E → id:

Translation Scheme

E.code = id

Example for E → E1 + E2:

For the input expression a + b:

E1.code = a

E2.code = b

The resulting three-address code would be:

t1 = a + bE.code = t1

Artificial Intelligence Assignment
70% (10)
Artificial Intelligence Assignment
5 pages
CD 2 Marks
No ratings yet
CD 2 Marks
15 pages
21951a0592. (Aat-2) CD
No ratings yet
21951a0592. (Aat-2) CD
17 pages
Compiler Notes
No ratings yet
Compiler Notes
8 pages
Compiler Design TCS601 All Answers Complete UTF8
No ratings yet
Compiler Design TCS601 All Answers Complete UTF8
12 pages
Compiler Notes
No ratings yet
Compiler Notes
12 pages
Compiler Design: Assignment
No ratings yet
Compiler Design: Assignment
4 pages
Sem 5 Major
No ratings yet
Sem 5 Major
25 pages
Complier Design WINTER 2021 PAPER SOLUTION
No ratings yet
Complier Design WINTER 2021 PAPER SOLUTION
19 pages
Compiler Design Solutions Guide
No ratings yet
Compiler Design Solutions Guide
10 pages
Compiler Design - Complete Study Notes
No ratings yet
Compiler Design - Complete Study Notes
14 pages
PYQs Unit 3 CD
No ratings yet
PYQs Unit 3 CD
34 pages
Question Bank-Compiler Full Notes From Ktu
No ratings yet
Question Bank-Compiler Full Notes From Ktu
5 pages
PDFen
No ratings yet
PDFen
23 pages
COMPILER DESIGN ASSIGNMENT TWO 17 12 2022 Submit
No ratings yet
COMPILER DESIGN ASSIGNMENT TWO 17 12 2022 Submit
18 pages
Compiler Constration Solve by Noman Tariq
No ratings yet
Compiler Constration Solve by Noman Tariq
35 pages
CD Question Bank Solution
No ratings yet
CD Question Bank Solution
18 pages
CD
No ratings yet
CD
2 pages
CD Unit-Iii
No ratings yet
CD Unit-Iii
20 pages
SPCC
No ratings yet
SPCC
80 pages
Compiler 76fddac7 42da 4b0e 985e 8cf8a92cd723
No ratings yet
Compiler 76fddac7 42da 4b0e 985e 8cf8a92cd723
20 pages
Compiler
No ratings yet
Compiler
5 pages
Compiler Design Question Bank
No ratings yet
Compiler Design Question Bank
6 pages
Questions Bank Compiler Design
No ratings yet
Questions Bank Compiler Design
3 pages
Compiler Design Merged
No ratings yet
Compiler Design Merged
40 pages
Compl Construction Past Q
No ratings yet
Compl Construction Past Q
11 pages
SSCD PDF
No ratings yet
SSCD PDF
2 pages
CD 2 M
No ratings yet
CD 2 M
5 pages
CS3501 Compiler Design
No ratings yet
CS3501 Compiler Design
13 pages
CS1601 Important Model III
No ratings yet
CS1601 Important Model III
5 pages
CD Unit3
No ratings yet
CD Unit3
17 pages
Cit 316
No ratings yet
Cit 316
18 pages
Demonstrate The Phases of A Compiler With Example
No ratings yet
Demonstrate The Phases of A Compiler With Example
16 pages
Ce 705
No ratings yet
Ce 705
5 pages
ECS-603 Put 2013 Sol
No ratings yet
ECS-603 Put 2013 Sol
27 pages
Compiler Design Essentials
No ratings yet
Compiler Design Essentials
26 pages
CD Unitwise Imp Questions
100% (1)
CD Unitwise Imp Questions
5 pages
Additional Note CSC 409
No ratings yet
Additional Note CSC 409
11 pages
Compiler Design - 2-Mark and 16-Mark Answers
No ratings yet
Compiler Design - 2-Mark and 16-Mark Answers
19 pages
CD Unit 4 Answers
No ratings yet
CD Unit 4 Answers
26 pages
Compiler Design for B.Tech Students
No ratings yet
Compiler Design for B.Tech Students
21 pages
CD Question Bank
No ratings yet
CD Question Bank
7 pages
Unit 3,4,5 Compiler - Design Notes
No ratings yet
Unit 3,4,5 Compiler - Design Notes
12 pages
Complier Design SUMMER 2022 PAPER SOLUTION
No ratings yet
Complier Design SUMMER 2022 PAPER SOLUTION
17 pages
CD (1) Removed
No ratings yet
CD (1) Removed
1 page
24-Module 4 - Variants of Syntax Trees - Three Address Code-10!09!2024
100% (1)
24-Module 4 - Variants of Syntax Trees - Three Address Code-10!09!2024
44 pages
CD Notesgpt s2
No ratings yet
CD Notesgpt s2
13 pages
Compiler Design Question Bank
No ratings yet
Compiler Design Question Bank
3 pages
A Write Short
No ratings yet
A Write Short
35 pages
Compiler Key2
No ratings yet
Compiler Key2
18 pages
CD 22-23 Answers
No ratings yet
CD 22-23 Answers
28 pages
Compiler Key3
No ratings yet
Compiler Key3
17 pages
Compiler Design Past Year QNs
No ratings yet
Compiler Design Past Year QNs
5 pages
Compiler Design Exam Solutions
No ratings yet
Compiler Design Exam Solutions
18 pages
Compiler Contruction QB PDF
No ratings yet
Compiler Contruction QB PDF
7 pages
Compiler 8 (Intermediate Code Generation)
No ratings yet
Compiler 8 (Intermediate Code Generation)
14 pages
Assignment Questions
No ratings yet
Assignment Questions
10 pages
COMPILER Lab VIVA
No ratings yet
COMPILER Lab VIVA
11 pages
CD Question Bank
No ratings yet
CD Question Bank
7 pages
UNIT-3 Odg
No ratings yet
UNIT-3 Odg
17 pages
Bluetooth Communication Using A Touchscreen Interface With The Raspberry Pi
No ratings yet
Bluetooth Communication Using A Touchscreen Interface With The Raspberry Pi
4 pages
Final Semester Exam Paper
No ratings yet
Final Semester Exam Paper
4 pages
Hailey College of Commerce Punjab University, Lahore: Assignment: A.I.S (Oracle) Submited To
No ratings yet
Hailey College of Commerce Punjab University, Lahore: Assignment: A.I.S (Oracle) Submited To
6 pages
Keyword Protocol 2000 - Part 1 - Physical Layer - Swedish
No ratings yet
Keyword Protocol 2000 - Part 1 - Physical Layer - Swedish
12 pages
Form5 Accounting HHW December 2022
No ratings yet
Form5 Accounting HHW December 2022
15 pages
MF50 Q&a
No ratings yet
MF50 Q&a
3 pages
Expert Frontend Developer Portfolio
No ratings yet
Expert Frontend Developer Portfolio
1 page
The Machine Learning Solutions Architect Handbook - 2nd Edition (Early Access) David Ping All Chapter Instant Download
100% (1)
The Machine Learning Solutions Architect Handbook - 2nd Edition (Early Access) David Ping All Chapter Instant Download
49 pages
Coronnello Et Al. - 2005 - Sector Identification in A Set of Stock Return Time Series Traded at The London Stock Exchange (2) - Annotated
No ratings yet
Coronnello Et Al. - 2005 - Sector Identification in A Set of Stock Return Time Series Traded at The London Stock Exchange (2) - Annotated
27 pages
L21 L22 Varying CTReconstruction Parameters
No ratings yet
L21 L22 Varying CTReconstruction Parameters
24 pages
Bolt - New Technical Implementation Explained
No ratings yet
Bolt - New Technical Implementation Explained
12 pages
Annihilator Method
100% (1)
Annihilator Method
7 pages
First Quarter Examination in Epas G12
100% (1)
First Quarter Examination in Epas G12
3 pages
Silicon N-Channel Power MOSFET: General Description
No ratings yet
Silicon N-Channel Power MOSFET: General Description
10 pages
13 - Flowcharts and Loops (In C and Assembly)
No ratings yet
13 - Flowcharts and Loops (In C and Assembly)
20 pages
18eln mergedPDFdocs PDF
100% (1)
18eln mergedPDFdocs PDF
125 pages
Dennis
No ratings yet
Dennis
27 pages
(Ebook) Visualization Analysis and Design by Tamara Munzner ISBN 9781466508910, 1466508914 Download
100% (1)
(Ebook) Visualization Analysis and Design by Tamara Munzner ISBN 9781466508910, 1466508914 Download
95 pages
UTS - Lec 11 - Digital Self - Panganiban
No ratings yet
UTS - Lec 11 - Digital Self - Panganiban
13 pages
Example Network Diagram: Msa Bts1 Bsc1 Msc/Vlr1 Air Interface/Lapdm Abis Interface/Lapd A Interface Map - E Interface
No ratings yet
Example Network Diagram: Msa Bts1 Bsc1 Msc/Vlr1 Air Interface/Lapdm Abis Interface/Lapd A Interface Map - E Interface
40 pages
Itu-T G.841
No ratings yet
Itu-T G.841
98 pages
BRO Software
No ratings yet
BRO Software
28 pages
Thesis Statement About Gadgets
100% (2)
Thesis Statement About Gadgets
7 pages
Final - Emt 11 - 12 Q2 0802 PS
No ratings yet
Final - Emt 11 - 12 Q2 0802 PS
53 pages
Term Paper On Management Information System
100% (1)
Term Paper On Management Information System
4 pages
GE3151 - Python
No ratings yet
GE3151 - Python
2 pages
77 9097
No ratings yet
77 9097
75 pages
Windows System Error Codes
No ratings yet
Windows System Error Codes
304 pages
EIM Performance Tuning Guide
No ratings yet
EIM Performance Tuning Guide
3 pages