Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views13 pages

Assignment

This document is an assignment on Lexical Analysis and Intermediate Code Generation for a Compiler Design course. It includes topics such as the role of lexical analyzers, token definitions, finite automata, regular expressions, and intermediate representations like Three-Address Code. The assignment consists of questions that cover theoretical concepts and practical applications in compiler design.

Uploaded by

gwagon620
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views13 pages

Assignment

This document is an assignment on Lexical Analysis and Intermediate Code Generation for a Compiler Design course. It includes topics such as the role of lexical analyzers, token definitions, finite automata, regular expressions, and intermediate representations like Three-Address Code. The assignment consists of questions that cover theoretical concepts and practical applications in compiler design.

Uploaded by

gwagon620
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

CHANDIGARH COLLEGE OF ENGINEERING AND TECHNOLOGY

(DEGREE WING) , SECTOR 26 CHANDIGARH

COMPUTER SCIENCE AND ENGINEERING

ASSIGNMENT - 1

TOPIC: Lexical Analysis, Intermediate Code Generation

COMPILER DESIGN

SUBJECT CODE: CS-604

SUBMITTED BY: SUBMITTED TO:

JANVI SHARMA(MCO22387) DR. GULSHAN GOYAL

KIRTI LAXMI (MCO22388) Asst. Professor

CSE 6th sem , 3rd year Dept. of Computer Science


COMPILER DESIGN(CS-604)

TOPIC: Lexical Analysis, Intermediate Code Generation

Lexical Analysis:

The role of the lexical analyzer, Tokens, Patterns, Lexemes, Input buffering,
Specifications of a token, Recognition of a tokens, design of a lexical analyzer
generator.

Intermediate code generation:

Intermediate languages, Graphical representation, Three-address code,


Implementation of three address statements (Quadruples, Triples, Indirect triples)

BLOOM’S TAXONOMY

COURSE OUTCOMES(COs)

CO1 Understand the functioning of different phases of a compiler.


CO2 Understand the implementation details and concepts behind each phase of
the compiler by stressing more on the syntax analysis and further on
different parsing techniques.

CO3 Understand need of intermediate code generation, code optimization and


actual machine code generation techniques.
INDEX
QUESTION QUESTION TITLE BLOOM's CO MAPPING
NO. TAXONOMY
LEVEL

1 Explain the primary role of a lexical UNDERSTAND CO1


analyzer in the process of compilation,
and how does it interact with other
phases of a compiler?

2 Explain the difference between a token, a UNDERSTAND CO1


lexeme, and a pattern in the context of
lexical analysis. Provide an example for
each.

3 Using the syntax-directed definition from APPLY CO2


the grammar below, construct an
annotated parse tree for the expression:
(2 + 3) * (4 + 1) n
and compute the final value of the
expression.
Grammar and Semantic Rules:
1) L -> E n L.val = E.val
2) E -> E1 + T E.val = E1.val + T.val
3) E -> T E.val = T.val
4) T -> T1 * F T.val = T1.val * F.val
5) T -> F T.val = F.val
6) F -> ( E ) F.val = E.val
7) F -> digit F.val = digit.lexval

4 Describe the purpose and structure of a UNDERSTAND CO1


finite automaton in the design of a lexical
analyzer. How does a deterministic finite
automaton (DFA) differ from a
nondeterministic finite automaton
(NFA)?

5 UNDERSTAND CO1
Describe significance of regular
expressions in lexical analysis, and how
are they used to define the lexical
structure of a programming language?

6 ANALYZE CO1
Differentiate between a linker and a
loader.

7 EVALUATE CO3
Assess the common forms of
intermediate representations used in
compilers? Provide examples for each.

8 EVALUATE CO3
What are the main components of a
three-address code (TAC) instruction,
and how are they used in intermediate
code?

9 How does a compiler handle type EVALUATE CO3


checking during intermediate code
generation?

10 Mention the advantages of using a virtual UNDERSTAND CO3


register in intermediate code generation?
QUESTION BANK
1. Explain the primary role of a lexical analyzer in the process of compilation, and how does it interact
with other phases of a compiler?

2. Explain the difference between a token, a lexeme, and a pattern in the context of lexical analysis.
Provide an example for each.

3. Using the syntax-directed definition from the grammar below, construct an annotated parse tree for
the expression:
(2 + 3) * (4 + 1) n
and compute the final value of the expression.
Grammar and Semantic Rules:
1) L -> E n L.val = E.val
2) E -> E1 + T E.val = E1.val + T.val
3) E -> T E.val = T.val
4) T -> T1 * F T.val = T1.val * F.val
5) T -> F T.val = F.val
6) F -> ( E ) F.val = E.val
7) F -> digit F.val = digit.lexval

4. Describe the purpose and structure of a finite automaton in the design of a lexical analyzer. How does
a deterministic finite automaton (DFA) differ from a nondeterministic finite automaton (NFA)?

5. Describe significance of regular expressions in lexical analysis, and how are they used to define the
lexical structure of a programming language?

6. Differentiate between a linker and a loader.

7. Assess the common forms of intermediate representations used in compilers? Provide examples for
each.

8. What are the main components of a three-address code (TAC) instruction, and how are they used in
intermediate code?

9. How does a compiler handle type checking during intermediate code generation?

10. Mention the advantages of using a virtual register in intermediate code generation?
ASSIGNMENT-1 : QUESTION BANK
(BY : JANVI SHARMA ,MCO22387 & KIRTI LAXMI,MCO22388)
TOPICS:Lexical Analysis, Intermediate code generation

Q1: Explain the primary role of a lexical analyzer in the process of compilation, and how
does it interact with other phases of a compiler?
Answer: The primary role of a lexical analyzer is to read the source code as a stream of
characters and group them into meaningful sequences called tokens, which are passed to the
syntax analyzer (parser). It performs the following functions:
1.Tokenization: Converts the raw source code into tokens such as keywords, identifiers,
operators, literals, and punctuation marks.
2.Error Detection: Identifies and reports lexical errors, such as invalid or unexpected
characters.
3.Elimination of Noise: Removes whitespace and comments from the source code as they are
irrelevant for further compilation stages.
Interaction with other phases:
A. Input: The lexical analyzer receives the raw source code from the editor or preprocessor.
B. Output: It passes tokens to the syntax analyzer, which uses them to construct the
syntactic structure (e.g., parse trees).
C. Feedback Loop: If the syntax analyzer encounters errors, it may request more tokens or
notify the lexical analyzer of the issue.
Q2: Explain the difference between a token, a lexeme, and a pattern in the context of lexical
analysis. Provide an example for each.
Answer: Token: A token is a category or type of lexeme recognized by the lexical analyzer.
Example: int is a token representing a keyword.
1. Lexeme: A lexeme is the actual sequence of characters in the source code that matches a
token's pattern.
Example: In int a = 5;, int is the lexeme for the keyword token.
2. Pattern: A pattern is a rule or regular expression that specifies the structure of lexemes
for a token.
Example: The pattern for an identifier token could be [a-zA-Z_][a-zA-Z0-9_]*.
Q3: Using the syntax-directed definition from the grammar below, construct an annotated
parse tree for the expression:
(2 + 3) * (4 + 1) n
and compute the final value of the expression.
Grammar and Semantic Rules:
1) L -> E n L.val = E.val
2) E -> E1 + T E.val = E1.val + T.val
3) E -> T E.val = T.val
4) T -> T1 * F T.val = T1.val * F.val
5) T -> F T.val = F.val
6) F -> ( E ) F.val = E.val
7) F -> digit F.val = digit.lexval
Solution:
Token Breakdown:

● (2 + 3) is the first sub-expression


● (4 + 1) is the second sub-expression
● Combined via *, ending with n

Parse Tree structure :


Evaluation:

● digit.lexval → 2, 3, 4, 1

● E1:val = 2, T:val = 3 → E:val = 5 (for left parentheses)

● E1:val = 4, T:val = 1 → E:val = 5 (for right parentheses)

● T:val = 5, F:val = 5 → T:val = 5 * 5 = 25


● L:val = E:val = 25

The value is computed bottom-up using synthesized attributes. (2 + 3) = 5, (4 + 1) = 5 -> 5 * 5 =


25 ->

Final Answer: 25
Q4: Describe the purpose and structure of a finite automaton in the design of a lexical
analyzer. How does a deterministic finite automaton (DFA) differ from a nondeterministic
finite automaton (NFA)?

Answer: A finite automaton is used to recognize patterns in the input source code efficiently. It
helps the lexical analyzer decide whether a sequence of characters forms a valid token.
Structure:
1. States: Represent different stages of pattern recognition
2. Transitions: Arrows between states triggered by input characters.
3. Start State: Where pattern matching begins.
4. Accept States: Indicate successful recognition of a token.

Differences:
DFA (Deterministic Finite Automaton):
1. At each step, there is at most one possible state to transition to for a given input.
2. Easier to implement and faster since no backtracking is required.
3. Requires more states in some cases.

NFA (Nondeterministic Finite Automaton):


1.Allows multiple possible transitions for a given input.
2.Simpler to construct but requires conversion to a DFA for efficient execution.
3.Uses backtracking or parallel processing to handle multiple paths.

Q5: Describe significance of regular expressions in lexical analysis, and how are they used
to define the lexical structure of a programming language?

Answer: Regular expressions provide a concise and precise way to describe patterns for tokens.
They are the basis for defining the rules of lexical analysis, allowing systematic recognition of
valid tokens.

Usage:
1.Defining Patterns: Regular expressions are written to describe tokens such as keywords,
operators, identifiers, and numbers.
Example: [a-zA-Z_][a-zA-Z0-9_]* for identifiers.
2. Automaton Generation: Regular expressions are used to construct finite automata, which
efficiently recognize tokens in the input.
3. Flexibility: They enable the lexical analyzer to adapt to different languages by altering the set
of expressions.

Q6: Describe Linker and Loader. Differentiate between a linker and a loader.

Answer:

A linker merges all the object files generated by compiler/assembler and other pieces of code to
originate an executable file which has .exe extension.

Loader is a special program that takes input of an executable file from linker and loads it to the
main memory and prepares this code for execution by computer. (Loader allocates memory
space to program.)

LINKER LOADER

1. Main function is to generate executable 1. Main function is to load executable files


files to main memory

2. Take input as object code generated by 2. Take input of executable files generated
compiler/assembler by linker

3. Linking can be defined as a process of 3. Loading is a process of loading


combining various pieces of code and source executable code to main memory for further
code to obtain executable code execution

4. Linkers are of two types: 4. Four types: Absolute, Relocating, Direct


i) Linkage editor Linking and Bootstrap

ii) Dynamic linker

5. Linker is used during the compilation 5. Loader is used at runtime to load files
process to link object files into memory and prepare it for execution

Q7: Assess the common forms of intermediate representations used in compilers?


Provide examples for each.

Answer:
Common forms of intermediate representations include:

1.Abstract Syntax Trees (ASTs): Represent the hierarchical structure of the source code.
Example: For a + b * c, the AST would have + as the root, a as the left child, and * as the right
child with b and c as its children.
2.Three-Address Code (TAC): A linear representation using instructions with at most three
operands.
Example:

t1 = b * c;

t2 = a + t1.
3. Control Flow Graphs (CFGs): Represent the flow of control in a program as a graph, with
nodes as basic blocks and edges as control flow.
4. Postfix Notation (Reverse Polish Notation): Eliminates the need for parentheses in
expressions.
Example: a + b * c is represented as a b c * +.

Q8: What are the main components of a three-address code (TAC) instruction, and
how are they used in intermediate code?
Answer:
The main components of a TAC instruction are:

1. Operator: Specifies the operation (e.g., +, -, *, /, =, etc.).

2. Operands: Can be constants, variables, or temporary variables.

3. Result: A temporary variable or a location to store the result.

Example: For the expression a + b * c , the TAC instructions are:

1. t1 = b * c (multiplication is computed and stored in t1).

2. t2 = a + t1 (addition is computed and stored in t2).

Q9: How does a compiler handle type checking during intermediate code
generation?
Answer: During intermediate code generation, the compiler performs type checking to ensure
that operations are valid for the given operands. The process involves:
1.Checking Compatibility: Ensuring that the types of operands are compatible with the
operation.
Example: Adding an integer to a string is invalid.
2.Type Conversion: Applying implicit type conversions (type coercion) if necessary.
Example: Converting an integer to a float in 3 + 2.5.
3.Error Reporting: Generating errors if type mismatches cannot be resolved.
The symbol table and semantic analysis phase provide the type information required for this
checking.

Q10: Mention the advantages of using a virtual register in intermediate code


generation?
Answer: Virtual registers in intermediate code generation provide the following advantages:
1.Simplified Code: They eliminate the need to worry about the actual physical registers during
the intermediate phase.
2.Optimizations: They allow for easier application of register allocation and optimization
techniques.
3.Abstraction: The IR remains machine-independent, as virtual registers can later be mapped to
physical registers.
4.Scalability: Virtual registers allow for an unlimited number of registers in the IR,
accommodating complex expressions without immediate resource constraints.
Example in TAC:
t1 = a + b
t2 = t1 * c
Here, t1 and t2 are virtual registers.

You might also like