Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
3 views5 pages

Compiler Design Notes

Uploaded by

hjkfdbns28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views5 pages

Compiler Design Notes

Uploaded by

hjkfdbns28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Compiler Design Notes

Compiler design is a complex process of translating a program written in a high-level


programming language into machine code or intermediate code. A compiler performs various
stages of translation, optimization, and error-checking to generate an efficient executable.

1. Introduction to Compiler Design

 Definition: A compiler is a program that translates a source code written in a high-level


language into machine code or an intermediate representation. The output is typically
an executable or bytecode.

 Purpose of Compiler:

o Translation: Convert high-level code into machine code.

o Optimization: Enhance performance of the program.

o Error Checking: Identify syntax and semantic errors.

2. Phases of a Compiler

A compiler is divided into several phases, each responsible for a specific task.

2.1. Lexical Analysis (Scanner)

 Function: Converts the raw source code (sequence of characters) into a sequence of
tokens (meaningful chunks).

 Tokens: Basic units of syntax (keywords, identifiers, literals, operators).

 Components:

o Lexer: The program that performs lexical analysis.

o Regular Expressions: Used to define token patterns.

o Finite Automata: Helps in implementing lexers (DFA/NFA).

 Output: Token stream.

2.2. Syntax Analysis (Parser)

 Function: Analyzes the token stream to ensure that the code adheres to the grammar of
the language. It constructs a parse tree (or syntax tree).

 Grammar Types: Context-free grammar (CFG) is most commonly used.

 Components:
o Parser: A program that performs syntax analysis.

o Parse Tree: A tree structure that represents the syntactic structure of the code.

o Context-Free Grammar: Defines the syntax rules for a language (productions).

 Parsing Techniques:

o Top-Down Parsing: Recursive descent parser, LL parser.

o Bottom-Up Parsing: Shift-reduce, LR parser.

2.3. Semantic Analysis

 Function: Ensures that the program has a meaningful structure by checking for semantic
errors (e.g., type mismatches, undeclared variables).

 Tasks:

o Symbol Table Construction: Stores information about variables, functions, and


objects.

o Type Checking: Verifies that operations are applied to compatible types.

o Scope Checking: Ensures that variables are declared before use and are in the
correct scope.

2.4. Intermediate Code Generation

 Function: Translates the source code into an intermediate form, which is easier to
manipulate than machine code and more abstract than source code.

 Intermediate Code (IC):

o Three-Address Code (TAC): Each instruction has at most three operands (e.g., x =
y + z).

o Abstract Syntax Tree (AST): Intermediate representation of code that retains the
structure.

o Benefits: Easier optimization, portability across target machines.

2.5. Code Optimization

 Function: Improves the intermediate code to make the final output more efficient in
terms of execution time, memory usage, etc.

 Types of Optimization:
o Loop Optimization: Unrolling loops, reducing redundant calculations.

o Constant Folding: Precomputing constant expressions.

o Dead Code Elimination: Removing code that never executes.

o Inlining: Replacing function calls with the function’s body.

 Machine-Independent Optimizations: Performed on the intermediate code.

 Machine-Dependent Optimizations: Performed on the machine code.

2.6. Code Generation

 Function: Converts the optimized intermediate code into target machine code (or
bytecode for virtual machines).

 Tasks:

o Instruction Selection: Mapping intermediate operations to machine-level


instructions.

o Register Allocation: Assigning variables to machine registers.

o Code Emission: Generating the final machine code (or bytecode).

2.7. Code Linking and Assembly

 Linking: Combines object files into a single executable, resolving references between
modules.

 Assembly: Low-level machine code instructions are generated by an assembler, which


are then converted to binary.

3. Components of a Compiler

 Lexical Analyzer (Lexer): Converts source code into tokens.

 Syntax Analyzer (Parser): Validates the structure and builds a parse tree.

 Semantic Analyzer: Performs type and scope checks.

 Intermediate Code Generator: Converts syntax tree into intermediate code.

 Optimizer: Enhances intermediate code for better performance.

 Code Generator: Converts optimized intermediate code into machine code.

 Error Handler: Detects and reports errors during various phases of compilation.
4. Symbol Table

 Definition: A data structure used by the compiler to store information about variables,
functions, objects, types, scopes, and more.

 Attributes in Symbol Table:

o Name: Identifier name (variable, function, etc.).

o Type: Data type of the variable or function.

o Scope: The region of the program where the symbol is valid.

o Address/Location: Memory location of the symbol (in case of variables).

 Operations:

o Insert: Add symbols to the table.

o Lookup: Retrieve symbol information during analysis.

o Delete: Remove symbols that go out of scope.

5. Types of Errors

 Lexical Errors: Invalid tokens or malformed strings.

 Syntax Errors: Incorrect grammar or structure.

 Semantic Errors: Mismatched types or undefined symbols.

 Runtime Errors: Errors that occur during execution (e.g., division by zero).

 Logical Errors: Incorrect program logic.

6. Examples of Compiler Tools

 Lex (Lexical Analyzer Generator): Generates lexical analyzers from regular expressions.

 Yacc/Bison (Parser Generator): Generates parsers from context-free grammar


specifications.

 LLVM: A modular compiler framework that allows for optimization and code generation.

 GCC (GNU Compiler Collection): A widely used open-source compiler for C/C++, Fortran,
and other languages.

 Java Compiler (javac): Compiles Java code into bytecode.

7. Advanced Topics in Compiler Design


 Just-In-Time (JIT) Compilation: Compiler optimization technique used by runtime
environments (e.g., Java, .NET) to compile code at runtime.

 Garbage Collection: Automatic memory management and reclamation of unused


memory.

 Multi-pass Compilation: The compiler may use multiple passes over the code to
generate the final output (e.g., first pass for lexical and syntax analysis, second pass for
code generation).

 Compiler Construction Tools:

o ANTLR: A powerful parser generator for reading, processing, and executing


structured text.

o Flex: A tool for generating lexical analyzers.

You might also like