Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
5 views35 pages

Module 6

Module 6 covers code generation in compilers, detailing the code generator's role in producing a target program from an intermediate representation. It discusses key tasks such as instruction selection, register allocation, and memory management, as well as the structure of activation records in runtime organization. The document also highlights the importance of next-use information for optimizing register allocation and the different types of runtime environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views35 pages

Module 6

Module 6 covers code generation in compilers, detailing the code generator's role in producing a target program from an intermediate representation. It discusses key tasks such as instruction selection, register allocation, and memory management, as well as the structure of activation records in runtime organization. The document also highlights the importance of next-use information for optimizing register allocation and the different types of runtime environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Module 6

Code Generation
• The final phase of a compiler is code generator

• It receives an intermediate representation (IR) with supplementary

information in symbol table

• Produces a semantically equivalent target program

• Code generator main tasks:


• Instruction selection

• Register allocation and assignment

• Instruction ordering
Register and Address Descriptors

• A register descriptor is used to keep track of which variable is stored


in a register.

• The register descriptors show that initially all the registers are empty.

• An address descriptor is used to keep track of location where the


variable is stored. Location may be register, memory address or stack.
Code-generation algorithm
• The algorithm takes a sequence of three-address statements as input. For each three address statement of the form

a:= b op c perform the various actions. These are as follows:

1.Invoke a function getreg to find out the location L where the result of computation b op c should be stored.

2.Consult the address description for y to determine y'. If the value of y currently in memory and register both

then prefer the register y' . If the value of y is not already in L then generate the instruction MOV y' , L to

place a copy of y in L.

3.Generate the instruction OP z' , L where z' is used to show the current location of z. if z is in both then prefer

a register to a memory location. Update the address descriptor of x to indicate that x is in location L. If x is in

L then update its descriptor and remove x from all other descriptor.

4.If the current value of y or z have no next uses or not live on exit from the block or in register then alter the

register descriptor to indicate that after execution of x : = y op z those register will no longer contain y or z.
Generating Code for Assignment
Statements
• The assignment statement d:= (a-b) + (a-c) + (a-c) can be translated
into the following sequence of three address code

t:= a-b

u:= a-c

v:= t +u

d:= v+u
Statement Code Generated Register descriptor Address descriptor
Register empty
t:= a - b MOV a, R0 R0 contains t t in R0
SUB b, R0
u:= a - c MOV a, R1 R0 contains t t in R0
SUB c, R1 R1 contains u u in R1
v:= t + u ADD R1, R0 R0 contains v u in R1
R1 contains u v in R1
d:= v + u ADD R1, R0 R0 contains d d in R0
MOV R0, d d in R0 and memory
Generate code for the following three- 1. LD R1, #1
address statements assuming all variables ST x, R1

are stored in memory locations. 2. LD R1, a

1. x = 1 ST x, R1

3. LD R1, a
2. x = a
ADD R1, R1, #1
3. x = a + 1
ST x, R1
4. x = a + b
4. LD R1, a

LD R2, b

ADD R1, R1, R2

ST x, R1
Generating Code for Assignment
Statements
• The assignment d = (a-b)
+ (a-c) + (a-c) might be
translated into the
following three-address
code sequence:

• Code sequence for the


example is:
• The two statements
LD R1, b
x=b*c LD R2, c
y=a+x MUL R1, R1, R2
LD R3, a
ADD R3, R3, R1
ST y, R3
• The three-statement sequence
x = a[i] Answer
y = b[i] LD R1, i
z=x*y MUL R1, R1, #4
LD R2, a(R1)
LD R1, b(R1)
MUL R1, R2, R1
ST z, R1
Issues in the Design of Code Generation
• Input to the code generator
• Target program
• Memory management
• Instruction selection
• Register allocation
• Evaluation order
Input to the code generator
• The input to the code generator contains the intermediate representation of the source program and

the information of the symbol table. The source program is produced by the front end.

• Intermediate representation has the several choices:

a) Postfix notation

b) Syntax tree

c) Three address code

• We assume front end produces low-level intermediate representation i.e. values of names in it can

directly manipulated by the machine instructions.

• The code generation phase needs complete error-free intermediate code as an input requires.
Target Program

• The target program is the output of the code generator. The output can be:

a) Assembly language: It allows subprogram to be separately

compiled.

b) Relocatable machine language: It makes the process of code

generation easier.

c) Absolute machine language: It can be placed in a fixed location in

memory and can be executed immediately.


Memory Management

• During code generation process the symbol table entries have to be mapped to
actual addresses

• Mapping name in the source program to address of data is co-operating done


by the front end and code generator.

• Local variables are stack allocation in the activation record while global
variables are in static area.
Instruction Selection

• Nature of instruction set of the target machine should be complete and


uniform.

• When you consider the efficiency of target machine then the instruction
speed and machine idioms are important factors.

• The quality of the generated code can be determined by its speed and size.
Register Allocation

• Register can be accessed faster than memory. The instructions involving


operands in register are shorter and faster than those involving in
memory operand.

• The following sub problems arise when we use registers:

1. Register allocation: In register allocation, we select the set of


variables that will reside in register.

2.Register assignment: In Register assignment, we pick the register


that contains variable.
Evaluation order

• The efficiency of the target code can be affected by the order in which
the computations are performed.

• Some computation orders need fewer registers to hold results of


intermediate than others.
Target Machine
• The target computer is a type of byte-addressable machine. It has 4 bytes to a word.

• The target machine has n general purpose registers, R0, R1,...., Rn-1. It also has two-address

instructions of the form: op source, destination

Where, op is used as an op-code and source and destination are used as a data field.

• It has the following op-codes:

ADD (add source to destination)

SUB (subtract source from destination)

MOV (move source to destination)

• The source and destination of an instruction can be specified by the combination of registers and

memory location with address modes.


MODE FORM ADDRESS EXAMPLE

absolute M M Add R0, R1


register R R Add temp, R1
indexed c(R) C+ contents(R) ADD 100 (R2),
R1
indirect register *R contents(R) ADD * 100
indirect indexed *c(R) contents(c+ (R2), R1
contents(R))
literal #c c ADD #3, R1
Next-Use Information
• In compiler design, the next use information is a type of data flow analysis
that can be used to optimize the allocation of registers in a computer’s
central processing unit (CPU).

• The goal of next use analysis is to determine which variables in a program


are needed in the immediate future and should therefore be stored in a
register for faster access, rather than in main memory.

• Example x = y + z;
a = x + b;
c = x + d;
• To perform the next-use analysis, the compiler examines each instruction

in the program and determines the next time that each variable is used. If a

variable is not used again until much later in the program, it may not be

worth keeping in a register and could be stored in the main memory

instead. On the other hand, if a variable is used multiple times in quick

succession, it may be more efficient to keep it in a register and avoid the

overhead of repeatedly loading and storing it in the main memory.

• Next use analysis can be combined with other optimization techniques,

such as register allocation and live range analysis, to further improve the

performance of a compiled program.


Register Allocation and Assignment
• Local register allocation

• Register allocation is only within a basic block. It follows top-down


approach.

• Assign registers to the most heavily used variables


• Traverse the block

• Use count as a priority function

• Assign registers to higher priority variables first


Need of global register allocation

• Local allocation does not take into account that some instructions (e.g. those in loops) execute

more frequently. It forces us to store/load at basic block endpoints since each block has no

knowledge of the context of others.

• To find out the live range(s) of each variable and the area(s) where the variable is used/defined

global allocation is needed. Cost of spilling will depend on frequencies and locations of uses.

• Register allocation depends on:

• Size of live range

• Number of uses/definitions

• Frequency of execution

• Number of loads/stores needed.


Register allocation – Global Register

• Global register allocation can be seen as a graph coloring problem.

• Basic idea:

1. Identify the live range of each variable

2. Build a register interference graph (RIG) that represents conflicts


between live ranges (two nodes are connected if the variables they
represent are live at the same moment)

3. Try to assign as many colors to the nodes of the graph as there are
registers so that two neighbors have different colors
Run time Organization

• The run-time environment is the structure of the target


computers registers and memory that serves to manage
memory and maintain information needed to guide a
programs execution process.
1. Fully Static
• Fully static runtime environment may be useful for the languages in which
pointers or dynamic allocation is not possible in addition to no support for
recursive function calls.

• Every procedure will have only one activation record which is allocated
before execution.

• Variables are accessed directly via fixed address.


2. Stack Based
• In this, activation records are allocated (push of the activation record)
whenever a function call is made.

• The necessary memory is taken from the stack portion of the program.

• When program execution return from the function, the memory used
by the activation record is deallocated (pop of the activation record).
Thus, the stack grows and shrinks with the chain of function calls.
3. Fully Dynamic
• Functional language use this style of call stack management.

• The activation record is deallocated only when all references to them


have disappeared, and this requires the activation records to
dynamically freed at arbitrary times during execution.

• Memory manager (garbage collector) is needed.

• The data structure that handles such management is heap an this is


also called as Heap Management.
Activation Records
• Information needed by a single execution of a procedure is managed
using a contiguous block of storage called “activation record”.

• An activation record is allocated when a procedure is entered and it is


deallocated when that procedure is exit.

• It contain temporary data, local data, machine status, optional access


link, optional control link, actual parameters and returned values.
contents of activation records
• Return Value: It is used by calling procedure to return a value to calling

procedure.

• Actual Parameter: It is used by calling procedures to supply parameters to

the called procedures.

• Control Link: It points to activation record of the caller.

• Access Link: It is used to refer to non-local data held in other activation

records.

• Saved Machine Status: It holds the information about status of machine

before the procedure is called.

• Local Data: It holds the data that is local to the execution of the procedure.

• Temporaries: It stores the value that arises in the evaluation of an expression.

You might also like