Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views80 pages

Module 4

The document outlines the course BCSSS602 on System Software and Compiler Design, detailing prerequisites and learning objectives related to compiler phases, code generation, and system software concepts. It covers various topics including code generation issues, memory management, instruction selection, register allocation, and evaluation order, along with examples of machine instructions and addressing modes. Additionally, it discusses program costs and the algorithm for partitioning intermediate code into basic blocks and flow graphs for efficient code generation.

Uploaded by

vvce22cse0028
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views80 pages

Module 4

The document outlines the course BCSSS602 on System Software and Compiler Design, detailing prerequisites and learning objectives related to compiler phases, code generation, and system software concepts. It covers various topics including code generation issues, memory management, instruction selection, register allocation, and evaluation order, along with examples of machine instructions and addressing modes. Additionally, it discusses program costs and the algorithm for partitioning intermediate code into basic blocks and flow graphs for efficient code generation.

Uploaded by

vvce22cse0028
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 80

System Software and

Compiler Design
BCSSS602
3:0:2
Pre- requisites
• Computer Organization
• Any programming language
• Data Structures
• Automata Theory
Course Learning Objectives (CLO)
• Understand the phases of the compiler
• Generate parse table, Intermediate Code, and Target Code
• Learn the concepts of System software – Assemblers and Loaders
MODULE 4
• Code Generation:
• The Target Language,
• Basic blocks and flow graph,
• Optimization of basic blocks,
• The Code-Generation Algorithm
• Introduction to Assembler
• Machine Architecture of SIC and SIC/XE.,
• Basic assembler functions
Issues in Design of code generator
• The following are the design issues:
1. Input to the code generator
2. The target program
3. Instruction selection
4. Register Allocation
5. Evaluation order
Input to Code Generator
• The input to the code generator is from the intermediate code
generated by the compiler’s front-end.
• This intermediate code is a higher-level representation of the
program, like triples, quadruples, or abstract syntax trees.
• Along with this intermediate code, the code generator also uses
information from the symbol table, which holds the addresses of
variables and other data objects.
• One key challenge here is that the input must be free from
syntactic and semantic errors, as the code generator assumes
that proper type-checking and other error checks have already
been handled by the front-end.
• Handling the input correctly is crucial for generating the correct
Target Program
The target program is the final output of the code generator, which can be
in the form of absolute machine language, relocatable machine language,
or assembly language. Each type of output has its own set of challenges:
• Absolute Machine Language is easy to execute but lacks flexibility
because it is bound to specific memory locations.

• Relocatable Machine Language allows parts of the program to be moved


around in memory, making it suitable for linking multiple modules, but it
requires a linking loader.

• Assembly Language is symbolic and needs an additional step (an


assembler) to convert it into machine code, but it makes the code
generation process easier.

• Choosing the appropriate form for the target program depends on factors
such as the program’s needs, execution environment, and whether the
program will be linked with other modules.
Memory Management
• Memory management in the code generation phase involves
mapping variable names to their corresponding memory
locations.
• The code generator works closely with the front-end to access
the symbol table, where memory addresses for variables are
stored.
• A major challenge is ensuring that the code generator uses
memory efficiently, avoids memory conflicts, and correctly
handles dynamic memory allocation.
• This requires careful handling of variable storage, particularly
for dynamically allocated objects or large data structures, such
as arrays or objects in object-oriented languages.
Instruction Selection
• Instruction selection is the process of choosing the most
suitable machine instructions to translate intermediate code
into executable code.
• The goal is to optimize the generated code by selecting
instructions that are efficient and appropriate for the target
machine.
• If the right instructions are not selected, the resulting code
can be inefficient and slow.
• A code generator might need to decide between different ways
of implementing the same operation, such as using different
addressing modes or optimizing for processor-specific
features.
For example, the respective three-address statements would be
translated into the latter code sequence as shown below:
• Three Address Code:
P=Q+R
S=P+T
MOV R0,Q MOV R0, Q
ADD R0,R ADD R0, R
MOV P, R0 MOV P, R0
MOV R0, P --- Redundant code
ADD R0, T
ADD R0, T
MOV S, R0 MOV S, R0
Register Allocation Issues
• Efficient use of registers is important because registers are faster than
memory, and utilizing them effectively can significantly improve
program performance.
• The challenge lies in selecting the right variables to store in registers at
different points in the program.
Register allocation involves two stages:
1.Register Allocation: It is selecting which variables will reside in the
registers at each point in the program

2.Register Assignment: Assigning specific registers to those variables


selected in Register Allocation.

• The difficulty arises in managing which variables are allocated to


registers, especially when the number of available registers is limited.
• Poor register allocation can lead to spills, where data is temporarily
stored in memory, causing slower performance.
Evaluation Order
• The evaluation order refers to the sequence in which
expressions are evaluated in the generated code.
• This order can significantly affect the efficiency of the program.
• For example, evaluating certain expressions first might require
fewer registers or fewer instructions.
• The challenge is to determine the optimal order in which to
execute operations so that the program requires fewer
resources (like memory or registers) and runs more efficiently.
• This is often a complex problem, as finding the best evaluation
order can be computationally expensive, and in some cases, it
may require sophisticated algorithms to find the optimal
solution.
Code Generation
• The Target Language – Assembly code
• A Simple Target Machine Model - Our target computer models a three-address
machine with load and store operations, computation operations, jump
operations, and conditional jumps.
• The underlying computer is a byte-addressable machine with n general-purpose
registers, RO, R1, . . . , Rn - 1.
• use a very limited set of instructions and assume that all operands are integers.
• Most instructions consists of an operator, followed by a target, followed by a list
of source operands.
• A label may precede an instruction.
Instructions for code generation
• Load operations: Move data from memory to register-
• Instruction LD dst, addr loads the value in location addr into location dst.
• LD r, x loads the value in location x into register r and LD rl, r2 loads the
contents of register r2 into register r1.
• Store operations: Move data from register to memory –
• The instruction ST x, r stores the value in register r into the location x.
• Computation operations OP dst, srcl, src2, where OP is a operator like ADD
or SUB, and dst, srcl , and src2 are locations, not necessarily distinct.
• For example, SUB rl , r2, r3 computes rl = r2 – r3
• Any value formerly stored in rl is lost
• Unary operators that take only one operand do not have a src2.
• Unconditional jumps: The instruction BR L causes control to branch to the
machine instruction with label L. (BR stands for branch.)
• Conditional jumps of the form Bcond r, L, where r is a register, L is a label,
and cond stands for any of the common tests on values in the register r.
• For example, BLTZ r, L causes a jump to label L if the value in register r is less
than zero, and allows control to pass to the next machine instruction if not.
Addressing Modes
• Direct addressing - In instructions, a location can be a variable name x
referring to the memory location that is reserved for x (that is, the l-value of x).
• Indexed addressing - A location can also be an indexed address of the form
a(r), where a is a variable and r is a register. The memory location denoted by
a(r) is computed by taking the 1-value of a and adding to it the value in register
r.
• For example, the instruction LD R1, a(R2) has the effect of setting Rl = contents
(a + contents (R2)), where contents(x) denotes the contents of the register or
memory location represented by x.
• This addressing mode is useful for accessing arrays, where a is the base address
of the array (that is, the address of the first element), and r holds the number of
bytes past that address, to reach one of the elements of array a.
• Integer Indexed Addressing - A memory location can be an integer indexed by a
register.
• For example, LD R1, 100(R2) has the effect of setting R1 = contents(100 +
contents(R2)), that is, of loading into R1 the value in the memory location
obtained by adding 100 to the contents of register R2.
• Indirect Addressing - two indirect addressing modes exist: *r means the memory
location found in the location represented by the contents of register r and *100(r)
means the memory location found in the location obtained by adding 100 to the
contents of r.
• For example, LD R1, *100(R2) has the effect of setting R1 = contents(contents(l00
+ contents(R2))), that is, of loading into R1 the value in the memory location
stored in the memory location obtained by adding 100 to the contents of register
R2.
• Immediate Addressing- immediate constant addressing mode. The constant is
prefixed by #.
• The instruction LD R1, #100 loads the integer 100 into register R1, and ADD
Example
• The three-address statement x = y - z can be implemented by the machine
instructions:

• The machine instruction for b = a [i], where a is an array whose elements are 8-
byte values, Also assuming elements of a are indexed starting at 0.
• a[j] = c

• x = *p

• *p = y

• if x < y goto L
Program and Instruction Costs
• Cost of a program is associated with compiling and running a program.
• Cost is measured depending on what aspect of a program that needs to be
optimized.
• Some common cost measures are the length of compilation time and the size,
running time and power consumption of the target program.
• Determining the actual cost of compiling and running a program is a complex
problem.
• Finding an optimal target program for a given source program is an undecidable
problem in general, and many of the subproblems involved are NP-hard.
Program and Instruction Costs
• Each target-language instruction has an associated cost , and the cost of an
instruction to be one plus the costs associated with the addressing modes of the
operands.
• This cost corresponds to the length in words of the instruction.
• Addressing modes involving registers have zero additional cost,
• Addressing modes involving a memory location or constant in them have an
additional cost of one, because such operands have to be stored in the words
following the instruction.
Examples
• The instruction LD R0, R1 copies the contents of register R1 into register R0.
This instruction has a cost of one because no additional memory words are
required. Register Addressing mode - 1 + 0 = 1
• The instruction LD R0, M loads the contents of memory location M into register
R0. The cost is two since the address of memory location M is in the word
following the instruction. Direct Memory addressing mode - 1 + 1 = 2
• Instruction LD R1, *100(R2) loads into register R1 the value given by
contents(contents(l00 + contents(R2))). The cost is three because the constant
100 is stored in the word following the instruction. Indirect Addressing mode
–1+2=3
Exercise problems
• x = 1 LD R1, #1 x=a+1 LD R1, a x = b * c LD R1, b
ST x , R1 ADD R1, R1, #1 y = b + x LD R2, c
ST x , R1 MUL R2, R1, R2
ST x, R2
x=a LD R1, a x = a + b LD R1, a ADD R1, R1, R2
ST x , R1 LD R2, b ST y, R1
ADD R1, R1, R2
ST x , R1
Generate code for the following three-address statements assuming a and b are
arrays whose elements are 4-byte values.

LD R1, i
MUL R1, R1, 4
LD R2, a(R1)
ST x, R2
LD R1, i
LD R3, j LD R1, i MUL R1, R1, 4
MUL R3, R3, 4 MUL R1, R1, 4 LD R2, a(R1)
LD R4, b(R3) LD R2, a(R1) ST x, R2
ST y, R4 ST x, R2 LD R3, b(R2)
ST a(R1), R4 LD R3, b(R1) ST y, R3
ST b(R3), R2 ST y, R3 ST a(R1), R3
MUL R3, R2, R3
ST z, R3
Generate code for the following three-address sequence assuming that i, n, p, q, x
and y are in memory locations:
if ( x < y ) if x < y goto L1 MOV R1, x
x = y; x=z MOV R2, y
goto L2 CMP R1, R2
else
BLTZ L1
LD R1, q x = z; L1: x = y
MOV R3, z
LD R2, 0(R1) L2:
ST x, R3
ST y, R2
BR L2
ADD R1, R1, #4
L1: ST X, R2
ST q, R1
L2:
LD R3, y
LD R4, p
ST 0(R4), R3
ADD R4, R4, #4
ST p, R4
while (i < n)
{ i = i + 1; }

L1:
MOV R1, i
L1: MOV R2, n
if i >= n goto L2 CMP R1, R2
BGTE L2
i=i+1
ADD R1, R1, #1
goto L1
ST i, R1
L2: BR L1
L2:
Basic Blocks and Flow Graphs
• A graph representation of intermediate code that is helpful for
discussing code generation even if the graph is not constructed
explicitly by a code-generation algorithm.
• The representation is constructed as follows:
1. Partition the intermediate code into basic blocks, which are maximal
sequences of consecutive three-address instructions with the properties
that
• (a) The flow of control can only enter the basic block through the first instruction in the
block. That is, there are no jumps into the middle of the block.
• (b) Control will leave the block without halting or branching, except possibly at the last
instruction in the block.
2. The basic blocks become the nodes of a flow graph, whose edges
indicate which blocks can follow which other blocks.
Algorithm: Partitioning three-address instructions into
basic blocks
INPUT: A sequence of three-address instructions.
OUTPUT: A list of the basic blocks for that sequence in which each instruction is assigned to
exactly one basic block.
METHOD: First, determine those instructions in the intermediate code that are leaders, that is,
the first instructions in some basic block.
• The instruction just past the end of the intermediate program is not included as a leader.
• The rules for finding leaders are:
1. The first three-address instruction in the intermediate code is a leader.
2. Any instruction that is the target of a conditional or unconditional jump is a leader.
3. Any instruction that immediately follows a conditional or unconditional jump is a leader
• Then, for each leader, its basic block consists of itself and all instructions up to but not
including the next leader or the end of the intermediate program.
Example

The leaders are instructions 1, 2, 3, 10,


12, and 13
Flow Blocks
• Once an intermediate-code program is partitioned into basic blocks, the flow of
control is represented between them by a flow graph.
• The nodes of the flow graph are the basic blocks.
• There is an edge from block B to block C if and only if it is possible for the first
instruction in block C to immediately follow the last instruction in block B.
• There are two ways that such an edge could be justified:
• There is a conditional or unconditional jump from the end of B to the beginning
of C.
• C immediately follows B in the original order of the three-address instructions,
and B does not end in an unconditional jump.
• We say that B is a predecessor of C, and C is a successor of B.
• Add two nodes, the entry and exit, that do not correspond to executable
intermediate instructions.
• There is an edge from the entry to the first executable node of the flow graph,
that is, to the basic block that comes from the first instruction of the
intermediate code.
• There is an edge to the exit from any basic block that contains an instruction
that could be the last executed instruction of the program.
• If the final instruction of the program is not an unconditional jump, then the
block containing the final instruction of the program is one predecessor of the
exit, but so is any basic block that has a jump to code that is not part of the
program.
The leaders are instructions 1, 2, 3, 10,
12, and 13
Translate the program into three-address statements
Construct the flow graph
B1 1) i = 0
B8 14) if i >= n goto(40) 26) t11 = a[t10]
B2 2) if i >= n goto(13) 27) t12 = n * k
B9 15) j = 0
B10 16) if j >= n goto(38) 28) t13 = t12 + j
B3 3) j = 0 29) t14 = t13 * 8
B11 17) k = 0 30) t15 = b[t14]
B4 4) if j >= n goto(11)
B12 18) if k >= n goto(36) 31) t16 = t11 * t15
B5 5 ) t1 = n * i B13 19) t4 = n * i 32) t17 = t7 + t16
6) t2 = t1 + j 20) t5 = t4 + j 33) c[t6] = t17
21) t6 = t5 * 8 34) k = k + 1
7) t3 = t2 * 8
35) goto(18)
8) c[t3] = 0.0 22) t7 = c[t6]
B14 36) j = j + 1
9) j = j + 1 23) t8 = n * i 37) goto(16)
10) goto(4) 24) t9 = t8 + k B15 38) i = i + 1
B6 11) i = i + 1 25) t10 = t9 * 8 39) goto(14)
12) goto(2)
Flow graph
Next use information
• Knowing when the value of a variable will be used next is essential for
generating good code.
• If the value of a variable that is currently in a register will never be referenced
subsequently, then that register can be assigned to another variable.
Algorithm: Determining the liveness and next-use
information for each statement in a basic block.
• INPUT: A basic block B of three-address statements.
• OUTPUT: At each statement i: x = y + z in B, we attach to i the liveness and
next-use information of x, y, and z.
• METHOD: We start at the last statement in B and scan backwards to the
beginning of B.
• At each statement i: x = y + z in B, we do the following:
1. Attach to statement i the information currently found in the symbol table
regarding the next use and liveness of x, y, and z.
2. In the symbol table, set x to "not live" and "no next use."
3. In the symbol table, set y and z to "live" and the next uses of y and z to i
Optimization of Basic Blocks
• The DAG Representation of Basic Blocks
• Many important techniques for local optimization begin by transforming a basic block
into a DAG (directed acyclic graph).
• Construction of a DAG for a basic block is as follows:
1. There is a node in the DAG for each of the initial values of the variables appearing in
the basic block.
2. There is a node N associated with each statement s within the block. The children of
N are those nodes corresponding to statements that are the last definitions, prior to s,
of the operands used by s.
3. Node N is labeled by the operator applied at s, and also attached to N is the list of
variables for which it is the last definition within the block.
4. Certain nodes are designated output nodes. These are the nodes whose variables are
live on exit from the block; that is, their values may be used later, in another block of
the flow graph. Calculation of these "live variables" is a matter for global flow
Advantages of DAG
• The DAG representation of a basic block lets us perform several code
improving transformations on the code represented by the block.
1. eliminate local common subexpressions, that is, instructions that compute a
value that has already been computed.
2. eliminate dead code, that is, instructions that compute a value that is never
used.
3. reorder statements that do not depend on one another; which may reduce the
time a temporary value needs to be preserved in a register.
4. apply algebraic laws to reorder operands of three-address instructions, and
sometimes t hereby simplify t he computation.
Finding Local Common Subexpressions
• Common subexpressions can be detected by noticing, as a new node M is about
to be added, whether there is an existing node N with the same children, in the
same order, and with the same operator.
• If so, N computes the same value as M and may be used in its place.
Dead Code Elimination
void example1()
• The operation on DAG's that corresponds to
{
dead-code elimination can be implemented as
int x = 10;
follows;
return; // Program exits
• Delete from a DAG any root (node with no
here
ancestors) that has no live variables attached.
• Repeated application of this transformation x = 20; // Dead code
}
will remove all nodes from the DAG that
correspond to dead code.
The Use of Algebraic Identities
• Algebraic identities represent another important class of optimizations on
basic blocks.
• For example, use identities to eliminate computations from a basic block

• Another class of algebraic optimizations includes replacing a more expensive


operator by a cheaper one as in:

• A third class of related optimizations is constant folding. Here we evaluate


constant expressions at compile time and replace the constant expressions by
their values. Thus the expression 2 * 3.14 would be replaced by 6.28.
Representation of Array References
• Consider for instance the sequence of three address statements:

• The proper way to represent array accesses in a DAG is as follows.


• 1. An assignment from an array, like x = a[i], is represented by creating a node
with operator =[ ] and two children representing the initial value of the array, a0
in this case, and the index i. Variable x becomes a label of this new node.
• 2. An assignment to an array, like a[j] = y, is represented by a new node with
operator []= and three children representing a0, j and y. There is no variable
labeling this node.
A Simple Code Generator
• One of the primary issues during code generation is deciding how to use
registers to best advantage.
• There are four principal uses of registers:
1. In most machine architectures, some or all of the operands of an operation
must be in registers in order to perform the operation.
2. Registers make good temporaries - places to hold the result of a
subexpression while a larger expression is being evaluated, or more
generally, a place to hold a variable that is used only within a single basic
block.
3. Registers are used to hold (global) values that are computed in one basic
block and used in other blocks, for example, a loop index that is
incremented going around the loop and is used several times within the
loop.
4. Registers are often used to help with run-time storage management, for
example, to manage the run-time stack, including the maintenance of
stack pointers and possibly the top elements of the stack itself.
Translate the basic block consisting of the three-address statements

• Assume that t, u, and v are temporaries, local to the block,


while a, b, c, and d are variables that are live on exit from the
block.
• Restrict the use of registers based on reusability, hence only 3
registers are sufficient
x= a/(b+c)-d*e
R1 R2 R3 a b c d e x t1 t2 t3
t1 = b + c a b c d e
LD R1, b b a b,R1 c d e
LD R2, c b c a b,R1 c, R2 d e
ADD R1, R1, R2 t1 c a b c, R2 d e R1

t2 = a / t1
LD R2, a t1 a a, R2 b c d e R1
DIV R1, R2, R1 t2 a a, R2 b c d e R1

t3 = d*e
LD R2, d t2 d a b c d,R2 e R1
LD R3, e t2 d e a b c d,R2 e,R3 R1
MUL R2, R2, R3 t2 t3 e a b c d e,R3 R1 R2

x = t2 – t3
SUB R2, R2, R3 x e a b c d e,R3 R2 R1
ST x, R2 x e a b c d e,R3 x, R2 R1
Introduction to System Software
• System software consists of a variety of programs that support the
operation of a computer
• Ex., - Text editor, compiler, loader or linker, debugger, macro processors,
operating system, database management systems, software engineering
tools, ….
• System software differs from application software in machine dependency
• System programs are intended to support the operation and use of the
computer itself, rather than any particular application.
Simplified Instructional Computer (SIC)
• SIC is a computer that includes the hardware features most often found on
real machines, while avoiding unusual or irrelevant complexities
• Like many other products, SIC comes in two versions
• The standard model (SIC)
• SIC/XE version‒ “extra equipment”, “extra expensive”
• The two versions have been designed to be upward compatible
SIC Machine Architecture
• Memory – Memory consists of 8-bit bytes, 15-bit addresses, Any 3
consecutive bytes form a word (24 bits)
• Total of 32768 (215) bytes in the computer memory
• Registers – Five registers, each is 24 bits in length
SIC
• Data Formats -
• 24-bit integer representation in 2’s complement
• 8-bit ASCII code for characters
• No floating-point on standard version of SIC
• Instruction formats
• Standard version of SIC

• The flag bit x is used to indicate indexed addressing mode


SIC-
• Addressing modes - Two addressing modes ‒ Indicated by the x bit in the
instruction

• (X): the contents of register X


SIC -Instruction set: (Textbook-Appendix A, Page 495)
• Load/store registers: LDA, LDX, STA, STX
• Integer arithmetic: ADD, SUB, MUL, DIV
• All involve register A and a word in memory, result stored in register A
• COMP - Compare value in register A with a word in memory
• Set a condition code CC (<, =, >)
• Conditional jump instructions - JLT, JEQ, JGT: test CC and jump
• Subroutine linkage - JSUB, RSUB: return address in register L
SIC -Input and output
• Performed by transferring 1 byte at a time to or from the rightmost 8 bits of
register A
• Each device is assigned a unique 8-bit code, as an operand of I/O
instructions
• Test Device (TD): < (ready), = (not ready)
• Read Data (RD), Write Data (WD)
SIC – program example - Data movement
LDA FIVE load 5 into A reg
STA ALPHA store in ALPHA from reg A
LDCH CHARZ load ‘Z’ into A
STCH C1 store in C1
.
.
.
ALPHA RESW 1 reserve one word space for variable ALPHA
FIVE WORD 5 one word holding 5
CHARZ BYTE ’Z’ one-byte constant
C1 RESB 1 one-byte variable
Arithmetic operations: BETA = ALPHA + INCR - 1
DELTA = GAMMA + INCR -1
Looping and indexing: copy one string to another
SIC/XE Machine Architecture
• Memory - Maximum memory available on a SIC/XE system is 1 megabyte
(220 bytes)
• An address (20 bits) cannot be fitted into a 15-bit field as in SIC Standard
• Must change instruction formats and addressing modes
• Registers - Additional registers are provided by SIC/XE
SIC/XE
• Data formats - There is a 48-bit floating-point data type
• fraction is a value between 0 and 1
• exponent is an unsigned binary number between 0 and 2047
• zero is represented as all 0

• 2(e-1024)
• Formats 1 and 2 do not refer memory
• Bit e distinguishes between format 3 and 4
SIC/XE Program
SIC/XE example program
BASIC ASSEMBLER FUNCTIONS
• Fundamental functions of an assembler:
• Translating mnemonic operation codes to their machine language equivalents.
• Assigning machine addresses to symbolic labels used by the programmer
• Along with the instructions the following assembler directives are used:
• START: Specify name and starting address for the program.
• END : Indicate the end of the source program and specify the first executable
instruction in the program.
• BYTE: Generate character or hexadecimal constant, occupying as many bytes as
needed to represent the constant.
• WORD: Generate one- word integer constant.
• RESB: Reserve the indicated number of bytes for a data area.
• RESW: Reserve the indicated number of words for a data area
• The program contains a main routine that reads records from an input
device( code F1) and copies them to an output device(code 05).
• The main routine calls subroutines:
• RDREC – To read a record into a buffer.
• WRREC – To write the record from the buffer to the output device.
• The end of each record is marked with a null character (hexadecimal 00).
A Simple SIC Assembler
• The translation of source program to object code requires the following
functions:
1. 1. Convert mnemonic operation codes to their machine language equivalents.
Eg: Translate STL to 14 (line 10).
2. Convert symbolic operands to their equivalent machine addresses.
Eg:Translate RETADR to 1033 (line 10).
3. Build the machine instructions in the proper format.
4. Convert the data constants specified in the source program into their
internal machine representations. Eg: Translate EOF to 454F46(line 80).
5. Write the object program and the assembly listing. All functions except
function 2 can be established by sequential processing of source program
one line at a time.
• Consider the statement
10 1000 FIRST STL RETADR 141033
• This instruction contains a forward reference (i.e.) a reference to a label
(RETADR) that is defined later in the program.
• It is unable to process this line because the address that will be assigned to
RETADR is not known.
• Hence most assemblers make two passes over the source program where the
second pass does the actual translation.
• The assembler must also process statements called assembler directives or
pseudo instructions which are not translated into machine instructions.
• Instead they provide instructions to the assembler itself.
• Examples: RESB and RESW instruct the assembler to reserve memory locations
without generating data values.
• The assembler must write the generated object code onto some output device.
• This object program will later be loaded into memory for execution.
Object Program
• Object program format contains three types of records:
• Header record: Contains the program name, starting address and length.
• Text record: Contains the machine code and data of the program.
• End record: Marks the end of the object program and specifies the address
in the program where execution is to begin.
THANK YOU

You might also like