CHAPTER 2
ASSEMBLER
Role of Assembler
Object
Source
Program
Assembler
Code
General Design Procedure
In general following procedure is used in design of
an assembler.
Define problem statement
Define data structures
Define format of data structures
Define algorithm
Check for modularity
Repeat steps from1 to 5 for all modules.
Forward Reference Problem
In an assembly language program we can use symbols
which are the names associated with data or
instructions.
It may happen that the symbols are referred before
they are defined. This is called as forward reference.
One approach to solve this problem is to have two
passes over the source program. So the first pass just
defines the symbols and second pass finds the
addresses.
Design of two pass assembler
IBM 360/370 System
Elements of Assembly language
Mnemonic operation codes: It is symbolic name given to
each machine instruction. It eliminates the need of
memorizing the numeric op-codes.
Pseudo-op : These are the instructions for the assembler
during the assembly process of program.
Machine-op: These are actual machine instructions
The general format of assembly language statement is:
[label] <op-code> operand (s) ;
Symbols: These are the names associated with data
or instructions. These names can be used as
operand in program.
Literal: It is an operand which has syntax like
=<value>.
Assembler creates data area for literals containing
the constant values.
E.g. =F10
Location Counter : Used to hold address of current
instruction being executed
Meaning of some pseudo-op
START: It indicates start of the source program
END : It indicates end of the source program.
EQU : It associates symbol with some address specification.
<symbol> EQU <address spec>
USING: It tells assembler that which register is used as
base register and its contents
DROP: It makes the base register unavailable.
LTORG: It tells assembler to place all literals at earlier
place
DC: Define constant
DS: Define storage
addressing scheme
offset
A 1, 250 (0,15)
index register
base register
It is add instruction which adds some number to
contents of register 1. The location of number is
calculated as :
location = offset + contents of index register
+contents of base register
5 Instruction Formats
RR (register-register)
RX (register-indexed)
RS (register-storage)
SI (storage-immediate)
SS (storage-storage)
Two passes
Pass 1 : It defines symbols and literals
Find length of machine instructions
Maintain location counter
Remember values of symbols till pass2
Process some pseudo ops
Remember literals
Pass 2: Generate object program
Look up values of symbols
Generate instructions
Generate data
Process pseudo ops
Pass1 requires following databases
Source program
Location counter(LC) which stores location of each
instruction
Machine Operation Table (MOT):This table indicates the
symbolic mnemonic for each instructions and its length.
Pseudo Operation Table (POT): This table indicates the
symbolic mnemonic and action taken for each pseudo-op in
pass1.
Symbol Table (ST) which stores each label along with its
value.
Literal Table(LT) which stores each literal and its
corresponding address
A copy of input which will be used by pass2.
Format of databases
Machine-op Table (MOT)
Pseudo-op Table (POT)
Symbol Table (ST)
Literal Table (LT)
Flowchart
Example
Pass 2 requires following databases:
Copy of Source program
Location counter(LC)
Machine Operation Table (MOT).
Pseudo Operation Table (POT).
Symbol Table (ST) generated by pass1
Base Table (BT) which indicates which register is used as
base register and what its contents are.
A work space (INST) which holds each instruction as its
various parts.
A work space (PRINTLINE) which produces a printed listing
A work space (PUNCH CARD) which is used to output the
assembled instructions in format needed by loader.
An output deck of assembled instructions in format needed
by loader.
Base Table
Flowchart
Design of one pass assembler
IBM PC
Intel 8088
One pass processing
Analysis Phase
Isolate label, mnemonic opcode and operand field
If label present enter (symb, LC contents) in
Symbol Table
Perform LC processing
Synthesis Phase
Obtain machine opcode
Obtain address from symbol table
Addressing
Segment based addressing scheme is used.
Code segment(CS)
Data segment(DS)
Stack segment(SS)
Extra Segment(ES)
Assembler directives
1. EQU : It associates symbol with some address
specification.
<symbol> EQU <address spec>
2. ORG : It is used to set location counter to specified address.
ORG <address spec>
3. ASSUME : This directive tells the assembler which
segment register contains the segment base.
ASSUME <register> :<segment name>
4. SEGMRNT : It indicates start of segment
5. ENDS : It indicates end of segment
Databases required
1.
2.
3.
4.
5.
6.
Source program
Mnemonic Operation Table (MOT). This table indicates
the symbolic mnemonic for each instruction.
Symbol Table (ST) which stores each label along with its
relevant information.
Segment Register Table (SRTAB) which stores
information about segment name and segment register.
Forward Reference Table (FRT) which stores information
about forward references.
Cross reference table (CRT) which list out all references to
a symbol in ascending order of statements.
Mnemonic Table (MOT)
Mnemonic
op-codes
(6)
JNE
Machine
op-codes
(2)
75 H
Alignment/forma
t information
(1)
00H
Routine id
(4)
binary
R2
Symbol Tabel
Segment Register Table Array
(SRTAB)
Forward Reference Table (FRT)
Pointer
(2)
SRTAB #
(1)
Instruction Usage
Address
Code
(2)
(1)
Source
statement
#
(2)
Cross reference table (CRT)
Pointer to next entry (2)
Source statement # (2)
The stepwise processing is as
follows:
Initialization of some parameters: LC=0 , size=0, srtab_no=1,
SYMTAB_segmrnt_entry=0,
ERRTAB and SRTAB_ARRAY is cleared
Read the statement from source program
Examine the op-code field to check whether it is pseudo-op or machine-op.
If it is machine-op then MOT is searched to find match for the op-code and call
the appropriate routine.
Every type statement requires different processing. The statemets are processed
in following way.
If it is EQU pseudo-op then
Evaluate expression in operand field,
Make entry for the label in SYMTAB
set offset = value of operand
Enter stmt_no in the CRT list of the label in operand field
Process forward references to the label.
size=0
If
it is ASSUME statement then
Create a new SRTAB and make entry for segment
register and SYMTAB_segment_entry for the
segment name in operand field..
srtab_no= srtab_no+1
size=0
If
SEGMENT statement then
make entry for label in SYMTAB with
segment_name =true
size=0
LC=0
SYMTAB_segment_entry=entry no in SYMTAB
If ENDS statement then SYMTAB_segment_entry=0
If DC statement then
Align LC according to specification in operand field
Assemble constant if any
size=size of memory required
If Imperative statement then
If operand is symbol then make entry in CRT
If operand symbol is already defined then check its alignment
and addressability and generate address specification for
symbol using SYMTAB entry
else Make appropriate entry in for the symbol in SYMTAB
Assemble instruction in machine_code buffer
size=size of instruction
If
size!=0
If label is present then Make appropriate entry in for
the symbol in SYMTAB with current LC
Move contents of machine_code_buffer to address
code_area_address
code_area_address= code_area_address+size
process forward references for symbol
Enter errors in ERRTAB
List statements with errors contained in ERRTAB
Clear ERRTAB
If
END statement then
Report undefined symbols from SYMTAB
Produce cross reference listing
Write code_area into output file.