Compiler Design and Construction
Code Generation
Pop Quiz/Review
What options do we have for generating code?
If we choose IR, what options do we have for IR?
2 Code Generation April, 2011
Intermediate Code Generation
A OR B AND NOT C
OR
t1 = not C
t2 = B AND t1 A AND
t3 = A OR t2 B NOT
3 Code Generation April, 2011
Intermediate Code Generation
If A < B AND C < D
IF
A = C * 10 + D
AND =
t1 = A < B
t2 = C < D A +
t3 = t1 AND t2 < <
t3 goto true1 * D
A B C D
goto endif1
true1: t4 = c*10 C 10
t5 = t4 * D
A = t5
endif1:
4 Code Generation April, 2011
Intermediate Code Generation
While a<b do
if c < d then x = y + 2
else x = y – 2 while
0 t1 = A < B if
1 t1 goto 3 <
2 goto 11 < = =
3 t2 = c < d A B
4 t2 goto 8 C D x - x +
5 t3 = y-2
6 x = t3 y
7 goto 10 2 y 2
8 t4 = y + 2
9 x = t4
10 goto 0
5 Code Generation April, 2011
Generating Code via Macro Expansion
Macroexpand each IR tuple or subtree
(ADDI,Addr(b),Addr(c),t1)
lw $t0, b
lw $t1, c
add $t2, $t0, $t1
Macroexpansion gives poor quality code if each tuple
expanded separately
Ignoring state (values already loaded)
6 Code Generation April, 2011
Generating Code via Macro Expansion
A := B+C;
D := A * C;
lw $t0, B,
lw $t1, C,
add $t2, $t0, $t1
sw $t2, A
lw $t0, A
lw $t1, C
mul $t2, $t0, $t1
sw $t2, D
7 Code Generation April, 2011
Generating Code via Macro Expansion
D := (B+C)*C;
t1=B+C lw $t0, B,
lw $t1, C
add $t2, $t0, $t1
sw $t2, t1
t2=t1*C lw $t0, t1
lw $t1, C
mul $t2, $t0, $t1
sw $t2, t2
d = t2 lw $t0, t2
sw $t0, d
8 Code Generation April, 2011
Generating Code via Macro Expansion
Macroexpansion gives poor quality code if each tuple
expanded separately
Ignoring state (values already loaded)
What if more than 1 tuple can be replaced with 1
instruction
Powerful addressing modes
Powerful instructions
Loop construct that decrements, tests and jumps if necessary
9 Code Generation April, 2011
Register and Temporary Management
Efficient use of registers
Values used often remain in registers
Temp values reused if possible
Define Register classes
Allocatable
Explicitly allocated and freed
Reserved
Have a fixed function
Volatile
Can be used at any time
Good for temp values (A:=B)
10 Code Generation April, 2011
Temporaries
Usually in registers (for quick access)
Storage temporaries
Reside in memory
Save registers or large data objects
Pseudoregisters
Load into volatile, then save back out
Generates poor code
Moving temp from register to pseudo register is called spilling
11 Code Generation April, 2011
Code Generation
A separate generator for each tuple
Modularized
Simpler
Harder to generate good code
Easy to add to yacc!
A single generator
More complex
12 Code Generation April, 2011
Code Generation
Instruction selection
Addressing modes, intermediates
R-R, R-M, M-M, RI...
Address-mode selection
Remember all the types!
Register allocation
These are tightly coupled
Address-mode affects instruction
Instruction can affect register
See handout for a “+” code generator
Doesn't handle 0 or same oprnd twice
13 Code Generation April, 2011
Expressions in YACC
expression : operand mathop operand {
if CheckType($2, $1, $3) yyerror(“operand mismatch”);
emit($2, $1, $3);
}
operand: INTCONST {$<e.type>$ = TY_INT; }
| FLCONST {$<e.type>$ = TY_FLT;}
| ID {SYMTAB *p = symLookUP($1);
if (p) {emit(lw)??}
else yyerror2(“error: %s undefined”, $1);
14 Code Generation April, 2011
Code Generation
IF A < B THEN thenPart ELSE elsePart END IF;
blt $t0, $t1, _then
j _else:
_then: thenPart
j _endif;
_else: elsePart
_endif:
15 Code Generation April, 2011
Code Generation
IF A < B THEN thenPart ELSE elsePart END IF;
bge $t0, $t1, _else blt $t0, $t1,
_then24: thenPart _then j _else:
j _endif; _then: thenPart
_else: elsePart j _endif;
_else: elsePart
_endif:
_endif:
16 Code Generation April, 2011
Code Generation
IF A < B THEN A := C * 10;
lw $t0, A
lw $t1, B
bge $t0, $t1, _else24
lw $t2, C
li $t3, 10
mul $t4, $t2, $t3
sw $t4, A
_else24:
17 Code Generation April, 2011
Code Generation
IF A < B THEN A := C * 10; ELSE A := C*9; END IF;
lw $t0, A
lw $t1, B
bge $t0, $t1, _else24
lw $t2, C
li $t3, 10
mul $t4, $t2, $t3
sw $t4, A
j _endif24
_else24: lw $t5, C
li $t6, 9
mul $t7, $t6, $t6
sw $t7, A
_endif24:
18 Code Generation April, 2011
Code Generation: Declarations
%type <type> type
%token <strval> VARIABLE
%%
Decls: type varlist SEMI | type varlist SEMI decls | ;
varlist:VAR {
addST($1,$0);
printf(“%s: %s”,$1,($0 == 1)? “.float 0.0”:”.word 0”);
}
| varlist COMMA VARIABLE {
addST($3,$0);
printf(“%s: %s”,$3,($0 == 1)? “.float 0.0”:”.word 0”);
}
type: FLOAT {$$ = TY_FLT; }
| INTEGER {$$ = TY_INT; }
;
19 Code Generation April, 2011
A Complete Example
PROGRAM .data
INTEGER B[15]; B: .word 0:15
BEGIN .text
B[8] := 19; .globl main
END main:
li $t0, 8
li $t1, 19
la $t2, B # array base address
mul $t3, $t0, 4 # offset to element
add $t2, $t2, $t3 # address of element
sw $t1, 0($t2) # save rhs in lhs array
li $v0, 10
syscall # exit
20 Code Generation April, 2011
A Digression into MIPS32
21 Code Generation April, 2011
22 Code Generation April, 2011
Registers (MIPS)
32 registers provided (but not 32-useable registers!)
R0 .. R31
Register R0 is hard-wired to zero
Register R1 is reserved for assembler
Arithmetic instructions operands must be registers
r0 0
r1
°
°
°
r31
PC
lo
hi
23 Xiaoyu Zhang, CSUSM CS 331
MIPS: Software conventions for Registers
Registers all have two names, ie $3 and $v1
Although you can do what you want, you should follow
these conventions:
0 zero constant 0 16 s0 local variables
1 at reserved for assembler . . . (callee must save)
23 s7
2 v0 expression evaluation &
3 v1 function results 24 t8 temporary (cont’d)
25 t9
4 a0 arguments
5 a1 26 k0 reserved for OS kernel
6 a2 27 k1
7 a3
28 gp Pointer to global area
8 t0 temporary: caller saves 29 sp Stack pointer
30 fp frame pointer
... (callee can clobber)
31 ra Return Address (HW)
15 t7
24 Xiaoyu Zhang, CSUSM CS 331
Addressing Objects:
Endianess and Alignment
Big Endian: address of most significant byte = word address (xx00 = Big End of
word)
IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA
Little Endian: address of least significant byte = word address (xx00 = Little
End of word)
Intel 80x86, DEC Vax, DEC Alpha (Windows NT)
little endian byte 0
3 2 1 0
msb lsb
0 1 2 3
0 1 2 3
big endian byte 0 Aligned
Alignment: require that objects fall on address Not
that is multiple of their size. Aligned
25 Xiaoyu Zhang, CSUSM CS 331
Memory Instructions
MIPS is CISC so only load and store instructions
lw $t1, offset($t0);
sw $t1, offset($t0);
Example:
C code: A[8] = h + A[8];
assume h in $s2 and base address of the array A in $s3
MIPS code: lw $t0, 32($s3)
add $t0, $s2, $t0
sw $t0, 32($s3)
Store word has destination last
Remember arithmetic operands are registers, not memory!
26 Xiaoyu Zhang, CSUSM CS 331
I/O Services
Service $v0 Argument(s) Results
Print integer 1 $a0 = number to be printed
Print float 2 $f12 = number to be printed
Print double 3 $f12 = number to be printed
Print string 4 $a0 = address of string in memory
Read integer 5 number returned in $v0
Read float 6 number returned in $f0
Read double 7 number returned in $f0
Read string 8 $a0 = address of input buffer in memory
$a1 = length of buffer (n)
Sbrk 9 $a0 = amount address in $v0
Exit 10
Print character 11 $a0 = character to print
Read character 12 character read in $v0
File I/O operations 13 – 16
Exit2 (terminate with value) 17 $a0 = termination result
li $v0, 4 # system call print_str li $v0, 1 # system call print_int
la $a0, str # addr of string to print li $a0, 5 # integer to print
syscall # print the string syscall # print it
27 Xiaoyu Zhang, CSUSM CS 331
Hello World Assembly Program
.data
str: .asciiz "Hello world!!!\n" Data Segment
.text
.globl main # exports symbol “main”
main:
la $a0,str # put string address into a0
li $v0,4 # system call to print
Text Segment
syscall # out a string (Code)
li $v0,10
syscall # exit with no result
Labels Directives
28 Xiaoyu Zhang, CSUSM CS 331
Hello World Assembly Program
.data
str: .asciiz "Hello world!!!\n"
.text
.globl main # exports symbol “main”
main:
la $a0,str # put string address into a0
li $v0,4 # system call to print
syscall # out a string
li $v0,10
syscall # exit with no result
Directives
29 Xiaoyu Zhang, CSUSM CS 331
Some Useful Mips Commands
Some register-register math commands
add $t0, $t1, $t2 # $t0 = $t1+$t2
sub $t0, $t1, $t2 # $t0 = $t1-$t2
mul $t0, $t1, $t2 # $t0 = $t1*$t2 note there could be overflow!
div $t0, $t1, $t2 # $t0 = $t1/$t2 note there could be overflow!
CISC machine, so can only access memory with load/store commands
lw $t1, a_addr # $t1 = Mem[a_addr]
lw $s1, 8($s0) # $s1 = Mem[$s0+8] sw $t1, a_addr # Mem[a_addr] = $t1
Sometimes you need an address
la $a0, addr # put addresss addr into $a0
30 Xiaoyu Zhang, CSUSM CS 331
Some Useful Mips Commands
Some register-register math commands
Note CMD Destination Operand1 Operand2
add $t0, $t1, $t2 # $t0 = $t1+$t2
sub $t0, $t1, $t2 # $t0 = $t1-$t2
mul $t0, $t1, $t2 # $t0 = $t1*$t2 note there could be overflow!
div $t0, $t1, $t2 # $t0 = $t1/$t2 note there could be overflow!
CISC machine, so can only access memory with load/store commands
lw $t1, a_addr # $t1 = Mem[a_addr]
lw $s1, 8($s0) # $s1 = Mem[$s0+8]
sw $t1, a_addr # Mem[a_addr] = $t1
Sometimes you need an address
la $a0, addr # put addresss addr into $a0
31 Xiaoyu Zhang, CSUSM CS 331
Some Useful Mips Commands
Those pesky immediates (constants)
li $a0, 12 # put immediate value of 12 into
register $a0
mfhi $t0 # move contents from hi into $t0
32 Xiaoyu Zhang, CSUSM CS 331
Some Useful Mips Commands
Branching
beqz $s0, label # if $s0 == 0 goto label
bnez $s0, label # if $s0 != 0 goto label
bgez $s0, label # if $s0 >= 0 goto label
bge $t0, $t1, label # if $t0 >= $t1 goto label pseudoinstruction
bgt $t0, $t1, label # if $t0 > $t1 goto label pseudoinstruction
ble $t0, $t1, label # if $t0 <= $t1 goto label pseudoinstruction
blt $t0, $t1, label # if $t0 < $t1 goto label pseudoinstruction
beq $t0, $t1, label # if $t0 == $t1 goto label
33 Xiaoyu Zhang, CSUSM CS 331
Spim, xspim, QtSpim
34 Xiaoyu Zhang, CSUSM CS 331
Use Appendix A as a Reference
35 Xiaoyu Zhang, CSUSM CS 331
Use of Registers
Example:
a = ( b + c) - ( d + e) ; // C statement
# $s0 - $s4 : a - e
add $t0, $s1, $s2
add $t1, $s3, $s4
sub $s0, $t0, $t1
a = b + A[4]; // add an array element to a var
// $s3 has address A
lw $t0, 16($s3)
add $s1, $s2, $t0
36 Xiaoyu Zhang, CSUSM CS 331
load and store
Ex:
a = b + A[i]; // A is in $s3, a,b, i in
// $s1, $s2, $s4
add $t1, $s4, $s4 # $t1 = 2 * i
add $t1, $t1, $t1 # $t1 = 4 * i
add $t1, $t1, $s3 # $t1 =addr. of A[i]
($s3+(4*i))
lw $t0, 0($t1) # $t0 = A[i]
add $s1, $s2, $t0 # a = b + A[i]
38 Xiaoyu Zhang, CSUSM CS 331
Making Decisions
Example
if ( a != b) goto L1; // x,y,z,a,b mapped to $s0-$s4
x = y + z;
L1 : x = x – a;
bne $s3, $s4, L1 # goto L1 if a != b
add $s0, $s1, $s2 # x = y + z (ignored if a!=b)
L1:sub $s0, $s0, $s3 # x = x – a (always ex)
Reminder
Registers variable in C code $s0 ... $s7 $16 ... 23
Registers temporary variable $t0 ... $t7 $8 ... 15
Register $zero always 0
39 Xiaoyu Zhang, CSUSM CS 331
if-then-else
Example:
if ( a==b) x = y + z;
else x = y – z ;
bne $s3, $s4, Else # goto Else if a!=b
add $s0, $s1, $s2 # x = y + z
j Exit # goto Exit
Else : sub $s0,$s1,$s2 # x = y – z
Exit :
40 Xiaoyu Zhang, CSUSM CS 331
Example: Loop with array index
Loop: g = g + A [i];
i = i + j;
if (i != h) goto Loop
....
$s1, $s2, $s3, $s4 = g, h, i, j, array A base = $s5
LOOP: add $t1, $s3, $s3 #$t1 = 2 * i
add $t1, $t1, $t1 #$t1 = 4 * i
add $t1, $t1, $s5 #$t1 = adr. Of A[i]
lw $t0, 0($t1) #load A[i]
add $s1, $s1, $t0 #g = g + A[i]
add $s3, $s3, $s4 #i = i + j
bne $s3, $s2, LOOP
41 Xiaoyu Zhang, CSUSM CS 331
Loops
Example :
while ( A[i] == k ) // i,j,k in $s3. $s4, $s5
i = i + j; // A is in $s6
Loop: sll $t1, $s3, 2 # $t1 = 4 * i
add $t1, $t1, $s6 # $t1 = addr. Of A[i]
lw $t0, 0($t1) # $t0 = A[i]
bne $t0, $s5, Exit # goto Exit if A[i]!=k
add $s3, $s3, $s4 # i = i + j
j Loop # goto Loop
Exit:
42 Xiaoyu Zhang, CSUSM CS 331
Other decisions
Set R1 on R2 less than R3
slt R1, R2, R3
Compares two registers, R2 and R3
R1 = 1 if R2 < R3 else
R1 = 0 if R2 >= R3
Example
slt $t1, $s1, $s2
Branch less than
Example: if(A < B) goto LESS
slt $t1, $s1, $s2 #t1 = 1 if A < B
bne $t1, $0, LESS
43 Xiaoyu Zhang, CSUSM CS 331
Switch statement
switch(k) {
case 0 : f = I + j; break;
case 1 : f = g + h; break;
case 2 : f = g – h; break;
case 3 : f = i – j; break;
}
f-k in $s0-$s5 and $t2 contains 4 (maximum of var k)
The switch statement can be converted into a big chain of if-
then-else statements.
A more efficient method is to use a jump address table of
addresses of alternative instruction sequences and the jr
instruction. Assume the table base address in $t4
44 Xiaoyu Zhang, CSUSM CS 331
Switch cont.
slt $t3, $s5, $zero # is k < 0
bne $t3, $zero, Exit # if k < 0, goto Exit
slt $t3, $s5, $t2 # is k < 4, here $t2=4
beq $t3, $zero, Exit # if k >=4 goto Exit
sll $t1, $s5, 2 # $t1 = 4 * k
add $t1, $t1, $t4 # $t1 = addr. Of $t4[k]
lw $t0, 0($t1) # $t0 = $t4[k]
jr $t0 # jump to addr. In $t0
# $t4[0]=&L0, $t4[1]=&L1, …,
L0 : add $s0, $s3, $s4 # f = i + j
j Exit
L1 : add $s0, $s1, $s2 # f = g + h
j Exit
L2 : sub $s0, $s1, $s2 # f = g – h
j Exit
L3 : sub $s0, $s1, $s2 # f = i – j
Exit :
45 Xiaoyu Zhang, CSUSM CS 331
Complex Arithmetic Example
z=(a*b)+(c/d)-(e+f*g);
lw $s0,a
lw $s1,b
mult $s0,$s1
mflo $t0
lw $s0,c
lw $s1,d
div $s0,$s1
mflo $t1
add $t0,$t0,$t1
lw $s0,e
lw $s1,f
lw $s2,g
mult $s1,$s2
mflo $t1
add $t1,$s0,$t1
sub $t0,$t0,$t1
sw $t0,z
CSCE 212 46
If-Statement
if ((a>b)&&(c==d)) e=0; else e=f;
lw $s0,a
lw $s1,b
bgt $s0,$s1,next0
b nope
next0: lw $s0,c
lw $s1,d
beq $s0,$s1,yup
nope: lw $s0,f
sw $s0,e
b out
yup: xor $s0,$s0,$s0
sw $s0,e
out: …
CSCE 212 47
For Loop
for (i=0;i<a;i++) b[i]=i;
lw $s0,a
li $s1,0
loop0: blt $s1,$s0,loop1
b out
loop1: sll $s2,S1,2
sw $s1,b($s2)
addi $s1,$s1,1
b loop0
out: …
CSCE 212 48
Pre-Test While Loop
while (a<b) {
a++;
}
lw $s0,a
lw $s1,b
loop0: blt $s0,$s1,loop1
b out
loop1: addi $s0,Ss0,1
sw $s0,a
b loop0
out: …
CSCE 212 49
Post-Test While Loop
do {
a++;
} while (a<b);
lw $s0,a
lw $s1,b
loop0: addi $s0,$s0,1
sw $s0,a
blt $s0,$s1,loop0
…
CSCE 212 50
Complex Loop
for (i=0;i<n;i++) a[i]=b[i]+10;
li $2,$0 # zero out index register (i)
lw $3,n # load iteration limit
sll $3,$3,2 # multiply by 4 (words)
la $4,a # get address of a (assume < 216)
la $5,b # get address of b (assume < 216)
j test
loop: add $6,$5,$2 # compute address of b[i]
lw $7,0($6) # load b[i]
addi $7,$7,10 # compute b[i]=b[i]+10
add $6,$4,$2 # compute address of a[i]
sw $7,0($6) # store into a[i]
addi $2,$2,4 # increment i
test: blt $2,$3,loop # loop if test succeeds
CSCE 212 51