CSCI 2406 Computer Org.
&
Architecture
Machine Language
Assembly is Convenient
2
● Assembly language is convenient for humans to read.
● But digital circuits understand only 1’s and 0’s.
● A program written in assembly is translated to machine
language.
● ARM32 uses 32-bit instructions.
● ARM defines three main instruction formats:
○ Data-processing instructions
○ Memory instructions
○ Branch instructions
How to encode instructions?
3
● In ARM32, all instructions are 32-bits wide
● For design simplicity, a single instruction format is
preferable, but
○ Different Instructions have different needs
○ Multiple instruction formats allow flexibility
■ ADD, SUB: use 3 register operands
■ LDR, STR: use 2 register operands and a constant
● Number of instruction formats kept small
○ Smaller is faster
Interpreting Machine Code
4
● All three formats start with a 4-bit condition field and a
2-bit op.
● The best place to begin is to look at the op.
○ 00: a data-processing instruction.
○ 01: a memory instruction.
○ 10: a branch instruction.
● Based on that, the rest of the fields can be interpreted
correctly.
5 Data processing Instructions
Mathematical and Logical Operations
DP instructions
6
Examples include (including those with optional {} S)
● ADD{S}
● SUB{S}
● MUL{S}
● -etc-
● These operations have different formats:
○ add r0, r1, r2 or add r0, r1
○ add r0, #256 or add r0, r1, #256
○ but add r0, #48, #256 is not possible
Register naming in instructions
7
Consider add r0, r1, r2
In the instruction format these are referred to as:
● R0: Rd
● R1: Rn
● R2: Rm
Note that because Rm is optional, if it is present, it is stored
in the src2 field
Data Processing Instructions
8
● Data-processing instructions have
○ a destination register (Rd)
○ a 1st source register (Rn)
○ a 2nd source that is either an immediate or a register,
possibly shifted (Src2 )
● 6 fields: cond , op, funct , Rn, Rd, Src2
op = 00 for data-processing instructions
Detail
9
cond – conditional execution based on flags
eg, cond = 1110 for unconditional instructions
funct – function code
Rn – the 1st source register
Rd – the destination register
funct and Src2 Interpretation
10
Some of the Operations
11
● A subset of the shift
operations is also shown
● Note that multiply is
missing.
● The implementation of
mul requires a different
approach discussed
briefly later
S and Src2
12
● S = 1 sets the condition flags
● Src2 can be
○ an immediate
○ a register Rm optionally shifted by a constant shamt5
○ a register Rm shifted by another register Rs
sh field encoding in src2
13
14 Immediate Values in SRC2
Representation of immediates
15
Data-processing instructions have an unusual immediate representation
involving an 8-bit unsigned immediate, imm8, and a 4-bit rotation,
rot. imm8 is rotated right by 2 × rot to create a 32-bit constant.
Representation of immediates
16
This can lead to anomalies in representing immediates in
data-processing instructions
add r0, r0, #257 // this is illegal
add r0, r0, #8000 // this is fine
● #257 fails because we cannot represent it in 8 bits and
we cannot apply any ROR operation to get to it
● So why is #8000 perfectly fine?
#8000 Representation
17
● #8000 cannot represent it in 8 bits but can be in 32 bits
● We start with 01111101 (125)
● Then ror by 1101
○ Which is really (13x2 = 26)
○ What do we get?
00000000000000000000000001111101 (125)
00000000000000000001111101000000 (8000)
Some Immediate Examples
18
sub r4, r7, #8000
1110 00 1 0100 0 0001 0000 1101 01111101
e2810d7d
sub r4, r7, #900
e2474fe1
=> 1110 00 1 0010 0 0111 0100 1111 11100001
=> 00000000 00000000 00000000 11100001
=> 00000000 00000000 00000011 10000100
Example: add r5, r6, r7
19
● Consider the instruction add r5, r6, r7
● cond = 1110 (14) for unconditional execution
● op = 00 (0) for data-processing instructions
● cmd = 0100 (4) for ADD
● Src2 is a register so I=0
● Rd = 0101 , Rn = 0110 , Rm = 0111
● shamt = 0, sh = 0
Is this correct?
20
add r5, r6, r7 is 0xe0865007 according to CPUlator,
which is the same as our binary number:
1110 00 0 0100 0 0110 0101 00000 00 0 0111
Another Example: ROR R3, R5, #21
21
● Rn and Rm are the first and second source operands,
respectively
● cond = 1110 (14) for unconditional execution
● op = 00 (0) for data-processing instructions
● cmd = 1101 (13) for all shifts (MOV, LSL, LSR, ASR, and ROR)
● Src2 is an immediate-shifted register so I=0
● Rd = 101, Rn = 0, Rm = 0101 (Rn is not used and should be 0)
● shamt5 = 21, sh = 11 (11=ROR )
Is this correct?
22
CPUlator confirms ror r3, r5, #21 is assembled to
0xe1a03ae5 which is the same as our binary number
1110 00 0 1101 0 0000 0011 10101 11 0 0101
Register shifted Register
23
For example, asr r5, r1, r12
Src2 is configured:
Rs = 1100, sh=10, Rm=0001(Rn is not used and should be 0)
Src2 bit 4
24
The ARM7 ISA seems to use the following convention for
bit 4 in src2:
● Shift operations using immediates: 0
● Shift operations using registers: 1
● Other data operations using registers: 0
○ add r5, r6, r7
○ sub r8, r9, r10
mul operations
25
Multiply instructions use the encoding shown here. The
3-bit cmd field specifies the type of multiply
Add
mul example (cmd = 000)
26
mul r0, r1, r2
1110 00 000000 0000 0000 0010 1001 0001
Rd=0000
Ra=0000
Rm=0010
Rn=0001
mla example (cmd = 001)
27
mla r0, r1, r2, r3
1110 00 000010 0000 0011 0010 1001 0001
Rd=0000
Ra=0011
Rm=0010
Rn=0001
mul does not allow immediates
28
With mul these types of instructions are not possible:
mul r0, r0, #8000
mul r0, r6, #256
Q: how can we get around this limitation?
Question 2 - Binary instruction
29
What is the binary instruction for
add r0, r0, r3, LSL r2
?
Question 1- LSL and Reg-Reg Shift
30
Consider the following code:
mov r0, #10
mov r2, #4
mov r3, #10
add r0, r0, r3, LSL r2
What is in r0 when add finishes?
Question 3 - Rotate vs Shift
31
Why does ARM use rotate operations and not shift for
representing immediates?
32 Memory Instructions
LDRx / STRx
Memory Instructions
33
Memory instructions have three operands:
● a register that is the destination on an LDR and a
source on an STR
● a base register
● an offset that is either an immediate or an optionally
shifted register
Memory Instructions
34
● Memory Instructions have the same six overall fields:
○ cond, op, funct, Rn, Rd, Src2 Rn – the base register
○ Src2 – the offset
○ Rd – the destination register in a load or the source
register in a store
● op is 01 for memory instructions
Memory Instructions
35
The offset is either a 12-bit unsigned
immediate imm12 or a register Rm that is
optionally shifted by a constant shamt5.
EG ldr r1, [ r0, r2, lsl #4 ]
funct control bits
36
● funct is composed of six control bits: I bar, P, U, B, W,
and L.
● The I-bar (immediate) and U (add) bits determine
whether the offset is an immediate or register
○ and whether it should be added or subtracted.
● Worked example:
○ str r11, [r5], #-26
str r11, [r5], #-26
37
● str is a memory instruction, so it has an op of 01
● According to the above table L = 0 and B = 0 for str.
● The instruction uses post-indexing, so P = 0 and W = 0.
● The immediate offset is subtracted from the base, so I
bar = 0 and U = 0
Quick check
38
cpulator agrees with our calculations, and the machine
code is 0xe405b01a
Negative numbers
39
● We do not use 2s complement for DP/Memory
operations
● For example -26 is mapped as +26 11010
● With U bit = 0 will subtract from the base
● For add operations
○ such as add r5, r6, #-25
○ this is converted to sub r5, r6, #25
Question 4
40
What is the binary instruction for
ldr r1, [ r0, r2, lsl #4 ]
Check your answers in cpulator
41 Branch Instructions
Branch Instructions
42
Branch instructions use a single 24-bit signed immediate
operand, imm24
● op is 10 for branch instructions The funct field is 2 bits.
● The upper bit of funct is always 1 for branches. The
lower bit, L: 1 for BL and 0 for B
● The remaining 24-bit two’s complement imm24 field is
used to specify an instruction address relative to PC+8
Branch Example
43
Consider the following code
Calculate the immediate
0x80a0 blt there
0x80a4 add r0, r1, r2 field and show the
0x80a8 sub r0, r0, r9 machine code for the
0x80ac add sp, sp, #8 branch instruction in the
0x80b0 mov pc, lr
following assembly
there:
0x80b4 sub r0, r0, #1 program.
0x80b8 add r3, r3, #0x5
BTA - Branch Target Address
44
The processor calculates the BTA from the instruction by
sign-extending the 24-bit immediate, shifting it left by 2 (to
convert words to bytes), and adding it to PC + 8.
Validate with cpulator
45
The results from cpulator are in agreement with our binary
/ hexadecimal code 0xBA000003
Another Example
46
test:
Calculate the immediate
0x8040 ldrb r5, [r0, r3] field and show the
0x8044 strb r5, [r1, r3] machine code for the
0x8048 add r3, r3, #1 branch instruction in the
0x8044 mov pc, lr following assembly
0x8050 bl test program.
0x8054 ldr r3, [r1], #4
0x8058 sub r4, r3, #9
Moving the PC back
47
We need an immediate which is -6 (the BTA will be
computed from that)
Where -6 is 1111 1111 1111 1010
Summary
48
We have looked at the generation of binary instructions for:
● data
● branch
● memory
From the binary we can easily generate the hexadecimal
code and from there validate with our assembler
Reverse Engineering
49
Of course, we can proceed in the other direction too
For example, we can parse:
● 1110 01 000000 0101 1011 000000011010
● to str r11, [r5], #-26
with a little bit of practice
50 Large Constants
A look at some exceptions to how src2 is arranged
mov r7, #257
51
Consider mov r7, #257
● rd = 0111
● rn = 0000 (not used, does not mean r0)
● 1110 00 1 1000 0 0000 0111 000100000001
The bit pattern for 257 can be accommodated in the 12
bits available in src2 (note - no ror here)
This an exception to our understanding of how src2 is
arranged
mov r8, #65535
52
Now consider a large 16-bit: mov r8, #65535
● rd = 1000
● rn = 1111
○ used, but does not mean r15
○ the bit pattern 1111 is combined with src2
○ this gives us 16-bits yielding 65535 / -1
● 1110 00 1 1000 0 1111 1000 111111111111
This an exception to our understanding of how src2 is
arranged
Question 5
53
● What is the assembly instruction for the following binary
code:
○ 10100101101100000011000000011000
○ 11100101001100000001000000010100
● What are the binary instructions for:
○ ldr r1, [ r4, #256 ]!
○ ldrne r4, [ r5, #-256 ]