Outline
1) Future Evolution of Information Technology 2) System - on - a- Chip Design 3) Design and Application of Cores 4) Analog and Mixed Signal Design (Prof. Berroth) 5) Test of Systems - on - a - Chip
Design and Application of Cores
Microprocessor Cores (ARM) On-chip buses A dedicated core (multimedia?)
Major Microprocessor Core Vendors
Hard Cores: ARM (Cambridge, UK) (32 bit) Firm Cores: MIPS (Mountain View, CA) (64 bit) Soft Cores: ARC (London, UK) Tensilica (Santa Clara, CA) SUN Microsystems (Mountain View, CA) picoJava-Core as synthesizable RTL
ARM System Design
History of ARM ARM Instruction Set Thumb Instruction Set ARM Cores ARM Cache Modeling ARM CPUs ARM Coprocessors System Development (optional)
4
Steve Furber. The ARM System Architecture. AW. 1996.
Acorn - a Computer Manufacturer
1983: Acorn Limited: Dominant position in UK personal computer market with Rockwell 6502 (8-Bit) CPU. 1983: 16-Bit CISC CPUs slower than standard memory ports with long interrupt latencies 1983-85: Acorn designed the first commercial RISC CPU: Acorn Risc Machine (ARM) 1990: Advanced Risc Machine: formed to broaden the market beyond Acorns product range
ARM - Advanced RISC Machine
1990: Startup with 12 engineers and 1 CEO
No patents, no customers, very little money
Mid-1990s: T.I. licensed ARM7
Incorporated into a chip for mobile phones
IPO Spring 1998
13 millionaires
More Than CPU Core Development
Design a circuit, license it and make millions does not work! Support Training Marketing Development Tools Design Consulting
7
ARM System Design
History of ARM ARM Instruction Set Thumb Instruction Set ARM Cores ARM Cache Modeling ARM CPUs ARM Coprocessors Optional
8
Architectural Inheritance from Berkeley RISC I
Used: Load-store architecture Fixed-length 32-bit instructions 3 address format Rejected: Register windows Delayed branches Single cycle execution of all instructions Result: RISC with a few CISC features
9
ARM Assembly Language Programming
Agenda:
the ARM programmers model the ARM instruction set writing simple programs examples ARM software development tools
hands-on: writing simple ARM assembly programs
10
The ARM programmers model
ARM is a Reduced Instruction Set Computer (RISC); it has:
a large, regular register file
any register can be used for any purpose
a load-store architecture
instructions which reference memory just move data, they do no processing processing uses values in registers only
fixed-length 32-bit instructions
11
ARM register organization
ro r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 R15 (PC)
usable in user mode system modes only
r8_fig r9_fig r10_fig r11_fig r12_fig r13_fig r14_fig
r13_svc r14_svc
r13_abt r14_abt
r13_irq r14_irq
r13_und r14_und
SPSR_und SPSR_irq SPSR_abt CPSR SPSR_fig user mode fig mode svc mode SPSR_svc abort mode irq mode undefined mode
12
ARM CPSR format
31 2827 8 7 6 5 4 0
NZCV
unused
IF
mode
In user programs only the top 4 bits of the CPSR are significant: N - the result was negative Z - the result was zero C - the result produced a carry out V - the result generated an arithmetic overflow
13
ARM memory organization
23 22 21 20
19 18 17 16 ----------------- word 16---------------15 14 13 12
Memory is a linear array of 232 byte locations. ARM can address:
individuaal bytes 32-bit words on 4-byte boundaries
half-word 14 11 10
half-word 12 9 8
----------------word 8------------------7 3 byte3 6 byte 6 2 byte2 5 4 half-word 4 1 byte1 0 byte0
some ARM chips can address 16-bit half-words on 2-byte boundaries
14
The ARM instruction set
data processing instructions data transfer instructions control flow instructions conditional execution special instruction memory faults operating modes and exceptions ARM architecture variants
15
Data processing instructions
ALL operands are 32-bits wide and either:
come from registers, or are literals (immediate values ) specified in the instruction
The result, if any, is 32-bits wide and goes into a register
exept long multiplies generate 64-bit results
All operand and result registers are independently specified
16
Data processing instructions
Example:
ADD r0, r1, r2 ; r0 := r1 + r2
Note: everything after the ; is a comment
it is there solely for the programmers convenience
the result register (r0) is listed first
17
Data processing instructions
Arithmetic operations:
ADD ADC SUB SBC RSB RSC r0, r1, r2 r0, r1, r2 r0, r1, r2 r0, r1, r2 r0, r1, r2 r0, r1, r2 ; ; ; ; ; ; r0 := r1 + r2 r0 := r1 + r2 + C r0 := r1 - r2 r0 := r1 - r2 + C r0 := r2 - r1 r0 := r2 - r1 + C
1 1
c is the C bit in the CPSR the operation may be viewed as unsigned or 2s complement signed
18
Data processing instructions
Bit-wise logical operations:
AND ORR EOR BIC r0, r1, r2 r0, r1, r2 r0, r1, r2 r0, r1, r2 ; r0 := r1 and r2 ; r0 := r1 or r2 ; r0 := r1 xor r2 ; r0 := r1 and not r2
the specified Boolean logic operation is performed on each bit from 0 to 31 BIC stands for bit clear
each 1 in r2 clears the corresponding bit in r1
19
Data processing instructions
Register movement operations:
MOV MVN r0, r0, r2 r2 ; ; r0 r0 := := r2 not
r2
MVN stands for move negated there is no first operand (r1) specified as these are unary operations
20
Data processing instructions
Comparison operations:
CMP CMN TST TEQ r1, r1, r1, r1, r2 r2 r2 r2 ; ; ; ; set set set set cc cc cc cc on on on on r1 r1 r1 r1 - r2 + r2 and r2 or r2
these instructions just affect the condition codes (N, Z, C, V) in the CPSR
there is no result register (r0)
21
Data processing instructions
Immediate operands
the 2nd source operand (r2) may be replaced by a constant:
ADD AND r3, r3, r8, r7, #1 #&ff ; r3 := r3 + 1 ; r8 := r7 [7:0]
# indicates an immediate value
& indicates hexadecimal notation
allowed immediate values are (in general): (0 => 255) x 22n
22
Data processing instructions
Shifted register operands
the 2nd source operand may be shifted
by a constant number of bit positions:
ADD r3, r2, r1, LSL #3 ; r3 := r2 + 8.r1
or by a register-specified number of bits:
ADD r5, r5, r3, LSL r2 ; r5 += 2r2*r3
LSL, LSR mean logical shift left, right ASL, ASR mean arithmetic shift left, right ROR means rotate right RRX means rotate right extended
23
ARM shift operations
31 0 31 0 00000 LSL #5 00000 LSL #5
31 0
31 1
00000 0 ASR #5, positive operand
11111 1 ASR #5, negative operand
31
0 C
31
C ROR #5
C RRX
24
Data processing instructions
Setting the condition codes all data processing instructions may set the condition codes.
the comparison operations always do so
For example, here is code for a 64-bit add:
ADDS ADC r2, r2, r0 ; 32-bit carry-out -> C r3, r3, r1 ; added into top 32 bits
s means Set condition codes
the primary use of the condition codes is in control flow-see later
25
Data processing instructions
Multiplication ARM has special multiply instructions
MUL r4, r3, r2, ; r4 := (r3 xr2)
[31:0]
only the bottom 32 bits are returned immediate operands are not supported
multiplication by a constant is usually best done with a short series of adds and subtracts with shifts
there is also a multiply-accumulate form:
MLA r4, r3, r2, r1 ; r4 := (r3xr2+r1)[31:0]
some ARMs support 64-bit result forms too
26
Data processing instructions
31 cond 28 27 26 25 24 21 20 19 16 15 12 11 00 # opcode S Rn Rd 0 operand 2
destination register first operand register set condition codes arithmetic/logic function 25 1 immediate alignment 11 # shift 25 0 immediate shift length shift type second operand register 7 6 5 4 3 Sh 0 Rm 0 11 # rot 8 7 8-bit immediate 0
11 Rs
8 7 6 5 0 Sh
4 3 0 1 Rm
register shift length 27
Data processing instructions
Opcode [24:21] 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Mnemonic AND EOR SUB RSB ADD ADC SBC RSC TST TEQ CMP CMN OPR MOV BIC MVN Meaning Logical bit-wise AND Logical bit-wise exclusive OR Subtract Reverse subtract Add Add with carry Subtract with carry Reverse subtract with carry Test Test equivalence Compare Compare negated Logical bit-wise OR Move Bit clear Move negated Effect Rd:= Rn AND Op2 Rd:= Rn EOR Op2 Rd:= Rn - Op2 Rd:= Op2 - Rn Rd:= Rn + Op2 Rd:= Rn + Op2 + C Rd:= Rn - Op2 + C - 1 Rd:= Op2 - Rn + C - 1 Scc on Rn AND Op2 Scc on Rn EOR Op2 Scc on Rn - Op2 Scc on Rn + Op2 Rd:= Rn OR Op2 Rd:= Op2 Rd:= Rn AND NOT Op2 Rd:= NOT Op2
28
Data processing instructions
Assembler format:
<op> {<cond>} {S} Rd, Rn, #<32-bit imm.> <op> {<cond>} {S} Rd, Rn, Rm {,<shift>}
where <shift> = LSL, LSR, ASL, ASR, ROR followed by #<5-bit imm.> or Rs, or just RRX. monadic instructions omit Rn comparison instructions omit Rd 32-bit immediates are rotated 8-bit values
29
Multiply instructions
31 28 27 24 23 21 20 19 16 15 12 11 8 7 4 3 cond 0000 mul S Rd / Rd Hi Rn / Rd Lo Rs 1001 0 Rm
MUL {<cond>} {S} Rd, Rm, Rs MLA {<cond>} {S} Rd, Rm, Rs, Rn <mul> {<cond>} {S} RdHi, RdLo, Rm, Rs
Opcode [23:21] 000 001 100 101 110 111 Mnemonic MUL MLA UMULL UMLAL SMULL SMLAL Meaning Multiply (32-bit result) Multiply-accumulate (32-bit result) Unsigned multiply long Unsigned multiply-accumulate long Signed multiply long Signed multiply-accumulate long Effect Rd:= (Rm*Rs)[31:0] Rd:= (Rm*Rs+Rn )[31:0] RdHi:RdLo := Rm*RS RdHi:RdLo += Rm*RS RdHi:RdLo := Rm*RS RdHi:RdLo += Rm*RS
30
The ARM instruction set
Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
31
Data transfer instructions
The ARM has 3 types of data transfer instruction: single register loads and stores
flexible byte or word ( or possibly half-word) transfers
multiple register loads and stores
less flexible, multiple words, higher transfer rate
single register-memory swap
mainly for system use, so ignore for now
32
Data transfer instructions
Addressing memory all ARM data transfer instructions use register indirect addressing. Example of load and store instructions:
LDR STR r0, [r1] : r0 := mem [r1] r0, [r1] : mem [r1] := r0
therefore before any data transfer is possible:
a register must be initialized with an address close to the target
33
Data transfer instructions
Initializing an address pointer any register can be used for an address any ARM instruction may be used to compute an address the assembler also has special pseudo instructions to make this easier:
ADR r1, .. TABLE1 TABLE1 ; r1 ; points to TABLE1 LABEL
34
Data transfer instructions
Single register loads and stores the simplest form is just register indirect:
LDR r0, [r1] ; r0 := mem [r1]
this is a special form of base plus offset:
LDR r0, [r1, #4] ; r0 := mem [r1+4]
the offset is within +/- 4 Kbytes
auto-indexing is also possible:
LDR r0, [r1, #4] ! ; r0 := mem [r1+4] ; r1 := r1 + 4
35
Data transfer instructions
Single register loads and stores (..ctd) another form uses post-indexing
LDR r0, [r1], #4 ; r0 ; r1 := mem [r1] := r1 +4
finally, any of these can load a byte rather than a word:
LDRB r0 , [r1] ; r0 := mem8 [r1]
stores [STR] have the same forms
some ARMs also support half-word and signed byte transfer
36
Data transfer instructions
Multiple register loads and stores ARM also supports instructions which transfer several registers:
LDMIA r1, {r0, r2, r5} ; ; ; r0 := mem [r1] r2 := mem [r1+4] r5 := mem [r1+8]
the {..} list may contain any or all of r0 - r15
including r15 (the PC!) will cause a branch
the lowest register always uses the lowest address
37
Data transfer instructions
Multiple register loads and stores (..ctd) stack addressing:
stacks can Ascend or Descend memory stacks can be Full or Empty ARM multiple register transfer support all forms of stack
block copy addressing
addresses can Increment or Decrement Before or After each transfer
38
Multiple register transfer addressing modes
r 9 r5 r1 r0 101816 r 9 r5 r1 r0 101816 r9 100c16 r9 100c16
100016 STMIA r 9!, {r 0, r 1, r 5 } 101816
100016 STMIB r 9!, {r 0, r 1, r 5 } 101816
r9
r5 r1 r0
100c16
r9 r5 r1 r0
100c16
r 9
100016
r 9
100016
STMDA r 9!, {r 0, r 1, r 5 }
STMDB r 9!, {r 0, r 1, r 5 }
39
Stack and block copy views of the load and store multiple instructions
Ascending
Full Before
STMIB STMFA STMIA STMEA LDMDB LDMEA LDMDA LDMFA LDMIA LDMFD STMDB STMFD STMDA STMED
Descending
Full Empty
LDMIB LDMED
Empty
Increment
After Before
Decrement
After
40
Single word and unsigned byte data transfer instructions
31 cond 28 27 26 25 24 23 22 21 20 19 16 15 12 11 01 # P U B W L Rn Rd 0 offset source/destination register base register load/store write-back (auto-index) unsigned byte/word up/down pre-/post-index 25 0 11 12-bit immediate 0
25 1 immediate shift length shift type offset register
11 # shift
7 6
5 4 3 Sh 0
0 Rm
41
Half-word and signed byte data transfer instructions
31 cond 28 27 000 25 24 23 22 21 20 19 16 15 12 11 8 7 6 5 4 3 0 P U # W L Rn Rd offsetH 1 S H 1 offsetL source/destination register base register load/store write-back (auto-index) up/down pre-/post-index
22 1
11 8 Imm[7:4]
3 0 Imm[3:0]
22 0 offset register
11 8 0000
3 Rm
42
Single data transfer instructions
Assembler format:
LDR | STR {<cond>} {B|SB|H|SH }Rd,[Rn, <off>] {!} LDR | STR {<cond>} {B | SB | H | SH } Rd, [Rn, <off>] LDR | STR {<cond>} {B | SB | H | SH } Rd, LABEL
is +/-Rm or +/- 12-bit (byte, word) or 8-bit (signed or halfword) immediate Data type encoding
<off> S 1 0 1 H 0 1 1 Data type Signed Byte Unsigned half-word Signed half-word
43
Multiple register data transfers
31 cond 28 27 25 24 23 22 21 20 19 100 P U S W L 16 15 Rn register list 0
base register load/store write- back (auto-index) restore PSR and force user bit up/down pre-/post-index
Assembler format:
LDM | STM {<cond>} <add> Rn {!}, <regs> <add> = IA etc, <regs> = {rn,..rm}
44
Swap memory and register instructions
31 cond 28 27 23 22 21 20 19 16 15 0 0 0 1 0 B 0 0 Rn 12 11 4 3 Rd 0 0 0 0 1 0 0 1 Rm 0
destination register base register unsigned byte/word source register
Assembler format:
SWP {<cond>} {B} Rd, Rm, [RN]
45
The ARM instruction set
Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
46
Control flow instructions
Control flow instructions just switch execution around the program:
B LABEL .. ; these instructions are skipped LABEL ..
normal execution is sequential branches are used to change this
to move forwards or backwards
47
Control flow instructions
Conditional branches sometimes whether or not a branch is taken depends on the condition codes:
MOV LOOP.. ADD CMP r0, r0, r0, #10 #1 ; increment counter ; compare with limit ; repeat if not equal ; else continue r0, #0 ; initialize counter
BNE LOOP ..
here the branch depends on how CMP sets Z
48
Branch conditions
Branch B BAL BEQ BNE BPL BMI BCC BLO BCS BHS BVC BVS BGT BGE BLT BLE BHI BLS Interpretation Unconditional Always Equal Not equal Plus Minus Carry clear Lower Carry set Higher or same Overflow clear Overflow set Greater than Greater or equal Less than Less or equal Higher Lower or same Normal Uses Always take this branch Always take this branch Comparison equal or zero result Comparison not equal or zero result Result positive or zero Result minus or negative Arithmetic operation did not give carry-out Unsigned comparison gave lower Arithmetic operation gave carry-out Unsigned comparison gave higher or same Signed integer operation ; no overflow occurred Signed integer operation ; overflow occurred Signed integer comparison gave greater than Signed integer comparison gave greater or equal Signed integer comparison gave less than Signed integer comparison gave less than or equal Unsigned comparison gave higher Unsigned comparison gave lower or same
49
Control flow instructions
Conditional execution an unusual ARM feature is that all instructions may be conditional:
CMP ADDNE SUBNE r0, r1, r1, #5 r1, r1, r0 r2 ; } ; if (r0 ; r1 != 5) { := r1 + r0 - r2
this removes the need for some short branches
improving performance and code density
50
Control flow instructions
Branch and link ARMs subroutine call mechanism saves the return address in r14
BL SUBR .. SUBR MOV .. pc, r14 ; branch to SUBR ; return to here ; subroutine entry point ; return
note the use of a data processing instruction for return
51
Control flow instructions
Nested subroutines r14 must be saved before the next BL
BL SUB1 ; branch to SUB! .. SUB1 STMFA r13!, {r0-r2, r14} ; save regs BL SUB2 .. LDMFA r13!, {r0-r2, pc} ; return SUB2 .. MOV pc, r14 ; return
52
Control flow instructions
Supervisor calls these are calls to operating system functions such as input and output:
SWI SWI SWI_WriteC SWI_Exit ; output character in ; return to monitor r0
the range of available calls is system dependent
53
Branch and Branch with Link
31 28 27 25 24 23 cond 1 0 1 L 0 24-bit signed word offset
the L bit selects Branch with Link
the address of the instruction after the branch is placed into r14
the offset is scaled to word
giving a range of +/-32 Mbytes
Assembler format:
B{L} {<cond>} <target address>
54
Branch and exchange
31 28 27 cond 4 3 0 0 0 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 Rm 0
only available on recent ARM chips used to switch execution to the Thumb instruction set
if Rm [0] = 1
causes a branch to the address in Rm Assembler format:
BX {<cond>} Rm
55
SoftWare Interrupt
31 cond 28 27 24 23 1111 0 24-bit ( interpreted ) immediate
this instruction is the normal way to access operating system facilities; it:
puts the processor into supervisor mode saves the CPSR in SPSR_svc sets the PC to 0x8
Assembler format:
SWI {<cond>} <24-bit immediate>
56
The ARM instruction set
Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
57
The ARM condition code field
31 28 27 cond 0
every ARM instruction may have a condition added
the instruction will only be executed if the condition is passed the conditions test the vales of the N, Z, C and V flags in the CPSR
if no condition is specified A (always) is assumed
58
ARM condition codes
Opcode [31:28] 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Mnemonic extension EQ NE CS/HS CC/LO MI Pl VS VC HI LS GE LT GT LE AL NV Interpretation Equal / equals zero Not equal Carry set / unsigned higher or same Carry clear / unsigned lower Minus / negative Plus / positive or zero Overflow No overflow Unsigned higher Unsigned lower or same Signed greater than or equal Signed less than Signed greater than Signed less than or equal Always Never (do not use!) Status flag state for execution Zset Zclear Cset Cclear Nset Nclear Vset Vclear Cset and Zclear Cclear or Zset N equals V N is not Equal to V Z clear and N equals V Z set or Nis not equal to V any none
59
The ARM instruction set
Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
60
Status register to general register transfers
31 cond 28 27 23 22 21 16 15 12 11 0 0 0 0 1 0 R 0 0 1 1 1 1 Rd 0 0 0 0 0 0 0 0 0 0 0 0
destination register SPSR/CPSR
Assembler format:
MRS {<cond>} Rd, CPSR | SPSR
and the reverse (see next slide):
MRS {<cond>} CPSR | SPSR, #32 | Rm
(with a few details about fields omitted)
61
Transfer to status register
31 cond 28 27 26 25 24 23 22 21 20 19 00 # 10 R 1 0 field 16 15 12 11 1111 0 operand
field mask SPAR/CPSR
25 1
11 8 7 # rot
0 8-bit immediate
immediate alignment 25 0 11 4 3 0 0 0 0 0 0 0 0 0 Rm
operand register
62
Coprocessor instructions
Coprocessor data processing instructions
31 28 27 24 23 20 19 16 15 12 11 8 cond 1110 Cop1 CRn CRd CP# 7 Cop2 5 4 0 3 CRm 0
CP# specifies the coprocessor number: it performs the operation specified by Cop1 and Cop2 on data in CRn and CRm, putting the result in CRd other interpretations are possible!
63
Coprocessor instructions
31 cound 28 27 25 24 23 22 21 20 19 110 P U N W L Rn 16 15 12 11 CRd CP# 8 7 8-bit offset 0
source/destination register base register load/store write- back (auto-index) data size (coprocessor dependent) up/down pre-/post-index
64
Coprocessor instructions
Coprocessor register transfer instructions
31 cond 28 27 24 23 21 20 19 16 15 12 11 1110 Cop1 L CRn Rd CP# 8 7 Cop2 5 4 3 1 CRm 0
Load from coprocessor/store to coprocessor
move a 32-bit value between the coprocessor and ARM (including CPSR)
examples: floating-point FIX, FLOAT and compare
65
The ARM instruction set
Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
66
Memory faults
ARM has full support for memory faults. Accesses may fail because of:
virtual memory page faults memory protection violations soft memory errors
Prefetch aborts are faults on instruction fetch Data aborts are faults on data transfers
both are recoverable with a little work
67
The ARM instruction set
Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
68
Operating modes and executions
ARM has privileged operating modes: SVC mode for software interrupts IRQ mode for (normal) interrupts FIQ mode for fast interrupts Abort mode for handling memory faults Undef mode for undefined instruction traps System mode for privileged operating system tasks
69
Operating modes and executions
Each privileged mode has: some private registers
its own r14 for a return address its own r13, normally for a private stack pointer FIQ mode has additional private registers to speed its operating
its own Saved Program Status Register (SPSR)
to preserve the CPSR so it can be restored upon return
70
Operating modes and exceptions
ro r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 R15 (PC)
usable in user mode system modes only
r8_fig r9_fig r10_fig r11_fig r12_fig r13_fig r14_fig
r13_svc r14_svc
r13_abt r14_abt
r13_irq r14_irq
r13_und r14_und
SPSR_und SPSR_irq SPSR_abt CPSR SPSR_fig user mode fig mode svc mode SPSR_svc abort mode irq mode undefined mode
71
Operating modes and exceptions
31 28 27 NZCV unused 8 7 6 5 4 IF T mode 0
The CPSR and SPSR format: bits 0 to 4 define the operating mode bit 5 controls the instruction set
ARM (T=0) or Thumb (T=1)
bit 6 disables FIQ when set bit 7 disables IRQ when set
72
Operating modes and exceptions
Register use:
CPSR [4:0] 10000 10001 10010 10011 10111 11011 11111 Mode User FIQ IRQ SVC Abort Undef System Use Normal user code Processing fast interrupts Processing standard interrupts Processing software interrupts (SWIs) Processing memory faults Handling undefined instruction traps Running privileged operating system tasks Registers user _fiq _irq _svc _abt _und user
73
Operating modes and exceptions
Exception entry sequence: change to the appropriate operating mode save the return address in r14_exc save the old CPSR in SPSR_exc on FIQ entry, disable FIQ force the PC to the appropriate exception vector address disable IRQ
74
Operating modes and exceptions
Exception vector addresses:
Exception Reset Undefined instruction Software interrupt (SWI) Prefetch abort (instruction fetch memory fault) Data abort (data access memory fault) IRQ (normal interrupt) FIQ (fast interrupt) Mode SVC UND SVC Abort Abort IRQ FIQ Vector address 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000010 0x00000018 0x0000001C
75
Operating modes and exceptions
Exception handling the vector address normally contains a branch to the exception handling code
the FIQ handler can start at 0x1C
r13_exc usually points to a private stack
save work registers for use by the handler FIQ usually has enough private registers
process exception restore work registers and return
76
Operating modes and exceptions
Return from exception from a SWI or undefined instruction:
MOWS pc, r14
this is a special form with s and pc it restores the CPSR from SPSR_exc as well
from an IRQ, FIQ or prefetch abort:
SUBS pc, r14, #4
from a data abort to retry the data transfer:
SUBS pc, r14, #8
77
The ARM instruction set
Agenda: data processing instructions data transfer instructions control flow instructions conditional execution special instructions memory faults operating modes and exceptions ARM architecture variants
78
Writing simple programs
Even experienced programmers approach a new environment by first getting a simple program to run
often a Hello Worldprogram
This requires some basic tools:
a text editor, to enter the program an assembler to produce binary code a system or emulator to test the code
79
Writing simple programs
Assembler details to note: AREA - declaration of code area EQU - initializing constants (1 word)
used here to define SWI numbers ENTRY - code entry point = - a way to initialize memory (per byte) END - the end of the program source
labels are aligned left
opcodes are indented
80
Examples
Hello World assembly program:
AREA SWI-WriteC EQU SWI-Exit START LOOP EQU ENTRY ADR LDRB CMP SWINE BNE SWI TEXT = END r1, TEXT r0, #0 SWI_WriteC LOOP SWI_Exit HelloW, CODE, READONLY ; declare area &0 &11 ; output character in r0 ; finish program ; code entry point ; r1-> "Hello World" ; check for text end ; if not end print .. ; .. And loop back ; end of execution ; end of program source r0, [r1], #1 ; get the next byte
"Hello World" , &0a, &0d, 0
81
Examples
Subroutine to print r1 in hexadecimal
HexOut MOV LOOP MOV CMP ADDGT ADDLE SWI MOV SUBS BNE MOV r2, #8 r0, r1, LSR #28 r0, #9 r0, r0, #"A"-10 r0, r0, #"0" SWI_WriteC r1, r1, LSL #4 r2, r2, #1 LOOP pc, r14 ; ; ; ; ; ; ; ; ; ; nibble count = 8 get top nibble 0-9 or A-F? ASCII alphabetic ASCII numeric print character shift left one nibble decrement nibble count if more do next nibble ... Else return
82
The structure of the ARM crossdevelopment toolkit
c source c libraries asm source c compiler .aof linker .aif debug assembler
ARMsd system model
PIE card ARMulator
83
ARM System Design
History of ARM ARM Instruction Set Thumb Instruction Set ARM Cores ARM Cache Modeling ARM CPUs ARM Coprocessors Optional
84
The Thumb instruction set
Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly programs
85
The Thumb instruction set
Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly and C programs
86
The Thumb programmers model
What is Thumb? a compressed, 16-bit representation of the ARM instruction set
primarily to increase code density also increases performance in some cases
It is not a complete architecture all Thumb-aware cores also support the ARM instruction set
therefore the Thumb architecture need only support common functions
87
The Thumb programmers model
31 28 27 NZCV 8 7 unused 6 5 4 IF T 0 mode
The T bit in the CPSR controls the interpretation of the instruction stream switch from ARM to Thumb (and back) by execution BX instruction exceptions also cause switch to ARM code
return symmetrically to ARM or Thumb code
88
The Thumb programmers model
r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 SP(r13) LR(r14) PC(r15) Shaded registers have restricted access Lo register
Hi register CPSR
89
The Thumb programmers model
Thumb register use: r0-r7 are general purpose registers r13 is used implicitly as a stack pointer
in ARM code this is a software convention
r14 is used as the link register
implicitly, as in the ARM instruction set
a few instructions can access r8-15 the CPSR flags are set by data processing instructions & control conditional branches
90
The Thumb programmers model
Thumb-ARM similarities: load-store architecture
with data processing, data transfer and control flow instructions
support for 8-bit byte, 16-bit half-word and 32-bit data types
half-words are aligned on 2-byte boundaries words are aligned on 4-byte boundaries
32-bit unsegmented memory
91
The Thumb programmers model
Thumb-ARM differences: most Thumb instructions are unconditional
all ARM instructions are conditional
many Thumb instructions use a 2-address format
most ARM instructions use 3-address format
Thumb instruction formats are less regular
a result of the denser encoding
92
The Thumb instruction set
Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly and C programs
93
Thumb branch instructions
15 12 11 1 1 0 1 cond 8 7 8-bit offset 0 (1) B <cond> <LABEL>
15 11 10 1 1 1 1 0
0 11-bit offset (2) B <LABEL>
15 12 11 10 1 1 1 1 H
0 11-bit offset (3) BL <LABEL>
15 7 6 5 3 2 0 1 0 0 0 1 1 1 0 H Rm 0 0 0
0 (4) BX RM
94
Thumb branch instructions
These are similar ARM instructions except: offset are scaled to half-word, not word range is reduced to fit into 16 bits BL works in two stages:
H=0: H=1: LR := PC + (offset <<12) PC := LR + (offset <<1) LR := oldPC + 3
the assembler generates both halves
95
Thumb software interrupts
15 1 1 0 1 1 1 1 1 8 7 8-bit immediate 0
The Thumb SWI operates exactly like the ARM SWI the (interpreted) immediate is restricted to 8 bits the SWI handler is entered in ARM code
the return automatically selects ARM or Thumb
96
Thumb data processing instructions
15 10 9 8 000110 A Rm 6 5 3 2 Rn Rd 0 (1) ADD|SUB Rd,Rn,Rm
15 10 9 8 6 5 000111 A #imm3 Rn
3 2 Rd
0 (2) ADD|SUB Rd, Rn, #imm3
15 13 12 11 10 8 7 0 0 1 Op Rd/Rn #imm8
0 (3) <Op> Rd/Rn, #imm8
15 13 12 11 0 0 0 Op # sh
6 5 Rn
3 2 Rd
0 (4) LSL|LSR|ASR Rd, Rn, #shift
97
Thumb data processing instructions
15 10 9 010 0 0 0 Op 6 5 3 2 0 Rm/Rs Rd/Rn (5) <Op> Rd/Rn,Rm/Rs
15 10 9 8 7 6 5 3 2 0 010001 Op D M Rm Rd/Rn
(6) ADD|CMP|MOV Rd/Rn,Rm
15 12 11 10 8 7 1 0 1 0 R Rd #imm8
0 (7 ) ADD|Rd,SP|PC, #imm8
15 8 7 6 1 0 1 1 0 0 0 0 A #imm7
0 (8) ADD|SUB SP, SP, #imm7
98
Thumb data processing instructions
Notes: in Thumb code shift operations are separate from general ALU functions
in ARM code a shift can be combined with an ALU function in a single instruction
all data processing operations on the Lo registers set the condition codes
those on the Hiregisters do not, apart from CMP which only changes the condition codes
99
Thumb single register data transfers
15 13 12 11 10 6 5 3 2 0 1 1 B L # off 5 Rn Rd 0 (1) LDR|STR {B} Rd,[Rn,#off5]
15 12 11 10 6 5 1 0 0 0 L # off 5 Rn
3 2 Rd
0 (2) LDRH|STRH Rd,[Rn,#off5]
15 12 11 9 8 6 5 3 2 0101 Op Rm Rn Rd
0 (3) LDR|STR {S} {H/B} Rd,[Rn,Rm]
15 11 10 0 1 0 0 1 Rd
8 7 # off 8
0 (4) LDR Rd,[PC,#off8]
15 12 11 10 8 7 1 0 0 1 L Rd
0 # off 8 (5) LDR|STR Rd,[SP,#off8]
100
Thumb multiple register data transfers
15 12 11 10 8 7 1 1 0 0 L Rn reg list 0 (1) LDMIA|STMIA Rn!,{<reg list>}
15 10 9 8 1 0 1 1 1 1 L R
7 reg list
0 (2) POP|PUSH {<reg list>{,R}}
These map directly onto the ARM forms:
POP: LDMFD SP!, {<regs>{, pc}} PUSH: STMFD SP!, {<regs>{, lr}}
note restrictions on available addressing modes compared with ARM code
101
The Thumb instruction set
Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly and C programs
102
Thumb instruction decompressor
B operand bus ARM instruction decoder mux select ARM or Thumb stream Thumb decompressor select high or low half-word mux data in immediate fields
Instruction pipeline
data in from memory 103
Thumb - ARM instruction mapping
15 13 12 11 10 8 7 0 0 1 1 0 Rd 0 # imm 8
always condition
Major opcode, format 3: MOV/ CMP/ADD/SUB with immediate
Minor opcode denoting ADD & set CC
destination and source register
zero shift
immediate value
31 28 27 26 25 24 21 20 19 1 1 1 0 0 0 1 0 1 0 0 1 0
Rd
16 15 0
12 11 Rd 0 0 0 0 # imm 8
104
The Thumb instruction set
Agenda : the Thumb programmers model Thumb instructions Thumb implementation Thumb applications hands-on: writing simple Thumb assembly and C programs
105
Thumb applications
Thumb code properties: 70% of the size of ARM code
-30% less external memory power -40% more instructions
With 32-bit memory: ARM code is 40% faster than Thumb code With 16-bit memory: Thumb code is 45% faster than ARM code
106
Thumb applications
For the best performance: use 32-bit memory and ARM code For best cost and power-efficiency: use 16-bit memory and Thumb code In a typical embedded system: use ARM code in 32-bit on-chip memory for small speed- critical routines use Thumb code in 16-bit off-chip memory for large non-critical control routines
107
Hands-on: writing simple Thumb assembly programs
Explore further the ARM software development tools Write simple Thumb assembly programs Check that they work as expected Follow the Hands-on instructions
108