Microprocessor Systems
1
Overview
Memory System and Addressing
Thumb Instruction Set
2
Memory Maps For Cortex M0+ and MCU
KL25Z128VLK4
0x2000_2FFF
SRAM_U (3/4)
16 KB SRAM
0x2000_0000
SRAM_L (1/4)
0x1FFF_F000
0x0001_FFFF
128KB Flash
0x0000_0000
Endianness
For a multi-byte
value, in what order
are the bytes stored?
Little-Endian: Start
with least-significant
byte
Big-Endian: Start with
most-significant byte
ARMv6-M Endianness
Instructions are always little-endian
Loads and stores to Private Peripheral Bus are
always little-endian
Data: Depends on implementation, or from
reset configuration
Kinetis processors are little-endian
ARM, Thumb and Thumb-2 Instructions
ARM instructions optimized for resource-rich high-performance
computing systems
Deeply pipelined processor, high clock rate, wide (e.g. 32-bit) memory
bus
Low-end embedded computing systems are different
Slower clock rates, shallow pipelines
Different cost factors – e.g. code size matters much more, bit and byte
operations critical
Modifications to ARM ISA to fit low-end embedded computing
1995: Thumb instruction set
16-bit instructions
Reduces memory requirements (and performance slightly)
2003: Thumb-2 instruction set
Adds some 32 bit instructions
Improves speed with little memory overhead
CPU decodes instructions based on whether in Thumb state or ARM
state - controlled by T bit
Instruction Set
Cortex-M0+ core implements ARMv6-M Thumb instructions
Only uses Thumb instructions, always in Thumb state
Most instructions are 16 bits long, some are 32 bits
Most 16-bit instructions can only access low registers (R0-R7), but
some can access high registers (R8-R15)
Thumb state indicated by program counter being odd (LSB =
1)
Branching to an even address will cause an exception, since switching
back to ARM state is not allowed
Conditional execution only supported for 16-bit branch
32 bit address space
Half-word aligned instructions
See ARMv6-M Architecture Reference Manual for specifics per
instruction (Section A.6.7)
Assembler Instruction Format
<operation> <operand1> <operand2> <operand3>
There may be fewer operands
First operand is typically destination (<Rd>)
Other operands are sources (<Rn>, <Rm>)
Examples
ADDS <Rd>, <Rn>, <Rm>
Add registers: <Rd> = <Rn> + <Rm>
AND <Rdn>, <Rm>
Bitwise and: <Rdn> = <Rdn> & <Rm>
CMP <Rn>, <Rm>
Compare: Set condition flags based on result of computing
<Rn> - <Rm>
Where Can the Operands Be Located?
In a general-purpose register R
Destination: Rd
Source: Rm, Rn
Both source and destination: Rdn
Target: Rt
Source for shift amount: Rs
An immediate value encoded in instruction word
In a condition code flag
In memory
Only for load, store, push and pop instructions
Update Condition Codes in APSR?
“S” suffix indicates the instruction updates APSR
ADD vs. ADDS
ADC vs. ADCS
SUB vs. SUBS
MOV vs. MOVS
Instruction Set Summary
Instruction Type Instructions
Move MOV, MOVS, MRS, MSR
Load/Store LDR, LDRB, LDRH, LDRSH, LDRSB, LDM, STR,
STRB, STRH, STM
Add, Subtract, ADD, ADDS, ADCS, ADR, SUB, SUBS, SBCS, RSBS,
Multiply MULS
Compare CMP, CMN, TST
Logical ANDS, EORS, ORRS, BICS, MVNS, TST
Shift and Rotate LSLS, LSRS, ASRS, RORS
Stack PUSH, POP
Conditional branch B, BL, B{cond}, BX, BLX
Extend SXTH, SXTB, UXTH, UXTB
Reverse REV, REV16, REVSH
Processor State SVC, CPSID, CPSIE, SETEND, BKPT
No Operation NOP
Hint SEV, WFE, WFI, YIELD
Load/Store Register
ARM is a load/store architecture, so must process
data in registers (not memory)
LDR: load register with word (32 bits) from memory
LDR <Rt>, source address
STR: store register contents (32 bits) to memory
STR <Rt>, destination address
Modes for Addressing Memory
Offset Addressing mode: [<Rn>, <offset>] accesses address
<Rn>+<offset>
Base Register <Rn>
Can be register R0-R7, SP or PC
<offset> is added or subtracted from base register to create
effective address
Can be an immediate constant
Can be another register, used as index <Rm>
Auto-update: Can write effective address back to base
register
Pre-indexing: use effective address to access memory,
then update base register with that effective address
Post-indexing: use base register to access memory, then
update base register with effective address
Memory Addressing
The assembly code syntax for addressing memory is a
bracketed expression.
[R0] indicates the memory location starting at the address
in register R0.
[R0, #22] indicates the memory location starting at the
address which is the sum of the values in R0 and 22.
[R0, R3] indicates the memory location starting at the
address which is the sum of the values in R0 and R3.
14
Memory Addressing
LDR R0, [R4, #8] will load register R0 with the contents of
the memory word (4 bytes) starting at location R4 + 8.
STR R1, [R4, R5] will store register R1 to the memory word
(4 bytes) starting at location R4 + R5.
15
Loading/Storing Smaller Data Sizes
Some load and store instructions can handle half-word (16
bits) and byte (8 bits)
Store just writes to half-word or byte
STRH, STRB
Loading a byte or half-word requires padding or extension:
What do we put in the upper bits of the register?
Example: How do we extend 0x80 into a full word?
Unsigned? Then 0x80 = 128, so zero-pad to extend to word
0x0000_0080 = 128
Signed? Then 0x80 = -128, so sign-extend to word 0xFFFF_FF80
= -128
Signed Unsigned
Byte LDRSB LDRB
Half-word LDRSH LDRH
In-Register Size Extension
Can also extend byte or half-word already in a register
Signed or unsigned (zero-pad)
How do we extend 0x80 into a full word?
Unsigned? Then 0x80 = 128, so zero-pad to extend to word
0x0000_0080 = 128
Signed? Then 0x80 = -128, so sign-extend to word
0xFFFF_FF80 = -128
Signed Unsigned
Byte SXTB UXTB
Half-word SXTH UXTH
Load/Store Multiple
LDM/LDMIA: load multiple registers starting from
[base register], update base register afterwards
LDM <Rn>!,<registers>
LDM <Rn>,<registers>
STM/STMIA: store multiple registers starting at [base
register], update base register after
STM <Rn>!, <registers>
LDMIA and STMIA are pseudo-instructions, translated
by assembler
An example:
LDM R4, {R0, R1, R2, R3}
Pseudo-code:
R0=R4[0]; R1=R4[1]; R2=R4[2]; R3=R4[3];
Load Literal Value into Register
Assembly pseudo-instruction: LDR <rd>, =value
Assembler generates code to load <rd> with value
Assembler selects best approach depending on value
Load immediate
MOV instruction provides 8-bit unsigned immediate operand (0-255)
Load and shift immediate values
Can use MOV, shift, rotate, sign extend instructions
Load from literal pool
1. Place value as a 32-bit literal in the program’s literal pool (table of literal
values to be loaded into registers)
2. Use instruction LDR <rd>, [pc, #offset] where offset indicates position of
literal relative to program counter value
Example formats for literal values (depends on compiler
and toolchain used)
Decimal: 3909
Hexadecimal: 0xa7ee
Character: ‘A’
String: “44??”
Move (Pseudo-)Instructions
Copy data from one register to
another without updating
condition flags
MOV <Rd>, <Rm>
Assembler translates pseudo-
instructions into equivalent
instructions (shifts, rotates)
Copy data from one register to
another and update condition flags
MOVS <Rd>, <Rm>
Copy immediate literal value (0-255)
into register and update condition
flags
MOVS <Rd>, #<imm8>
Stack Operations
Push some or all of registers (R0-R7, LR) to stack
PUSH {<registers>}
Decrements SP by 4 bytes for each register saved
Pushing LR saves return address
PUSH {r1, r2, LR}
Always pushes registers in same order
Pop some or all of registers (R0-R7, PC) from stack
POP {<registers>}
Increments SP by 4 bytes for each register restored
If PC is popped, then execution will branch to new PC value
after this POP instruction (e.g. return address)
POP {r5, r6, r7}
Always pops registers in same order (opposite of pushing)
Stack Operations
• For the ARM Cortex-M, all pushes and pops use 32-bit data items; no other size is possible.
• Since all possible stack pointer values are multiples of four, the hardware is designed so that the
two least significant bits of the stack pointer are always zeros. 22
Add Instructions
Add registers, update condition flags
ADDS <Rd>,<Rn>,<Rm>
Add registers and carry bit, update condition
flags
ADCS <Rdn>,<Rm>
Add registers
ADD <Rdn>,<Rm>
Add immediate value to register
ADDS <Rd>,<Rn>,#<imm3>
ADDS <Rdn>,#<imm8>
Add Instructions with Stack Pointer
Add SP and immediate value
ADD <Rd>,SP,#<imm8>
Add SP value to register
ADD <Rdm>, SP, <Rdm>
ADD SP,<Rm>
Address to Register Pseudo-Instruction
Generate a register-relative address in the
destination register, for a label defined in a
storage map.
ADR <Rd>,<label>
How is this used?
First, load register R2 with address of const_data
ADR R2, const_data
Second, load const_data into R2
LDR R2, [R2]
Subtract
Subtract immediate from register, update condition
flags
SUBS <Rd>,<Rn>,#<imm3>
SUBS <Rdn>,#<imm8>
Subtract registers, update condition flags
SUBS <Rd>,<Rn>,<Rm>
Subtract registers with carry, update condition flags
SBCS <Rdn>,<Rm>
Subtract immediate from SP
SUB SP,SP,#<imm7>
Multiply
Multiply source registers, save lower word of
result in destination register, update condition
flags
MULS <Rdm>, <Rn>, <Rdm>
<Rdm> = <Rdm> * <Rn>
Signed multiply
Note: upper word of result is truncated
Logical Operations
Bitwise AND registers, update condition flags
ANDS <Rdn>,<Rm>
Bitwise OR registers, update condition flags
ORRS <Rdn>,<Rm>
Bitwise Exclusive OR registers, update condition flags
EORS <Rdn>,<Rm>
Bitwise AND register and complement of second register,
update condition flags
BICS <Rdn>,<Rm>
Move inverse of register value to destination, update
condition flags
MVNS <Rd>,<Rm>
Update condition flags by ANDing two registers, discarding
result
TST <Rn>, <Rm>
Compare
Compare - subtracts second value from first,
discards result, updates APSR
CMP <Rn>,#<imm8>
CMP <Rn>,<Rm>
Compare negative - adds two values, updates
APSR, discards result
CMN <Rn>,<Rm>
Shift and Rotate
Common features
All of these instructions update APSR condition flags
Shift/rotate amount (in number of bits) specified by last
operand
Logical shift left - shifts in zeroes on right
LSLS <Rd>,<Rm>,#<imm5>
LSLS <Rdn>,<Rm>
Logical shift right - shifts in zeroes on left
LSRS <Rd>,<Rm>,#<imm5>
LSRS <Rdn>,<Rm>
Arithmetic shift right - shifts in copies of sign bit on left (to
maintain arithmetic sign)
ASRS <Rd>,<Rm>,#<imm5>
Rotate right
RORS <Rdn>,<Rm>
Reversing Bytes
MSB LSB
REV - reverse all bytes
in word
MSB LSB
REV <Rd>,<Rm>
REV16 - reverse bytes in
both half-words MSB LSB
REV16 <Rd>,<Rm>
REVSH - reverse bytes MSB LSB
in low half-word
(signed) and sign- MSB LSB
extend Sign extend
REVSH <Rd>,<Rm> MSB LSB
Changing Program Flow - Branches
Unconditional Branches
B <label>
Target address must be within 2 KB of branch instruction
(-2048 B to +2046 B)
Conditional Branches
B<cond> <label>
<cond> is condition - see next page
B<cond> target address must be within 256 B of branch
instruction (-256 B to +254 B)
Condition Codes
Append to branch
instruction (B) to make
a conditional branch
Full ARM instructions
(not Thumb or Thumb-
2) support conditional
execution of arbitrary
instructions
Note: Carry bit = not-
borrow for compares
and subtractions
Changing Program Flow - Subroutines
Call
BL <label> - branch with link
Call subroutine at <label>
PC-relative, range limited to PC+/-16MB
Save return address in LR
BLX <Rd> - branch with link and exchange
Call subroutine at address in register Rd (exchange Rd with PC)
Supports full 4GB address range
LSB of target address must be set to 1 to ensure continued execution in Thumb
state
Save return address in LR
Return
BX <Rd> branch and exchange
Branch to address specified by <Rd>
LSB of target address must be set to 1 to ensure continued execution in
Thumb state
Supports full 4 GB address space
BX LR - Return from subroutine
Special Register Instructions
Move to Register from
Special Register
MSR <Rd>, <spec_reg>
Move to Special Register
from Register
MRS <spec_reg>, <Rd>
Change Processor State -
Modify PRIMASK register
CPSIE - Interrupt enable
CPSID - Interrupt disable
Other
No Operation - does nothing!
NOP
Breakpoint - causes hard fault or debug halt - used
to implement software breakpoints
BKPT #<imm8>
Wait for interrupt - Pause program, enter low-power
state until a WFI wake-up event occurs (e.g. an
interrupt)
WFI
Supervisor call generates SVC exception (#11),
same as software interrupt
SVC #<imm>