Microcontrollers: Module 2: Introduction tothe ARM Inston Set
. MODULE -2
INTRODUCTION TO THE ARM INSTRU
Jons support different instructions. However ™
you writ for architecture ARM
CTION SET
ew revisions usually add
Different ARM architecture revis
tect JAT should execute
instructions and remain backwardly compatible. Code
onan ARMVSTE processor.
“The following Table provides a complete list of ARM inst
sore ARM instruc
ructions available in the ARMvsE
Instruction set architecture (ISA). Ths ISA includes all he o tions as well as Some ofthe
newer features in the ARM instruction set.
Table: ARM Instruction Set
‘Mnemonics ARMISA_ Description
ADC VI add two 32-bit values and carry
ADD vi add two 32-bit values
AND VL logical bitwise AND of two 32-bit values
8 vi branch relative +/— 32 MB
Bic VI logical bit clear (AND NOT) of two 32-bit values
BKPT y5 breakpoint instructions
BL vi relative branch with link
BLX v5 branch with link and exchange
Bx v4T branch with exchange
cop cop2 v2v5 coprocessor data processing operation
clz v5 ___count leading zeros
HN compare negative wo sbialues
cMP vi compare two 32-bit values
EOR YL logical exclusive OR of two 32-bit values
toc Loca v2v5 load to coprocessor single or multiple 32-bit values
LOM VI ~ Toad multiple 32-bit words from memory to ARM registers
Lor vi v4 vSE_ loada single value from a virtual address in memory
Mnemonics ARMISA Description
VCR MCR2 NCRR v2.v5VSE move to coprocessor from an ARM register or registers
MLA v2 multiply and accumulate 32-bit values
Mov ‘vi movea 32-bit value into a register
MRC MRC2 MRRC v2-vSVSE move to ARM register or registers from a coprocessor
move to ARM register from a status register (cpsr or spsr)
MRS
MSR move toa status register (cpsr or spsr) from an ARM register
MUL multiply two 32-bit values
MN, _- move the logical NOT, of 32-bit value intora register -
Lakshmi R, Chethan Ghatage, Dr. Girijamma IT A. Dr, Devaraju DB.
oRMicrocontrollers: Module 2: Introduction to the ARM Instruction Set 26
Mnemonics ARMISA Description
one YI logical bitwise OR of two 32-bit values
PLD YSE preload hint instruction
‘qx0D signed saturated 32-bit add
‘QoaDo signed saturated double and 3:
‘gosus YSE signed saturated double and 3:
QsuB XSE signed saturated 32-bit subtract
RSB v1 reverse subtract of two 32-bit values
asc WI reverse subtract with carry of two 32-bit integers
SBC I subtract with carry of two 32-bit values
SHLAGy YSE signed multiply accumulate instructions ((16 x 16) + 32 = 32-bit)
SHLAL ¥3M___ signed multiply accumulate long ((32 « 32) + 64 = 64-bit)
‘SHLALy SE ied multiply accumulate long ((16 16) + 64 = 64
SHLANy SE ly accumulate instruction (((32 x 16) 3 16) +32 = 32-bit)
SHULL 3M Tong (32 x 32 = 64-bit)
‘Mnemonics ARM ISA Description
sMULy “SE signed multiply instructions (16 x 16 = 32-bit)
sMULYy, XSE signed multiply instruction ((32 x 16) >> 16 = 32-bit)
STC stce w2x5 store to memory single or multiple 32-bit values from coprocessor
s™H vi store multiple 32-bit registers to memory
STR VL vd SE store register to a virtual address in memory
SUB vl subtract two 32-bit values
SWI vl software interrupt
4 SHP wa svap a.word/byte in memory with a register, without interruption
TEQ vt test for equality of two 32-bit values
1ST vl test for bits in a 32-bit value
UMLAL BM ‘unsigned multiply accumulate long ((32 x 32) + 64 = 64-bit)
MULL 3M unsigned multiply fong (32 x 32 = 64-bit)
In the following sections, the hexadecimal numbers are represented with the prefix Ox and binary numbers
“with the prefix 05. The examples follow this format:
PRE
POST
In the pre- and post-conditions, memory is denoted as mem [address]
‘This refers to daia_size bits of memory starting at the given byte address, For example, mem32{1024] is
the 32-bit value starting at address 1 KB.
Uakshmi R. Chethan Ghatsge, Dr. Girjjamma H A, Dr, Devarat DB0
Microcontrollers: Module 2: Introduction to the ARM Instruction Set
[ARM instructions process data held in registers and memory is accessed only with load and store
instructions.
ARM instructions commonly take two or three operands. For instance, the ADD instruction below
Adds the two'values stored in registers r/ and r2 (the source registers). It writes the result to register r3
(the destination register.
Instruction Destination Source Source
Syntax register (Rd) register 1 (Rn) | register 2 (Rm)
[400 93, 1, ve 3 A 2
ARM instructions classified as—data processing instructions, branch instructions, load-store
instructions, software interrupt instruction, and program status register instructions.
DATA PROC! IG INSTRUCTION:
‘The data processing instructions manipulate data within registers. They are—
“move instructions, arithmetic instructions, logical instructions, comparison instructions, and
multiply instructions.
Most data processing instructions can process one of their operands using the barrel shifter.
Ifyou use the S suffix on a data processing instruction, then it updates the flags in the epsr.
Move and logical operations update the carry flag C, negative flag N, and zero flag Z.
oTheC flag is set from the result of the barrel shift as the lat bit sifted out.
OTheN flag is set to bit 31 of the result, oTheZ Nag is sev ifthe results zer.—— —————
MOVe Instructions:
Move instruction copies NV into a destination register Rd, where NV is a register or immediate value. This
instruction is useful for setting initial values and transferring data between registers.
Syntax: (}{S} Rd, N
value into a register Rd =N
~ [wor | movea 2
nto a register Rd =~N
[in| move the NOT of esi value
register r5 and copies them into register r7, in this case, taking the valve 5, and overwriting the value 8 in
register r7.
PRE 5
728
Lakshmi R. Chethan Ghatage, Dr. Girijammia 1A, Dr, Devaraju DB
@Microcontrollers: Module 2; Introduction to the ARM Instruction Set
28
MOV 7,05; letr7 =r5
POST r5=5
25
Barrel Shifter:
Im above Example, we showed a MOV instruction where NV is a simple register. But NV can be more than
Just a register or immediate value; it can also be a register Rov that has been preprocessed by the barrel
shifter prior to being used by a data processing instruction,
Data processing instructions are processed within the arithmetic logic unit (ALU).
A unique and powerful feature of the ARM processor is the ability to shift the 32-bit binary
Pattem in one ofthe source registers let or right by a specific number of positiéns before it enters
the ALU,
¥
Pre-processing or shift occurs within the cycle time of the instruction,
© This shift increases the power and flexibility of many data processing operations.
is particularly useful for loading constants into a register and achieving fast
or division by a power of 2.
There are data processing instructions that do not use the barrel shift, for example, the
MUL (multiply), CLZ (count leading zeros), and QADD (signed saturated 32-bit add)
instructions.
Rm
Result N
[No pre-processing
Figure: Barrel Shifter and ALU
¥ Figure shows the data flow between the ALU and the barrel shifter.
Y Register Rn enters the ALU without any pre- processing of registers.
¥ We apply a logical shift left (LSL) to register Rm before moving it to the destination register, This
is the same as applying the standard C language shift operator « to the register,
27 Fhe MOV instuet
‘copies:the, shift operator result’ into registecRd. NV represents‘the result of
the LSL operation described in the following Table.
Lakshmi , Chethan Ghataze, Dr, Girijamima HA, Dr. Devar
RNSIT.CSEMicrocontrollers: Module 2: Introduction to the ARM Instruction Set 29
Table: Barrel Shifter Opera
Mnemonic Description Shift Result Shift amounty
ist logical shift left xisly oxy
tse logical shift right xLSRy — (unsigned)x> y
ASR arithmetic right shift xASRY —_(signed)x >> y
ROR rotate right RORY — ((unsigned)x> y) |(x<« (32 — »))
RX fotate right extended —xRRX_—_(c flag < 31) | ((unsigned)x>> 1)
Note: xcrepresens the register being shifted and y represents the shift amount.
The five different shift operations that you can use within the barrel shifter are summarized
in the above Table.
PRE 15=5
178
MOV 17,15, LSL#2 ; let r7 = r5%4 = (15 << 2)
POST rs
17 =20
7 The above example multiplies register r5 by four and then places the result into register r7.
~The following Figure illustrates a logical shift left by one.
Condition Bags
‘updated when
‘Sis prose
ogical Shift Left by One
For example, the contents of bit 0 are shifted to bit /. Bit 0 is cleared. The C flag is updated with the
last bit shifted out of the register. This is bit (32 - y) of the original value, where y is the shift
amount. When y is greater than one, then a shift by y positions is the same as a shift by one
position executed y times.
Example: This example of a MOVS instruction shifts register r/ left by one bit.
is multiplies register r7
by a value 2'. As you can see, the C flag is updated in the epsr because the S suffix is present in the
instruction. mnesionic,.. - - e : 2
Lakshmi R, Chethan Ghatage, Dr.Microcontrollers: Module 2: Introduction to the ARM Instruction Set.
30
PRE cpsr = ncevgiFt_USER
10 = 000000000 r=
xs0000004
MOVSr0, rl, LSL HL
POSTepsr = nzCwyiFt_USER
0 = 000000008 ri
‘oxso000004
‘The following Table lists the syntax forthe different barrel shi operations available on data processing
instructions, The second operand N can be an immediate constant preceded by # a register value Rm, or
the value of Rm processed by a shift
‘Table: Barrel Shifier Operation Syntax for data Processing Instructions
N shift operations
Syntax
Immediate #Finmediate
Register Rm
Logical shift left by immediate Rm, LSL #shift_imm
Logical shift left by register Rm, LSL Rs
Logical shift right by immediate Rm, LSR #shift_inm
Logical shift right with register Rm, LSR Rs
Arithmetic shift right by immediate Rm, ASR #shift_inm
Arithmetic shift right by register Rm, ASR Rs
Rotate right by immediate SR ROR FETT
Rotate right by register Rm, ROR Rs
Rotate right with extend Rm, RRX
Arithme!
Instructions:
‘The arithmetic instructions implement addition and subtraction of 32-bit signed and unsigned values
Syntax: (}{S} Rd, Rn, N
AoC | add two 32-bit values and carry Rd = Rt N+ carry
00 | add wo 32-bit values Rd = Rn tN
RSB | reverse subtract of two 32-bit values Rd = N— Rn
BSC | reverse subtract with cary oftwo 32-bit values | R= N — Run —W(earey Tagy
$8¢ | subtract with carry of two 32-bit values
Rd = Rn ~N—'earry flag)
Rd = Rn —N
SUB | subtsact wo 32-bit values
Nis the result of the shi
Lakshmi R, Chethan Ghatage, Dr. Giijamma HA, Dr, Bevaraj DB
@aANsMT.csEMicrocontrollers: Module 2: Introduction to the ARM Instruction Set ai
Example: The following simple subtract nstrtion subacts a value stored in register 2 from a yg
stored in register r/. The result is stored in register r0.
PREr0 = 0x00000000 rl
= 000000002 r2
= 0x00000001
SUB r0, rl, 72
POST r0 = 0x00000001
Example: The following reverse subtract instruction (RSB) subtracts r/ from the constant value #
‘writing the result tor. You can use this instruction to negate numbers.
PRE r= 0x00000000 rt
= 0x00000077
RSBr0, rl, #0; Rd = 0x0 rt
POST r0 =-rl = oxffyis9
Example: The SUBS instruction is useful for decrementing loop counters In this example, we subtract th
immediate value one from the value one stored in register rl. The result value zero is written to registe
11. The cpsr is updated with the ZC flags being set
PRE cpsr = nzevgiFl_USER
pe = 000000001 SUBS FI, rI,
a
POSTepsr =nZCvgiFt_USER rl =
‘ox00000000
Using the Barrel Shifter with Arithmetic Instructio
‘The wide range of second operand shifts available on arithmetic and logical instructions is a ver
powerful feature of the ARM instruction set. The following Example illustrates the use ofthe inline barre
* shifter with an arithmetic instruction, The instruction multiplies the value stored in register r/ by three.
Example: Register r/ is first shifted one location to the let to give the value of twice r/. The ADL
instruction then adds the result ofthe barrel shift operation to register r/. The final result transferred int
register ris equal to three times the value stored in register 1.
PRE r= 0x00000000 r1
= 000000005 ADD r0, .
4 Sher LSE
1 akshoii R. Chethan Ghatage, Dr. Girifamma H A, Dr. Devaraju D BMicrocontrollers: Module 2: Introduction to the ARM Instruction Set. 2
POST r0 = ox0000000/r1
= oxoo000005
Logical Instructions: --
‘Logical instructions perform bitwise logical operations on the two source registers.
Syntax: {}{S} Rd, Rn, N
AND logical bitwise AND of two. 32-bit values. Rd = Rn& N
ORR logical bitwise OR of two 32-bit values Rd =Rn|N |
EOR | logical exclusive OR of two 32-bit values | Rd=RnAN |
BIC | logical bit clear (AND NOT) Rd = Rn& ~N
ical OR operation between registers r/ and r2. Register r0 holds the
PRE 0 = ov00000000 rt
= 002040608 12 «=
(010305070
ORR 0.11, r2
POST r0 = 0512345678
Example: This example shows a more complicated logical instruction called BIC, which carries out @
logical bit clear.
PRE rl = Obi] 72
= ObOI01 BIC r0, r1, r2
POST r0 = 061010
This is equivalent to— Rd
tn AND NOT (N)
In this example, register r2 contains a binary pattem where every binat
location in register 1.
Thi s particularly useful when clearing status bits and is frequently used to change interrupt
‘masks in the epsr.
1 in r2 clears a corresponding bit
instruc
NOTE: The logical instru
update the cpsr flags only if the S suffix is present. These instructions can
use barrel-shifled second operands in the same Way as the arithmetic instructions.
Comparison Instriction
Vashi R, Chathan Gi
Or. Girijamma HA, De, Devaraju DB.
ORNSTT.CSEMicrocontrollers: Module 2: Introduction to the ARM Instruction Set
3
‘The comparison instructions are used to compare or testa register with a 32-bit value. v7.
update the epsr flag bits according to the result, but do not affect other registers. After the bx,
have been set, the information can then be used to change program flow by using conditiong,
execution,
Y Its not required to apply the S suffix for comparison instructions to update the flags.
Syntax: () Rn, N
cHN compare negated flags set asa result of Ra +N.
cup | compare flags set asa result of Rn —N
TEQ test for equality of two 32-bit values flags set asa result of Rn “N
st test bits of a 32-bit value flags set asa result of Rn&N
N is the
result of the shifter operation.
Example: This example shows a CMP comparison insiruction. You ean sce that both registers, 70 and r9,
are equal before executing the instruction. The value of the Z flag
represented by.
PRE cpsr =nzevgiFt_USER
ro=4
o—____—_
ro.
CMP,
POSTepsr = nzZevgiFi_USER
v
instruction is @ logical AND operation, and TEQ is a logical exclusive OR operation,
epsr and do not affect the registers being compared.
Mul
iply Instructions:
For each, the results are discarded but the condition bits are updated in the cpsr.
to execution is 0 and is
& lowercase =. After execution the Z flag changes to J or an uppercase Z. This change
indicates equality.
‘The CMP is effectively a subtract instruction with the result discarded; similarly the TST
1k is important to understand that comparison instructions only modify the condition flags of the
‘The multiply instructions multiply the contents ofa pair of registers and, depending upon the instruction,
accumulate the results in with another register.
The long multiplies accumulate onto a pair of registers representing a 64-bit value, The final result is
sis, «x placed inva destination register or a pair of registers.
Lakshmi R, Chethan Ghatage, Dr. Girijamma HA, Dr, Devaraju DB.Microcontrollers: Module 2: Introduction to the ARM Instruction Set 44
Syntax: MLA{}{S) Rd, Rm, Rs, Rn
MUL{}{S) Rd, Rm, Rs
Rs) + Rn
MUL multiply Rd = Ren Rs
MLA multiply and accumulate Rd = (Rn
Syntax: {}(S) RdLo, RdH1, Rm, Rs
SMLAL | signed multiply accumulate long | [RdMi, Rdlo] = [RdHi, RdLol + (Rm*Re)
SMULL | signed multiply long [RdHi, RaLo| = Rin? Re
UMLAL | unsigned multiply accumulate | [Rabi RdLo] = [RdMi, RdLo] + (Ron* Re)
long.
UMULL | unsigned multiply long, [RaHi, RdLo| = Rms * Rs
‘The number of cycles taken to execute a multiply instruction depends on the processor implementation.
For some implementations the cycle ti
ing also depends on the value in Rs.
Example: This example shows a simple multiply instruction that multiplies registers r/ and r2 together
and places the result into register r0. In this example, register r/ is equal to the value 2, and r2is equal to
2. The result, 4, is then placed into register r0.
PRE 0 = 000000000 r1
00000002 r2 =
axo0000002
MUL 0, FI, 72 ; 10 =r1*r2
POST rd = 0x00000004 r1
= 000000002 12 =
‘vooa00002
The long multiply instructions (SMLAL, SMULL, UMLAL, and UMULL) produce a 64-bit resul
result is too large to fit a single 32-bit register so the result is placec
The
two registers labeledRao and
RdHi, RdLo holds the lower 32 bits of the 64-bit result, and RdHi holds the higher 32 bits of the 64-bit
result. The follo
1g shows an example of a long unsigned multiply instruction.
Example: The instruction multiplies registers r2 and r3 and places the result into register r0 and rl.
Register r0 contains the lower 32 bits, and register r/ contains the higher 32 bits of the 64-bit result.
PRE 10 = 0x00000000 5 :
r1=0x00000000° = :
Lakshmi R, Chethan Ghatage, Dr. Girijamma HA, Dr. Devarajur DB.
ORNSIT-CSEi 35,
Microcontrollers: Module 2: Introduction tothe ARM Instruction St
F2 = 0xf0000002 13,
= 0x00000002
UMULL 10, r1,12,13
POST 10 = 0xe0000004
rl = 0x00000001 ye Ralli
BRANCH INSTRUCTIONS
‘A __ branch instruction changes the flow of
execution or is used to calla routine, This
type of instruction allows programs to have
subroutines, ifthen-else structures, and
loops.
‘The change of execution flow forces
instruction set includes four different branch instructions.
the program counter pe to point to a new address. The ARMVSE
Syntax: B{} label
BL{} label
BX{) Rm
BLX{} label | Rm
8 | branch pe label
BL. | branch with link pe* label
Irsaddress of the next instruction after the BL
pe=kn & Oxfffffffe, Tam & 1
BX | branch exchange
BLX | branch exchange with fink | pc * label, T=1
pom & Oxfffffffe, T=Am & 1
Ireaddress of the next instruction after the BLX
Y The address label is stored in the instruction asa signed pe-relaive offset and must
be within approximately 32 MB of the branch instruction.
Y Trofers to the Thumb bit in the cpsr. When instructions set T, the ARM switches to
Thumb state gi
|
|
{
|
}
Example: This example shows a forward and backward branch. Because these loops are address specific,
Wwe do not include the pre- and post-conditions. The forward branch skips three instructions. The
backward branch creates an infinite loop.
Bo forward
_ ig ADDR Me. cin as : i je
ADD 10,76, #2 a oe i :
Lakshmi R, Chethan Ghatage, Dr. Girijanma ILA, De, Devaraja DBMicrocontrollers: Module 2: Introduction to the ARM Instruction Set 36
ADD 13,7, #4 forward
SUB rl, 12, #4
backward
ADDrl, r2, #4
SUB rl, 72, #4
ADD r4, 6,17
B backward
In this example, forward and backward are the labels. The branch labels are placed at the beginning of the
Jine and are used to mark an address that can be used later by the assembler to calculate the branch offset.
Y The branch with link, or BL, instruction is similar to the B instruction but overwrites the link
fer Ir with a return address. It performs a subroutine call.
Example: This example shows a simple ffagment of code that, branches to a subroutine using the BL
instruction, To return from a subroutine, you eopy the link register tothe pe.
BL — subroutine; branch to subrowine
CMP rl, HS compare rl with $ MOVEQ
$d Hh fet ==5) then rl = 0
subroutine
subroutine code>
MON pe, lr. return by moving pe = Ir
Y The branch exchange (BX) and branch exchange with link (BLX) are the third type of branch
instruction.
Y The BX instruction uses an absolute address stored in register Rm. It is primarily used to brinch’
to and from Thumb code. The Tbit in the epsr is updated by the least significant bit of the branch
register.
Y Similarly the BLX instruction updates the T bit of the epsr with the least significant bit and
additionally sets the link register with the return address.
LOAD-STORE INSTRUCTIONS:
Vakshmi R, Chethan Ghatage, Dr, Girijamma HA, Dr, Devarraju DBicrocontrollers: Module 2: Introduction to the ARM Instruction Set ”
Load-store instructions transfer data between memory and processor registers. There are three types of
lood:-store instructions: single-repister transfer, multiple-register transfer, and swap.
Single-Register Transfer:
These instructions are used for moving a single data item in and out of a register.
Y The data types supported are signed and unsigned words (32:
),half-words (16-bit), and bytes,
Here are the various load-store single-register transfer instructions
Syntax: {}{B} Rd,addressing!
LOR{}SB|H|SH Rd, addressing?
STR{}H Rd, addressing?
LR | load word into a register Rd <- mem32faddress]
|_STR_ | save byte or word from a register | Rd -> mem32faudress]
LoRB | load byte into a register Rd <- mems8{adldress}
STRB | save byte froma register Rd -> mem8|address}
LORH | load halfword into a register Rd <- men 16faddress]
STRH | save halfword into a regi
Rd -> men 16 [address]
ee LORSB | load signee bytemnea a reuister — — Ret (} Rn{!},(~}
LOH | load multiple registers | {Ra]*N <- memi32{start address + 4*N] optional Rn updated
StH | save multiple registers
{Ra}*N -> mems32{start address + 44N] optional Rn updated
The following Table shows the different addressing modes for the load-store multiple instructions. Here N
is the number of registers in the list of register.
Table: Addressing Mode for Load-Store Multiple Instructions
Addressing
mode Description Start address End address Rut
IA \crement after “Rin Rut4N—-4 Rn eatn
1B increment before Rn +4 Rutatn Rn+4*N,
DA decrement after Rn-A*N+4 Rn Rn—4*n
0B decrement before Rn — 4*N. Rn-4 Rn—4°n
v
‘Any subset of the current bank of registers can be transferred to memory or fetched from
memory.
‘The base register Rn determines the source or destination address for a load-store multiple
instruction. This register gan be optional
updated. following the transfer. This occurs when,
Lakshmi R. Chethan Gia
ize, Dr. Girjjamma H A, Dr, Devacaju DB.Microcontrollers: Module 2: Introduction to the ARM Instruction Set. a
register Rv is followed by the !character, similar to the single-register load-store
with writeback.
Example: In this example, register r0 is the base register Rn and is followed by 1, indicating that the
register is updated after the instruction is executed. You will notice within the load multiple instruction
that the registers are not individually listed. Instead the “.” character is used to identi
registers. In this case the range is from register r/ to r3 inclusive.
a range of
Each register can also be listed, using a comma to separate each register within “{" and “)” brackets,
PRE mem32{0x80018] = 0:03 mem32(0x80014)
= 0x02
‘mem32(0x80010] = 0x01
+0 = 000080010 rt
= ox09000000 r2
= 0x00000000 13
= ox09000000
LDMIA r0!, fr1-r3}
POSTrO = ox0008001e
I = 0x00000001 r2
= 0x00000002 r3 =
‘xo0000003
The followi
Figure shows a graphical representation, ——————————— ————-_______
‘Memory
Address pointer address Data :
(0x80020 | Ox00000005
‘0x8001¢ | 0x00000004
(0x80018 | 0x00000003 «00000000
(0x80014 | Ox00000002 00000000
70 = 0x80010 —- [0x80010 | 0x00000001 | r/ = 0x00000000
‘0x8000¢ | 0x00000000
Figure: Pre-condition for LDMIA Instruction
The base register r0 points to memory address 0x80010 in the PRE condition.
Memory addresses 0x80010, Ox800I4, and 0x80018 contain the values 1, 2, and 3 respectively.
Y After the load multiple instruction executes, registers r/, r2, and r3 contain these values as shown
in the following
, Dr. Givijanim
Lakshmi R, Chethan Ghate;
IA. Dr. Devaraja DB,
a
RNSIT-CSE
kMicrocontrollers: Module 2: Introduction tothe ARM Instruction Set
Memory |
Address pointer address Datst
0x80020 | 0x00000005 |
10 = oxs001e > [/OxB00ic | 000000004 (
0x60018 | 0x00000003 | r3= 0x00000003 |
060014 | 0x00000002 | 12 = 0x00000002
080010 | 0x00000001 | r/ = 0x00000001
‘0x6000e | 000000000
Figure: Post Condition for LDMIA Instructio
The base register r0 now points to memory address 0x8001¢ aller the last loaded word.
Y Now replace the LDMIA instruction with a fond multiple and inerement before LDMID
instruction and use the same PRE conditi
‘The first word pointed to by register r0 is ignored and register rf is loaded from the next memory
location as shown in the following Figure.
Memory
Address pointer address Dat
(080020 | Ox00000005
10 = O8001e—+ | OxB00Te | Ox00000004
OxB00T8 | 0x00000003
‘0x80014 | 0x00000002
{0x80010 | 0x00000001
‘x8000c | 0x00000000
0x00000004
0x00000003
0x00000002
. Figure: Post Condition for LDMIB Instruction
After execution, register r0 now points to the last loaded memory location. ‘This is in contrast with
~ the LDMIA example, which pointed to the next memory location
+The decrement versions DA and DB of the load-store multiple instructions decrement the start
address and then store to ascending memory locations. .
7
+ With the increment and decrement load multiples; you can access arrays forwards or backwards,
is equivalent to descending memory but accessing the register lis in reverse order.
+ They also allow for stack push and pull operations.
The following Table shows a list of load-store multiple instruction pairs.
‘Table: Load-Store Multiple Pairs when Base Update used
‘Store Multiple | Load Multiple
Tee . . 2 | STMIA LDMDB *
Lakshmi fé, Chethan Ghatage, Dr. Girjamma HA, De, Devaraju DB‘Microcontrollers: Module 2: Introduction to the ARM Instruction Set “4
STMIB LDMDA
STMDA LDMIB
STMDB LDMIA
* Ifyou use a store with base update, then the paired load instruction of the same number of
registers will reload the data and restore the base address pointer.
* This is useful when you need to temporarily save a group of registers and restore them later.
Example: This example shows an STM increment before instruction followed by an LDM decrement after
instru
PRE 0 = 0x00009000
ox00000009
72 = 0x00000008,
3 = 000000007
STMIB r0!, {rI-r3}
MOV rl, a
MOV r2, 42
MOV r3, #3
PRE) 10 -
00009000 ri”
exo0000001 2 = ——
osoo000002, 3 =
(000000003 LDMDA rot,
frlr3}
POST r0 = 000009000 ri
= 000000009 r2 =
000000008 r3 =
x00000007
‘The STMIB instruction stores the values 7, 8, 9 to memory. We then corrupt register r/ to r3. The
LDMDA reloads the original values and restores the base pointer r0.
Example: We illustrate the use ofthe load-store multiple instructions with a block memory copy example.
This example is a simple routine that copies blocks of 32 bytes from a source address location to a
desti
sation address location. :
Lakshmi R, Chethan Ghataye, Dr. Girijamma HA. Dr, Devaraju DB.
DansincseMicrocontrollers: Module 2: Introduction o the ARM Instrtion Set 4s
“The example has two Toad-store multiple instructions, which use the same Inerement afer ddretg
mode.
219 poims to start of source data
£710 points to start of destination data
271 points to end of the source loop
+ load 32 bytes from source annd update r9 pointer
LDMIA 192, fr0-F7}
{store 32 bytes to destination and update r10 pointer
STMIA r10!, fr0-r7} + and store them
shave we reached the
CMP 19, r1T
BNE loop
¥ This routine relies on registers r9, 7/0, and r1/ being set up before the code is exe
end
Registers r9 and r1 determine the data to be copied, and register r70 points to the destination it
memory forthe data.
¥LDMIA loads the data pointed to by register 9 into re
ister r"to 7. Ialso updates r9 to poin
to the next block of data to be copied.
¥ STMIA copies the contents of registers r0 10 r7 tothe destination memory
address pointed to by
register 710. It also updates 7/0 to point to the next destination location,
—— ——“eittand BNE compare pointers rS-and-r#Hto check whether the end ofthe block eopy has bee
reached.
YF the block copy is complete, then the routine finishes: otherwise the loop repeats with th
updated values of register 9 and 7/0.
“The BNE is the branch instruction B with a’condition mnemonic NE (not equal). If the previow
compare instruction sets the condition flags to not equa, the branch instruction is executed.
The following Figure shows the memory map of the block memory copy and how the routine move
‘through memory.
* Lakshmi R, Chothan G
age, De. Gitijamma HA, Dr, Bevaraiu DB
eMicrocontrollers: Module 2: Introduction to the ARM Instruction Set
High memory
a
wo __t Source
ro ___{__ | Destination |
Low memory
Copy
memory
location
Figure: Block Memory Copy in the Memory map
Theoretically this loop can transfer 32 bytes (8 words) in two instructions,
46
for a maximum possible
throughput of 46 MBy/second being transferred at 33 Mi¥z. These numbers assume a perfect memory
system with fast memory.
Stack Operation: The ARM architecture uses the load-store multiple instructions to carry out stack
operations.
A stack is either ~
location
‘The popoperation(removing data from a stack) uses a oad multiple instruction.
‘The push operation (placing data onto the stack) uses a store multiple instruction,
When using a stack you have to decide whether the stack will grow up or down in memory.
ascending (A) ~ stacks grow towards higher memory addresses or *
descending (D) ~ stacks grow towards lower memory addresses.
When you use a full stack (F), the stack pointer sp points to an address that is the last used or full
. 8p points to the last item on the stack).
Y Ifyou use an empty stack (E) the sp points to an address that is the first unused or empty location
-¢., it points after the last item on the stack).
There are number of load-store multiple addressing mode aliases available to support stack
operations (see the following Table).
Lakshmi R. Cheth;
Ghatage, Dr. Girijamma HA, Dr, Devaraju DB
ANSITRCSE‘Microcontrollers: Module 2: Introduction to the ARM Instruction Set
‘Table: Addressing Methods for Stack Operations
‘Addressing mode Description
FA full ascending
FO full descending
EA ‘empty ascending
£0 empty descending
Pop
LOMFA
LOMFD
LOMEA
LOMED
= LOM
‘LOMDA
LOMIA
LOMDB
LOMB
Push
STMFA
STMFD
STMEA
STMED
=S4
STHIB
‘STHOR,
STMIA,
STHOA
+ Next tothe pop column is the actual load multiple instruction equivalent
(oFor example, a full ascending stack would have the notation FA appended 0 the Joy
‘multiple instruction—LDMFA. This would be translated into an LDMDA instruction,
+ ARM has specified an ARM-Thumb Procedure Call Standard (ATPCS) that defines how routing
are called and how registers are allocated. In the ATPCS, stacks are defined as being fu
descending stacks. Thus, the LDMFD and STMFD instructions provide the pop and pus,
functions, respectively.
Example: The STMFD instruction pushes registers onto the stack, updating the sp. The following Figur
shows a push onto a full descending stack,
PRE Address Data POST Address Data
‘OxB0018 | 0x00000001 ‘0x80018 | Ox00000001
sp —~| 0x80014 |0x00000002 oxg0014 | 0x00000002
: em = a
oxe000¢ [empty sp —+ |.0x8000c | 0x00000002
Figure: STMFD Instruction — Full Stack push Operation
‘You can see that when the stack grows the stack pointer points to the last full entry in the stack.
PRE rl = 0x00000002 r4
= 0x00000003 sp =
‘ox00080014
STMED spt, trl, r4}
POST r} = 0x00000002 r4
= 000000003 sp =
0x0008000e
Example: The
following Figure
shows a push
‘operation onan
Lakshmi R, Chethan Ghatage, Dr. Girijamma H A, Dr. Devaraju DB
leMicrocontroller
: Module 2: Introduction to the ARM Instruction Set ae
empty stack using
the STMED
instruction,
PRE Address Data POST Address Data
(0x80018 |_0x00000001 ‘0xB0018)]_0x00000001
0x80014 | 0x00000002 0xg0014 | 0xo0000002
sp —~| 0x80010 | Empry -0x80010 | 0x00000003
0x8000c | Empry 0x8000¢ | 0x00000002
0x80008 | Empry sp —~ [[0x80008;| Empry
igure: STMED Instruction — Empty Stack push Operation
‘The STMED instructiof pushes the registers onto the stack but updates register sp to point to the next
empty location.
PRE r} = 0x00000002
4 = 000000003
sp = 000080010
STMED spl, {rl, r4
POST rl = 0x00000002
| 4 = ox00000003.
sp = 0x00080008
Y When handling a checked stack there are three attributes that need to be preserved: the stack base,
the stack pointer, and the stack limit.
The stack base isthe starting address ofthe stack in memory.
Y The stack pointer initially points to the stack base; as data is pushed onto the stack, the stack
pointer descends memory and continuously points to the top of stack. If the stack pointer passes + *
the stack limit, then a stack overflow error has occurred,
Y Here is a small piece of code that checks for stack overflow errors for a descending stack: ; check
for stack overflow
SUB sp, sp, tsize
CMP sp, 110
BLLO _stack overflow; condition
Laks Chathan Ghat, Dr. Girjanma HA, De, Devaraj DB
OansircseMicrocontrollers: Module:
Introduction o the ARM Instruction Set
+ ATPCS defines resister r10 asthe stack limit oF sf. This is optional since ii ony us
stack checking is enabled.
+ The BLLO instruction is a branch with link instruction plus the condition mnemonic Lo,
~ © ap isles than register 110 afer the new items are pushed onto the slack, then gy 7
overflow error has occurred. it
(© If the stack pointer goes back past the stack base, then a stack underflow erop wal
occurred.
Swap Instruction:
‘The swap insmetion speci ase ofaond-storeinstuton. swaps the ontens of memory with,
contents ofa register.
This inset i an etme operation it reads and. wits & neon in the same bs open]
Preventing any other instrction from reading or writing to that location until it completes. 7
Syntax: SHP(B) {} Rd, Rm, [Rn] sc
SWP_| swap a word between memory and a register | tmp=mem32{Rn] 7
mem32[Rn] = Ro
Rd= tmp
SHPB | swap a byte between memory anda register | tmp=mems[Ruj
‘mem8{Rn]= Rm
vt ttn
‘Swap cannot be intemupted by any other instruction or any other bus access We say the sytem “holds the
‘bus” until the transaction is complete Also, swap instruction allows for both a word and a byte swap. w
Example: The swap instruction loads a word from memory i
to register rO and overwrites the memory | ro
with register rl. |e
PREmem32(0:9000] = 012345678 r0 | w
‘oxo0000000. rl = |
011112222 12 = 0x00009000 |
SWP 0, rl, [2] te
POSTmem32{0x9000] = Oxt1112222 | PB
10 = 012345678 rl = |
Ox11112222 |
; 12-5 0s90009000.
Lakshmi R, Chethan Ghatage, Dr. Girijamma H A, De: Devaraju DB
1
1
|
|49
s only Used When ‘Microcontrollers: Module 2: Introduction to the ARM Instruction Set 50
mio Example: This example shows a simple data guard that ean be used to protect data from being written by
me nother task. The SWP instruction “holds the bus” until the transaction is complete.
stack, then stacy
spin
MOV rl, "semaphore
derflow error has MOV 12,41
SUP r3, 2, fr]; hold the bus until complete
CMP 13, #1
BEQ. spin
f memory with the “The address pointed to by the semaphore either contains the value 0 or 1. When the semaphore equals /,
then the service in question is being used by another process. The routine will continue to loop around
me bus operation, until the service is released by the other process —i
‘other words, when the semaphore address location
es. ‘contains the value 0, '
SOFTWARE INTERRUPT INSTRUCTIO!
‘A software interrupt instruction (SIV) causes a software interrupt exception, which provides a mechanism
321m]
for applications to call operating system routines
Syntax: SWI{} SWI_nunber
SHE | software interrupt | Ir_sve= address of instruction following the SWT
spsr_sve= epsr
Pemvectors + 0x8
he system “holds the pir 1=1 (mask IRQ interrupts)
and a byte swap.
‘When the processor executes an SWI insiruction, it sets the program counter pe to the offset Ox8 in the
vecior table. The instruction also forees the processor mode to SVC, which allows an operating system
erwrites the memory routine to be called in a privileged mode.
Each SWI instruction has an associated SWI number, which is used to represent a particular
function call or feature,
Example: Here we havea simple example ofan SW call with SWI number 63123456, used by ARM
toolkis a debugging SWI. Typiealy te SW instruction i executed in user mode
PREepHr =necVQit_USER po =
0x00008000 Ir = OxOO3ffHf Ir
14
r= 0x12
‘xdoddsoin sir dsr2iuse
Lakshmi R, Chethan Ghatage, Dr. Gitijamma H A, Dr. Devaraju BB
‘ ansiegseMicrocontrollers: Module 2: Introduction to the ARM Instruction Set i
POSTepsr =nzeVqlfi_SVCspsr
= mcVqift_USER pe
= 000000008 Ir =
00008004 r0 = 0x12
Since SWI instructions are used to call operating system routines, you need some form of parameg
passing. This is achieved using registers. In this example, register ris used to pass the parameter gy/9
‘The return values are also passed back via registers }
Code called the SII handler is required to process the SWI call. The handler obtains the SW1 numbey
using the address of the executed instruction, which is calculated from the link register /r
‘The SWI number is determined by
SWI_Number = AND NOT (0347000000)
Here the SW1 instruction is the actual 32-bit SWI instruction executed by the processor.
Example: This example shows the start of an SWI handler implementation. The code fragment determing
‘what SWI number is being called and places that number into register 0.
You can see from this example that the load instruction frst copies the complete SWI instruction
register 710, The BIC instruction masks off the top bis ofthe instruction, leaving the SWI number. We
assume the SWI has been called from ARM state
SWL handler
+ Store registers rO-r12 and the link register
STMED sp!, fr0-r12, Inf
+ Read the SWI instruction
LDR 110, [Ir, #4]
: Mask off top 8 bits
BIC r10, r10, #oxfy000000
+110 - containy the SWI number
BL service_routine
: return from SWI handler
LDMED spt, {r0-r12, po}
‘The number in register r/0 is then used by the SWI handler to call the appropriate SWI service routine.
PROGRAM STATUS REGISTER INSTRUCTIONS:
The ARM instruction sét provides two instructions to diréetly comtrol a progreim stanus register (pst).
Lakshmi R, Chethan Ghatage, Dr. Girijamma HA, Dr, Devaraju DB
@Microcontrollers: Module
Introduction to the ARM Instruction Set 92
v
register
v
‘The MRS insiruction transfers the contents of either the epsr oF spsr into a
‘The MSR insiruction transfers the contents of a register into the cpsr or spsr.
Together these instructions are used to read and write the epsr and spsr.
In the syntax we can see a label called fields. This ean be any combination of control (6),
extension (3),
status (5), and flags ().
Syntax: MRS() Rd,
MSR(} _,Rm
MSR{} _,#inmediate
IRS | copy program status register to a general-purpose register | Rd= par
NSR | move general-purpose register to program status register | psefill)= Rw
NSR | move an immediate value toa program status register st {fell) = immediate
These fields r
late to particular byte regions in a psr, as shown in the following Figure,
gS [24:31] Status [16:23]
——
Bit 31302928
Fields
Inizic|v a Mode
gu
sr Byte Fields
The c field controls the interrupt masks, Thumb state, and processor mode.
‘The following Example shows how to enable IRQ interrupts by clearing the J mask. This operation
involves using both the MRS and MSR instructions to read from and then write tothe epsr
Example: The MSR first copies the epsr into register r/. The BIC instruction clears bit 7 of r!. Register r/
's then copied back into the } cp, opcodel, Cd, Cn {, opcode2}
{} cp, opcodel, Rd, Cn, Cm {, opcode2)
{} cp, Cd, addressing
COP _| coprocessor data processing —perform an operation in a coprocessor
HRC HCR| coprocessor register transfer—move data to/from coprocessor registers
LOC STC] coprocessor memory transfer—load and store blocks of memory tolfrom a coprocessor
Y In the syntax of the coprocessor instructions, oThecp, field represents the coprocessor numbe
between p0 and p/SoThe opcode fields describe the operation to take place on the coprocessor.
‘oTheCn, Cm, and Ca fields describe registers within the coprocessor.
- Y The coprocessor operations and registers depend on the specific coprocessor you are using.
Y Coprocessor 15 (CP15) is reserved for system control purposes, such as memory managemen’
write buffer control, cache control, and identification regi
Example: This example shows a CP/5 register being copied into a general-purpose register.
J transferring the contents of CP1S register c0 to register r10"
MRC pI5, 0,10, €0, €0, 0
Here CP 1 Sregister-O contains the processor identification number. This register is copied into the general
purpose register r/0.
Loa!
col
Lakshmi &, Chethan Ghatage, Dr. Girijamma H A, Dr. Devaraju DBMicrocontrollers: Module 2: Introduction to the ARM Instruction Set 34
‘You might have noticed that there is no ARM instruction to move a 32-bit constant into a register. Since
ARM instructions are 32 bits in size, they obviously cannot specify a general 32-bit constant. To aid
programming there are two pseudo-instructions to move 32:
it Value into a register.
Syntax: LOR Rd, =constant
ADR Rd, label
LOR | toad constant pseudoinstruction | Rd = 32-bit constant
ADR | load address pseudoinstruction | Rd= 32-bit relative address
* The first pseudo-instruction writes a 32-bit constant to a register using whatever instructions are
available. It defaults to.a memory read if the constant cannot be encoded using other instructions.
+ The second pseudi
instruction writes a relative address into a register, which will be encoded
using a pe-relative expression,
Example: This example shows an LDR instruction loading a 32-bit constant OxOOPMT into register
70.
EDR 10, (pe, Heonstant_number-8-{PC}]
constant_number
Ded oxpoogy
This example involves a memory access {0 load the constant, whict-eam-be-expensive-for time-criticat ———————
routines.
‘The following Example shows an alternative method to load the same constant into register r0 by using an
MYN instruction,
joading the constant Ox//0O/If/ using an MVN. PRE
Examplet
AIYN ro, Hoxooguo00
POST ro = oxfongy
‘AS you can see, there are alternatives to accessing memory, but they depend upon the constant you are
trying to load.
‘The LDR pseudo-instruction either inserts an MOV or MN instruction to generate a value (if possible)
or generates an LDR instruction with a pe-relative address to read the constant from a literal pool—a data
_ ttea embedded within the code,
Dr. Gini
Lakshmi R, Chethan Ghatag ma HA, Dr, Devaraju DB.
DRNSITCSEMicrocontrollers: Module 2: Introduction to the ARM Instruction Set 35
The following Table shows two pseudo-code conversions.
‘Table: LDR pseudo-instruction Conversion
eee eS
Pseudoinstruction Actual instruction
LOR r0, =Oxff MOV rO, #0xff
LOR r0, =0x55555555 LOR r0, [pc, foffset_12]
‘The first conversion produces a simple MOV instruction; the second conversion produces a pe-reat
load.
‘Another useful pseudo-instruction is the ADR instruction, or address relative. This instruction places te
address of the given label into register Rd, using a pe-relative add or subtract.