CIE2 - Notes
CIE2 - Notes
Logical Instructions
LSL R1, R5
Logical shift left by number in register LSL <Rd>, <Rs> R1= R1 << R5
times
P a g e 1 | 14
MODULE 3 NOTES- CA WITH ARM
LSR R3, R4
Logical shift right by number in register LSR <Rd>, <Rs> R3= R3 << R4
times
Arithmetic shift right by immed count ASR R2, R3, #5
R2= R3 >> 5 times
ASR <Rd>, <Rs>, and keeping the
#immed MSB bit
ASR R3, R4
R3= R3 << R4
Arithmetic shift right by number in register ASR <Rd>, <Rs> times
and keeping the MSB
bit
Rotate Instructions
Rotate right by amount in register
Data Conversion
SXTB R1, R3
Extract byte [7:0] from register, move to register, and
SXTB <Rd>, <Rm> R1= signed extend
sign-extend to 32 bits
byte present in R3
SXTH R1, R3
Extract halfword [15:0] from register, move to R1=unsigned extend
SXTH <Rd>, <Rm>
register, and sign-extend to 32 bits half word present in
R3
UXTB R1, R3
Extract byte [7:0] from register, move to register, and R1= unsigned
UXTB <Rd>, <Rm>
zero-extend to 32 bits extend byte present
in R3
P a g e 2 | 14
MODULE 3 NOTES- CA WITH ARM
UXTH R1, R3
Extract halfword [15:0] from register, move to UXTH <Rd>,
R1= signed extend half
register, and zero-extend to 32 bits <Rm>
word present in R3
Reverse Instructions
Reverse bytes in word
REV R2, R3
REV <Rd>, <Rm> R2= reverse bytes
of R3
Compare Instructions
CMP R1, #10
CMP <Rn>, Compare R1 with
Compare immediate 8-bit value
#<immed_8> 10 and affect The
flags
CMP R1, R2
Compare R1 with
Compare registers CMP <Rn>, <Rm>
R2 and affect the
flags
CMN R4, R6
Compare negation of register value with another CMN <Rn>, Compare R4 and
register value <Rm> (~R6) and affect the
flags
P a g e 3 | 14
MODULE 3 NOTES- CA WITH ARM
Test Instructions
TST R1, R5
Test register value for set bits by ANDing it with Bitwise AND R1
TST <Rn>, <Rm>
another register value and R5, Flags N &
Z is updated
TEQ R1, R5
Test register value for set bits by XORing it with Bitwise XOR R1
TEQ <Rn>, <Rm>
another register value and R5, Flags N &
Z is updated
Bit Field Processing
BFC R3, #5, #8
Clear bit field, from lsb of Rd for total width size bits BFC <Rd>,
From 5 th bit of R3,
are made clear #<lsb>, #<width>
8 bits are made zero
BFI R1, R2, #4, #6
BFI <Rd>, <Rn>, Insert least 4 bits
Insert bit field from one register value into another
#<lsb>, #<width> from R2 to R1 from
6 th bit onwards
CLZ R1, R4
R1= leading zeros
Return number of leading zeros in register value CLZ <Rd>, <Rn>
of value present in
R4
RBIT R5, R6
Reverse bit order RBIT <Rd>, <Rm> Reverse R6 bitwise
and save in R5
UBFX R3, R6, #6, #8
UBFX <Rd>,
Copy bit field from register value to register and zero- R3= Zero extend
<Rn>, #<lsb>,
extend to 32 bits the 8 bits copied
#<width>
from 6 th bit of R6
UBFX R4, R7, #4, #12
SBFX <Rd>,
Copy bit field from register value to register and sign- R4= Sign extend the 12
<Rn>, #<lsb>,
extend to 32 bits bits copied from 4 th bit
#<width>
of R7
P a g e 4 | 14
MODULE 3 NOTES- CA WITH ARM
Branching instructions
P a g e 5 | 14
MODULE 3 NOTES- CA WITH ARM
Stack memory is a memory usage mechanism that allows the system memory to be used as temporary
data storage that behaves as a first-in-last-out buffer. One of the essential elements of stack memory
operation is a register called the Stack Pointer. The stack pointer indicates where the current stack
memory location is, and is adjusted automatically each time a stack operation is carried out.
In the Cortex-M processors, the Stack Pointer is register R13 in the register bank. Physically there are
two stack pointers in the Cortex-M processors, but only one of them is used at a time, depending on the
current value of the CONTROL register and the state of the processor.
P a g e 6 | 14
MODULE 3 NOTES- CA WITH ARM
In common terms, storing data to the stack is called pushing (using the PUSH instruction) and restoring
data from the stack is called popping (using the POP instruction). Depending on processor architecture,
some processors perform storing of new data to stack memory using incremental address indexing and
some use decrement address indexing. In the Cortex-M processors, the stack operation is based on a
“full-descending” stack model. This means the stack pointer always points to the last filled data in the
stack memory, and the stack pointer predecrements for each new data store (PUSH).
PUSH and POP are commonly used at the beginning and at the end of a function or subroutine.
At the beginning of a function, the current contents of the registers used by the calling program are
stored onto the stack memory using PUSH operations, and at the end of the function, the data on the
stack memory is restored to the registers using POP operations. Typically, each register PUSH
operation should have a corresponding register POP operation; otherwise the stack pointer will not be
able to restore registers to their original values. This can result in unpredictable behaviors, for example,
function return to incorrect addresses.
The minimum data size to be transferred for each push and pop operations is one word (32-bit)
and multiple registers can be pushed or popped in one instruction. The stack memory accesses in the
Cortex-M processors are designed to be always word aligned (address values must be a multiple of 4,
for example, 0x0, 0x4, 0x8,…) as this gives the best efficiency for minimum design complexity. For
this reason, bit [1:0] of both stack pointers in the Cortex-M processors are hardwired to zeros and read
as zeros.
P a g e 7 | 14
MODULE 3 NOTES- CA WITH ARM
In programming, the stack pointer can be accessed as either R13 or SP in the program codes.
Depending on the processor state and the CONTROL register value, the stack pointer accessed can
either be the MSP or the PSP. In many simple applications, only one stack pointer is needed and by
default the MSP is used. The PSP is usually only required when an OS is used in the embedded
application.
In a typical embedded application with an OS, the OS kernel uses the MSP and the application
processes use the PSP. This allows the stack for the kernel to be separate from stack memory for the
application processes. This allows the OS to carry out context switching quickly (switching from
execution of one application process to another). Also, since exception handlers only use main stack,
each of the stack spaces allocated to application tasks do not need to reserve space needed for exception
handler, thus allow better memory usage efficiency.
Even though the OS kernel only uses the MSP as its stack pointer, it can still access the value in PSP
by using special register access instructions (MRS and MSR).
Assembly Program
1. Write an assembly program to add 20 numbers
2. Write an assembly program to generate Fibonacci series
3. Write an assembly program to separate odd and even numbers of an given array
4. Write an assembly program to perform linear search
5. Write an assembly program to find y= 3x2 + 4x +5
6. Write an assembly program to generate multiple of 5
7. Write an assembly program to count number of 1’s in a given number
8. Write an assembly program to find if given number is negative or positive
9. Write an assembly program to add, subtract, multiply and divide two 32-bit numbers
10. Write an assembly program to add, subtract and multiply two 64-bit numbers
Embedded C Program
1. Write a C program to add 20 numbers and find average
2. Write a C program to generate Fibonacci series
3. Write a C program to separate odd and even numbers of a given array
4. Write a C program to perform linear search
5. Write a C program to find GCD and LCM.
6. Write a C program to generate multiple of 5
P a g e 8 | 14
MODULE 3 NOTES- CA WITH ARM
Product of two n-bits numbers is max 2n bits number. Unsigned multiplication can be
viewed as addition of shifted versions of the multiplicand. Multiplication involves the generation
of partial products, one for each digit in the multiplier. These partial products are then summed
toproduce the final product. When the multiplier bit is 0, the partial product is 0. When the
multiplier is 1 the partial product is the multiplicand. The total product is produced by summing
the partial products. For this operation, each successive partial product is shifted one position to
the left relative to the preceding partial product.
Array Multiplier
Binary multiplication can be implemented in a combinational two-dimensional logic
arraycalled array multiplier.
• The main component in each in each cell is a full adder, FA.
• The AND gate in each cell determines whether a multiplicand bit mj, is added to the
incoming partial product(PP) bit based on the value of the multiplier bit, qi.
• Each row i, where 0<= i <=3, adds the multiplicand (appropriately shifted) to the
incoming parcel product, PPi, to generate the outgoing partial product, PP(i+1), if
qi.=1.
• If qi.=0, PPi is passed vertically downward unchanged. PP0 is all 0’s and PP4 is the
desired product. The multiplication is shifted left one position per row by the diagonal
signal path
P a g e 9 | 14
MODULE 3 NOTES- CA WITH ARM
Multiplier cell
Algorithm:
(1) The multiplier and multiplicand are loaded into two registers Q and M. Third
registerA and C are cleared to 0.
(2) In each cycle it performs 2 steps:
(a) If LSB of the multiplier qi =1, control sequencer generates Add signal
whichadds the multiplicand M with the register A and the result is stored in
A.
(b) If qi =0, it generates Noadd signal to restore the previous value in register
A.
(3) Right shift the registers C, A and Q by 1 bit
P a g e 11 | 14
MODULE 3 NOTES- CA WITH ARM
[To extend the sign bit - since its 5 bit signed operand, 10 bit product should be
generated.So, if the partial product’s MSB is 1, add 1 for sign extension (to left),
if the partial product’s MSB is 0, add 0 for sign extension (to left)]
P a g e 14 | 14