Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
7 views14 pages

CIE2 - Notes

Uploaded by

amogh.patadi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views14 pages

CIE2 - Notes

Uploaded by

amogh.patadi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

MODULE 3 NOTES- CA WITH ARM

Module 3: ARM Architecture and Instruction Set-II

Syllabus: Logical instructions, Branch Instructions, Barrier Instructions, Cortex-M3 assembly


Programming

Logical Instructions

Operation Assembler Example


AND <Rd>, AND R1, R4
Bitwise AND register values
<Rm>
AND <Rd>, <Rn>,
Bitwise AND register value with immediate value AND R1, #44
#immed
ORR <Rd>, ORR R2, R3
Bitwise OR register values
<Rm>
ORR <Rd>, <Rn>,
Bitwise OR register value with immediate value ORR R1, R5, #0XA5
#immed
EOR <Rd>, EOR R2, R3
Bitwise XOR register values
<Rm>
EOR <Rd>, <Rn>,
Bitwise XOR register value with immediate value EOR R1, R2, #60
#immed
BIC R1, R2
Bit clear BIC <Rd>, <Rm>
 R1=R1&(~R2)
BIC <Rd>, <Rn>, BIC R1, R2, R4
Bit clear
<Rm>  R1=R2&(~R4)
ORN <Rd>, ORN R1, R2
Logical OR NOT
<Rm>  R1=R1| (~R2)
ORN <Rd>, ORN R1, R3, #0X33
Logical OR NOT <Rm>,  R1=R3 | (~0X33)
# immed
Shift instructions
Logical shift left by immed count LSL R1, R5, #3
 R1= R5 << 3 times
LSL <Rd>, <Rs>,
#immed

LSL R1, R5
Logical shift left by number in register LSL <Rd>, <Rs>  R1= R1 << R5
times

P a g e 1 | 14
MODULE 3 NOTES- CA WITH ARM

Logical shift right by immed count LSR R2, R3, #5


 R2= R3 >> 5 times
LSR <Rd>, <Rs>,
#immed

LSR R3, R4
Logical shift right by number in register LSR <Rd>, <Rs>  R3= R3 << R4
times
Arithmetic shift right by immed count ASR R2, R3, #5
 R2= R3 >> 5 times
ASR <Rd>, <Rs>, and keeping the
#immed MSB bit

ASR R3, R4
 R3= R3 << R4
Arithmetic shift right by number in register ASR <Rd>, <Rs> times
and keeping the MSB
bit
Rotate Instructions
Rotate right by amount in register

ROR <Rd>, <Rs> ROR R4, R1


 R4= R4 >> R1
times
Rotate right with extend (carry included in rotate)
by amount in register
RRX R4, R1
RRX <Rd>, <Rs> R4= R4 >> R1 times

Data Conversion
SXTB R1, R3
Extract byte [7:0] from register, move to register, and
SXTB <Rd>, <Rm>  R1= signed extend
sign-extend to 32 bits
byte present in R3
SXTH R1, R3
Extract halfword [15:0] from register, move to  R1=unsigned extend
SXTH <Rd>, <Rm>
register, and sign-extend to 32 bits half word present in
R3
UXTB R1, R3
Extract byte [7:0] from register, move to register, and  R1= unsigned
UXTB <Rd>, <Rm>
zero-extend to 32 bits extend byte present
in R3
P a g e 2 | 14
MODULE 3 NOTES- CA WITH ARM

UXTH R1, R3
Extract halfword [15:0] from register, move to UXTH <Rd>,
R1= signed extend half
register, and zero-extend to 32 bits <Rm>
word present in R3
Reverse Instructions
Reverse bytes in word

REV R2, R3
REV <Rd>, <Rm>  R2= reverse bytes
of R3

Reverse bytes in each halfword


REV16 R2, R3
REV16 <Rd>,  R2= reverse half
<Rn> words of R3

Reverse bytes in bottom halfword and sign-extend REVSH R4, R1


 R4= reverse the
lower bytes of r1
REVSH <Rd>, and sign extend it to
<Rn> 32bits

Compare Instructions
CMP R1, #10
CMP <Rn>,  Compare R1 with
Compare immediate 8-bit value
#<immed_8> 10 and affect The
flags
CMP R1, R2
 Compare R1 with
Compare registers CMP <Rn>, <Rm>
R2 and affect the
flags
CMN R4, R6
Compare negation of register value with another CMN <Rn>,  Compare R4 and
register value <Rm> (~R6) and affect the
flags
P a g e 3 | 14
MODULE 3 NOTES- CA WITH ARM

Test Instructions
TST R1, R5
Test register value for set bits by ANDing it with  Bitwise AND R1
TST <Rn>, <Rm>
another register value and R5, Flags N &
Z is updated
TEQ R1, R5
Test register value for set bits by XORing it with  Bitwise XOR R1
TEQ <Rn>, <Rm>
another register value and R5, Flags N &
Z is updated
Bit Field Processing
BFC R3, #5, #8
Clear bit field, from lsb of Rd for total width size bits BFC <Rd>,
 From 5 th bit of R3,
are made clear #<lsb>, #<width>
8 bits are made zero
BFI R1, R2, #4, #6
BFI <Rd>, <Rn>,  Insert least 4 bits
Insert bit field from one register value into another
#<lsb>, #<width> from R2 to R1 from
6 th bit onwards
CLZ R1, R4
 R1= leading zeros
Return number of leading zeros in register value CLZ <Rd>, <Rn>
of value present in
R4
RBIT R5, R6
Reverse bit order RBIT <Rd>, <Rm>  Reverse R6 bitwise
and save in R5
UBFX R3, R6, #6, #8
UBFX <Rd>,
Copy bit field from register value to register and zero-  R3= Zero extend
<Rn>, #<lsb>,
extend to 32 bits the 8 bits copied
#<width>
from 6 th bit of R6
UBFX R4, R7, #4, #12
SBFX <Rd>,
Copy bit field from register value to register and sign- R4= Sign extend the 12
<Rn>, #<lsb>,
extend to 32 bits bits copied from 4 th bit
#<width>
of R7

P a g e 4 | 14
MODULE 3 NOTES- CA WITH ARM

Branching instructions

Operation Assembler Example


Unconditional Branch => range −16 MB to +16 MB
Branch to label, label address is B STOP
copied to PC for branching. It is
B Label
unconditional which means it is
always true.
Branch to address present in Rm, BX Rm
address from Rm is copied to PC Set the execution state of Processor T BX R2
for branching. bit= LSB of Rm=1
Branch to function named as
given label. The current PC in
saved in LR and then label BL label BL fun
address is copied to PC for
calling the function.
Branch to function address saved
in Rm. The current PC in saved BLX Rm
in LR and then address from Rm Set the execution state of Processor T BLX R4
is copied to PC for calling the bit= LSB of Rm=1
function.
Conditional Branch: One Flag Check
If Z flag =1 then branch otherwise BEQ Label
Branch if Equal
execute next instruction
If Z flag =0 then branch otherwise BNE Label
Branch if not Equal
execute next instruction
Branch if Carry Set / If C flag =1 then branch otherwise BCS Label /
Branch if unsigned high execute next instruction BHS Label
Branch if Carry Clear / If C flag =0 then branch otherwise BCC Label /
Branch if unsigned Low execute next instruction BLO Label
If N flag =1 then branch otherwise BMI Label
Branch if Minus (Negative)
execute next instruction
If N flag =0 then branch otherwise BPL Label
Branch if Plus (Positive)
execute next instruction
If V flag =1 then branch otherwise BVS Label
Branch if overflow set
execute next instruction
If V flag =0 then branch otherwise BVC Label
Branch if overflow clear
execute next instruction

P a g e 5 | 14
MODULE 3 NOTES- CA WITH ARM

Conditional Branch: Two Flags Check


If C flag =1 & Z flag=0, then BHI Label
Branch if unsigned higher branch otherwise execute next
instruction
If C flag =0 or Z flag=1, then BLS Label
Branch if unsigned lower or same branch otherwise execute next
instruction
Conditional Branch: Three Flags Check
If N==V and Z =0, then branch BGT Label
Branch if signed greater than
otherwise execute next instruction
If N==V or Z =1, then branch BGE Label
Branch if signed greater than or equal
otherwise execute next instruction
If N !=V and Z =0, then branch BLT Label
Branch if signed less than
otherwise execute next instruction
If N !=V or Z =1, then branch BLE Label
Branch if signed less than or equal
otherwise execute next instruction
Compare and Branch
Compare zero and branch
Compare Rn with 0, if Rn==0, then CBZ R1, next
CBZ <Rn>, <label>
jump to label otherwise execute next
instruction
Compare not zero and branch
Compare Rn with 0, if Rn !=0, then CBNZ R1, next
CBNZ <Rn>, <label>
jump to label otherwise execute next
instruction

Stack Memory Operations: PUSH and POP

Stack memory is a memory usage mechanism that allows the system memory to be used as temporary
data storage that behaves as a first-in-last-out buffer. One of the essential elements of stack memory
operation is a register called the Stack Pointer. The stack pointer indicates where the current stack
memory location is, and is adjusted automatically each time a stack operation is carried out.

In the Cortex-M processors, the Stack Pointer is register R13 in the register bank. Physically there are
two stack pointers in the Cortex-M processors, but only one of them is used at a time, depending on the
current value of the CONTROL register and the state of the processor.

P a g e 6 | 14
MODULE 3 NOTES- CA WITH ARM

In common terms, storing data to the stack is called pushing (using the PUSH instruction) and restoring
data from the stack is called popping (using the POP instruction). Depending on processor architecture,
some processors perform storing of new data to stack memory using incremental address indexing and
some use decrement address indexing. In the Cortex-M processors, the stack operation is based on a
“full-descending” stack model. This means the stack pointer always points to the last filled data in the
stack memory, and the stack pointer predecrements for each new data store (PUSH).

PUSH and POP are commonly used at the beginning and at the end of a function or subroutine.
At the beginning of a function, the current contents of the registers used by the calling program are
stored onto the stack memory using PUSH operations, and at the end of the function, the data on the
stack memory is restored to the registers using POP operations. Typically, each register PUSH
operation should have a corresponding register POP operation; otherwise the stack pointer will not be
able to restore registers to their original values. This can result in unpredictable behaviors, for example,
function return to incorrect addresses.
The minimum data size to be transferred for each push and pop operations is one word (32-bit)
and multiple registers can be pushed or popped in one instruction. The stack memory accesses in the
Cortex-M processors are designed to be always word aligned (address values must be a multiple of 4,
for example, 0x0, 0x4, 0x8,…) as this gives the best efficiency for minimum design complexity. For
this reason, bit [1:0] of both stack pointers in the Cortex-M processors are hardwired to zeros and read
as zeros.
P a g e 7 | 14
MODULE 3 NOTES- CA WITH ARM

In programming, the stack pointer can be accessed as either R13 or SP in the program codes.
Depending on the processor state and the CONTROL register value, the stack pointer accessed can
either be the MSP or the PSP. In many simple applications, only one stack pointer is needed and by
default the MSP is used. The PSP is usually only required when an OS is used in the embedded
application.
In a typical embedded application with an OS, the OS kernel uses the MSP and the application
processes use the PSP. This allows the stack for the kernel to be separate from stack memory for the
application processes. This allows the OS to carry out context switching quickly (switching from
execution of one application process to another). Also, since exception handlers only use main stack,
each of the stack spaces allocated to application tasks do not need to reserve space needed for exception
handler, thus allow better memory usage efficiency.
Even though the OS kernel only uses the MSP as its stack pointer, it can still access the value in PSP
by using special register access instructions (MRS and MSR).

Assembly Program
1. Write an assembly program to add 20 numbers
2. Write an assembly program to generate Fibonacci series
3. Write an assembly program to separate odd and even numbers of an given array
4. Write an assembly program to perform linear search
5. Write an assembly program to find y= 3x2 + 4x +5
6. Write an assembly program to generate multiple of 5
7. Write an assembly program to count number of 1’s in a given number
8. Write an assembly program to find if given number is negative or positive
9. Write an assembly program to add, subtract, multiply and divide two 32-bit numbers
10. Write an assembly program to add, subtract and multiply two 64-bit numbers

Embedded C Program
1. Write a C program to add 20 numbers and find average
2. Write a C program to generate Fibonacci series
3. Write a C program to separate odd and even numbers of a given array
4. Write a C program to perform linear search
5. Write a C program to find GCD and LCM.
6. Write a C program to generate multiple of 5

P a g e 8 | 14
MODULE 3 NOTES- CA WITH ARM

Module 4: Computer Arithmetic and Basic Processing Unit:


Syllabus: Multiplication of unsigned and signed numbers, Fast multiplication, Integer Division, Floating
point numbers and operations, Hardwired control

Multiplication of unsigned Numbers

Product of two n-bits numbers is max 2n bits number. Unsigned multiplication can be
viewed as addition of shifted versions of the multiplicand. Multiplication involves the generation
of partial products, one for each digit in the multiplier. These partial products are then summed
toproduce the final product. When the multiplier bit is 0, the partial product is 0. When the
multiplier is 1 the partial product is the multiplicand. The total product is produced by summing
the partial products. For this operation, each successive partial product is shifted one position to
the left relative to the preceding partial product.

Multiplication of two integer numbers 13 and 11 is,

Array Multiplier
Binary multiplication can be implemented in a combinational two-dimensional logic
arraycalled array multiplier.
• The main component in each in each cell is a full adder, FA.
• The AND gate in each cell determines whether a multiplicand bit mj, is added to the
incoming partial product(PP) bit based on the value of the multiplier bit, qi.
• Each row i, where 0<= i <=3, adds the multiplicand (appropriately shifted) to the
incoming parcel product, PPi, to generate the outgoing partial product, PP(i+1), if
qi.=1.
• If qi.=0, PPi is passed vertically downward unchanged. PP0 is all 0’s and PP4 is the
desired product. The multiplication is shifted left one position per row by the diagonal
signal path

P a g e 9 | 14
MODULE 3 NOTES- CA WITH ARM

Multiplier cell

Array multiplication of positive binary operands


Disadvantages:
(1) An n bit by n bit array multiplier requires n2 AND gates and n(n-2) full adders and n
half adders.(Half aders are used if there are 2 inputs and full adder used if there are 3
inputs).
(2) The longest part of input to output through n adders in top row, n -1 adders in the
bottom row and n-3 adders in middle row. The longest in a circuit is called critical
path.
P a g e 10 | 14
MODULE 3 NOTES- CA WITH ARM

Sequential Circuit Multiplier


Multiplication is performed as a series of (n) conditional addition and shift operation
such that if the given bit of the multiplier is 0 then only a shift operation is performed, while if
the given bit of the multiplier is 1 then addition of the partial products and a shift operation are
performed.
The combinational array multiplier uses a large number of logic gates for multiplying
numbers. Multiplication of two n-bit numbers can also be performed in a sequential circuit that
uses a single n bit adder.
The block diagram in Figure shows the hardware arrangement for sequential
multiplication. This circuit performs multiplication by using single n-bit adder n times to
implement the spatial addition performed by the n rows of ripple-carry adders in Figure.
RegistersA and Q are shift registers, concatenated as shown. Together, they hold partial product
PPi while multiplier bit qi generates the signal Add/Noadd. This signal causes the multiplexer
MUX to select 0 when qi = 0, or to select the multiplicand M when qi = 1, to be added to PPi to
generate PP(i + 1). The product is computed in n cycles. The partial product grows in length by
one bit percycle from the initial vector, PP0, of n 0s in register A. The carryout from the adder
is stored in flipflop C, shown at the left end of the register C.

Algorithm:
(1) The multiplier and multiplicand are loaded into two registers Q and M. Third
registerA and C are cleared to 0.
(2) In each cycle it performs 2 steps:
(a) If LSB of the multiplier qi =1, control sequencer generates Add signal
whichadds the multiplicand M with the register A and the result is stored in
A.
(b) If qi =0, it generates Noadd signal to restore the previous value in register
A.
(3) Right shift the registers C, A and Q by 1 bit

P a g e 11 | 14
MODULE 3 NOTES- CA WITH ARM

Multiplication of Signed Numbers:


The multiplication of 2’s-complement operands, generating a double-length product, the
general strategy is still to accumulate partial products by adding versions of the multiplicand as
selected by the multiplier bits.
First, consider the case of a positive multiplier and a negative multiplicand. When we add a
negative multiplicand to a partial product, we must extend the sign-bit value of the multiplicand
to the left as far as the product will extend. Figure shows an example in which a 5- bit signed
operand, −13, is the multiplicand. It is multiplied by +11 to get the 10-bit product, −143. The
sign extension of the multiplicand is shown in blue. The hardware discussed earlier can be used for
negative multiplicands if it is augmented to provide for sign extension of the partial products.
13 –> 1101
+13 -> 01101 [for +ve number, add 0 to MSB]
-13 -> 10010 + [for –ve number, find 2’s complement]
1

10011 -> -13


P a g e 12 | 14
MODULE 3 NOTES- CA WITH ARM

[To extend the sign bit - since its 5 bit signed operand, 10 bit product should be
generated.So, if the partial product’s MSB is 1, add 1 for sign extension (to left),
if the partial product’s MSB is 0, add 0 for sign extension (to left)]

Example: Sign extension of negative multiplicand

For a negative multiplier, a straightforward solution is to form the 2’s-complement of


both the multiplier and the multiplicand and proceed as in the case of a positive multiplier.
This is possible because complementation of both operands does not change the value or the sign
of the product.
[If the sign bit is 0 then the number is positive, If the sign bit is 1, then the number is negative]

The Booth Algorithm

1. Multiplicand is placed in BR and Multiplier in QR


2. Accumulator register AC, Qn+1 are initialized to 0
3. Sequence counter SC is initialized to n (number of
bits).
4. Compare Qn and Qn+1 and perform the
following01 –> AC=AC+BR
10 –> AC=AC+BR’+1
00 –> No arithmetic operation
11-> No arithmetic
operation
5. ASHR- Arithmetic Shift right AC, QR
6. Decrement SC by 1
The final product will be store in AC, QR
P a g e 13 | 14
MODULE 3 NOTES- CA WITH ARM

P a g e 14 | 14

You might also like