BIOEN 442: Microprocessor
Chapter 3 – Sec 3.3
Time Delay
Instruction Pipeline
Branch penalty
Week 7
Biomedical Engineering Department
College of Engineering
Imam Abdulrahman Bin Faisal University, Saudi Arabia 1
Eng. Kamran Hameed
Today's topic
0- Course 1- Pic 2- PIC Architecture PIC
Overview /Introduction Microcontrollers:history Programming using
to Computing and features Assembly Language
5-Arithmetic, Logic
3-Branch, Call and Time 4-PIC I/O Port
Instructions, and
Delay Loop Programming
Programs
17- Motor Control:
12-LCD and Keypad 13- ADC, DAC and
Relay, PWM, DC and
Interfacing sensor interfacing
Stepper Motors
2
PIC 18 TIME DELAY AND
INSTRUCTION PIPELINE
❑ In the last section we used the DELAY subroutine.
❑ In this section we discuss how to generate various
time delays and calculate exact delays for the PIC18.
❑ We will also discuss instruction pipelining and its
impact on execution time.
❑ Branch penalty
Delay calculation for the PIC18
❑ Two Factors can affect the accuracy of the delay
1. The frequency of the crystal oscillator connected to the OSC 1
and OSC 2 input pins is one factor in the time delay
calculation.
The crystal frequency: The duration of the clock period for the
instruction cycle is a function of crystal frequency
2. The instruction Cycle duration
❑ Most of the PIC18 Instruction consumes 1 cycle.
❑ There are three ways to do that : (see next slide)
Delay calculation for the PIC18
Pipeline : concept between fetch and execute
Harvard Architecture: to get the maximum amount of code and
data into the CPU
RISC Architecture : fixed-size instructions
Pipeline Vs. Non-pipeline
❑ The CPU could either fetch or execute at a given time. In other
words, the CPU had to fetch an instruction from memory, then
execute it, and then fetch the next instruction, execute it, and
so on.
❑ The idea of pipelining in its simplest form is to allow the
CPU to fetch and execute at the same time
Pipelining instructions and executions in the
microcontroller
❑ During the first instruction cycle
clock, instruction 1 is fetched
from the memory, then during the
second instruction cycle clock,
instruction 2 is fetched while
instruction 1 executes.
❑ This overlapping or pipelining
allows most instructions to
execute in a single clock
improving the efficiency of the
microcontroller.
Pipeline Activity After the Instruction Has
Been Fetched
❑In super pipelining, the process of executing instructions is split into
many small steps that are all executed in parallel
❑ Q 1, decode the instruction that is already fetched and sitting in the
queue.
❑ Q 2, the operand is fetched from the file register.
❑ Q 3, the operation is performed: The adding of the two numbers is
done.
❑ Q 4, the result is written into the destination register.
Pipeline Activity for Both Fetch and Execute
❑ In reality, one can construct the PIC18 superpipeline for four
instructions, and is shown in Figure
Instruction cycle time for the PIC
❑ CPU takes a certain amount of time to execute an instruction.
In the PIC this time is referred to as instruction cycles .
❑ All the instructions in the PIC18 are either 2-byte or 4-byte,
most instructions take no more than one or two instruction
cycles to execute.
❑ The instruction cycle depends on the frequency of the oscillator
connected to the PIC system.
❑ The crystal oscillator, along with on-chip circuitry, provide the
clock source for the PIC CPU .
❑ One instruction cycle consists of four oscillator periods
❑ Take 1/4 of the crystal frequency, then take its inverse, as shown
in Example 3-14 (see next slide)
Example 3-14
❑ The following shows the crystal frequency for three different
PIC-based systems. Find the period of the instruction cycle in
each case.
(a) 4 MHz (b) 16 MHz (c) 20 MHz
Solution:
(a) 4/4 = 1 MHz; instruction cycle is 1/1 MHz = 1µs
(microsecond)
(b) 16 MHz/4 = 4 MHz; instruction cycle = 1/4 MHz = 0.25µs
= 250 ns (nanosecond)
(c) 20 MHz/4 = 5 MHz; instruction cycle= 1/5 MHz = 0.2µs
= 200 ns
Branch penalty
❑The overlapping of fetch and execution of the instruction is
widely used in today's microcontrollers
❑Queue is needed for prefetched the instruction and ready to be
executed
❑In some circumstances, the CPU must flush out the queue when
??
❑When a branch instruction is executed
❑The CPU starts to fetch codes from the new memory location
and the code in the queue that was fetched previously is
discarded
Count…..
❑In this case, the execution unit must wait until the fetch unit
fetches the new instruction. This is called a branch penalty.
❑The penalty is an extra instruction cycle to fetch the instruction
from the target location instead of executing the instruction right
below the branch.
❑Instructions take two or three instruction cycles. These are
GOTO, BRA, CALL, and all the conditional branch
instructions such as BNZ, BC, and so on.
❑The conditional branch instruction can take only one
instruction cycle if it does not jump. For example, the BNZ
will jump if Z = 0 and that takes two instruction cycles. If Z =
1, then it falls through and it takes only one instruction cycle
Instruction cycle time for the PIC
Example 3-16
❑ Find the size of the delay of the code snippet below if the
crystal frequency is 4 MHz:
❑ Notice that BNZ takes two instruction cycles if it jumps back, and takes only
one when falling through the loop. That means the above number should be
1277 µs.
Example 3-17
❑ Find the size of the delay in the following program if the crystal
frequency is 4 MHz
Example 3-17
Loop inside a loop delay
❑ Example 3-18 :For a instruction cycle of 1 µs, find the time delay in the following
subroutine
Execution cycle
1 µsec
1 µsec
1 × 200 = 200µsec
1 × 200 = 200µsec
1 × 250 =250 × 200 = 50000 µsec
1 × 250 = 250× 200 = 50000 µsec
1 × 250 = 250 × 200 = 50000 µsec
2 × 250 = 500× 200 = 100000 µsec
1 × 200 = 200 µ𝑠𝑒𝑐
2 × 200 = 400 µsec
1µsec
𝟐𝟓𝟏𝟎𝟎𝟑
Total delay 𝟐𝟓𝟏𝟎𝟎𝟑 µ 𝒔𝒆𝒄 = = 𝟐𝟓𝟏. 𝟎𝟎𝟑 𝒎𝒊𝒍𝒊𝒔𝒆𝒄𝒐𝒏𝒅𝒔
𝟏𝟎𝟎𝟎
Loop inside a loop delay
❑ Example 3-18 :For a instruction cycle of 1 µs, find the time delay in the following
subroutine
❑ For Here loop (5*250)* 1µS=1250µS
❑ The AGAIN loop repeats the HERE loop
200 times: therefore
200*1250µS=250000µS
❑ Instruction at beginning and end of the
AGAIN loop add 5*200*1µS= 1000µS
❑ The MOVLW, MOVWF AND RETURN
instructions executes only once;
3*1µS=3µS
❑ Subtract 200µS for the times BNZ HERE
and 1µS when BNZ AGAIN falls through.
❑ As a result, we have 250000+1000+3-
200-1=250802µS= 250.8ms for the total
time delay.
𝟐𝟓𝟎𝟖𝟎𝟐
Total delay 𝟐𝟓𝟏𝟎𝟎𝟑 µ 𝒔𝒆𝒄 = = 𝟐𝟓𝟎. 𝟖 𝒎𝒊𝒍𝒊𝒔𝒆𝒄𝒐𝒏𝒅𝒔
𝟏𝟎𝟎𝟎
Example 3-19
❑ Find the time delay for the following subroutine, assuming a
crystal frequency of 4 MHz. Discuss the disadvantage of this
over Example 3-18
Solution (Example 3-19)
❑ The time delay inside the AGAIN loop is [200(13 + 2)] x 1 µs
= 3000 µs.
❑ NOP is a 2-byte instruction, even though it does not do anything
except to waste cycle time.
❑ There are 17 instructions in the above DELAY program, and all
the instructions are 2-byte instructions.
❑ This means the loop delay takes 34 bytes of ROM code space,
and gives us only a 3000 µs delay.
❑ That is the reason we use a nested loop instead of NOP
instructions to create a time delay.
Delay generation for 1sec for 10MHz
MOVLW D’20’ 1 for HERE loop(5*250)*0.4µS=500µS
MOVEWF R4 1 for AGAIN loop (500µS*100)=50mS
BACK MOVLW D100 1 for AGAIN (5*100)*0.4µS=200µS
MOVWF R3 1 AGAIN 50mS+ 200µS= 50.2mS
AGAIN MOVLW D’250’ 1 BACK loop: (50.2mS+500µS)*20=1s
MOVWF R2 1 BACK (5*20)* 0.4µS=40µS
HERE NOP 1 BACK loop: 1s+40µS=1.00004S
NOP 1 for 2 above and 1 return: 3*0.4µS=1.2µS
DECF R2, F 1 sub (100*0.4µS )=40µS for HERE
BNZ HERE 2 sub (20*0.4µS )= 8µS for AGAIN
DECF R3, F 1 sub (1*0.4µS )= 0.4µS for BACK
BNZ AGAIN 2 as a result we have
DECF R4, F 1 1.00004S+1.2µS-40µS-8µS-0.4µS
BNZ BACK 2 =0.9999928sec
RETURN 1 approx. 1sec