Performance of Processor
Processor Performance - Terminologies
Clock Rate CR
Cycle Count CC
Cycle Time CT
Cycles Per Instructions CPI
Instruction Count IC
Million Instructions Per Second MIPS
Millions FLoating point Operation Per Second MFLOPS
Machine Clock Rate
Clock Rate (CR) (MHz, GHz) is inverse of Clock Cycle (CC) time (clock period)
CC = 1 / CR
one clock period
10 nsec clock cycle => 100 MHz clock rate
5 nsec clock cycle => 200 MHz clock rate
2 nsec clock cycle => 500 MHz clock rate Clock Rate (CR) is measured in MHz, GHz
1 nsec clock cycle => 1 GHz clock rate
500 psec clock cycle => 2 GHz clock rate
250 psec clock cycle => 4 GHz clock rate
200 psec clock cycle => 5 GHz clock rate
Performance of computer
Clock is used to synchronize the working of a unit
Clock Cycle: Discrete time intervals at which the events happen in a
computer system
The length of each clock cycle is considered as clock period.
A clock period is also called tick, clock tick etc.
Computer Clock
The clock rate is the inverse of the clock cycle time.
ie, Clock Rate = 1/Clock Cycle Time
The clock cycle time is the amount of time for one clock period to elapse
(e.g. 5 ns).
Question
If a computer has a clock cycle time of 5 ns, What is the clock rate?
Type your answer in chat box
Computer Clock
The clock rate is the inverse of the clock cycle time.
Clock Rate = 1/Clock Cycle Time
The clock cycle time is the amount of time for one clock period to
elapse (e.g. 5 ns).
Question
If a computer has a clock cycle time of 5 ns, What is the clock rate?
Performance matrix of a computer system
Execution time: It is the time taken to finish a task
CPU execution time is a combination of user CPU time and system CPU time
Throughput: It is defined as the total quantity of completed work in a
specific period of time.
Processor Performance Metrics
Execution time: It is the time taken to finish a task
Response time: the time between the start and the completion of a task (in time
units)
Throughput: the total amount of tasks done in a given time period (in number of
tasks per unit of time)
• Example: Car assembly factory:
4 hours to produce a car (response time)
6 cars per an hour produced (throughput)
Defining (Speed) Performance
Normally interested in reducing
Response time ( execution time) – the time between the start and the completion of a task
Important to individual users
Thus, to maximize performance, need to minimize execution time
performanceX = 1 / execution_timeX
If X is n times faster than Y, then
performanceX execution_timeY
-------------------- = --------------------- =n
performanceY execution_timeX
How many times faster is machine A?
Problem:
machine A runs a program in 20 seconds
machine B runs the same program in 25 seconds
how many times faster is machine A than machine B??
Type your answer in chat box
If X is n times faster than Y, then
performanceX execution_timeY
-------------------- = --------------------- =n
performanceY execution_timeX
How many times faster is machine A?
Problem:
machine A runs a program in 20 seconds
machine B runs the same program in 25 seconds
how many times faster is machine A?
Performance Factors
Want to distinguish elapsed time and the time spent on our task
CPU execution time (CPU time) – time the CPU spends working on a task
Does not include time waiting for I/O or running other programs
CPU execution time = # CPU clock cycles x clock cycle time
for a program for a program
or
CPU execution time # CPU clock cycles for a program
= -------------------------------------------
for a program clock rate
Can improve performance by reducing either the length of the clock cycle or
the number of clock cycles required for a program
Performance Equation
Our basic performance equation is then
CPU time = Instruction_count x CPI x clock_cycle
or
Instruction_count x CPI
CPU time = -----------------------------------------------
clock_rate
Factors that affect Performance
These equations separate the three key factors that affect
performance
Can measure the CPU execution time by running the program
The clock rate is usually given
Can measure overall instruction count by using profilers/ simulators
without knowing all of the implementation details
CPI varies by instruction type and ISA implementation for which we
must know the implementation details
Cycles Per Instruction (CPI)
Computing the CPI is done by looking at the different types of instructions
and their individual cycle counts
n
CPI = (CPIi x ICi)
i=1
Where ICi is the count (percentage) of the number of instructions
of class i executed
CPIi is the (average) number of clock cycles per instruction for
that instruction class
n is the number of instruction classes
Effective CPI
The overall Effective CPI for an application can then be calculated as
Where ICi is the count (percentage) of the number of instructions of class i executed
ICtotal is total number of machine instructions executed
CPIi is the (average) number of clock cycles per instruction for that instruction class
n is the number of instruction classes
F – Frequency
T- Processor Execution Time
Ic- Instruction Count
MIPS : millions of instructions per second
For example, a program that executes 3 million instructions in 2
seconds has a MIPS rating of 1.5
Advantage : Easy to understand and measure
Disadvantages : May not reflect actual performance, since simple
instructions do better.
MFLOPS - millions of floating point operations per
second
MFLOPS : millions of floating point operations per second
For example, a program that executes 4 million fp. instructions in
5 seconds has a MFLOPS rating of 0.8
Advantage : Easy to understand and measure
Disadvantages : Same as MIPS, only measures floating point
Amdahl’s law uses factors -Enhancement.
Fraction enhanced using N processors
Speedup before and after enhancement
Problem
Suppose that we are considering developing a parallel program to improve on an existing
sequential program and that we determine that 10% of the execution time of the sequential
program is spent in inherently sequential code. (We have to inspect the code to determine this.)
The remaining code can be parallelized, although we do not as yet know how many processors
would be optimal. What is the maximum possible speedup that could be obtained if we were to
develop a parallel version that used ten processors?
Max. Speedup=
Problem
Suppose that we know that 20% of inherently sequential computation in the
problem of interest is made parallel. What is the least number of processors
that we need to use to obtain a speedup of 6.0?
Amdhal’s - Speedup Based Problem
Consider a CPU used in Web servicing. We need to enhance 30% of the processor by increasing
the computation speed 10 times faster on computation process in web service applications.
Solution Fractionenhanced= f= 30% =0.3
Speedenhanced =SUf = 10
Speedupoverall = 1/(1-0.3)+(0.3/10)
= 1/0.7+0.03
=1/0.73
͌ 1.369
MODULE 1
Introduction and Overview of Computer Architecture
Problems to be solved
Problem 1
Assume that # of instructions in the program is 1,000,000,000. Suppose
we have two implementations of the same instruction set architecture
(ISA). For some program,
Machine A has a clock cycle time of 10 ns. and a CPI of 2.0
Machine B has a clock cycle time of 20 ns. and a CPI of 1.2
Which machine is faster for this program, and by
how much?
Problem 2
Solve the following
CPU clock rate is 1 MHz Program takes 45 million cycles to
execute What’s the CPU time?
CPU clock rate is 500 MHz Program takes 45 million cycles to
execute What’s the CPU time?
Problem 3
You are on the design team for a new processor. The clock of the processor runs at 200 MHz. The
following table gives instruction frequencies for Benchmark B, as well as how many cycles the
instructions take, for the different classes of instructions. Calculate the CPI and MIPS.
Problem 4
Problem 5
Suppose a program (or a program task) takes 1 billion instructions to
execute on a processor running at 2 GHz. Suppose also that 50% of the
instructions execute in 3 clock cycles, 30% execute in 4 clock cycles, and
20% execute in 5 clock cycles. What is the execution time for the
program or task?
Note: We have the instruction count: 109 instructions. The clock time can be
computed quickly from the clock rate to be 0.5×10-9 seconds. So we only need to to
compute clocks per instruction as an effective value:
Problem 6
Problem 7
We want to compare the computers R1 and R2, which differ that R1 has the machine
instructions for the floating point operations, while R2 has not (FP operations are implemented
in the software using several non-FP instructions). Both computers have a clock frequency of
400 MHz. In both we perform the same program, which has the following mixture of
commands:
a) Calculate the MIPS for the computers R1 and R2.
b) Calculate the CPU program execution time on the computers R1 and R2, if there are 12000
instructions in the program?
Problem 8
The clock of the processor runs at 200 MHz with 4.4 Cycles per
instruction. Compute the MIPS processor speed for the benchmark
in millions of instructions per second?
Problem 9
Suppose that we are considering developing a parallel program to
improve on an existing sequential program and that we determine
that 20% of the execution time of the sequential program is spent
in inherently sequential code. (We have to inspect the code to
determine this.) The remaining code can be parallelized, although
we do not as yet know how many processors would be optimal.
What is the maximum possible speedup that could be obtained if
we were to develop a parallel version that used ten processors?
Problem 10
Suppose that we know that fraction of inherently sequential
computation is 0.12 in the problem of interest. What is the least
number of processors that we need to use to obtain a speedup of
5.0?
Text Book(s) Friday, March 2
4, 2023
David A. Patterson and . John L. Hennessy
―Computer Organization and Design-The
Hardware/Software Interface‖ 5th edition,
Morgan Kaufmann, 2011.
Carl Hamacher, Zvonko Vranesic, Safwat Zaky,
Computer organization, Mc Graw Hill, Fifth
edition ,Reprint 2011.
Reference Books
W. Stallings, Computer organization and
architecture, Prentice-Hall, 8th edition, 2009
Time for Discussion
Any Queries??
THANK YOU