0% found this document useful (0 votes)

4 views81 pages

Padl 1

The document discusses microprocessor architecture, focusing on instruction handling through fetch and execute cycles, and contrasts RISC and CISC architectures. It also addresses challenges such as power, memory, and instruction-level parallelism (ILP), while introducing concepts like Amdahl's Law and various types of parallel computing. Additionally, it covers the importance of Instruction Set Architecture (ISA) and the relationship between hardware and software in microprocessor design.

Uploaded by

Devaragothaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views81 pages

Padl 1

Uploaded by

Devaragothaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 81

Processor Architecture and Design

Introduction
How Does a
Microprocessor
handle an
Instruction?
HOW DOES A MICROPROCESSOR HANDLE AN
INSTRUCTION?
Fetch Cycle
 The fetch cycle takes the instruction required from memory, stores it
in the instruction register
Execute Cycle
 The actual actions which occur during the execute cycle of an
instruction

3
Address bus

BIU RD Discs
I/o
WR ROM RAM
Ports Video

Data Bus

ALU

CLK
Control
& Timing

EU
Memory Interface

Program Address Generator

Instruction Register

BIU
Registers
Control & Timing

A
B ALU
C
D
E EU
F
G
H Registers

Block Diagram of a Microprocessor

RISC VS CISC

Which is better ?
WHAT IS THE EFFECT ?
If Instructions can be present anywhere
 Size of Instruction Varies
 Complicates Instruction Decoder
ISA
 CISC
 Operands for Arithmetic/Logic operation
can be in Register/ Memory
 RISC
 Operands for Arithmetic/Logic operation
only in Registers
 Register – Register Architecture
RISC Vs CISC
Goal: Multiply data in mem A with B- put it
back in A
A
CISC:
B Mem
MUL A,B
RISC: C

LDA R0,A
LDA R1,B R0 R1

MUL R0,R1 R2 R3
STR A,R0

x,÷,+,-
Time = Time x Cycles x Instructions
Program cycle Inst Program

RISC CISC
Processor Speed-up
Introduction
PAD© K.R.Anupama

Deeply pipelined machines

Speed up Many instructions/cycle

Out-of-order execution of
instructions

Aggressive branch prediction

techniques
PAD© K.R.Anupama

The Three Walls

THE
THE POWER THE ILP
MEMORY
WALL WALL
WALL
PAD© K.R.Anupama

Power Wall

• Power dissipation depends on

• Clock rate, capacitive load, voltage
• Increases in clock frequency – more power dissipated,
more cooling
• Decreases in voltage – reduce dynamic power
consumption – but increase static power leakage –
transistors

• Reached practical power limit in cooling

The Power Wall

PAD© K.R.Anupama
PAD© K.R.Anupama

The Memory Wall

PAD© K.R.Anupama

ILP
• Pipelined
• VLIW
Course
• Superscalar

DLP
• SIMD
• Vector Architectures
• GPU
PAD© K.R.Anupama

• TLP Course
• MIMD
• Multi-threaded
• Distributed memory MIMD
• Shared memory MIMD
PAD© K.R.Anupama

Arch, Implementation & Realization

• ISA
Architecture • Functional level behavior of processor

• Micro-architecture
Implementation • Logic structure that implements the arch

Realization • Physical Implementation

PAD© K.R.Anupama

Contract between
ISA h/w & s/w
PAD© K.R.Anupama

• Contract between software and hardware

• Multiple machines can implement ISA
• Advantage – program portability
• Microprocessor design – starts with ISA
ISA
• ISA produces – micro architecture
• Micro architecture has to be rigorously
verified
PAD© K.R.Anupama

• Development is very slow

• ISAs varied
ISA • No. of operands
• Implied operands
• Operands may be stored in stack
PAD© K.R.Anupama

Dynamic – Static interface

Separates
Compile Time At run time
what is done
• Statically • Dynamically
PAD© K.R.Anupama

DSI
Program (Software)

Compiler Exposed to
Complexity software Static

Architecture
Hardware Exposed to
Complexity hardware Dynamic

Machine (Hardware)
PAD© K.R.Anupama

DSI

DEL CISC VLIW RISC

HLL

DSI1

DSI2

DSI3

Hardware
PAD© K.R.Anupama

• Traditionally, software has been written for

serial computation:
• To be run on a single computer having a single
What is parallel Central Processing Unit (CPU)
computing ? • A problem is broken into a discrete series of

Serial instructions

Computing • Instructions are executed one after another

• Only one instruction may execute at any
moment in time
PAD© K.R.Anupama

Serial Computing

Problem

CPU

TN T3 T2 T1
PAD© K.R.Anupama

What is parallel computing

• In the simplest sense - parallel computing is the simultaneous use of

multiple compute resources to solve a computational problem:
• To be run using multiple CPUs
• A problem is broken into discrete parts that can be solved concurrently
• Each part is further broken down to a series of insts
• Insts from each part execute simultaneously on different CPUs
PAD© K.R.Anupama

What is parallel computing

Problem
1 CPU 1

Problem
2
CPU 2

Problem
3
CPU 3

Problem
4
CPU 4
PAD© K.R.Anupama

• A Single processor with multiple cores

• A single computer with multiple processors
Parallel
• An arbitrary number of computers connected
Computing by a network
• A combination of all three
Parallel Computing

• The computational problem should be able to:

• Be broken apart into discrete pieces of work that
can be solved simultaneously
• Execute multiple program instructions at any
moment in time
• Be solved in less time with multiple compute
resources than with a single compute resource.

PAD© K.R.Anupama
The most import
law in micro-
architecture
PAD© K.R.Anupama

Amdahl’s Law

Ttotal = 1
Timproved [ Ttotal - Tcomponent ]+ Tcomponent
n
PAD© K.R.Anupama

Law of Diminishing Returns

1-f enh
PAD© K.R.Anupama

Types and Levels of parallelism

Functional parallelism irregular

Data level parallelism regular

PAD© K.R.Anupama

Functional Parallelism

Instruction level

Loop Level

• recurrences

Procedure level

Program level
PAD© K.R.Anupama

Flynn’s Taxonomy

SISD SIMD MISD MIMD

PAD© K.R.Anupama

Basic Parallel Techniques

Pipelining

Replication
ILP
TYPES OF ILP-PROCESSORS
Traditional Von- Scalar ILP Superscalar ILP
Neumann
• Sequential • Sequential Issue • Parallel Issue –
Issue – – Parallel Parallel
Sequential Execution Execution
Execution • VLIW – static
schedule
• Superscalar -
dynamic

PAD© K.R.ANUPAMA
INTERNAL OPERATION

Pipelined Processors VLIW /superscalar

PAD© K.R.ANUPAMA
VLIW & SUPERSCALAR ARCHITECTURE

EU1 EU2 EU3

PAD© K.R.ANUPAMA
VLIW
Instruction Fetch

EU1 EU2 EU3

PAD© K.R.ANUPAMA
SUPERSCALAR
Instruction
Dispatch unit
Fetch

EU1 EU2 EU3

PAD© K.R.ANUPAMA
PIPELINE Scalar

PAD© K.R.ANUPAMA
AMDAHL’S LAW
Ttotal = 1
Timproved [ Ttotal - Tcomponent]+ Tcomponent
n

PAD© K.R.ANUPAMA
PIPELINE – N STAGES

Phase 1 Filling

Phase 2 Full Phase

Phase 3 Draining

PAD© K.R.ANUPAMA
IDEALIZED PIPELINE EXECUTION
N

1-g g

PAD© K.R.ANUPAMA
IDEALIZED PIPELINE EXECUTION
N

1-g g

AMDAHL’S LAW N = 10
100% - S = 10
90% - S = 5.26

100% - S = 20
N = 20 90% - S = 6.897

• Foster • 51

• Fishers • 90

PAD© K.R.ANUPAMA
PARAMETERS – JOUPPI
CLASSIFICATION
Operation Latency
Issue Latency
Machine Parallelism
Issue Parallelism
All static parameters
Super Pipeline
 Minor Cycles

F unit D unit E unit

IF DE EX WB

OL 1

MP 4

IL 1

IP 1

IF DE EX WB

OL 1(3)

MP 12

IL 1 per1minor
cycle
IP 3

PAD© K.R.ANUPAMA
PIPELINE
Under- Pipelined Super-Pipelined
Execution > Issue Issue > Executions
Deeply Pipelined machine
Restrictions on forwarding Paths

PAD© K.R.ANUPAMA
MIPS R4000
8 Physical Stages
Each Stages -10ns
ClK – 50MHz
Clock doubler present internally
20ns – base line

OL 1

MP 12

IL 1

IP 3

OL 3
1

MP 36

IL 1/minor
1
cycle
IP 9

INSTRUCTIONS

DATA
DEPENDENCY
Loops

PAD© K.R.ANUPAMA
STRAIGHT LINE CODE
RAW/ true
Load- use
I1: load r1, a;
I2: add r2,r1,r1;
 Define –used
I1: mul r1 ,r4, r5
I2:add r2,r1,r1;

WAR /false/anti
I1: mul r1,r2,r3
I2: add r2,r4,r5

PAD© K.R.ANUPAMA
Inter-iteration/ loop-carried
do I = 2, n
X(I) = A * X(I-1) + B
RECURRENCES End do
First order
kth order

PAD© K.R.ANUPAMA
DATA DEPENDENCY GRAPH
i1 i2
load r1,a
load r2,b δt δt
i3
add r3,r2,r1
mul r1,r2,r4 δa
i4
div r1,r2,r4
δo
i5

PAD© K.R.ANUPAMA
CONTROL DEPENDENCIES
General purpose program 20-30%
Scientific/technical program 5-10 %
Avg branch distance
 4.6
 3-6th inst
 9.2
 10-20th inst

08 Parallel Algorithms Approches
No ratings yet
08 Parallel Algorithms Approches
12 pages
HPC Unit 2
No ratings yet
HPC Unit 2
72 pages
HPC Unit 1
No ratings yet
HPC Unit 1
65 pages
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
No ratings yet
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
43 pages
740 Fall10 Lecture4 Afterlecture Pipelining
No ratings yet
740 Fall10 Lecture4 Afterlecture Pipelining
24 pages
Parallelism in Microprocessor
No ratings yet
Parallelism in Microprocessor
17 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
36 pages
EE (CE) 6304 Computer Architecture Lecture #2 (8/28/13)
No ratings yet
EE (CE) 6304 Computer Architecture Lecture #2 (8/28/13)
35 pages
HPC - Unit-1 Insem Notes
No ratings yet
HPC - Unit-1 Insem Notes
76 pages
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
No ratings yet
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
17 pages
Instruction Pipelining and SuperScalar Development - 2019
No ratings yet
Instruction Pipelining and SuperScalar Development - 2019
53 pages
Lec7 PDF
No ratings yet
Lec7 PDF
16 pages
Modern Computer Architecture (Processor Design) : Prof. Dan Connors Dconnors@colostate - Edu
No ratings yet
Modern Computer Architecture (Processor Design) : Prof. Dan Connors Dconnors@colostate - Edu
32 pages
A4 版本1 （未使用）
No ratings yet
A4 版本1 （未使用）
2 pages
Chapter 9
No ratings yet
Chapter 9
28 pages
Lecture1 Introduction To Parallel Computing - 2025
No ratings yet
Lecture1 Introduction To Parallel Computing - 2025
38 pages
Chap2 Slides
No ratings yet
Chap2 Slides
127 pages
Parallel Processing
No ratings yet
Parallel Processing
127 pages
Unit 5
No ratings yet
Unit 5
23 pages
PAG Unit1
No ratings yet
PAG Unit1
64 pages
Unit 7 - Parallel Processing Paradigm
No ratings yet
Unit 7 - Parallel Processing Paradigm
26 pages
Unit 1 Modern Processors
No ratings yet
Unit 1 Modern Processors
52 pages
Unit 1
No ratings yet
Unit 1
5 pages
Parallel Processing Parallel Processing
No ratings yet
Parallel Processing Parallel Processing
64 pages
4 MultiIssue 2024
No ratings yet
4 MultiIssue 2024
174 pages
Pipelining & Vector Processing Guide
No ratings yet
Pipelining & Vector Processing Guide
73 pages
Unit 5
No ratings yet
Unit 5
44 pages
Computer Architecture
No ratings yet
Computer Architecture
12 pages
Superscalar
No ratings yet
Superscalar
38 pages
9 - Processor Organization and Architecture
No ratings yet
9 - Processor Organization and Architecture
91 pages
Advanced Processor Superscalarclass
50% (2)
Advanced Processor Superscalarclass
73 pages
CO1 Evoluation of Processors and Modern Processor
No ratings yet
CO1 Evoluation of Processors and Modern Processor
29 pages
Multiprocessor Systems & Pipelining
No ratings yet
Multiprocessor Systems & Pipelining
11 pages
IAS & MIPS Rate
No ratings yet
IAS & MIPS Rate
42 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
Parallel Computing Insights
No ratings yet
Parallel Computing Insights
47 pages
Microprocessor Architecture From Simple Pipelines To Chip Multiprocessors 1st Edition Jean-Loup Baer Download
100% (1)
Microprocessor Architecture From Simple Pipelines To Chip Multiprocessors 1st Edition Jean-Loup Baer Download
40 pages
Arch3 Pipelining Afterlecture
No ratings yet
Arch3 Pipelining Afterlecture
180 pages
S6 - Advanced Topics in Computer Architecture
No ratings yet
S6 - Advanced Topics in Computer Architecture
52 pages
cs152 Notes
No ratings yet
cs152 Notes
34 pages
Module 2
No ratings yet
Module 2
127 pages
The Improvement of The Personal Computer
No ratings yet
The Improvement of The Personal Computer
74 pages
Unit 5
No ratings yet
Unit 5
36 pages
2 - Cpe410l2
No ratings yet
2 - Cpe410l2
10 pages
Computer Architecture 1
No ratings yet
Computer Architecture 1
37 pages
10 Week
No ratings yet
10 Week
35 pages
Computer Architecture
No ratings yet
Computer Architecture
29 pages
5th Sem - Unit 2-Ec355tbf
No ratings yet
5th Sem - Unit 2-Ec355tbf
104 pages
Introduction To High Performance Computing: Unit-I
No ratings yet
Introduction To High Performance Computing: Unit-I
70 pages
HPC Important Question
No ratings yet
HPC Important Question
19 pages
Organization CH 2
No ratings yet
Organization CH 2
102 pages
Lecture (2) .PPT-1
100% (1)
Lecture (2) .PPT-1
19 pages
High Performance Computing Notes Unit-1
No ratings yet
High Performance Computing Notes Unit-1
7 pages
Bill of Quantity (Boq) Air Conditioning
No ratings yet
Bill of Quantity (Boq) Air Conditioning
4 pages
Champ® VMV: LED Luminaires For Hazardous Areas
No ratings yet
Champ® VMV: LED Luminaires For Hazardous Areas
12 pages
Casa Kaufmann (Richard Neutra Palm Spring 1946) - Modelo11-DESKTOP-PASJTA0
No ratings yet
Casa Kaufmann (Richard Neutra Palm Spring 1946) - Modelo11-DESKTOP-PASJTA0
1 page
Gustave Eiffel
No ratings yet
Gustave Eiffel
6 pages
01CBC Classroom Project - 240904 - 084531 PLAN
No ratings yet
01CBC Classroom Project - 240904 - 084531 PLAN
1 page
BBSH 1300
No ratings yet
BBSH 1300
28 pages
Retaining Estimate
No ratings yet
Retaining Estimate
2 pages
Sickle Series Fan Blades for Noise Reduction
No ratings yet
Sickle Series Fan Blades for Noise Reduction
2 pages
Coram Square Edge Bath Screen Installation Guide
No ratings yet
Coram Square Edge Bath Screen Installation Guide
5 pages
Kuat Tekan Mortar Dan Silinder Beton Pada Perpaduan Material Lokal Pasir Samboja Dengan Pasir Palu
No ratings yet
Kuat Tekan Mortar Dan Silinder Beton Pada Perpaduan Material Lokal Pasir Samboja Dengan Pasir Palu
7 pages
HITACHI SETFREE Mini HNSQ Sales Catalogue (3-12HP)
No ratings yet
HITACHI SETFREE Mini HNSQ Sales Catalogue (3-12HP)
40 pages
Gabions & Retaining Walls
No ratings yet
Gabions & Retaining Walls
26 pages
Dover Castle
No ratings yet
Dover Castle
9 pages
Aesthetics of Sustainable Architecture Bas Roijers
No ratings yet
Aesthetics of Sustainable Architecture Bas Roijers
13 pages
Revivalism and Modern Architecture
No ratings yet
Revivalism and Modern Architecture
89 pages
APL Series Propeller Fans Specs
No ratings yet
APL Series Propeller Fans Specs
6 pages
1.fundemental Concepts Structure in Nature
No ratings yet
1.fundemental Concepts Structure in Nature
76 pages
Ar Caroline CV 1234
No ratings yet
Ar Caroline CV 1234
1 page
Bill Unpriced
No ratings yet
Bill Unpriced
4 pages
HOUSEKEEPING AND LAUNDRY OPERATION Class Notes
100% (2)
HOUSEKEEPING AND LAUNDRY OPERATION Class Notes
10 pages
01 - 20140401 - VN - ME - Standard - MECH (Shimizu)
No ratings yet
01 - 20140401 - VN - ME - Standard - MECH (Shimizu)
131 pages
School Facilities Inventory Template
No ratings yet
School Facilities Inventory Template
2 pages
4 BHK Royale - 2 BHK Grand Jodi Plan
No ratings yet
4 BHK Royale - 2 BHK Grand Jodi Plan
1 page
REPEAT 9 Get Started With AWS DeepRacer AIM207-R9
No ratings yet
REPEAT 9 Get Started With AWS DeepRacer AIM207-R9
47 pages
Lorichs JSAH
No ratings yet
Lorichs JSAH
28 pages
Soal Les Bahasa Inggris Kelas 3 SD
No ratings yet
Soal Les Bahasa Inggris Kelas 3 SD
8 pages
Grafe Rieniets - 2020
No ratings yet
Grafe Rieniets - 2020
2 pages
Designing A Dining Room
No ratings yet
Designing A Dining Room
7 pages
Adit 2 30042020 02 PDF
No ratings yet
Adit 2 30042020 02 PDF
1 page
Greater Hyderabad Municipal Corporation: Site Visit Report
No ratings yet
Greater Hyderabad Municipal Corporation: Site Visit Report
3 pages

Padl 1

Uploaded by

Padl 1

Uploaded by

Processor Architecture and Design

Program Address Generator

Block Diagram of a Microprocessor

Deeply pipelined machines

Speed up Many instructions/cycle

Aggressive branch prediction

The Three Walls

• Power dissipation depends on

• Reached practical power limit in cooling

The Memory Wall

Arch, Implementation & Realization

Realization • Physical Implementation

• Contract between software and hardware

• Development is very slow

Dynamic – Static interface

DEL CISC VLIW RISC

• Traditionally, software has been written for

Computing • Instructions are executed one after another

What is parallel computing

• In the simplest sense - parallel computing is the simultaneous use of

What is parallel computing

• A Single processor with multiple cores

• The computational problem should be able to:

Law of Diminishing Returns

Types and Levels of parallelism

Functional parallelism irregular

Data level parallelism regular

SISD SIMD MISD MIMD

Basic Parallel Techniques

Pipelined Processors VLIW /superscalar

EU1 EU2 EU3

EU1 EU2 EU3

EU1 EU2 EU3

Phase 2 Full Phase

F unit D unit E unit

You might also like