0% found this document useful (0 votes)

64 views69 pages

EE6304 Lecture13 Processors

This document discusses modern processor architectures such as scalar, superscalar, CISC, RISC, and vector processors. It provides details on the components and operation of vector processors, including vector registers, functional units, and load/store units. The document compares CISC and RISC processors, noting that CISC aims to minimize instructions while potentially sacrificing cycles per instruction, while RISC reduces cycles per instruction at the cost of more instructions. Control units can be hardwired or microprogrammed, with hardwired control generating signals based on the instruction step counter, instruction register contents, and computation/comparison results.

Uploaded by

Ashish Soni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views69 pages

EE6304 Lecture13 Processors

Uploaded by

Ashish Soni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

EEDG/CE/CS 6304 Computer Architecture

Lecture 13 – Modern Processors

Types

Benjamin Carrion Schaefer

Associate Professor
Department of Electrical and Computer Engineering
Course Overview
• Fundamentals of Design and Analysis of
Computers (2 lectures)
– History, technological breakthroughs, etc.
– Trends and metrics: performance,
power/energy, cost
• CPU (7 Lectures)
– Instruction Set Architecture
– Arithmetic for Computers (new)
– Instruction Level Parallelism (ILP)
– Dynamic instruction scheduling
– Branch prediction
– Thread-level parallelism
– Modern processors
• Memories (4 Lectures)
– Memory hierarchy
– Caches
– Secondary storage
– Virtual memory
• Buses (1 lecture)
• New computer structures: Heterogeneous
computing (1 lecture)
Objectives
Upon completion of this chapter, you will be able to:
– Differentiated between the different microprocessors'
architectures
• Scalar
• Superscalar
• CISC
• RISC
• Vector Processor
– Understand how control units in CPUs work
• Hardwired vs. micorprogrammed
– Understand current processors available and their architecture
• Intel, ARM, MIPS
– Application Specific Processors
– Soft Processors
– MicroBlaze
– Nios II

Ref: Miscellaneous Sources

3
Scalar Processor

• Can only execute one

instruction at a time
• Flynn’s Taxonomy
– Single instruction stream
single data stream (SISD)
Superscalar Processor
• Processor being able to issue multiple
instructions in a single clock cycle
• Implements parallelism through instruction-
level parallelism à increases throughput
• Execute more than one instruction during a
clock cycle by simultaneously dispatching
multiple instructions to different execution
units on the processor
Vector Processing
• Vector processors have high-level operations that
work on linear arrays of numbers (vectors)

DAP Spr.‘98 ©UCB 6

Vector Processors Properties
• Operate on an entire vector in one instruction
• Each result independent on previous result
– Deep, wide pipeline à Compiler ensures no dependencies
– High clock rate
• Vector instructions access memory with known patterns
– Highly interleaved memory (spreading memory addresses evenly across
memory banks)
– Amortize memory latency
– No (data) cache required (only instruction cache)
• Vector operations are SIMD
Types of Vector Architectures
• Based on how the operands are
fetched :
1. Memory-memory vector
processors
– Operands directly streamed to
functional units from memory and
stored back to memory
2. Vector-register processors
– All operands are read into vector
registers which feed to the
functional units and results stored
back into vector registers
• Vector equivalent of load-store
architecture
• Includes all vector machines since late
1980 (Cray, convex, Fujitsu, Hitachi, NEC)
• Assume vector-register machine from
now on
Operations and Instruction Count RISC vs. Vector
Processor*
Spec92fp Operations (millions) Instructions (Million)
Program RISC Vector R/V RISC Vector R/V
Swim256 115 95 1.1x 115 0.8 142x
Hydro2d 58 40 1.4x 58 0.8 71x
Nasa7 69 41 1.7x 69 2.2 31x
Su2cor 51 35 1.4x 51 1.8 29x
Timcatv 15 10 1.4x 15 1.3 11x
wave5 27 25 1.1x 27 7.2 4x
mdljdp2 32 52 0.6x 32 15.8 2x

• Vector reduces
– ops by 1.2x
– Instructions by 20x

* Ref. F. Quintana (Universidad de Barcelona)

Components of Vector Processor
• Vector Register
– Fixed length bank holding a single vector
• Has at least 2 read and 1 write ports
• Typically, 8-32 vector registers, each holding
64-128 64-bits elements
• Vector Functional Units (FUs)
– Fully pipelined à start new operations
every clock
• Typically, 4 to 8 FUs (e.g., FP add, FP mult,
integer add)
• Vector Load-Store Units (LSU)
– Fully pipelined unit to load or store a
vector
– May have multiple LSU
• Scalar Registers
– Single elements for interconnecting FUS,
LSUs and registers
Example of Vector Machines

Machine Year Clock [MHz] Regs Regs FUs Load

Elements Store
Cray1 1976 80 8 64 6 1
Cray XMP 1983 120 8 64 8 2L,1S
Cray YMP 1888 166 8 64 8 2L, 1S
Cray C-90 1991 240 8 128 8 4
Cray T-90 1996 455 8 128 8 4
Fuj VP200 1982 133 16 128 3 1
Fuj VP300 1996 133 8-256 32-1014 3 2
NEC SX/2 1984 160 8+8K 256+var 16 8
NEC SX/3 1995 400 8+8K 256+var 16 8
Vector Processors Declining Use

• More expensive than superscalar processors

– Sell very few copies à design cost is expensive
• Need high speed on-chip memory à
expensive
• Few architectural innovation to improve
performance
Scalar vs. Vector Processors
• Vector processor
– Smaller program size (requires less instructions)
– Memory access is more “efficient” since every data item requested is
actually used
– Once data is being processed other units (fetch, decode, etc..) can be
powered off à reduces power
– Reduces fetch and decode bandwidth as number of instructions fetched is
less
– Exploit parallelism in large scientific and multimedia applications
– Mainly used in supercomputers
• Scalar processor
– Instruction operate on single data item
• Most current CPUs implement vector-like instructions (SIMD)
– Intel’s x86 MMX/SSE (Streaming SIMD Extensions)
– Cell processor (IBM, Toshiba, Sony) for PlayStation 3 1 scalar +8 SIMD
processors
Conclusion: Vector processors are not viable due to economic reasons BUT vector
instructions set architecture is.
CISC vs. RISC Processors
• CISC: Complex Instruction Set Processors
– Complete a task in as few lines of assembly as possible.
– Performance improvement by simplifying the compiler à
burden falls on processors
– E.g., Intel X86 Processor
• RISC: Reduced Instruction Set Processors
– Only use simple instructions that can be executed within
one clock cycle
– Requires fewer transistors to produce processors
– processor can execute the instructions more quickly
– However greater burden is placed upon the compiler
– E.g., ARM Processor

14
Example
• Find the product of two numbers - one stored in
location 23h and another stored in location 52h -
and then store the product back in the location
23h
• CISC
MUL (23h), (52h)
• RISC
LOAD A, (23h)
LOAD B, (52h)
MUL A, B
STORE (23h), A
CISC vs. RISC

CISC RISC
Emphasis on hardware Emphasis on software
Includes multi-clock Single-clock,
complex instructions reduced instruction only
Memory-to-memory: Register to register:
"LOAD" and "STORE" "LOAD" and "STORE"
incorporated in instructions are independent instructions
Small code sizes, Low cycles per second,
high cycles per second large code sizes
The Performance Equation
• CISC approach attempts to minimize the number of
instructions per program, sacrificing the number of
cycles per instruction
• RISC does the opposite, reducing the cycles per
instruction at the cost of the number of instructions per
program.
Time
Processor Performance = ----------------
Program

= Instructions Cycles Time

X X
Program Instruction Cycle

(code size) (CPI) (cycle time-freq)

Instruction Decoder Differences
• CISC
– Microprogrammed
• RISC
– decoder
Control Signals when Executing 1 Instr.
• Performing an Arithmetic or
Logic Operation

ADD R1, R2 [R1] + [R2] à R1

Step Action

1. R1out, Yin

2. R2out, SelectY,Add, Zin

3. Zout, R1in

It takes three clocks to complete.

19
Hardwired Control
• Two ways to generate control signals:
– Hardwired - faster
– Micro-programmed – slower but more flexible
• Instructions are executed in steps, each taking 1 clock
cycle
• Different actions performed in each step depending on the
instruction being executed à setting of control signals
depends on:
– Contents of step counter
– Contents of instruction register
– Result of a computation or a comparison operation
– External input signals e.g., interrupts

20
Hardwired Control - Circuit
• Instruction decoder interprets the opcode in the IR and sets INS1, INS2, …, INSm
signals
• Step counter indicates which phase execution step the processor is (T1-T5)
• External inputs (e.g., interrupts) connected directly to control signal generator

21
Microprogrammed Control
• Control signals are generated by a program similar
to machine language programs (microprogram)
• Microprogram is stored on the processor in a
small and very fast memory called microprogam
memory (control store)
• Sequence of microinstructions corresponding to a
given machine instruction constitute the
microroutine for that instruction

22
Microprogrammed Control - Circuit
• Microinstruction address
generator
– generates the address to be Microinstruction
address
IR
used for reading the from generator
control store
μPC
• μPC (microcounter)
– keeps track of control store
addresses
Control Store
• Control store
– Contains instructions and :::::
their control signals
23
Microprogrammed Control
Select4

An example of microinstructions for ADD R1, [R3] (Slide 24).

24
Microinstruction Coding Scheme
Microinstruction

F1 F2 F3 F4 F5

F1 (4 bits) F2 (3 bits) F3 (3 bits) F4 (4 bits) F5 (2 bits)

0000: No transfer 000: No transfer 000: No transfer 0000: Add 00: No action
0001: PC 001: PC 001: MAR 0001: Sub 01: Read
out in in
0010: MDR out 010: IR in 010: MDR in 10: Write
0011: Z out 011: Z in 011: TEMP in
0100: R0 out 100: R0 in 100: Y in 1111: XOR
0101: R1 out 101: R1 in
0110: R2 out 110: R2 in 16 ALU
functions
0111: R3 out 111: R3 in
1010: TEMP out
1011: Offset out

F6 F7 F8
19-bit micro-instruction
F6 (1 bit) F7 (1 bit) F8 (1 bit)

0: SelectY 0: No action 0: Continue

1: Select4 1: WMFC 1: End

An example of a partial format for field-encoded microinstructions. 25

Embedded Microprocessors
• System on Chip (SoC) contain at least 1 embedded
processor
– High performance
– Low power: Pdynamic=Iavg Vdd=αCfVdd2

26
ARM Business model
• ARM Holdings licenses to third parties:
– the chip designs
– the ARM instruction set architectures
• Third parties:
– design their own products that implement one of
those architectures

30
ARM Acquisitions
ARM in 2023
ARM Cores

Apple A6/A6X– IPhone5

Altera Cyclone V SOC, Xilinx Zynq

Apply A7/8/8X/9/9X 34
ARM Cores Naming convention
• In the past ARM numbered their cores:
– ARM7, ARM8, ARM9, ARM10, ARM11
• Cortex is the result of a new naming convention
from ARM è Now some wise guys came up with
the idea to call ALL future cores from ARM Cortex
and a suffix.
– Cortex-A for applications processor, the high-end
running > 1 GHz
– Cortex-R for real-time processor, the mid-range 400-
600 MHz
– Cortex-M for microcontroller the low-end running <
200 MHz

35
ARM Cores Line Up
• Cortex-A Application Processor:
– Smartphones
– Netbooks
– eReaders
– Digital TV
– Home Gateways
– Servers and Networking
• Cortex-R Real-time embedded processors:
– Automotive braking systems
– Powertrain solutions
– Mass storage controller
– Networking & Printing
• Cortex-M Microcontroller:
– Microcontrollers
– Mixed signal devices
– Smart sensors
– Automotive body electronics and airbags
• Securecore Security Applications:
– Security markets for mobile SIMs
– identification applications
www.arm.com/products/processors/index.php

36
ARM Cortex-X
• Design is based on the ARM Cortex-A78, but
redesigned for purely performance instead of a
balance of performance, power, and area (PPA)
• X1:
– 5-wide decode out-of-order superscaler
– can fetch 5 instructions per cycle
– out-of-order window size has been increased to 224
entries
– Has 15 execution ports with a pipeline depth of 13
stages and the execution latencies consists of 10
stages
• Latest Cortex-X4
ARM Cores
• Reduced Instruction Set Computer (RISC)
– uniform register file load/store architecture,
where data processing operates only on
register contents, not directly on memory
contents.
– Simple addressing modes, with all load/store
addresses determined from register contents
and instruction fields only
• Differentiating features:
– Java acceleration (Jazelle)
– VFP : Vector Floating point (co-processor)
– security (TrustZone)
– SIMD
– Advanced SIMD (NEON) technologies.
– The ARMv8-architecture adds a Cryptographic
extension as an optional feature.
– Thumb instruction set provides a subset of
the most commonly used 32-bit ARM
instructions which have been compressed into
16-bit wide opcodes. On execution, these 16-
bit instructions are decompressed
transparently to full 32-bit ARM instructions in
real time without performance loss.

38
Thumb Instructions
• Re-encoded subset of the ARM instruction set
• Thumb instructions execute in their own processor
state
• Thumb instructions are half the size of ARM
instructions (16 bits compared with 32 bits)
Pros:
• Greater code density compared to the ARM
instruction set
Cons:
• Uses more instructions (ARM and Thumb)
39
ARM Cortex-A Architecture
• Cortex-A57 processor
– highest-performance and
most advanced processor
– Based on the ARMv8-A
Architecture
– launched in early 2015
– big.LITTLE technology
– ACP Accelerator Coherence
Port (DMA)
– SCU : Snoop control unit
(maintains cache
coherence between ARM
processors)

SCU: Snoop Control Unit

ACP: Accelerator Coherency Port
40
Cortex-A57 Features
• Superscalar, variable-length, out-of-order
pipeline.
• Dynamic branch prediction with Branch Target
Buffer (BTB) and Global History Buffer
(GHB)RAMs
• Fixed 48K L1 instruction cache and 32K L1 data
cache.
• Shared L2 cache of 512KB, 1MB, or 2MB
configurable size.

42
Cortex-A57 Implementation Options

• When implementing the Cortex-A57 processor

in an SoC

44
ARM big.LITTLE
• Pairs high performance core with lower performance (e.g. ARM Cortex A15
and Cortex A7 )
• Both processors support the same ARMv7-A ISA
• Differences in the internal microarchitecture allow them to provide the
different power and performance characteristics
• KEY: dynamically allocate tasks to the right processor according to their
instantaneous performance requirement à Cache coherent interconnect à
Not need to transfer data through main memory

45
ARM big.LITTLE Architecture
• Little: Cortex-A7
– Simple, in-order execution, 8 pipeline stages
• Big: Cortex-A15
– Complex, out-of-order execution, multi-issue pipelines

46
Apple A10
• Quad core
– 2 high-performance cores (64-
bit ARMv8-A – Hurricane – 4.16
mm2)
– 2 energy efficient cores (64-bit
ARM cores – Zephyr- 0.78 mm2)
• ARM big.LITTE technology, BUT
only one core can be active at a
time à works as dual core
• Small ones embedded into large
cores
– Share L2 cache (3 Mbytes)
– L3 cache service all CPUs 4
MBytes
• Die Area 125 mm2, 3.3 billion
transistors (TSMC 16nm FinFET)
• Dedicated image processing unit
Apple’s A11 Application Processors
• Shrunk by 30% compared to A10 to
87.66 mm2 , 4.3 billion transistors
à25% faster (TSMC 10nm FinFET
process)
• Custom ARM architecture
• 6 CPUs
– 2 high performance 64-bit ARMv8-A
cores – Monsoon – have own L2 $
– 4 energy-efficient cores – Mistral –
share L2 $
– Can be used simultaneously
• Dedicated hardware processing
blocks
– Neural Engine (600 billion
operations per second) à used for
e.g., face ID
– Dedicated image processing unit
• Package-on-package device CPU+
3GB of SDRAM
Apple A14 vs. A15 vs. A16
• A14 and A15 TSMC 5nm process technology
• A16 4nm TSMC
• Equipped with the same number of cores
– 2 high performance
– 4 energy efficient cores
• A15 has 5 GPU cores as compared to 4 GPU
cores of A14
• A16 HAS 6 GPU cores
• A15 has 15 billion transistors vs. 11.8 billion
of A14 A16, 16 billion
– New image processing accelerator
A17 Pro
• 3nm transistors à 19 billion transistors
• 6 CPUs
– 2 High-performance cores (10% faster)
• Improved branch predictor
• Wider decode & execute engines
– 4 Efficiency cores (3x performance/watt)
• Neural engine
– 16 cores (up to 2xfaster – 35 trillion
operations/second)
• Dedicated engines = HW accelerators
– ProRes codec
– Display engine
– AV1 decoder (video codec)
• New GPU architecture
– Hardware accelerated Ray tracing
Apple Silicon

A14 5nm, 11.8 billion transistors M1 Ultra 114 billion

transistors
Apple M1
• 5nm transistor size
• 16 billion transistors
• SoC –and System in a
Package (DRAM in same
package as CPU)
• Big.LITTLE architecture (4
large, 4 small cores)
• 64-bit architecture
• Dedicated HW accelerators
(neural engine)
Running Intel Programs on M1
• Need to emulate Intel x86 instructions on M1
ARM’s processor
• Rosetta 2 release
• Invisible to user BUT performance slowdown
• Apple used this during their transition period
from IBM PowerPC to Intel in 2006
MIPS Technologies

• Founded in 1984 at Stanford University

– Founded by Prof. Hennessy and his student Chris Rowen
• Fabless semiconductor company developing RISC CPU chips
• 1988 Silicon Graphics (SGI) adopted MIPS architecture for
its computers
• 1989 IPO
• 1992 fully acquired by SGI
• 1998 SGI Span its business off
• 2013 acquired in by Imagination Technologies (UK) –
embedded graphics chips.
• 2017 sold to venture capital firm

60
David Patterson and John Hennessy

https://www.cnet.com/news/risc-chip-inventors-hennessy-patterson-win-computing-turing-prize/
MIPS Processors
• 2013 acquired by Imagination Technologies
• 2017 sold to venture capitalist firm

62
http://imgtec.com/mips/
MIPS I6400 Internal Structure
• Data coherence across cores important

63
RISC-V
• Open-Source RISC ISA
• Free and extensible software and hardware
freedom on architecture
RISC V
CISC
• Intel x86 (and compatible AMD) only CISC processors
• CISC chips are becoming increasingly unwieldy and difficult
to develop
• Intel has the resources to plow through development and
produce powerful processors
BUT
• Cost or RAM has decreased significantly.
– In 1977, 1MB of DRAM cost about $5,000.
– 1994, the same amount of memory cost only $6 (when adjusted
for inflation).
• Compiler technology has also become more sophisticated
• RISC processors consume less power à ideal for embedded
systems
Intel Processors : 13th Gen Intel Core
Intel Processors : 14th Gen
Intel's E-core and P-core chip
• E : Efficiency cores
• P: Performance cores
Intel Turbo Boost
• CPU to determine how close the processor is to its
maximum thermal design power, or TDP
• If the Intel Turbo Boost Technology sees that the
CPU is operating well within limits, the Turbo
Boost can kick in
ASIPs – Application Specific Instruction Set Processors

• The instruction set of an ASIP is tailored to benefit a

specific application.
• This specialization of the core provides a tradeoff
between the flexibility of a general-purpose CPU
and the performance of an ASIC

ASIPs flexibility vs. performance (source: Synopsys)

74
ASIPs Design Flow – Target Compiler Technologies

Target Compiler Technologies

75
Target acquired by Synopsys
Tensilica Xtensa
• Provided as synthesizable RTL core ASIP
– Gate count range: 25,000 – 150,000+
– Increase in gates as customer adds instructions or
optional features
• Software development tools
• Basic architecture
– 78 instructions
– five-stage pipeline that supports single-cycle execution
– 1 - load/store model
– 32-entry orthogonal register file
– 32 optional extra registers
• Founded by Chris Rowen
Xtensa Family
• Xtensa I to V
Tensilica (acquired by Cadence)
• Customizable Processor IP (ASIP)
– http://ip.cadence.com/ipportfolio/tensilica-ip
– Xtensa Processor Generator
Soft Processors
• FPGAs also provide their own configurable ‘soft
processors’ (implemented on LUTs)
– Altera – Nios
– Xilinx – MicroBlaze
• Benefits of a Soft processor:

80
Intel (Altera ) Nios II
• Core Block diagram
• 32-bit RISC processor

81
Xilinx MicroBlaze
• Core Block diagram
• 32-bit RISC processor

82
Summary
• Vector processors
• CISC vs. RISC processors
• Embedded SoC microprocessors
• ARM
– Cortex-A|R|M
– Internal structure
– Power saving features
– big.LITTLE
• MIPS
• Application Specific Instruction Processors (ASIPs)
– Target
– Tensilica
• Soft processors
– Nios II
– MicroBlaze

Microprocessor Study Materials
No ratings yet
Microprocessor Study Materials
158 pages
Digital Electronics and Microprocessors 20APC3301 Min
No ratings yet
Digital Electronics and Microprocessors 20APC3301 Min
211 pages
Microprocessor Basics for Beginners
100% (2)
Microprocessor Basics for Beginners
27 pages
Chp1 Processors
No ratings yet
Chp1 Processors
73 pages
Microprocessor Exam Guide
100% (1)
Microprocessor Exam Guide
16 pages
Unit 2 Architecture of 8051 Microcontroller
No ratings yet
Unit 2 Architecture of 8051 Microcontroller
25 pages
Microprocessor - Terms and Definitions
No ratings yet
Microprocessor - Terms and Definitions
5 pages
Microprocessor Digital Design Course
No ratings yet
Microprocessor Digital Design Course
130 pages
Embedded Systems Design-1: Dr. N. Mathivanan
No ratings yet
Embedded Systems Design-1: Dr. N. Mathivanan
35 pages
Lecture18 New
No ratings yet
Lecture18 New
19 pages
Microprocessor and Interfacing - Lecture - 1 (New)
No ratings yet
Microprocessor and Interfacing - Lecture - 1 (New)
18 pages
Introduction To ARM Series and Architecture - Unit1
No ratings yet
Introduction To ARM Series and Architecture - Unit1
30 pages
6th Week (1) - 064453
No ratings yet
6th Week (1) - 064453
32 pages
Embedded Systems Lecture Overview
No ratings yet
Embedded Systems Lecture Overview
34 pages
TKM MTech Emb Mod 1
No ratings yet
TKM MTech Emb Mod 1
252 pages
L4 Compiler Design
No ratings yet
L4 Compiler Design
15 pages
Embedded and Real-Time Operating Systems: Course Code: 70439
No ratings yet
Embedded and Real-Time Operating Systems: Course Code: 70439
76 pages
Embedded Systems for Engineers
No ratings yet
Embedded Systems for Engineers
35 pages
Adv M 1
No ratings yet
Adv M 1
85 pages
Microprocessor and Interfacing Unit 1
No ratings yet
Microprocessor and Interfacing Unit 1
10 pages
Chapter 5 - Basics of PLC Programming
No ratings yet
Chapter 5 - Basics of PLC Programming
91 pages
Seagate SSD & HDD Price List 2024
No ratings yet
Seagate SSD & HDD Price List 2024
1 page
Microprocessor
No ratings yet
Microprocessor
73 pages
Microprocessor Chapter 1
No ratings yet
Microprocessor Chapter 1
6 pages
Lecture-1 (Intro To Microprocessors)
No ratings yet
Lecture-1 (Intro To Microprocessors)
21 pages
Intet Face
No ratings yet
Intet Face
34 pages
Microprocessor Study Guide
No ratings yet
Microprocessor Study Guide
334 pages
Coa Unit 5
No ratings yet
Coa Unit 5
53 pages
Onur Digitaldesign 2020 Lecture19 Simd Beforelecture
No ratings yet
Onur Digitaldesign 2020 Lecture19 Simd Beforelecture
64 pages
Mal CH1
No ratings yet
Mal CH1
40 pages
SN8P2722A Sonix
No ratings yet
SN8P2722A Sonix
107 pages
Topic 1 Introduction To Embedded System (ISMAIL - FKEUTM 2020)
No ratings yet
Topic 1 Introduction To Embedded System (ISMAIL - FKEUTM 2020)
37 pages
Evolution of Microprocessor With Its History
No ratings yet
Evolution of Microprocessor With Its History
4 pages
Ee1110 1
No ratings yet
Ee1110 1
55 pages
ES (Unit 2) Fams-1
No ratings yet
ES (Unit 2) Fams-1
97 pages
LAS - CSS 11 - Q2 - Week1 (Using Hand Tools)
100% (1)
LAS - CSS 11 - Q2 - Week1 (Using Hand Tools)
5 pages
Basicfunctionalunit 190124043726
No ratings yet
Basicfunctionalunit 190124043726
37 pages
WINSEM2024-25 BECE204L TH VL2024250504045 2024-12-14 Reference-Material-I
No ratings yet
WINSEM2024-25 BECE204L TH VL2024250504045 2024-12-14 Reference-Material-I
32 pages
Embedded Systems: A Technical Overview
No ratings yet
Embedded Systems: A Technical Overview
57 pages
Lecture ParallelArchTLP-DLP
No ratings yet
Lecture ParallelArchTLP-DLP
52 pages
LECTURE - 1 - 2 Fall 2015 334
No ratings yet
LECTURE - 1 - 2 Fall 2015 334
49 pages
Lecture 1
No ratings yet
Lecture 1
30 pages
Types of Microprocessors Guide
No ratings yet
Types of Microprocessors Guide
5 pages
Lecture 1
No ratings yet
Lecture 1
25 pages
Processor and Computer Achitecture
No ratings yet
Processor and Computer Achitecture
26 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
74 pages
Computer Architecture Overview
No ratings yet
Computer Architecture Overview
64 pages
Computer Archi
No ratings yet
Computer Archi
58 pages
MPMC (Unit 01) Part 02
No ratings yet
MPMC (Unit 01) Part 02
39 pages
Proc Emb - Ch1
No ratings yet
Proc Emb - Ch1
37 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
53 pages
MICROPROCESSORS
No ratings yet
MICROPROCESSORS
10 pages
Cessna SEL-25-03 Secondary Seat Stop
No ratings yet
Cessna SEL-25-03 Secondary Seat Stop
18 pages
CH 2
No ratings yet
CH 2
59 pages
Core of Emb-Sys
No ratings yet
Core of Emb-Sys
52 pages
CN 320: Microprocessor and Microcontroller Systems: Lecture I-Introduction
No ratings yet
CN 320: Microprocessor and Microcontroller Systems: Lecture I-Introduction
35 pages
Microprocessor Basics for Students
No ratings yet
Microprocessor Basics for Students
35 pages
Lab 1 COAL
No ratings yet
Lab 1 COAL
5 pages
Processors and Programmable Logic: Khaled Grati Grati - Khaled@Supcom - TN 2020-2021
No ratings yet
Processors and Programmable Logic: Khaled Grati Grati - Khaled@Supcom - TN 2020-2021
18 pages
FANUCCNCPMCLadder03 PDF
No ratings yet
FANUCCNCPMCLadder03 PDF
334 pages
Microprocessor - Overview: How Does A Microprocessor Work?
No ratings yet
Microprocessor - Overview: How Does A Microprocessor Work?
8 pages
Microprocessor and Interface
No ratings yet
Microprocessor and Interface
16 pages
Manual Service 6983
No ratings yet
Manual Service 6983
41 pages
440NZ21W1PH
No ratings yet
440NZ21W1PH
2 pages
HPC Module Wise Lession Plan For Autumn 2023 July - Dec
No ratings yet
HPC Module Wise Lession Plan For Autumn 2023 July - Dec
5 pages
Introduction To Operating System 2
No ratings yet
Introduction To Operating System 2
7 pages
Cache Performance Analysis Homework
No ratings yet
Cache Performance Analysis Homework
14 pages
MPMC Digtal Notes
No ratings yet
MPMC Digtal Notes
129 pages
8086 Microprocessor Study Guide
No ratings yet
8086 Microprocessor Study Guide
6 pages
A Demonstration of Exact String Matching Algorithms With CUDA
No ratings yet
A Demonstration of Exact String Matching Algorithms With CUDA
10 pages
DX Diag
No ratings yet
DX Diag
33 pages
Red Hat Virtualization Guide
No ratings yet
Red Hat Virtualization Guide
3 pages
ESC 8832 Version 2.00 Manual Complete
No ratings yet
ESC 8832 Version 2.00 Manual Complete
416 pages
400 KV AMM Vol 3-River-Crossing
No ratings yet
400 KV AMM Vol 3-River-Crossing
42 pages
4-Bit Processing Unit Design Usingvhdl Structural Modeling For Multiprocessor Architecture
No ratings yet
4-Bit Processing Unit Design Usingvhdl Structural Modeling For Multiprocessor Architecture
6 pages
Motherboard AIMB-272man
No ratings yet
Motherboard AIMB-272man
114 pages
Unit 1 Part 2-2 Segmentation
No ratings yet
Unit 1 Part 2-2 Segmentation
35 pages
SNT & Dip: AXE Operation & Configuration, LZU 108 6145
No ratings yet
SNT & Dip: AXE Operation & Configuration, LZU 108 6145
34 pages
UNIT 3 Ca (EASY)
No ratings yet
UNIT 3 Ca (EASY)
75 pages
Message
No ratings yet
Message
5 pages
Mpu Mcu 21bee0319
No ratings yet
Mpu Mcu 21bee0319
8 pages
Embedded Systems Architecture Guide
No ratings yet
Embedded Systems Architecture Guide
12 pages
Zoom Remote Control
No ratings yet
Zoom Remote Control
2 pages
Paraf Guru-Guru
No ratings yet
Paraf Guru-Guru
8 pages
Akhilesh Desktop Support Engineer
No ratings yet
Akhilesh Desktop Support Engineer
1 page
Electronic: Programming Terminal PT-100-N
No ratings yet
Electronic: Programming Terminal PT-100-N
2 pages

EE6304 Lecture13 Processors

Uploaded by

EE6304 Lecture13 Processors

Uploaded by

EEDG/CE/CS 6304 Computer Architecture

Lecture 13 – Modern Processors

Benjamin Carrion Schaefer

Ref: Miscellaneous Sources

• Can only execute one

DAP Spr.‘98 ©UCB 6

* Ref. F. Quintana (Universidad de Barcelona)

Machine Year Clock [MHz] Regs Regs FUs Load

• More expensive than superscalar processors

= Instructions Cycles Time

(code size) (CPI) (cycle time-freq)

ADD R1, R2 [R1] + [R2] à R1

2. R2out, SelectY,Add, Zin

It takes three clocks to complete.

An example of microinstructions for ADD R1, [R3] (Slide 24).

F1 (4 bits) F2 (3 bits) F3 (3 bits) F4 (4 bits) F5 (2 bits)

0: SelectY 0: No action 0: Continue

An example of a partial format for field-encoded microinstructions. 25

Apple A6/A6X– IPhone5

SCU: Snoop Control Unit

• When implementing the Cortex-A57 processor

A14 5nm, 11.8 billion transistors M1 Ultra 114 billion

• Founded in 1984 at Stanford University

• The instruction set of an ASIP is tailored to benefit a

ASIPs flexibility vs. performance (source: Synopsys)

Target Compiler Technologies

You might also like