0% found this document useful (0 votes)

123 views4 pages

COA Digital-Cheatsheet

This document provides a summary of key concepts in computer organization and cache memory. It defines structural hazards as simultaneous use of hardware resources. It describes different types of data hazards like RAW and how forwarding can resolve them. It discusses control hazards from branching and how early branch resolution and prediction can help. Finally, it summarizes cache performance metrics and how parameters like block size can impact hit rate and miss penalty.

Uploaded by

2k22.cscys.2212288

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

123 views4 pages

COA Digital-Cheatsheet

Uploaded by

2k22.cscys.2212288

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

lOMoARcPSD|38291606

CS2100 Finals Cheatsheet

Computer Organisation (National University of Singapore)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Muskan Gupta ([email protected])
lOMoARcPSD|38291606

Hazard and resolution

Structural Hazards
 Simultaneous use of hardware resource (e.g. memory unit used by
both load and fetch instruction)
 No issue for MIPS as data and instruction memory is separate
Data Hazards
Per-block overhead: Valid flag (1-bit) + Tag length
RAW (Read after Write) (Initially, all valid flags are unset)
 Register Writes first, then Reads Blocks in cache: 2M
 Without data forwarding: If dependent cycle is right before: 2 Bytes per block: 2N
cycle delay, 2 cycles before: 1 cycle delay For each Memory Address
 With data forwarding: If dependent cycle is dependent on lw: 1  Set Index = (val mod 2N+M)// 2N
 Word Index = (val mod 2N)//Bytesword
cycle delay otherwise: no delay  Tag = val // 2N+M
 Detect Load-Use hazard when ID/EX.instruction == Load && Set-Associative Cache
(ID/EX.rt == IF/ID.rs || ID/EX.rt == IF/ID.rt) A block maps to a unique set of N possible cache locations
Data Forwarding N-way SAC → N cache blocks per set
 Resolves all RAW Hazards except lw (need one stall) Bytes per block: 2M
Cache bocks = Sizecache / Sizeblock
Performance  sw after lw might not need to stall at all Sets = CacheBlocks / N = 2N
Single Cycle  Forward from EX/MEM to ALU for 1 Fully-Associative Cache
 One Instruction = 1 Clock Cycle  Forward from MEM/WB to ALU for 2
 Clock Cycle Time: Longest Latency Amongst all Control Hazards (Branching/Jumping)
Instructions (usually Lw)  Without ANY Control Measures: 3 cycle delay
 Total Execution Time =  Early branch resolution Move branch decision calculation from
EX/MEM to ID stage – stall 1 cycle instead of 3 (may cause
Number of Instructions x Clock Cycle Time further stall if reg. is written by previous instruction)
Multi Cycle o Involved in RAW with prev inst (not lw): stall 2
cycles Block can be placed anywhere, but need to search all blocks
 One Stage = 1 Clock Cycle No conflict miss anymore. Capacity miss = total miss – cold miss
 Cycle Time Decreases, Clock Frequency Increases o Involved in RAW with prev inst (lw): stall 3 cycles Cache Performance
 Different Instructions take variable number of clock o Not involved in any RAW: stall 1 cycle Larger Block Trade Off:
 Branch prediction (not taken): Guess the outcome and  Spatial Locality Advantage (Hit Rate Increases)
cycles (since not all stages are needed) speculatively execute instructions, if guess wrongly then flush
 Clock Cycle Time: Longest latency amongst all pipeline  Miss Penalty increases due to loading more
Stages  Temporal Locality Disadvantage at certain limit (Miss Rate
o With early branching: 1 cycle occur Increases)
 Total Execution Time = o Without early branching: 3 cycles occur Rule of thumb: Direct-mapped cache of size N has
I x Average CPI x Clock Cycle Time before instructions get flushed/not flushed almost the same miss rate as 2-way set associative cache of
 Delayed branch: X instructions following a branch will always size N/2
Pipeline be executed regardless of outcome (requires compiler re-ordering - Cold/Compulsory miss does not depend on size/associativity
 One Stage = 1 Clock Cycle of instructions to branch-delay slot(s), or add nop instructions)
Try to find un-linked instructions from before the branch. - For same cache size, Conflict miss decreases with increasing associativity
 Clock Cycle Time: Longest latency amongst all - Conflict miss is 0 for FA Cache
Stages + Td (time needed to store into pipeline) o With early branching: shift 1 instruction
- For same cache size, Capacity miss does not depend on associativity
o Without early branching: shift 3 instructions - Capacity miss decreases with increasing size
 Cycles needed for I inst: (I + N – 1) Cache (1GiB = 2 bytes, 1 KiB = 210 bytes)
30

 Total Execution Time = Block replacement policy

 Temporal locality: Same item tends to be re-referenced soon Least recently used (LRU): the usual policy, hard to track
(I + N – 1) x Clock Cycle Time  Spatial locality: Nearby items tend to be referenced soon First in first out (FIFO) – with second chance variant
 If N(instructions) >> N(stages)  Hit rate: fraction of memory accesses that are in cache Random replacement (RR)
 (avg. access time) = (hit rate) × (hit time) + (1 − (hit rate)) × Least frequently used (LFU)
Speedup(pipeline) = (Time(single cycle) / Time(pipe)) ~ N (miss penalty)
Performance
 Cache block/line: smallest unit of transfer between memory and
cache Performance = 1 / ResponseTime
Types of misses: Speedup n, between x and y:
 Cold/Compulsory: when the block has never been accessed
before
 Conflict: same index gets overwritten (direct & set assoc.)
 Capacity: cache cannot contain all blocks (full assoc.) CPU Time = Instructions / Program * Cycles / Instruction * Seconds / Cycle
Write Policy Factors affecting performance: Different compiler (affects Instruction Per
 Write-through: write data both to cache and main memory using Program), Different ISA (affects CPI)
Pipelining (Pipeline register contents) a write buffer to queue memory writes Cannot use CPI to determine performance/time, use total time!
 Write-back: write data to cache; write to main memory when Amdahl’s Law (Performance limited to non-speedup program portion)
 IF/ID: Instruction from memory & PC + 4 block is evicted using a “dirty bit” on each cache block P: % of program time that can be
 ID/EX: Data read from register files, 32-bit Sign Extended Imm, Write miss policy improved
& PC + 4  Write allocate: load block to cache, then follow write policy
 EX/MEM: Imm, & (PC + 4) + (Imm * 4), ALU Result, isZero  Write around: write directly to main memory Boolean Algebra
Signal * RD2 from register file Direct Mapped Cache Precedence of Not > And > Or
 Identity: A + 0 = A and A · 1 = A
 MEM/WB: ALU result, Memory Read Data & Write Register
 Complement: A + A’ = 1 and A · A’ = 0
Data (passed through all pipelines)  Commutative: A + B = B + A and A · B = B · A
Downloaded by Muskan Gupta ([email protected])
lOMoARcPSD|38291606

 Associative: A + (B + C) = (A + B) + C and A · (B · C) = (A · B) ·  Prime implicant: Implicant which is not a subset of any other code
C implicant - Priority Encoder can deal with the garbage inputs by assigning priorities
 Distributive: A + (B · C) = (A + B) · (A + C) and A · (B + C) = to inputs.
(A · B) + (A · C)  Essential prime implicant: Prime implicant with at least one ‘1’
- Add valid bit to
 Duality (not a real law): If we flip AND/OR operators and flip the that is not in any other prime implicant (must show in final eqn) deal with (if nothing
operands (0 and 1), the Boolean equation still holds  Simplified SOP expression – group ‘1’s on K-map switched on)
 Ide mpo tency:X+X=Xa ndX·X=X  Simplified POS expression – find SOP expression using ‘0’s on • Demultiplexer:
 One /Ze roEl eme nt :X+1=1a ndX·0=0 K-map, then negate resulting expression - One input data line
 Inv olution:( X’)’=X  Grouping 2N cells(only power-sizes are allowed) eliminates n - N selection lines
 Abs orption:X+( X· Y)=X variables - Directs data from
X·( X+Y)=X input to a selected
 EPIs are counted only by checking 1s, not Xs output line among
 Abs orption( var i
ant ):X+( X’·Y)=X+Y  K-maps help to obtain canonical SOP, but might not provide the 2N possibilities
X·( X’+Y)=X·Y simplest expression possible (need to use boolean algebra for that) Demultiplexer ≡
 De Mor g ans’(ca nb eus edon>2v ar
iabl
es):( X·Y) ’=X’+Y’ Decoder with enable
(X+Y) ’=X’·Y’ • Multiplexer:
 Cons ens us:(X·Y)+( X’·Z)+( Y·Z)=( X·Y)+( X’·Z) - Selects one of 2n inputs to a single output line, using n selection lines
(X+Y)·( X’+Z)·( Y+Z)=( X+Y)·( X’+Z) - To implement functions with n variables, pass variables to the n-bit selector
Lo gicGa t es and set 2n inputs to
Complete set of logic: Any set of gates appropriate constants from
sufficient for building any boolean function. truth table
 e.g. {AND, OR, NOT} - To implement functions
Lo gicCircuits with n + 1 variables, pass
 e.g. {NAND} (self-sufficient / first n variables to the n-
universal gate) = {Negative OR} Combi nati
onalc i
rc uit
:eachoutputdependsent
ire
lyon
bit selector and set each
 e.g. {NOR} (self-sufficient / presentinputs input appropriately to ‘0’,
Seque nti
alc i
rcuit:e a
choutputdependsonbo t
hpresent ‘1’, Z, or Z’ (Z is the last
universal gate) – only when both
inputsands t
ate variable)
inputs 0 will output be 1
•Hal f-
Adde rC=X·Y,S=X⊕Y
•Ful l-
adde rCout=X· Y+( X⊕Y) Cin,S=X⊕(
· Y⊕Z)=( X⊕Y) ⊕Z
With negated outputs, use NAND to simulate •4- bi
tparall
eladd erbyc a
scadi
ng4f ull-
adder
sviathe
ircar
ri
es
OR and NOR to simulate AND •Adde r-
cum- subtractorneedtoXORt heYwi thS(0/1de p
endingon Larger
add/s ubtract
)andpa s
sinSa sC-in(X–Y=X+( 1s-
Complemento fY)+ Components
1) - Remove a
•Magni t
udeCompar a
tor:input:2uns i
gnedval
uesAa ndB,output:"
A> decoder that gives
B" ,"A=B" ,"A<B" duplicate outputs
(w.r.t another
Circuit Delays decoder) by using
• For each component, time = max(∀tinput) + tcurrent component an OR gate with
• Propagation delay of ripple-carry parallel adders ∝ no. of bits the outputs from
the first decoder,
and the enable input of the second.
ALU Build

MSI Components
SOP 0 IS THE LEAST
SIGNIFICANT
expression – implement using 2-level AND-OR circuit or 2-level NAND INPUT!
circuit • Decoder (n-to-m-
POS expression – implement using 2-level OR-AND circuit or 2-level NOR line decoder):
circuit converts binary
data from n input lines to one of the m ≤ 2n output
lines (i.e. 2 x 4 )
Minterms & Maxterms
- Each output line represents a minterm
 Minterm/Maxterm of n - Active High - Generate minterms and
variables is a use OR on minterms to form a function.
product/sum term that Alternatively, use NOR on maxterms
contains n literals from all the variables -> n variables -> 2n - Active Low – AND the maxterms or
mindterms, 2n maxterms NAND the minterms
 Minterm: m0 = X’· Y’· Z’ - Can add an Enable signal
Larger decoders can be constructed from
 Maxterm: M0 = X + Y + Z smaller ones with an inverter (e.g. 3 x 8
 m0’ = M0 decoder built from 2 x 4)
 Functions can be sum of minterms or product of maxterms • Encoder: opposite of decoder Sequential Circuits
 Sum of 2 distinct Maxterms is 1 - Exactly ONE input should be ‘1’ Self-Correcting: any unused state can transit to a used state after a finite
 Product of 2 distinct minterms is 0 - If more than one input switched one, number of cycles
Kmap then X (don’t care values)
- Position of single active input line Synchronous: outputs change at specific time (with clock)
 Implicant: Product term with all ‘1’ or ‘X’, but with at least one among 2n possibilities is coded as a n-bit Asynchronous: outputs change at any time
‘1’
Downloaded by Muskan Gupta ([email protected])
lOMoARcPSD|38291606

Multivibrator: sequential circuits that operate/swing between

 HIGH and LOW state
 Bistable: 2 stable states (e.g. latch, flip-flop)
 Monostable / one-shot: 1 stable state
 Astable: no stable state (e.g. clock)
Memory element: device that can remember value indefinitely, or change
value on command from its inputs. Same input does not always give same
output!
 Pulse-triggered: activated by +ve/−ve pulses (e.g. latch)
 Edge-triggered: activated by rising/falling edge (e.g. flip-flop)

S-R latch (“Set-Reset”) (High: 2 cross-coupled NOR gates Low: NAND):

Gated S-R latch: Outputs change only when EN is HIGH (AND)

Memorised when EN is LOW

Gated D latch (“Data”): Can build from gated S-R latch (No invalid)

• S-R flip-flop: Similar to gated S-R latch

• D (data) flip-flop: Similar to gated D latch (No invalid Inputs)
• J-K flip-flop: J:“Set”, K:“Reset”, Toggle if both HIGH
• T flip-flop (“Toggle”): J-K flip-flop with tied inputs

J-K Flip Flop: Q and Q’ fed back to NAND gates

T Flip Flop: Tie both inputs of J-K together

Downloaded by Muskan Gupta ([email protected])

Data Analysis and Property Modeling With SKUA-GOCAD Training Manual - Paradigm 15
No ratings yet
Data Analysis and Property Modeling With SKUA-GOCAD Training Manual - Paradigm 15
186 pages
Cache 2 Output
No ratings yet
Cache 2 Output
37 pages
College and Advanced Algebra (Content)
100% (1)
College and Advanced Algebra (Content)
269 pages
Final Exam Topics: CSE 564 Computer Architecture Summer 2017
No ratings yet
Final Exam Topics: CSE 564 Computer Architecture Summer 2017
78 pages
Embedded Systems UEC513 D
No ratings yet
Embedded Systems UEC513 D
225 pages
Cache Memory
No ratings yet
Cache Memory
28 pages
Cache Memory Parameters Explained
No ratings yet
Cache Memory Parameters Explained
18 pages
CSE 240A Assignment 3 Solutions
No ratings yet
CSE 240A Assignment 3 Solutions
5 pages
UNIT-IV Memory and I/O
No ratings yet
UNIT-IV Memory and I/O
36 pages
Important Topics Answers
No ratings yet
Important Topics Answers
3 pages
Lecture 8
No ratings yet
Lecture 8
37 pages
Entry-Task-Validation-Exit (ETVX)
No ratings yet
Entry-Task-Validation-Exit (ETVX)
13 pages
Lecture 7
No ratings yet
Lecture 7
21 pages
Cache
No ratings yet
Cache
34 pages
Chapter # 05
No ratings yet
Chapter # 05
42 pages
Unit II
No ratings yet
Unit II
9 pages
Comp Org Exam 3 Cheat Sheet
No ratings yet
Comp Org Exam 3 Cheat Sheet
3 pages
EECS 470 Final Review
No ratings yet
EECS 470 Final Review
16 pages
L18 Cache Wrap Up
No ratings yet
L18 Cache Wrap Up
30 pages
Computer Architecture: Assoc. Prof. Nguyễn Trí Thành, Phd
No ratings yet
Computer Architecture: Assoc. Prof. Nguyễn Trí Thành, Phd
55 pages
Mercedes Functions
100% (1)
Mercedes Functions
9 pages
DigitalLogic ComputerOrganization L22 CachesP3 Handout
No ratings yet
DigitalLogic ComputerOrganization L22 CachesP3 Handout
52 pages
Computer Architecture Solutions - OK
No ratings yet
Computer Architecture Solutions - OK
6 pages
COSS MidSem 2020.07.05 MakeUp With Key COPYM06Tq# Name-Rana
No ratings yet
COSS MidSem 2020.07.05 MakeUp With Key COPYM06Tq# Name-Rana
5 pages
L07 MemoryII
No ratings yet
L07 MemoryII
27 pages
Cache Misses
No ratings yet
Cache Misses
8 pages
CS-30005 (HPC) - CS End Nov 2024
No ratings yet
CS-30005 (HPC) - CS End Nov 2024
23 pages
Cache Optimization Techniques
No ratings yet
Cache Optimization Techniques
23 pages
Disc09 Sols
No ratings yet
Disc09 Sols
7 pages
Caches and Memory
No ratings yet
Caches and Memory
65 pages
10 Caches
No ratings yet
10 Caches
34 pages
CompArch Cheatsheet
No ratings yet
CompArch Cheatsheet
2 pages
ch2 Appb
No ratings yet
ch2 Appb
58 pages
Lec 34
No ratings yet
Lec 34
26 pages
Pelltech Burners Modbus RTU Guide
No ratings yet
Pelltech Burners Modbus RTU Guide
10 pages
Data Hazards and Cache Optimization
No ratings yet
Data Hazards and Cache Optimization
2 pages
Music Notation Shortcuts Guide
No ratings yet
Music Notation Shortcuts Guide
7 pages
Design and Fabrication of Compact Bicycle Trolley
No ratings yet
Design and Fabrication of Compact Bicycle Trolley
7 pages
CompEng 361 Final Review Problems - Solutions
No ratings yet
CompEng 361 Final Review Problems - Solutions
6 pages
Discrete Math for CS Students
No ratings yet
Discrete Math for CS Students
46 pages
Chip Multicore Processors - Tutorial 6: Task 6.1: Cache Misses
No ratings yet
Chip Multicore Processors - Tutorial 6: Task 6.1: Cache Misses
1 page
COA Answers
No ratings yet
COA Answers
5 pages
Coa Applied
No ratings yet
Coa Applied
13 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
12 pages
Data Acquisition in MATLAB
No ratings yet
Data Acquisition in MATLAB
27 pages
E-Commerce Project
No ratings yet
E-Commerce Project
26 pages
COMP 740: Computer Architecture and Implementation: Montek Singh
No ratings yet
COMP 740: Computer Architecture and Implementation: Montek Singh
41 pages
Advance Computer Architecture Homework 2 Solution
No ratings yet
Advance Computer Architecture Homework 2 Solution
8 pages
Fan Kit Instruction
No ratings yet
Fan Kit Instruction
4 pages
Lecture 5 Cache Optimization
No ratings yet
Lecture 5 Cache Optimization
25 pages
Numerical: Central Processing Unit
No ratings yet
Numerical: Central Processing Unit
28 pages
Two Forms of Pipelining: - E.g., Floating Point Operations
No ratings yet
Two Forms of Pipelining: - E.g., Floating Point Operations
36 pages
FORM R.1 Recognition Application Form
No ratings yet
FORM R.1 Recognition Application Form
9 pages
Memory Hierarchy - Ways To Reduce Misses: DAP Spr. 98 ©UCB 1
No ratings yet
Memory Hierarchy - Ways To Reduce Misses: DAP Spr. 98 ©UCB 1
23 pages
Archi Second 2013 2014 JCE
No ratings yet
Archi Second 2013 2014 JCE
2 pages
Computer Applications in Hydraulic Engineering Tutorials 2020-Jul-21
No ratings yet
Computer Applications in Hydraulic Engineering Tutorials 2020-Jul-21
100 pages
Advanced Cache Strategies
No ratings yet
Advanced Cache Strategies
27 pages
Digital Literacy
No ratings yet
Digital Literacy
40 pages
Lecture2a PDF
No ratings yet
Lecture2a PDF
63 pages
Advanced Computer Architecture-06CS81-Memory Hierarchy Design
No ratings yet
Advanced Computer Architecture-06CS81-Memory Hierarchy Design
18 pages
Computer Architecture
No ratings yet
Computer Architecture
5 pages
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
No ratings yet
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
20 pages
Ca Sol PDF
No ratings yet
Ca Sol PDF
8 pages
Dsei30 06a
No ratings yet
Dsei30 06a
3 pages
Memory Hierarchy for Engineers
No ratings yet
Memory Hierarchy for Engineers
15 pages
Assign1 PDF
No ratings yet
Assign1 PDF
5 pages
Project Report
No ratings yet
Project Report
88 pages
15IF11 Multicore E PDF
No ratings yet
15IF11 Multicore E PDF
14 pages
John Locke Essays
100% (2)
John Locke Essays
5 pages
Lect12 Cache
No ratings yet
Lect12 Cache
39 pages
Computer Architecture and Organization: Lecture15: Cache Performance
No ratings yet
Computer Architecture and Organization: Lecture15: Cache Performance
17 pages
Network Configuration: 69-3 Nguyen Thi Nho, P9, Q.Tbinh, Tp. HCM
No ratings yet
Network Configuration: 69-3 Nguyen Thi Nho, P9, Q.Tbinh, Tp. HCM
20 pages
5.2 Eleven Advanced Optimizations of Cache Performance
No ratings yet
5.2 Eleven Advanced Optimizations of Cache Performance
13 pages
Cau 6 Cache
No ratings yet
Cau 6 Cache
25 pages
Server Administration and Management
No ratings yet
Server Administration and Management
3 pages
CSC Examination Result
No ratings yet
CSC Examination Result
2 pages
!!!!!!!!!AC SINGLE PHASE INDUCTION MOTOR SPEED CONTROL U2008b PDF
No ratings yet
!!!!!!!!!AC SINGLE PHASE INDUCTION MOTOR SPEED CONTROL U2008b PDF
6 pages
E-Sahal Wallet Intro Jemal
No ratings yet
E-Sahal Wallet Intro Jemal
18 pages
Computer Systems Architecture Exam Solutions
100% (1)
Computer Systems Architecture Exam Solutions
8 pages
Mandarine Log
No ratings yet
Mandarine Log
37 pages
Keywords and Identifiers in C
No ratings yet
Keywords and Identifiers in C
3 pages
Android - Using The SDK
No ratings yet
Android - Using The SDK
32 pages
Nextreme Whitepaper Design Considerations For TEG System Optimization NWP003.1
No ratings yet
Nextreme Whitepaper Design Considerations For TEG System Optimization NWP003.1
14 pages
Ina121 PDF
No ratings yet
Ina121 PDF
17 pages
Cybersecurity Case Studies
No ratings yet
Cybersecurity Case Studies
7 pages
DLL - Math6 - Week 1
No ratings yet
DLL - Math6 - Week 1
12 pages
Stage 4 Business Analysis and System Recommendation
No ratings yet
Stage 4 Business Analysis and System Recommendation
8 pages

COA Digital-Cheatsheet

Uploaded by

COA Digital-Cheatsheet

Uploaded by

lOMoARcPSD|38291606

CS2100 Finals Cheatsheet

Computer Organisation (National University of Singapore)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Hazard and resolution

 Total Execution Time = Block replacement policy

Multivibrator: sequential circuits that operate/swing between

S-R latch (“Set-Reset”) (High: 2 cross-coupled NOR gates Low: NAND):

Gated S-R latch: Outputs change only when EN is HIGH (AND)

• S-R flip-flop: Similar to gated S-R latch

J-K Flip Flop: Q and Q’ fed back to NAND gates

T Flip Flop: Tie both inputs of J-K together

Downloaded by Muskan Gupta ([email protected])

You might also like