0% found this document useful (0 votes)

205 views88 pages

Onur 447 Spring15 Lecture1 Intro Afterlecture

This document appears to be a lecture introduction slide deck for a computer architecture course. It discusses: 1) The goals of the course are to understand computer design principles and precedents in order to evaluate tradeoffs, develop principled designs, and create novel designs. 2) The role of the computer architect is to look backward and understand past work, look forward to create new designs, look up to solve important problems, and look down to understand emerging technologies. 3) Abstraction layers in computing allow higher levels to function without understanding lower levels, but crossing abstraction layers is important for optimization, performance issues, or designing more efficient systems.

Uploaded by

kaanp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

205 views88 pages

Onur 447 Spring15 Lecture1 Intro Afterlecture

Uploaded by

kaanp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 88

18-447

Computer Architecture
Lecture 1: Introduction and Basics

Prof. Onur Mutlu

Carnegie Mellon University
Spring 2015, 1/12/2015
Question: What Is This?

2
Answer: Masterpiece of A Famous Architect

3
Your First 447 Assignment
 Go and visit Fallingwater

 Appreciate the importance of out-of-the-box and creative

thinking
 Think about tradeoffs in the design of the building
 Strengths, weaknesses
 Derive principles on your own for good design and
innovation

 Due date: After passing this course

 Apply what you have learned in this course
 Think out-of-the-box

4
But First, Today’s First Assignment
 Find The Differences Of This and That

5
Find Differences Of This and That

6
Many Tradeoffs Between Two Designs
 You can list them after you complete the first assignment…

7
A Key Question
 How Was Wright Able To Design Fallingwater?
 Can have many guesses
 (Ultra) hard work, perseverance, dedication (over decades)
 Experience of decades
 Creativity
 Out-of-the-box thinking
 Principled design
 A good understanding of past designs
 Good judgment and intuition
 Strong combination of skills (math, architecture, art, …)
 …

 (You will be exposed to and hopefully develop/enhance

many of these skills in this course)
8
A Quote from The Architect Himself
 “architecture […] based upon principle, and not upon
precedent”

9
A Principled Design

10
11
A Quote from The Architect Himself
 “architecture […] based upon principle, and not upon
precedent”

12
Major High-Level Goals of This Course
 Understand the principles
 Understand the precedents

 Based on such understanding:

 Enable you to evaluate tradeoffs of different designs and ideas
 Enable you to develop principled designs
 Enable you to develop novel, out-of-the-box designs

 The focus is on:

 Principles, precedents, and how to use them for new designs

 In Computer Architecture

13
Role of the (Computer) Architect

from Yale Patt’s lecture notes

Role of The (Computer) Architect
 Look backward (to the past)
 Understand tradeoffs and designs, upsides/downsides, past
workloads. Analyze and evaluate the past.
 Look forward (to the future)
 Be the dreamer and create new designs. Listen to dreamers.
 Push the state of the art. Evaluate new design choices.
 Look up (towards problems in the computing stack)
 Understand important problems and their nature.
 Develop architectures and ideas to solve important problems.
 Look down (towards device/circuit technology)
 Understand the capabilities of the underlying technology.
 Predict and adapt to the future of technology (you are
designing for N years ahead). Enable the future technology.
15
Takeaways
 Being an architect is not easy
 You need to consider many things in designing a new
system + have good intuition/insight into ideas/tradeoffs

 But, it is fun and can be very technically rewarding

 And, enables a great future
 E.g., many scientific and everyday-life innovations would not
have been possible without architectural innovation that
enabled very high performance systems
 E.g., your mobile phones

 This course will teach you how to become a good computer

architect
16
So, I Hope You Are Here for This
“C” as a model of computation
18-213
Programmer’s view of how
a computer system works

 How does an assembly

program end up executing as Architect/microarchitect’s view:
digital logic? How to design a computer that
meets system design goals.
 What happens in-between? Choices critically affect both
 How is a computer designed the SW programmer and
using logic gates and wires the HW designer

to satisfy specific goals?

HW designer’s view of how
a computer system works
18-240 Digital logic as a
model of computation
17
Levels of Transformation
“The purpose of computing is insight” (Richard Hamming)
We gain and generate insight by solving problems
How do we ensure problems are solved by electrons?

Problem
Algorithm
Program/Language
Runtime System
(VM, OS, MM)
ISA (Architecture)
Microarchitecture
Logic
Circuits
Electrons

18
Aside: A Paper By Hamming
 Hamming, “Error Detecting and Error Correcting Codes,”
Bell System Technical Journal 1950.

 Introduced the concept of Hamming distance

 number of locations in which the corresponding symbols of
two equal-length strings is different
 Developed a theory of codes used for error detection and
correction

 Also:
 Hamming, “You and Your Research,” Talk at Bell Labs,
1986.
 http://www.cs.virginia.edu/~robins/YouAndYourResearch.html
19
The Power of Abstraction
 Levels of transformation create abstractions
 Abstraction: A higher level only needs to know about the
interface to the lower level, not how the lower level is
implemented
 E.g., high-level language programmer does not really need to
know what the ISA is and how a computer executes instructions

 Abstraction improves productivity

 No need to worry about decisions made in underlying levels
 E.g., programming in Java vs. C vs. assembly vs. binary vs. by
specifying control signals of each transistor every cycle

 Then, why would you want to know what goes on

underneath or above?

20
Crossing the Abstraction Layers
 As long as everything goes well, not knowing what happens
in the underlying level (or above) is not a problem.

 What if
 The program you wrote is running slow?
 The program you wrote does not run correctly?
 The program you wrote consumes too much energy?

 What if
 The hardware you designed is too hard to program?
 The hardware you designed is too slow because it does not provide the
right primitives to the software?

 What if
 You want to design a much more efficient and higher performance
system?
21
Crossing the Abstraction Layers
 Two key goals of this course are

 to understand how a processor works underneath the

software layer and how decisions made in hardware affect the
software/programmer

 to enable you to be comfortable in making design and

optimization decisions that cross the boundaries of different
layers and system components

22
An Example: Multi-Core Systems
Multi-Core
Chip

L2 CACHE 1
L2 CACHE 0
SHARED L3 CACHE

DRAM INTERFACE

DRAM BANKS
CORE 0 CORE 1

DRAM MEMORY
CONTROLLER
L2 CACHE 2

L2 CACHE 3

CORE 2 CORE 3

*Die photo credit: AMD Barcelona

23
Unexpected Slowdowns in Multi-Core
High priority

Memory Performance Hog

Low priority

(Core 0) (Core 1)
Moscibroda and Mutlu, “Memory performance attacks: Denial of memory service
in multi-core systems,” USENIX Security 2007.
24
A Question or Two
 Can you figure out why there is a disparity in slowdowns if
you do not know how the system executes the programs?

 Can you fix the problem without knowing what is

happening “underneath”?

25
Why the Disparity in Slowdowns?

CORE
matlab1 gcc 2
CORE Multi-Core
Chip

L2 L2
CACHE CACHE
unfairness
INTERCONNECT
Shared DRAM
DRAM MEMORY CONTROLLER Memory System

DRAM DRAM DRAM DRAM

Bank 0 Bank 1 Bank 2 Bank 3

26
DRAM Bank Operation
Access Address:
(Row 0, Column 0) Columns
(Row 0, Column 1)
(Row 0, Column 85)

Row decoder
(Row 1, Column 0)

Rows
Row address 0
1

Row 01
Row
Empty Row Buffer CONFLICT
HIT !

Column address 0
1
85 Column mux

Data

27
DRAM Controllers
 A row-conflict memory access takes significantly longer
than a row-hit access

 Current controllers take advantage of the row buffer

 Commonly used scheduling policy (FR-FCFS) [Rixner 2000]*

(1) Row-hit first: Service row-hit memory accesses first
(2) Oldest-first: Then service older accesses first

 This scheduling policy aims to maximize DRAM throughput

*Rixner et al., “Memory Access Scheduling,” ISCA 2000.

*Zuravleff and Robinson, “Controller for a synchronous DRAM …,” US Patent 5,630,096, May 1997.

28
The Problem
 Multiple applications share the DRAM controller
 DRAM controllers designed to maximize DRAM data
throughput

 DRAM scheduling policies are unfair to some applications

 Row-hit first: unfairly prioritizes apps with high row buffer locality
 Threads that keep on accessing the same row
 Oldest-first: unfairly prioritizes memory-intensive applications

 DRAM controller vulnerable to denial of service attacks

 Can write programs to exploit unfairness

29
A Memory Performance Hog
// initialize large arrays A, B // initialize large arrays A, B

for (j=0; j<N; j++) { for (j=0; j<N; j++) {

index = j*linesize; streaming index = rand(); random
A[index] = B[index]; A[index] = B[index];
… …
} }

STREAM RANDOM
- Sequential memory access - Random memory access
- Very high row buffer locality (96% hit rate) - Very low row buffer locality (3% hit rate)
- Memory intensive - Similarly memory intensive

Moscibroda and Mutlu, “Memory Performance Attacks,” USENIX Security 2007.

30
What Does the Memory Hog Do?

Row decoder
T0: Row 0
T0:
T1: Row 05
T1:
T0:Row
Row111
0
T1:
T0:Row
Row16
0
Memory Request Buffer Row
Row 00 Row Buffer

Row size: 8KB, cache blockColumn mux

size: 64B
T0: STREAM
128
T1: (8KB/64B)
RANDOM requests of T0 serviced
Data before T1

Moscibroda and Mutlu, “Memory Performance Attacks,” USENIX Security 2007.

31
Now That We Know What Happens Underneath
 How would you solve the problem?

 What is the right place to solve the problem?

 Programmer? Problem
 System software? Algorithm
 Compiler? Program/Language
 Hardware (Memory controller)? Runtime System
(VM, OS, MM)
 Hardware (DRAM)?
ISA (Architecture)
 Circuits?
Microarchitecture
Logic
 Two other goals of this course: Circuits
 Enable you to think critically Electrons
 Enable you to think broadly
32
Reading on Memory Performance Attacks
 Thomas Moscibroda and Onur Mutlu,
"Memory Performance Attacks: Denial of Memory Service
in Multi-Core Systems"
Proceedings of the 16th USENIX Security Symposium (USENIX SECURITY),
pages 257-274, Boston, MA, August 2007. Slides (ppt)

 One potential reading for your Homework 1 assignment

33
If You Are Interested … Further Readings
 Onur Mutlu and Thomas Moscibroda,
"Stall-Time Fair Memory Access Scheduling for Chip
Multiprocessors"
Proceedings of the 40th International Symposium on Microarchitecture
(MICRO), pages 146-158, Chicago, IL, December 2007. Slides (ppt)

 Sai Prashanth Muralidhara, Lavanya Subramanian, Onur Mutlu, Mahmut

Kandemir, and Thomas Moscibroda,
"Reducing Memory Interference in Multicore Systems via
Application-Aware Memory Channel Partitioning"
Proceedings of the 44th International Symposium on Microarchitecture
(MICRO), Porto Alegre, Brazil, December 2011. Slides (pptx)

34
Takeaway
 Breaking the abstraction layers (between components and
transformation hierarchy levels) and knowing what is
underneath enables you to solve problems

35
Another Example
 DRAM Refresh

36
DRAM in the System
Multi-Core
Chip

L2 CACHE 1
L2 CACHE 0
SHARED L3 CACHE

DRAM INTERFACE

DRAM BANKS
CORE 0 CORE 1

DRAM MEMORY
CONTROLLER
L2 CACHE 2

L2 CACHE 3

CORE 2 CORE 3

*Die photo credit: AMD Barcelona

37
A DRAM Cell

wordline (row enable)

bitline

bitline
 A DRAM cell consists of a capacitor and an access transistor
 It stores data in terms of charge in the capacitor
 A DRAM chip consists of (10s of 1000s of) rows of such cells
DRAM Refresh
 DRAM capacitor charge leaks over time

 The memory controller needs to refresh each row periodically

to restore charge
 Activate each row every N ms
 Typical N = 64 ms

 Downsides of refresh
-- Energy consumption: Each refresh consumes energy
-- Performance degradation: DRAM rank/bank unavailable while
refreshed
-- QoS/predictability impact: (Long) pause times during refresh
-- Refresh rate limits DRAM capacity scaling
39
First, Some Analysis
 Imagine a system with 1 ExaByte DRAM
 Assume a row size of 8 KiloBytes

 How many rows are there?

 How many refreshes happen in 64ms?
 What is the total power consumption of DRAM refresh?
 What is the total energy consumption of DRAM refresh
during a day?

 Part of your Homework 1

40
Refresh Overhead: Performance

46%

Liu et al., “RAIDR: Retention-Aware Intelligent DRAM Refresh,” ISCA 2012. 41

Refresh Overhead: Energy

47%

15%

Liu et al., “RAIDR: Retention-Aware Intelligent DRAM Refresh,” ISCA 2012. 42

How Do We Solve the Problem?
 Do we need to refresh all rows every 64ms?

 What if we knew what happened underneath and exposed

that information to upper layers?

43
Underneath: Retention Time Profile of DRAM

Liu et al., “RAIDR: Retention-Aware Intelligent DRAM Refresh,” ISCA 2012. 44

Taking Advantage of This Profile
 Expose this retention time profile information to
 the memory controller
 the operating system
 the programmer?
 the compiler?

 How much information to expose?

 Affects hardware/software overhead, power consumption,
verification complexity, cost

 How to determine this profile information?

 Also, who determines it?

45
An Example: RAIDR
 Observation: Most DRAM rows can be refreshed much less often
without losing data [Kim+, EDL’09][Liu+ ISCA’13]
 Key idea: Refresh rows containing weak cells
more frequently, other rows less frequently
1. Profiling: Profile retention time of all rows
2. Binning: Store rows into bins by retention time in memory controller
Efficient storage with Bloom Filters (only 1.25KB for 32GB memory)
3. Refreshing: Memory controller refreshes rows in different bins at
different rates

 Results: 8-core, 32GB, SPEC, TPC-C, TPC-H

74.6% refresh reduction @ 1.25KB storage
~16%/20% DRAM dynamic/idle power reduction

~9% performance improvement

Benefits increase with DRAM capacity

46
Liu et al., “RAIDR: Retention-Aware Intelligent DRAM Refresh,” ISCA 2012.
Reading on RAIDR
 Jamie Liu, Ben Jaiyen, Richard Veras, and Onur Mutlu,
"RAIDR: Retention-Aware Intelligent DRAM Refresh"
Proceedings of the 39th International Symposium on Computer Architecture
(ISCA), Portland, OR, June 2012. Slides (pdf)

 One potential reading for your Homework 1 assignment

47
If You Are Interested … Further Readings
 Onur Mutlu,
"Memory Scaling: A Systems Architecture Perspective"
Technical talk at MemCon 2013 (MEMCON), Santa Clara, CA, August 2013.
Slides (pptx) (pdf) Video

 Kevin Chang, Donghyuk Lee, Zeshan Chishti, Alaa Alameldeen, Chris Wilkerson,
Yoongu Kim, and Onur Mutlu,
"Improving DRAM Performance by Parallelizing
Refreshes with Accesses"
Proceedings of the 20th International Symposium on High-Performance
Computer Architecture (HPCA), Orlando, FL, February 2014. Slides (pptx) (pdf)

48
Takeaway
 Breaking the abstraction layers (between components and
transformation hierarchy levels) and knowing what is
underneath enables you to solve problems and design
better future systems

 Cooperation between multiple components and layers can

enable more effective solutions and systems

49
Yet Another Example
 DRAM Row Hammer (or, DRAM Disturbance Errors)

50
Disturbance Errors in Modern DRAM

Row of Cells Wordline

Row Row
Victim
Row Opened
Aggressor Closed
Row VHIGH
LOW
Row Row
Victim
Row

Repeatedly opening and closing a row enough times within a

refresh interval induces disturbance errors in adjacent rows in
most real DRAM chips you can buy today
Kim+, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of 51
DRAM Disturbance Errors,” ISCA 2014.
Most DRAM Modules Are At Risk
A company B company C company
86% 83% 88%
(37/43) (45/54) (28/32)

Up to Up to Up to
1.0×107 2.7×106 3.3×105
errors errors errors
Kim+, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of
DRAM Disturbance Errors,” ISCA 2014. 52
x86 CPU DRAM Module

loop:
mov (X), %eax
mov (Y), %ebx X
clflush (X)
clflush (Y)
mfence Y
jmp loop
x86 CPU DRAM Module

loop:
mov (X), %eax
mov (Y), %ebx X
clflush (X)
clflush (Y)
mfence Y
jmp loop
Observed Errors in Real Systems
CPU Architecture Errors Access-Rate
Intel Haswell (2013) 22.9K 12.3M/sec
Intel Ivy Bridge (2012) 20.7K 11.7M/sec
Intel Sandy Bridge (2011) 16.1K 11.6M/sec
AMD Piledriver (2012) 59 6.1M/sec

• A real reliability & security issue

• In a more controlled environment, we can
induce as many as ten million disturbance errors
Kim+, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of
57
DRAM Disturbance Errors,” ISCA 2014.
Errors vs. Vintage

First
Appearance

All modules from 2012–2013 are vulnerable

58
How Do We Solve The Problem?
 Do business as usual but better: Improve circuit and device
technology such that disturbance does not happen.
Use stronger error correcting codes.

 Tolerate it: Make DRAM and controllers more intelligent so

that they can proactively fix the errors

 Eliminate or minimize it: Replace DRAM with a different

technology that does not have the problem

 Embrace it: Design heterogeneous-reliability memories that

map error-tolerant data to less reliable portions

 …
59
More on DRAM Disturbance Errors
 Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk
Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu,
"Flipping Bits in Memory Without Accessing Them: An
Experimental Study of DRAM Disturbance Errors"
Proceedings of the 41st International Symposium on Computer
Architecture (ISCA), Minneapolis, MN, June 2014. Slides (pptx) (pdf)
Lightning Session Slides (pptx) (pdf) Source Code and Data

 Source Code to Induce Errors in Modern DRAM Chips

 https://github.com/CMU-SAFARI/rowhammer

 One potential reading for your Homework 1 assignment

60
Recap: Some Goals of 447
 Teach/enable/empower you to:
 Understand how a computing platform (processor + memory +
interconnect) works
 Implement a simple platform (with not so simple parts), with a
focus on the processor and memory
 Understand how decisions made in hardware affect the
software/programmer as well as hardware designer
 Think critically (in solving problems)
 Think broadly across the levels of transformation
 Understand how to analyze and make tradeoffs in design

61
Review: Major High-Level Goals of This Course
 Understand the principles
 Understand the precedents

 Based on such understanding:

 Enable you to evaluate tradeoffs of different designs and ideas
 Enable you to develop principled designs
 Enable you to develop novel, out-of-the-box designs

 The focus is on:

 Principles, precedents, and how to use them for new designs

 In Computer Architecture

62
Agenda
 Intro to 18-447
 Course logistics, info, requirements
 What 447 is about
 Lab assignments
 Homeworks, readings, etc

 Assignments for the next two weeks

 Homework 0 (due this Friday: January 16)
 Homework 1 (due Jan 28)
 Lab 1 (due Jan 23)

 Basic concepts in computer architecture

63
Handouts for Today
 Online
 Homework 0
 Syllabus
 Website and Past Websites

64
Course Info: Who Are We?
 Instructor: Prof. Onur Mutlu
 [email protected]
 Office: CIC 4105
 Office Hours: W 2:30-3:30pm (or by appointment)
 http://www.ece.cmu.edu/~omutlu
 PhD from UT-Austin, worked at Microsoft Research, Intel,
AMD
 Research and teaching interests:
 Computer architecture, hardware/software interaction
 Many-core systems
 Memory and storage systems
 Improving programmer productivity
 Interconnection networks
 Hardware/software interaction and co-design (PL, OS, Architecture)
 Fault tolerance
 Hardware security
 Algorithms and architectures for bioinformatics, genomics, health applications 65
Course Info: Who Are We?
 Teaching Assistants
 Kevin Chang
 [email protected]
 Rachata Ausavarungnirun
 [email protected]
 Albert Cho
 [email protected]
 Jeremie Kim
 [email protected]
 Clement Loh
 [email protected]

 Reach all of us at
 [email protected] and Piazza
66
Your Turn
 Who are you?

 Homework 0 (absolutely required)

 Your opportunity to tell us about yourself
 Due this Friday (midnight)
 Attach your picture (absolutely required)
 Submit via Autolab

 All grading predicated on receipt of Homework 0

67
Where to Get Up-to-date Course Info?
 Website: http://www.ece.cmu.edu/~ece447
 Lecture notes and videos
 Project information
 Homeworks
 Course schedule, handouts, papers, FAQs
 Material from past incarnations of 447
 This is your single point of access to all resources: Learn it well

 Your email

 Me and the TAs

 Piazza
68
Lecture and Lab Locations, Times
 Lectures:
 MWF 12:30-2:20pm
 Hamerschlag Hall 1107
 Attendance is for your benefit and is therefore important
 Some days, we may have recitation sessions or guest lectures

 Recitations:
 T 10:30am-1:20pm, Th 1:30-4:20pm, F 6:30-9:20pm
 Hamerschlag Hall 1303
 You can attend any session
 Goals: to enhance your understanding of the lecture material,
help you with homework assignments, exams, and labs, and
get one-on-one help from the TAs on the labs.

69
Tentative Course Schedule
 Tentative schedule is in syllabus and online
 To get an idea of topics, you can look at last year’s
schedule, lectures, videos, etc:
 http://www.ece.cmu.edu/~ece447/s14
 http://www.ece.cmu.edu/~ece447/s13

 But don’t believe the “static” schedule

 Systems that perform best are usually dynamically
scheduled
 Static vs. Dynamic scheduling
 Compile time vs. Run time

70
A Note on Hardware vs. Software
 This course is classified under “Computer Hardware”

 However, you will be much more capable if you master

both hardware and software (and the interface between
them)
 Can develop better software if you understand the underlying
hardware
 Can design better hardware if you understand what software
it will execute
 Can design a better computing system if you understand both

 This course covers the HW/SW interface and

microarchitecture
 We will focus on tradeoffs and how they affect software
71
What Do I Expect From You?
 Required background: 240 (digital logic, RTL implementation,
Verilog), 213 (systems, virtual memory, assembly)

 Learn the material thoroughly

 attend lectures, do the readings, do the homeworks
 Do the work & work hard
 Ask questions, take notes, participate
 Perform the assigned readings
 Come to class on time
 Start early – do not procrastinate
 If you want feedback, come to office hours

 Remember “Chance favors the prepared mind.” (Pasteur)

72
What Do I Expect From You?
 How you prepare and manage your time is very important

 There will be an assignment due almost every week

 8 Labs and 7 Homework Assignments

 This will be a heavy course

 However, you will learn a lot of fascinating topics and
understand how a microprocessor actually works (and how it
can be made to work better)
 And, it will hopefully change how you look at and think about
designs around you

73
How Will You Be Evaluated?

 Seven Homeworks + Reading Summaries: 14%

 Eight Lab Assignments: 40% (+ many extra credit chances)
 Midterm I: 12%
 Midterm II: 12%
 Final: 22%
 Our evaluation of your performance: 5%
 Participation counts
 Doing the readings counts

74
More on Homeworks and Labs
 Homeworks
 Do them to truly understand the material, not to get the grade
 Content from lectures, readings, labs, discussions
 All homework writeups must be your own work, written up
individually and independently
 However, you can discuss with others
 No late homeworks accepted

 Labs
 These will take time.
 You need to start early and work hard.
 Labs will be done individually unless specified otherwise.
 A total of five late lab days per semester allowed.

75
A Note on Cheating and Academic Dishonesty
 Absolutely no form of cheating will be tolerated

 You are all adults and we will treat you so

 See syllabus, CMU Policy, and ECE Academic Integrity Policy

 Linked from syllabus

 Cheating  Failing grade (no exceptions)

 And, perhaps more

76
Homeworks for the Next Two Weeks (I)
 Homework 0
 Due this Friday (Jan 16)

77
Homeworks for the Next Two Weeks (II)
 Homework 1
 Due Wednesday Jan 28
 Refresh question, MIPS warmup, ISA concepts, basic
performance evaluation, …
 Write a ½-page summary for the following paper:
 Patt, “Requirements, Bottlenecks, and Good Fortune: Agents for Microprocessor
Evolution,” Proceedings of the IEEE 2001.
 Write ½-page summary for one of the following papers:
 Moscibroda and Mutlu, “Memory Performance Attacks: Denial of Memory Service in
Multi-Core Systems,” USENIX Security 2007.
 Liu+, “RAIDR: Retention-Aware Intelligent DRAM Refresh,” ISCA 2012.
 Kim+, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of
DRAM Disturbance Errors,” ISCA 2014.
 How to write a good critical summary handout will be
posted
 0.5% extra credit for each well-done additional summary
78
Lab Assignment 1
 A functional C-level simulator for a subset of the MIPS ISA
 Due Friday Jan 23, at the end of the Friday recitation session

 Start early, you will have a lot to learn

 Homework 1 and Lab 1 are synergistic
 Homework questions are meant to help you in the Lab

79
Required Readings for This Week
 Patt, “Requirements, Bottlenecks, and Good Fortune: Agents for
Microprocessor Evolution,” Proceedings of the IEEE 2001.
 One of
 Moscibroda and Mutlu, “Memory Performance Attacks: Denial of Memory
Service in Multi-Core Systems,” USENIX Security 2007.
 Liu+, “RAIDR: Retention-Aware Intelligent DRAM Refresh,” ISCA 2012.
 Kim+, “Flipping Bits in Memory Without Accessing Them: An Experimental
Study of DRAM Disturbance Errors,” ISCA 2014.

 P&P Chapter 1 (Fundamentals)

 P&H Chapters 1 and 2 (Intro, Abstractions, ISA, MIPS)

 Reference material throughout the course

 MIPS ISA Reference Manual + x86 ISA Reference Manual
 http://www.ece.cmu.edu/~ece447/s15/doku.php?id=techdocs
80
A Note on Books
 None required

 But, I expect you to be resourceful in finding and doing the

readings…

81
Recitations Next Week
 MIPS ISA Tutorial
 You can attend any recitation session

82
18-447
Computer Architecture
Lecture 1: Introduction and Basics

Prof. Onur Mutlu

Carnegie Mellon University
Spring 2015, 1/12/2015
What Will You Learn
 Computer Architecture: The science and art of
designing, selecting, and interconnecting hardware
components and designing the hardware/software interface
to create a computing system that meets functional,
performance, energy consumption, cost, and other specific
goals.

 Traditional definition: “The term architecture is used

here to describe the attributes of a system as seen by the
programmer, i.e., the conceptual structure and functional
behavior as distinct from the organization of the dataflow
and controls, the logic design, and the physical
implementation.” Gene Amdahl, IBM Journal of R&D, April
1964
84
Computer Architecture in Levels of Transformation

Problem
Algorithm
Program/Language
Runtime System
(VM, OS, MM)
ISA (Architecture)
Microarchitecture
Logic
Circuits
Electrons

 Read: Patt, “Requirements, Bottlenecks, and Good Fortune: Agents for

Microprocessor Evolution,” Proceedings of the IEEE 2001.

85
Levels of Transformation, Revisited
 A user-centric view: computer designed for users
Problem
Algorithm
Program/Language User

Runtime System
(VM, OS, MM)
ISA
Microarchitecture
Logic
Circuits
Electrons

 The entire stack should be optimized for user

86
What Will You Learn?
 Fundamental principles and tradeoffs in designing the
hardware/software interface and major components of a
modern programmable microprocessor
 Focus on state-of-the-art (and some recent research and trends)
 Trade-offs and how to make them

 How to design, implement, and evaluate a functional modern

processor
 Semester-long lab assignments
 A combination of RTL implementation and higher-level simulation
 Focus is functionality first (some on “how to do even better”)

 How to dig out information, think critically and broadly

 How to work even harder!
87
Course Goals
 Goal 1: To familiarize those interested in computer system
design with both fundamental operation principles and design
tradeoffs of processor, memory, and platform architectures in
today’s systems.
 Strong emphasis on fundamentals and design tradeoffs.

 Goal 2: To provide the necessary background and experience to

design, implement, and evaluate a modern processor by
performing hands-on RTL and C-level implementation.
 Strong emphasis on functionality and hands-on design.

Onur Mutlu All Lecs 447
No ratings yet
Onur Mutlu All Lecs 447
503 pages
Lecture 1
No ratings yet
Lecture 1
69 pages
Topic 01 - Intro To Computer Architecture
No ratings yet
Topic 01 - Intro To Computer Architecture
69 pages
Computer Architecture Overview
No ratings yet
Computer Architecture Overview
31 pages
Chapter 1 Edit
No ratings yet
Chapter 1 Edit
463 pages
Onur Comparch Fall2017 Lecture1 Intro Afterlecture
No ratings yet
Onur Comparch Fall2017 Lecture1 Intro Afterlecture
142 pages
Chapter 1
No ratings yet
Chapter 1
21 pages
Computer Architecture: Lecture 1: Introduction and Basics
No ratings yet
Computer Architecture: Lecture 1: Introduction and Basics
28 pages
173-15-10328 Cao
No ratings yet
173-15-10328 Cao
25 pages
Chapter 1 Measuring Understanding Performance
No ratings yet
Chapter 1 Measuring Understanding Performance
63 pages
UG - B.Sc. - Computer Science - PG - B.Sc. - Computer Science - 130 53 - Computer Architecture - 2964
No ratings yet
UG - B.Sc. - Computer Science - PG - B.Sc. - Computer Science - 130 53 - Computer Architecture - 2964
198 pages
Onur Comparch Fall2017 Lecture2 Fundamentals Memoryhierarchy Caches Afterlecture
No ratings yet
Onur Comparch Fall2017 Lecture2 Fundamentals Memoryhierarchy Caches Afterlecture
191 pages
Onur 447 Spring15 Lecture2 Isa Afterlecture
No ratings yet
Onur 447 Spring15 Lecture2 Isa Afterlecture
57 pages
Basics Computer Architecture by Pooyan Jamshidi 1731311297
No ratings yet
Basics Computer Architecture by Pooyan Jamshidi 1731311297
266 pages
SAQA - 14917 - Learner Guide
No ratings yet
SAQA - 14917 - Learner Guide
30 pages
Session 30aug
No ratings yet
Session 30aug
50 pages
CAO Fall 2024 Lecture 01 Introduction Motivation
No ratings yet
CAO Fall 2024 Lecture 01 Introduction Motivation
68 pages
Computer Architecture Course Guide
No ratings yet
Computer Architecture Course Guide
42 pages
CSC 323 Computer Architecture and Organization II 2ND SEMESTER
No ratings yet
CSC 323 Computer Architecture and Organization II 2ND SEMESTER
128 pages
Computer Architecture Basics
100% (1)
Computer Architecture Basics
16 pages
01 - Introduction To Computer Systems
No ratings yet
01 - Introduction To Computer Systems
27 pages
Computer Architecture Intro
100% (1)
Computer Architecture Intro
18 pages
UVM Interview Questions
100% (10)
UVM Interview Questions
27 pages
Administrative Stuff : Instructor
No ratings yet
Administrative Stuff : Instructor
8 pages
Chapter1 Computer Abstractions and Technology
No ratings yet
Chapter1 Computer Abstractions and Technology
52 pages
Computer Architecture Course Guide
No ratings yet
Computer Architecture Course Guide
192 pages
Cs1304-Computer Architecture Department of Cse & It
No ratings yet
Cs1304-Computer Architecture Department of Cse & It
105 pages
Java Project Report
100% (2)
Java Project Report
46 pages
PI CSE30 Lecture 1 Intro PDF
No ratings yet
PI CSE30 Lecture 1 Intro PDF
45 pages
Onur 447 Spring15 Lecture17 Memoryhierarchyandcaches Afterlecture
No ratings yet
Onur 447 Spring15 Lecture17 Memoryhierarchyandcaches Afterlecture
51 pages
E Cat Jobs
No ratings yet
E Cat Jobs
3 pages
Unit 1 Module 1-Merged
No ratings yet
Unit 1 Module 1-Merged
118 pages
Wase 1
No ratings yet
Wase 1
86 pages
Computer Architecture Introduction
No ratings yet
Computer Architecture Introduction
27 pages
CS3350B Computer Architecture: Marc Moreno Maza
100% (1)
CS3350B Computer Architecture: Marc Moreno Maza
45 pages
Ec8552 Computer Architecture and Organization
No ratings yet
Ec8552 Computer Architecture and Organization
106 pages
01 Introduction
No ratings yet
01 Introduction
20 pages
SystemVerilogAssertionHandbook Full
100% (3)
SystemVerilogAssertionHandbook Full
361 pages
001 Intro
No ratings yet
001 Intro
55 pages
Advanced Computer Architecture: Azvjvhd
No ratings yet
Advanced Computer Architecture: Azvjvhd
61 pages
Pakdd 2018 Workshops Bdasc BDM Ml4cyber Paisi Damemo Melbourne Vic Australia June 3 2018 Revised Selected Papers Mohadeseh Ganji
No ratings yet
Pakdd 2018 Workshops Bdasc BDM Ml4cyber Paisi Damemo Melbourne Vic Australia June 3 2018 Revised Selected Papers Mohadeseh Ganji
141 pages
Computer Architecture Introduction
No ratings yet
Computer Architecture Introduction
20 pages
Lecture01 Intro
No ratings yet
Lecture01 Intro
25 pages
° Pls Read Pro Forma: Course Objective
No ratings yet
° Pls Read Pro Forma: Course Objective
26 pages
Eve Lam CV
No ratings yet
Eve Lam CV
2 pages
Cse431 02
No ratings yet
Cse431 02
50 pages
Chapter 1
No ratings yet
Chapter 1
33 pages
Chapter 1
No ratings yet
Chapter 1
63 pages
Cs6303comparchnotes PDF
No ratings yet
Cs6303comparchnotes PDF
250 pages
Computer Organization Unit 1: Overview
No ratings yet
Computer Organization Unit 1: Overview
32 pages
Introduction To Computer Organization and Architecture (COA)
No ratings yet
Introduction To Computer Organization and Architecture (COA)
35 pages
Chapter 1 Edit PDF
No ratings yet
Chapter 1 Edit PDF
40 pages
Lecture 01 - Computer Abstractions and Technology
No ratings yet
Lecture 01 - Computer Abstractions and Technology
24 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
24 pages
01 Intro
No ratings yet
01 Intro
17 pages
Arcgis Assignments PDF
No ratings yet
Arcgis Assignments PDF
96 pages
CSE 820 Graduate Computer Architecture: Dr. Enbody
No ratings yet
CSE 820 Graduate Computer Architecture: Dr. Enbody
25 pages
1.1 Project Summary:: Digital Scrapbook
No ratings yet
1.1 Project Summary:: Digital Scrapbook
30 pages
CH6 - Computer Abstractions and Technology
No ratings yet
CH6 - Computer Abstractions and Technology
69 pages
Lecture1 ch1
No ratings yet
Lecture1 ch1
24 pages
Computer History Timeline PPTX 1
100% (1)
Computer History Timeline PPTX 1
11 pages
Computer Science
No ratings yet
Computer Science
5 pages
Advance Operating System-Computer Organization: Chap 1a: Overview
No ratings yet
Advance Operating System-Computer Organization: Chap 1a: Overview
71 pages
Inspect S50: Easy To Use Mainstream SEM Enabling Quick, Accurate Answers
No ratings yet
Inspect S50: Easy To Use Mainstream SEM Enabling Quick, Accurate Answers
4 pages
PDF
No ratings yet
PDF
41 pages
Chapter - 01 - Computer Abstractions
No ratings yet
Chapter - 01 - Computer Abstractions
37 pages
Learn C Programming
100% (10)
Learn C Programming
169 pages
Back To 'Certificate Final Exam/': Incorrect 0.00 Points Out of 1.00
No ratings yet
Back To 'Certificate Final Exam/': Incorrect 0.00 Points Out of 1.00
15 pages
Instructor: L. N. Bhuyan
No ratings yet
Instructor: L. N. Bhuyan
32 pages
Chapter 7: Operations and Postimplementation Chapter Objectives
No ratings yet
Chapter 7: Operations and Postimplementation Chapter Objectives
7 pages
EE360 Embedded Systems: Omputer Rganization and Esign
No ratings yet
EE360 Embedded Systems: Omputer Rganization and Esign
70 pages
Computer Architecture and Operating Systems (Caos) Course Code: CS31702 4-0-0
No ratings yet
Computer Architecture and Operating Systems (Caos) Course Code: CS31702 4-0-0
33 pages
Stereo Amplifier for Hi-Fi Systems
No ratings yet
Stereo Amplifier for Hi-Fi Systems
12 pages
N4 Computerised Financial Systems
No ratings yet
N4 Computerised Financial Systems
29 pages
FDMS - Adobe Photoshop - Course Outline
No ratings yet
FDMS - Adobe Photoshop - Course Outline
6 pages
NoSQL M2
No ratings yet
NoSQL M2
47 pages
That One Privacy Guy's Email Comparison Chart
No ratings yet
That One Privacy Guy's Email Comparison Chart
18 pages
Process Simulator & Visio: Optimize Business Models
No ratings yet
Process Simulator & Visio: Optimize Business Models
2 pages
VBA ArrayList Guide for Developers
No ratings yet
VBA ArrayList Guide for Developers
15 pages
BCA 1st To 6th Sem
No ratings yet
BCA 1st To 6th Sem
111 pages
Windows 10 Pro System Report
No ratings yet
Windows 10 Pro System Report
34 pages
TDSSKiller.3.1.0.28 08.11.2020 18.26.36 Log
No ratings yet
TDSSKiller.3.1.0.28 08.11.2020 18.26.36 Log
46 pages
Shinymanager
No ratings yet
Shinymanager
20 pages
PCIe Training PDF
86% (7)
PCIe Training PDF
133 pages
VLSI Interview Experience Synopsys
No ratings yet
VLSI Interview Experience Synopsys
2 pages
UVM Verification of An I2C Master Core PDF
100% (1)
UVM Verification of An I2C Master Core PDF
144 pages
CDC
No ratings yet
CDC
38 pages
Venus X1
No ratings yet
Venus X1
12 pages
Coreldraw Syllabus
No ratings yet
Coreldraw Syllabus
6 pages
Notes Chapter 2.3 Lecture 2.3.4 (Cursors)
No ratings yet
Notes Chapter 2.3 Lecture 2.3.4 (Cursors)
7 pages
Trello Tips for Beating Procrastination
No ratings yet
Trello Tips for Beating Procrastination
7 pages
3615B English User Manual
No ratings yet
3615B English User Manual
14 pages
Metastability and CDC-1
No ratings yet
Metastability and CDC-1
32 pages
UVM Basics: Nagesh Loke ARM CPU Verification Lead/Manager
No ratings yet
UVM Basics: Nagesh Loke ARM CPU Verification Lead/Manager
22 pages
Intuit Quickbook Job Description
No ratings yet
Intuit Quickbook Job Description
4 pages
IT Project Manager
No ratings yet
IT Project Manager
3 pages
Week 3
No ratings yet
Week 3
3 pages
AXI Interview Que
100% (3)
AXI Interview Que
7 pages
CRM Upgrade Boosts Productivity
No ratings yet
CRM Upgrade Boosts Productivity
5 pages
Digital Design Interview Prep
No ratings yet
Digital Design Interview Prep
5 pages
ARM SoC Verification Techniques
No ratings yet
ARM SoC Verification Techniques
28 pages
VC SpyGlass RDC Training 06-2021
100% (1)
VC SpyGlass RDC Training 06-2021
28 pages
SV Assertions
No ratings yet
SV Assertions
99 pages
Low Power SoC Design Guide
No ratings yet
Low Power SoC Design Guide
54 pages
Coverage UVM Cookbook
0% (1)
Coverage UVM Cookbook
97 pages
Qualcomm Interview All
100% (3)
Qualcomm Interview All
17 pages
Automated HDL Synthesis Guide
No ratings yet
Automated HDL Synthesis Guide
41 pages
State Machine Signaling Guide
No ratings yet
State Machine Signaling Guide
47 pages
Digital Logic RTL & Verilog Interview Questions Preview
33% (6)
Digital Logic RTL & Verilog Interview Questions Preview
34 pages
SystemVerilog Interview Q&A Guide
100% (6)
SystemVerilog Interview Q&A Guide
6 pages
Fifo Verif Plan
50% (2)
Fifo Verif Plan
20 pages
Cummings Why Use Classes For UVM Transactions
No ratings yet
Cummings Why Use Classes For UVM Transactions
2 pages
Pciec Tutorial
No ratings yet
Pciec Tutorial
305 pages
FPGA
100% (6)
FPGA
122 pages
Digital Design: Register-Transfer Level (RTL) Design
No ratings yet
Digital Design: Register-Transfer Level (RTL) Design
88 pages
Uvm
100% (7)
Uvm
46 pages
UVM Presentation DAC2011 Final
100% (1)
UVM Presentation DAC2011 Final
105 pages
Clock Distribution and Metrics
100% (2)
Clock Distribution and Metrics
52 pages
Intel Interview Questions
No ratings yet
Intel Interview Questions
11 pages