Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
145 views29 pages

Arvind PDF

Computer architecture has evolved greatly over time due to advances in technology and an improved understanding of software. Early computers like Babbage's Analytical Engine used mechanical components to store and process data according to programmed instructions, laying the foundations for modern programmable machines. However, it wasn't until the development of electronic computers like ENIAC and EDVAC in the 1940s that general-purpose programmable computing became possible. The stored program concept developed for EDVAC established the basic architecture that became universal, with memory to store both instructions and data. Advances in vacuum tubes and magnetic technologies improved reliability and allowed the commercialization of computing, exemplified by the influential IBM 701.

Uploaded by

Xebi Shiekh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views29 pages

Arvind PDF

Computer architecture has evolved greatly over time due to advances in technology and an improved understanding of software. Early computers like Babbage's Analytical Engine used mechanical components to store and process data according to programmed instructions, laying the foundations for modern programmable machines. However, it wasn't until the development of electronic computers like ENIAC and EDVAC in the 1940s that general-purpose programmable computing became possible. The stored program concept developed for EDVAC established the basic architecture that became universal, with memory to store both instructions and data. Advances in vacuum tubes and magnetic technologies improved reliability and allowed the commercialization of computing, exemplified by the influential IBM 701.

Uploaded by

Xebi Shiekh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Computer Architecture:

A Historical Perspective
Arvind
Computer Science and Artificial
Intelligence Laboratory
M.I.T.

CompArch Summer School on Parallel Programming and


Architectures, Brown University, Providence, RI.
August 20-21, 2008

August 21, 2008

Computing Devices Then


EDSAC, University of Cambridge, UK, 1949

August 21, 2008

Computing Devices Now

August 21, 2008

A journey through this space


What do computer architects actually
do?
Illustrate via historical examples

August 21, 2008

Prehistory: Babbage and Analytic Engine


Early days: Eniac, Edvac and Edsac
Arrival of IBM 650 and then IBM 360
Seymour Cray CDC 6600, Cray 1
Microprocessors and PCs
Multicores and Cell phones

Computer Architecture is the


design of the abstraction layers
Application
Algorithm
Programming Language
Original
domain of
the
computer
architect
(50s-80s)

Operating System/Virtual Machine


Instruction Set Architecture (ISA)
Microarchitecture
Register-Transfer Level (RTL)
Circuits

Parallel computing
security,
Domain of recent
computer
architecture (90s)
Reliability, power

Devices

Expansion of
computer
architecture, mid2000s onward.

Physics

August 21, 2008

Importance of Technology
New technologies not only provide greater
speed, size and reliability at lower cost, but
more importantly these dictate the kinds of
structures that can be considered and thus
come to shape our whole view of what a
computer is.
Bell & Newell

August 21, 2008

Technology is the dominant factor


in computer design
Technology

Transistors
Integrated circuits
VLSI (initially)
Flash memories,

Computers

Technology

Core memories
Magnetic tapes
Disks

Computers

Technology

ROMs, RAMs
VLSI
Packaging
Low Power

Computers

August 21, 2008

But Software...
As people write programs and use computers,
our understanding of programming and
program behavior improves.
This has profound though slower impact
on computer architecture
Modern architects cannot avoid paying
attention to software and compilation issues.
Technology
Computers

Software
August 21, 2008

Architecture is Engineering
Design under constraints
Factors to consider:
Performance of whole system on target applications
Average case & worst case

Cost of manufacturing chips and supporting system


Power to run system
Peak power & energy per operation

Reliability of system
Soft errors & hard errors

Cost to design chips (engineers, computers, CAD tools)


Becoming a limiting factor in many situations, fewer unique chips
can be justified

Cost to develop applications and system software


Often the dominant constraint for any programmable device

At different points in history, and for different applications


at the same point in time, the relative balance of these
factors can result in widely varying architectural choices
August 21, 2008

Prehistory:
Charles Babbage & Ada Byron

August 21, 2008

10

Charles Babbage 1791-1871


Lucasian Professor of Mathematics,
Cambridge University, 1827-1839

11

August 21, 2008

Charles Babbage
Difference Engine

1823

Analytic Engine

1833

The forerunner of modern digital computer!

Application

Mathematical Tables Astronomy


Nautical Tables Navy

Background

Any continuous function can be approximated by a


polynomial --Weierstrass

Technology

mechanical - gears, Jacquards loom, simple


calculators

August 21, 2008

12

Difference Engine

A machine to compute mathematical tables


Weierstrass:

Any continuous function can be approximated by a


polynomial
Any Polynomial can be computed from difference tables

An example
f(n)
d1(n)
d2(n)

= n2+n+41
= f(n) - f(n-1) = 2n
= d1(n) - d1(n-1) = 2

f(n)

= f(n-1) + d1(n) = f(n-1) + (d1(n-1) + 2)

n
d2(n)
d1(n)
f(n)

41

2
43

2
2
4
47

3
2

4 ...
2

6
53

8
61

all you need is an adder!


13

August 21, 2008

Difference Engine
1823

Babbages paper is published

1834

The paper is read by Scheutz & his son in Sweden

1842

Babbage gives up the idea of building it;he is onto


Analytic Engine!

1855

Scheutz displays his machine at the Paris World Fare


Can compute any 6th degree polynomial
Speed: 33 to 44 32-digit numbers per minute!

- Now the machine is at the Smithsonian.


- Also a working replica of Difference Engine-2 is on
display at the computer museum in Mountainview CA
August 21, 2008

14

Analytic Engine
1833: Babbages paper was published

conceived during a hiatus in the development of the


difference engine

Inspiration: Jacquard Looms

looms were controlled by punched cards


The set of cards with fixed punched holes
dictated the pattern of weave
program
The same set of cards could be used with different
colored threads
numbers

1871: Babbage dies

The machine remains unrealized.

It is not clear if the analytic engine


could be built even today using only
mechanical technology
15

August 21, 2008

Analytic Engine

The first conception of a general purpose computer


1. The store in which all variables to be operated
upon, as well as all those quantities which have
arisen from the results of the operations are
placed.
2. The mill into which the quantities about to be
operated upon are always brought.
The program
Operation

variable1

variable2

variable3

An operation in the mill required feeding two punched


cards and producing a new punched card for the store.
An operation to alter the sequence was also provided!
August 21, 2008

16

The first programmer

Ada Byron aka Lady Lovelace 1815-52

Adas tutor was Babbage himself!

17

August 21, 2008

Babbages Influence
Babbages ideas had great influence later
primarily because of
Luigi Menabrea, who published notes of Babbages
lectures in Italy
Lady Lovelace, who translated Menabreas notes in
English and thoroughly expanded them.
... Analytic Engine weaves algebraic patterns....

In the early twentieth century - the focus


shifted to analog computers but

Harvard Mark I built in 1944 is very close in spirit to


the Analytic Engine.

August 21, 2008

18

Harvard Mark I
Built in 1944 in IBM Endicott laboratories
Howard Aiken Professor of Physics at Harvard
Essentially mechanical but had some electromagnetically controlled relays and gears
Weighed 5 tons and had 750,000 components
A synchronizing clock that beat every 0.015
seconds

Performance:
0.3 seconds for addition
6 seconds for multiplication
1 minute for a sine calculation

Broke down once a week!


19

August 21, 2008

Early Developments:
From Eniac to IBM 701

August 21, 2008

20

10

Electronic Numerical Integrator


and Computer (ENIAC)
Designed and built by Eckert and Mauchly at the
University of Pennsylvania during 1943-45
The first, completely electronic, operational,
general-purpose analytical calculator!
30 tons, 72 square meters, 200KW

Performance
Read in 120 cards per minute
Addition took 200 s, Division 6 ms
1000 times faster than Mark I

WW-2 Effort

Not very reliable!

Application:

Ballistic calculations

angle = f (location, tail wind, cross wind,


air density, temperature, weight of shell,
propellant charge, ... )
21

August 21, 2008

Electronic Discrete Variable


Automatic Computer (EDVAC)
ENIACs programming system was external
Sequences of instructions were executed
independently of the results of the calculation
Human intervention required to take instructions
out of order

EDVAC was designed by Eckert, Mauchly and von


Neumann in 1944 to solve this problem
Solution was the stored program computer

program can be manipulated as data


First Draft of a report on EDVAC was published in
1945, but just had von Neumanns signature!
Without a doubt the most influential paper in
computer architecture

August 21, 2008

22

11

Stored Program Computer


Program = A sequence of instructions
How to control instruction sequencing?
manual control
calculators
automatic control
external ( paper tape)

Harvard Mark I , 1944


Zuses Z1, WW2

internal
plug board
read-only memory
read-write memory

ENIAC
ENIAC
EDVAC

1946
1948
1947 (concept )

The

same storage can be used to store program


and data

EDSAC

1950

Maurice Wilkes
23

August 21, 2008

The Spread of Ideas


ENIAC & EDVAC had immediate impact
brilliant engineering: Eckert & Mauchley
lucid paper: Burks, Goldstein & von Neumann
IAS
EDSAC
MANIAC
JOHNIAC
ILLIAC
SWAC

Princeton
Cambridge
Los Alamos
Rand
Illinois
Argonne
UCLA-NBS

46-52 Bigelow
46-50 Wilkes
49-52 Metropolis
50-53
49-52
49-53

UNIVAC - the first commercial computer, 1951


Alan Turings direct influence on these developments
is often debated by historians.
August 21, 2008

24

12

Dominant Technology Issue:


Reliability
ENIAC

18,000 tubes
20 10-digit numbers

EDVAC

4,000 tubes
2000 word storage
mercury delay lines

Mean time between failures (MTBF)

MITs Whirlwind with an MTBF of 20 min. was perhaps


the most reliable machine !

Reasons for unreliability:


1. Vacuum Tubes
2. Storage medium
acoustic delay lines
mercury delay lines
Williams tubes
Selections

CORE

J. Forrester

1954

25

August 21, 2008

BINAC
Two processors that checked each other
for reliability.
Didnt work well because processors never
agreed

August 21, 2008

26

13

And then there was IBM 701


IBM 701 -- 30 machines were sold in 1953-54
IBM 650 -- a cheaper, drum based machine,
more than 120 were sold in 1954
and there were orders for 750 more!
Users stopped building their own machines.
Why was IBM late getting into computers?
IBM was making too much money!

Even without computers, IBM revenues


were doubling every 4 to 5 years in 40s
and 50s.
27

August 21, 2008

Software Developments
up to 1955 Libraries of numerical routines
- Floating point operations
- Transcendental functions
- Matrix manipulation, equation solvers, . . .

1955-60

High level Languages - Fortran 1956


Operating Systems - Assemblers, Loaders, Linkers, Compilers
- Accounting programs to keep track of
usage and charges

Machines required experienced operators


Most users could not be expected to understand
these programs, much less write them

August 21, 2008

Machines had to be sold with a lot of resident


software
28

14

The first definition of an


Instruction Set Abstraction:
IBM 360

August 21, 2008

29

Programmers view of the machine


IBM 650
A drum machine with 44 instructions
Instruction:
60 1234 1009
Load the contents of location 1234 into the
distribution; put it also into the upper accumulator;
set lower accumulator to zero; and then go to
location 1009 for the next instruction.

Programmers view of the machine was


inseparable from the actual hardware
implementation
Good programmers optimized the placement
of instructions on the drum to reduce latency!
August 21, 2008

30

15

Compatibility Problem at IBM


By early 60s, IBM had 4 incompatible lines of
computers!
701
650
702
1401

7094
7074
7080
7010

Each system had its own

Instruction set
I/O system and Secondary Storage:
magnetic tapes, drums and disks
assemblers, compilers, libraries,...
market niche
business, scientific, real time, ...

IBM 360
August 21, 2008

31

IBM 360 : Design Premises


Amdahl, Blaauw and Brooks, 1964

The design must lend itself to growth and


successor machines
General method for connecting I/O devices
Total performance - answers per month rather
than bits per microsecond programming aids
Machine must be capable of supervising itself
without manual intervention
Built-in hardware fault checking and locating aids
to reduce down time
Simple to assemble systems with redundant I/O
devices, memories etc. for fault tolerance
Some problems required floating point words
larger than 36 bits

August 21, 2008

32

16

IBM 360: A General-Purpose


Register (GPR) Machine
Processor State
16 General-Purpose 32-bit Registers
may be used as index and base register
Register 0 has some special properties

4 Floating Point 64-bit Registers


A Program Status Word (PSW)
PC, Condition codes, Control flags

A 32-bit machine with 24-bit addresses


No instruction contains a 24-bit address !

Data Formats
8-bit bytes, 16-bit half-words, 32-bit words,
64-bit double-words

33

August 21, 2008

IBM 360: Initial Implementations (1964)


Memory Capacity
Memory Cycle
Datapath
Circuit Delay
Registers
Control Store

Model 30
...
8K - 64 KB
2.0s
...
8-bit
30 nsec/level
in Main Store
Read only 1sec

Model 70
256K - 512 KB
1.0s
64-bit
5 nsec/level
in Transistor
Dedicated circuits

Six implementations (Models, 30, 40, 50, 60, 62, 70)


50X performance difference cross models
ISA completely hid the underlying technological
differences between various models.
With minor modifications, IBM 360 ISA is still in use
August 21, 2008

34

17

IBM 360: Forty years later


The zSeries z990 Microprocessor
64-bit virtual addressing
original 360 was 24-bit; 370 was a 31-bit extension

Dual core design


Dual-issue in-order superscalar
10-stage CISC pipeline
Out-of-order memory accesses
Redundant datapaths
every instruction performed in two parallel datapaths
and results compared

256KB L1 I-cache, 256KB L1 D-cache on-chip


32MB shared L2 unified cache, off-chip
512-entry L1 TLB + 4K-entry L2 TLB
very large TLB, to support multiple virtual machines

8K-entry Branch Target Buffer


Very large buffer to support commercial workloads

Up to 64 processors (48 visible to customer)


in one machine
1.2 GHz in IBM 130nm SOI CMOS
technology, 55W for both cores

[ IBM Journal R&D,


48(3/4), May/July 2004 ]

35

August 21, 2008

Seymour Cray:
The champion designer of
fastest computers

August 21, 2008

36

18

CDC 6600

Seymour Cray, 1963


A fast pipelined machine with 60-bit words
128 Kword main memory capacity, 32 banks

Ten functional units (parallel, unpipelined)


Floating Point: adder, 2 multipliers, divider
Integer: adder, 2 incrementers, ...

Hardwired control (no microcoding)


Dynamic scheduling of instructions using a
scoreboard
Ten Peripheral Processors for Input/Output
a fast multi-threaded 12-bit integer ALU

Very fast clock, 10 MHz (FP add in 4 clocks)


>400,000 transistors, 750 sq. ft., 5 tons,
150 kW, novel freon-based technology for
cooling
Fastest machine in world for 5 years (until
7600)
over 100 sold ($7-10M each)

August 21, 2008

37

IBM Memo on CDC6600


Thomas Watson Jr., IBM CEO, August 1963:
Last week, Control Data ... announced the
6600 system. I understand that in the
laboratory developing the system there are
only 34 people including the janitor. Of
these, 14 are engineers and 4 are
programmers... Contrasting this modest
effort with our vast development activities,
I fail to understand why we have lost our
industry leadership position by letting
someone else offer the world's most
powerful computer.
To which Cray replied: It seems like Mr. Watson
has answered his own question.
August 21, 2008

38

19

CDC 6600:
A Load/Store Architecture

Separate instructions to manipulate three types of reg.

All arithmetic and logic instructions are reg-to-reg

8
8
8

60-bit data registers (X)


18-bit address registers (A)
18-bit index registers (B)

opcode

Ri (Rj) op (Rk)

Only Load and Store instructions refer to memory!


6

18

opcode

disp

Ri M[(Rj) + disp]

Touching address registers 1 to 5 initiates a load


6 to 7 initiates a store
- very useful for vector operations
39

August 21, 2008

CDC 6600: Datapath


Operand Regs
8 x 60-bit
operand
10 Functional
Units

result
Central
Memory

Address Regs
8 x 18-bit
oprnd
addr

Index Regs
8 x 18-bit

IR
Inst. Stack
8 x 60-bit

result
addr

August 21, 2008

40

20

Microprocessor Evolution:
4004 to Pentium-4

August 21, 2008

41

First Microprocessor
Intel 4004, 1971
4-bit
accumulator
architecture
8m pMOS
2,300 transistors
3 x 4 mm2
750kHz clock
8-16 cycles/inst.

August 21, 2008

42

21

Microprocessors in the Seventies


Initial target was embedded control
- Intel 4004 was designed for a desktop printing calculator

Constrained by what could fit on single chip


- Single accumulator architectures

8-bit micros used in hobbyist personal computers


- Micral, Altair, TRS-80, Apple-II

Little impact on conventional computer market until


VISICALC spreadsheet for Apple-II (6502, 1MHz)
- First killer business application for personal computers

August 21, 2008

43

Microprocessor Evolution
through 70s
Rapid progress in size and speed
Fueled by advances in MOSFET technology and expanding markets

Intel i432
Most ambitious seventies micro; started in 1975 - released 1981
32-bit capability-based object-oriented architecture

Motorola 68000 (1979, 8MHz, 68,000 transistors)


Heavily microcoded (and nanocoded)
32-bit general purpose register architecture (24 address pins)

Intel 8086 (1978, 8MHz, 29,000 transistors)


Stopgap 16-bit processor, architected in 10 weeks
Extended accumulator architecture, assembly-compatible with 8080

August 21, 2008

44

22

IBM PC, 1981


Hardware
Team from IBM building PC prototypes in 1979
Motorola 68000 chosen initially, but 68000 was late
IBM builds stopgap prototypes using 8088 boards from
Display Writer word processor
8088 is 8-bit bus version of 8086 => allows cheaper system
Estimated sales of 250,000
100,000,000s sold

Software
Microsoft negotiates to provide OS for IBM. Later buys and
modifies QDOS from Seattle Computer Products.

Open System

Standard processor, Intel 8088


Standard interfaces
Standard OS, MS-DOS
IBM permits cloning and third-party software

August 21, 2008

45

The Eighties:
Personal Computer Revolution
Personal computer market emerges
Huge business and consumer market for spreadsheets, word
processing and games
Based on inexpensive 8-bit and 16-bit micros: Zilog Z80, Mostek
6502, Intel 8088/86,

Minicomputers replaced by workstations


Distributed network computing and high-performance graphics for
scientific and engineering applications (Sun, Apollo, HP,)
Based on powerful 32-bit microprocessors with virtual memory,
caches, pipelined execution, hardware floating-point
Commercial RISC processors developed for workstation market

Massively Parallel Processors (MPPs) appear


Use many cheap micros to approach supercomputer performance
(Sequent, Intel, Parsytec)
August 21, 2008

46

23

The Nineties
Advanced superscalar microprocessors appear

- first superscalar microprocessor is IBM POWER in 1990

MPPs have limited success in supercomputing market

- Highest-end mainframes and vector supercomputers survive


killer micro onslaught

64-bit addressing becomes essential at high-end


- In 2004, 4GB DRAM costs <$1,000

Parallel microprocessor-based SMPs take over low-end server


Workstation and PC markets merge

- By late 90s RISC vendors get wiped out and the CISC x86 ISA
thrives!
- Only exception: Apple PowerPC-based systems

August 21, 2008

47

Intel Pentium 4 (2000)

August 21, 2008

48

24

MIPS R10000 (1995)

0.35m CMOS, 4 metal layers


Four instructions per cycle
Out-of-order execution
Register renaming
Speculative execution past 4
branches
On-chip 32KB/32KB split I/D
cache, 2-way set-associative
Off-chip L2 cache
Non-blocking caches
Compare with simple 5-stage
pipeline (R5K series)
~1.6x performance SPECint95
~5x CPU logic area
~10x design effort
49

August 21, 2008

By 2004 Industry gives up on


clock speed and starts pushing
Multi-cores
Learn how the multi-core processor architecture plays
a central role in Intel's platform approach.
AMD is leading the industry to multi-core technology
for the x86 based computing market
Sun's multicore strategy centers around multi-threaded
software...

August 21, 2008

IBM Power 5

50

25

By 2004 Industry gives up on


clock speed and starts pushing
Multi-cores for power reasons

August 21, 2008

51

Present: Multicores and Cell


Phones

August 21, 2008

52

26

The power of numbers


Last year 950M cell phones were sold as opposed to
100M PC
India & Chine are selling ~7M new cell phones
connection per month
In developing countries cell phone is the only computer most
people have

A shift in research is underway from PCs to cell


phone, not very different from the shift from
Mainframes and Minis to PCs in early eighties.

53

August 21, 2008

The future would be dominated by the


concerns of
cheap & powerful handheld devices
and
Powerful infrastructure needed to
support services on these devices.

August 21, 2008

54

27

Current Cellphone Architecture


WLAN
WLAN
RF
RF

Application
Processing

Many
specialized
complex
blocks

August 21, 2008

WLAN RF
WCDMA/GSM
RF

Comms.
Processing

Two chips, each with an


ARM general-purpose
processor (GPP) and a
DSP (TI OMAP 2420)

gh
i
te
H
e
a
x, anc sip s
e
pl rm dis att
m fo
w
t
o
C er no 3
P s t an
u th
m
t ore
u
b m

55

Real power saving implies specialized


hardware
H.264 implementations in software vs
hardware
the power/energy savings could be 100 to 1000 fold

but our mind set is that hardware design is:


Difficult, risky
Increased time-to-market
Inflexible, brittle, error prone, ...
How to deal with changing standards, errors

New design
flows and tools
can change this
mind set

August 21, 2008

56

28

SoC & Multicore Convergence:


more application specific blocks

Applicationspecific
processing units

On-chip memory banks

Generalpurpose
processors

Structured onchip networks

Can we design a variety of such chips economically?


57

August 21, 2008

These are truly exciting


times for computer
architects!

Thanks!
August 21, 2008

58

29

You might also like