Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
57 views46 pages

Lecture 06 - Slides - Computer Technology and Instructions

The document discusses the evolution and components of computer technology, highlighting the impact of Moore's Law and the various classes of computers such as desktops, servers, and embedded systems. It covers performance metrics, programming languages, and the architecture of processors, including instruction sets and the MIPS instruction set as an example. Additionally, it addresses the challenges of power consumption and the complexities of programming for multiprocessors.

Uploaded by

Mandy Lores
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views46 pages

Lecture 06 - Slides - Computer Technology and Instructions

The document discusses the evolution and components of computer technology, highlighting the impact of Moore's Law and the various classes of computers such as desktops, servers, and embedded systems. It covers performance metrics, programming languages, and the architecture of processors, including instruction sets and the MIPS instruction set as an example. Additionally, it addresses the challenges of power consumption and the complexities of programming for multiprocessors.

Uploaded by

Mandy Lores
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Computer Technology

and Instructions

Sam Amiri
The Computer Revolution

 Progress in computer technology


 Underpinned by Moore’s Law

 Makes novel applications feasible


 Computers in automobiles
 Cell phones
 Human genome project
 World Wide Web
 Search Engines

 Computers are pervasive

2
Classes of Computers

 Desktop computers
 General purpose, variety of software
 Subject to cost/performance tradeoff
 Server computers
 Network based
 High capacity, performance, reliability
 Range from small servers to building sized
 Embedded computers
 Hidden as components of systems
 Stringent power/performance/cost constraints

3
The Processor Market

4
Understanding Performance

 Algorithm
 Determines number of operations executed

 Programming language, compiler, architecture


 Determine number of machine instructions executed per operation

 Processor and memory system


 Determine how fast instructions are executed

 I/O system (including OS)


 Determines how fast I/O operations are executed

5
Below Your Program

 Application software
 Written in high-level language

 System software
 Compiler: translates HLL code to machine code
 Operating System: service code
 Handling input/output
 Managing memory and storage
 Scheduling tasks & sharing resources

 Hardware
 Processor, memory, I/O controllers
6
Levels of Program Code

 High-level language
 Level of abstraction closer to problem domain
 Provides for productivity and portability
 Assembly language
 Textual representation of instructions
 Hardware representation
 Binary digits (bits)
 Encoded instructions and data

7
Components of a Computer

 Same components for all kinds


of computer
 Desktop, server, embedded

 Input/output includes
 User-interface devices
 Display, keyboard, mouse
 Storage devices
 Hard disk, CD/DVD, flash
 Network adapters
 For communicating with other computers
8
Inside the Processor (CPU)

 Datapath: performs operations on data

 Control: sequences datapath, memory, ...

 Cache memory
 Small fast SRAM memory for immediate access to data

9
Abstractions

 Abstraction helps us deal with complexity


 Hide lower-level detail

 Instruction set architecture (ISA)


 The hardware/software interface

 Application binary interface (ABI)


 The ISA plus system software interface

 Implementation
 The details underlying and interface

10
Defining Performance

 Response time
 How long it takes to do a task

 Throughput
 Total work done per unit time
 e.g., tasks/transactions/… per hour

 How are response time and throughput affected by


 Replacing the processor with a faster version?
 Adding more processors?

 We’ll focus on response time for now…


11
Measuring Execution Time

 Elapsed time
 Total response time, including all aspects
 Processing, I/O, OS overhead, idle time
 Determines system performance
 CPU time
 Time spent processing a given job
 Discounts I/O time, other jobs’ shares
 Comprises user CPU time and system CPU time
 Different programs are affected differently by CPU and system performance

12
CPU Clocking

 Operation of digital hardware governed by a constant-rate clock

 Clock period: duration of a clock cycle


 e.g., 250ps = 0.25ns = 250×10–12s

 Clock frequency (rate): cycles per second


 e.g., 4.0GHz = 4000MHz = 4.0×109Hz
13
CPU Time

CPU Time = CPU Clock Cycles × Clock Cycle Time


CPU Clock Cycles
=
Clock Rate

 Performance improved by
 Reducing number of clock cycles
 Increasing clock rate
 Hardware designer must often trade off clock rate against cycle count

14
Instruction Count and CPI

Clock Cycles = Instruction Count × Cycles per Instruction


CPU Time = Instruction Count × CPI × Clock Cycle Time
Instruction Count × CPI
=
Clock Rate
 Instruction Count for a program
 Determined by program, ISA and compiler

 Average cycles per instruction


 Determined by CPU hardware
 If different instructions have different CPI
 Average CPI affected by instruction mix
15
CPI in More Detail

 If different instruction classes take different numbers of cycles

n
Clock Cycles = ∑ (CPIi × Instruction Count i )
i=1

 Weighted average CPI

Clock Cycles n
 Instruction Count i 
CPI = = ∑  CPIi × 
Instruction Count i=1  Instruction Count 
Relative frequency

16
CPI Example

 Alternative compiled code sequences using instructions in classes A,


B, C

 Sequence 1: IC = 5  Sequence 2: IC = 6
 Clock Cycles  Clock Cycles
= 2×1 + 1×2 + 2×3 = 4×1 + 1×2 + 1×3
= 10 =9
 Avg. CPI = 10/5 = 2.0  Avg. CPI = 9/6 = 1.5
17
Performance Summary

Instructions Clock cycles Seconds


CPU Time = × ×
Program Instruction Clock cycle

 Performance depends on
 Algorithm: affects IC, possibly CPI
 Programming language: affects IC, CPI
 Compiler: affects IC, CPI
 Instruction set architecture: affects IC, CPI, Tc

18
Power Trends

 The power wall


 We can’t reduce voltage
further
 We can’t remove more
heat

 How else can we


improve performance?
 In CMOS IC technology

Power = Capacitive load × Voltage 2 × Frequency


×30 5V → 1V ×1000 19
Uniprocessor Performance

Constrained by power, instruction-level parallelism, memory latency


20
Multiprocessors

 Multicore microprocessors
 More than one processor per chip

 Requires explicitly parallel programming


 Compare with instruction level parallelism
 Hardware executes multiple instructions at once
 Hidden from the programmer
 Hard to do
 Programming for performance
 Load balancing
 Optimizing communication and synchronization

21
Instruction Set

 The repertoire of instructions of a computer


 Different computers have different instruction sets
 But with many aspects in common
 Early computers had very simple instruction sets
 Simplified implementation
 Many modern computers also have simple instruction sets

22
The MIPS Instruction Set

 Used as the example throughout this module

 Stanford MIPS commercialized by MIPS Technologies


(www.mips.com)

 Large share of embedded core market


 Applications in consumer electronics, network/storage equipment, cameras,
printers, …

 Typical of many modern ISAs

23
Arithmetic Operations

 Add and subtract, three operands


 Two sources and one destination

add a, b, c # a gets b + c

 All arithmetic operations have this form

 Design Principle 1: Simplicity favours regularity


 Regularity makes implementation simpler
 Simplicity enables higher performance at lower cost

24
Arithmetic Example

 C code:

f = (g + h) - (i + j);

 Compiled MIPS code:

add t0, g, h # temp t0 = g + h


add t1, i, j # temp t1 = i + j
sub f, t0, t1 # f = t0 - t1

25
Register Operands

 Arithmetic instructions use register operands


 MIPS has a 32 × 32-bit register file
 Use for frequently accessed data
 Numbered 0 to 31
 32-bit data called a “word”
 Assembler names
 $t0, $t1, …, $t9 for temporary values
 $s0, $s1, …, $s7 for saved variables
 Design Principle 2: Smaller is faster
 c.f. main memory: millions of locations

26
Register Operand Example

 C code:

f = (g + h) - (i + j);
 f, …, j in $s0, …, $s4

 Compiled MIPS code:

add $t0, $s1, $s2


add $t1, $s3, $s4
sub $s0, $t0, $t1

27
Memory Operands

 Main memory used for composite data


 Arrays, structures, dynamic data
 To apply arithmetic operations
 Load values from memory into registers
 Store result from register to memory
 Memory is byte addressed
 Each address identifies an 8-bit byte
 Words are aligned in memory
 Address must be a multiple of 4
 MIPS is Big Endian
 Most-significant byte at least address of a word
 c.f. Little Endian: least-significant byte at least address

28
Memory Operand Example 1

 C code:

g = h + A[8];
 g in $s1, h in $s2, base address of A in $s3

 Compiled MIPS code:


 Index 8 requires offset of 32
 4 bytes per word

lw $t0, 32($s3) # load word


add $s1, $s2, $t0
offset base register 29
Memory Operand Example 2

 C code:

A[12] = h + A[8];
 h in $s2, base address of A in $s3

 Compiled MIPS code:


 Index 8 requires offset of 32

lw $t0, 32($s3) # load word


add $t0, $s2, $t0
sw $t0, 48($s3) # store word
30
Registers vs. Memory

 Registers are faster to access than memory


 Operating on memory data requires loads and stores
 More instructions to be executed
 Compiler must use registers for variables as much as possible
 Only spill to memory for less frequently used variables
 Register optimization is important!

31
Immediate Operands

 Constant data specified in an instruction

addi $s3, $s3, 4

 No subtract immediate instruction


 Just use a negative constant
addi $s2, $s1, -1

 Design Principle 3: Make the common case fast


 Small constants are common
 Immediate operand avoids a load instruction

32
The Constant Zero

 MIPS register 0 ($zero) is the constant 0


 Cannot be overwritten

 Useful for common operations


 E.g., move between registers
add $t2, $s1, $zero

33
MIPS R-format Instructions

op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

 Instruction fields
 op: operation code (opcode)
 rs: first source register number
 rt: second source register number
 rd: destination register number
 shamt: shift amount (00000 for now)
 funct: function code (extends opcode)

34
R-format Example

op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

add $t0, $s1, $s2

special $s1 $s2 $t0 0 add

0 17 18 8 0 32

000000 10001 10010 01000 00000 100000

000000100011001001000000001000002 = 0232402016 35
MIPS I-format Instructions

op rs rt constant or address
6 bits 5 bits 5 bits 16 bits

 Immediate arithmetic and load/store instructions


 rt: destination or source register number
 Constant: –215 to +215 – 1
 Address: offset added to base address in rs
 Design Principle 4: Good design demands good compromises
 Different formats complicate decoding, but allow 32-bit instructions uniformly
 Keep formats as similar as possible

36
Logical Operations

 Instructions for bitwise manipulation

 Useful for extracting and inserting groups of bits in a word

37
Shift Operations

op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

 shamt: how many positions to shift


 Shift left logical
 Shift left and fill with 0 bits
 sll by i bits multiplies by 2i
 Shift right logical
 Shift right and fill with 0 bits
 srl by i bits divides by 2i (unsigned only)

38
NOT Operations

 Useful to invert bits in a word


 Change 0 to 1, and 1 to 0

 MIPS has NOR 3-operand instruction


 a NOR b == NOT ( a OR b )
Register 0: always read as
zero
nor $t0, $t1, $zero

$t1 0000 0000 0000 0000 0011 1100 0000 0000

$t0 1111 1111 1111 1111 1100 0011 1111 1111

39
Conditional Operations

 Branch to a labeled instruction if a condition is true


 Otherwise, continue sequentially
 beq rs, rt, L1
 if (rs == rt) branch to instruction labeled L1;
 bne rs, rt, L1
 if (rs != rt) branch to instruction labeled L1;
 j L1
 unconditional jump to instruction labeled L1

40
Compiling If Statements

 C code:
if (i==j) f = g+h;
else f = g-h;
 f, g, … in $s0, $s1, …
 Compiled MIPS code:
bne $s3, $s4, Else
add $s0, $s1, $s2
j Exit
Else: sub $s0, $s1, $s2
Exit: …
Assembler calculates addresses

41
More Conditional Operations

 Set result to 1 if a condition is true


 Otherwise, set to 0

 slt rd, rs, rt


 if (rs < rt) rd = 1; else rd = 0;

 slti rt, rs, constant


 if (rs < constant) rt = 1; else rt = 0;

 Use in combination with beq, bne


slt $t0, $s1, $s2 # if ($s1 < $s2)
bne $t0, $zero, L # branch to L
42
Branch Addressing

 Branch instructions specify


 Opcode, two registers, target address

 Most branch targets are near branch


 Forward or backward

op rs rt constant or address
6 bits 5 bits 5 bits 16 bits

 PC-relative addressing
 Target address = PC + offset × 4
 PC already incremented by 4 by this time
43
Jump Addressing

 Jump (j and jal) targets could be anywhere in text segment


 Encode full address in instruction

op address
6 bits 26 bits

 (Pseudo)Direct jump addressing


 Target address = PC31…28 : (address × 4)

44
Addressing Mode Summary

45
Thank You!

46

You might also like