Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
10 views53 pages

Chapter 3 Arithmetic For Computers 2024

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views53 pages

Chapter 3 Arithmetic For Computers 2024

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

COMPUTER ORGANIZATION AND DESIGN

5th
Edition
The Hardware/Software Interface

Chapter 3
Arithmetic for Computers
§3.1 Introduction
Arithmetic for Computers
 Operations on integers
 Addition and subtraction
 Multiplication and division

 Floating-point real numbers


 Representation and operations

Chapter 3 — Arithmetic for Computers — 2


§3.3 Multiplication
Multiplication
 Start with long-multiplication approach
multiplicand

multiplier
1000
1000
×× 4012
1011
1000
2000
1000
1000
0000
0000
1000
4000
1011000
4012000
product

Binary makes it easy:


◆0  place 0 ( 0 x multiplicand)
◆ 1  place a copy ( 1 x multiplicand)

Chapter 3 — Arithmetic for Computers — 3


Multiplication Hardware

Initially 0

Chapter 3 — Arithmetic for Computers — 4


Example
 Using 4-bit numbers to save space,
multiply 2ten×3ten; or 0010two×0011two

Chapter 3 — Arithmetic for Computers — 5


Optimized Multiplier
 Perform steps in parallel: add/shift

 One cycle per partial-product addition


 That’s ok, if frequency of multiplications is low

Chapter 3 — Arithmetic for Computers — 6


Example
 Multiply 0010two×0011two using optimized
multiplier hardware

Chapter 3 — Arithmetic for Computers — 8


Signed Multiplication
 The simplest approach:
Negate all negative operands at the beginning,
perform unsigned multiplication on the resulting numbers,
and then negate the product if necessary.

 Disadv:
 Extra clock cycles may be needed to negate
multiplicand, multiplier, and the double length
product.
Booth’s Algorithm
 E.g.: 210  610 = 00102  01102

6 = 01102 6 = -2 + 8 = - 00102 + 10002

 Consider 011102 = 1 x 23 + 1 x 22 + 1 x 21 (three additions)


 Faster calculation
4 1
 011102 = 1 x 2 - 1 x 2 (one addition and one subtraction)

14 = 16 - 2 0111102 ?
 9 8 7 6 5 4 3 2 1 0

 00111111000  ?
 00111111111  29 - 1
 - 111  23 - 1

 00111111000 (29-1) - (23-1) = 29 - 23


 m n
 000111111111000000
  2m+1 - 2n
Booth’s Algorithm
 The key to Booth’s insight:
 classify groups of bits into the beginning, the middle,
or the end of a run of 1s
Booth’s Algorithm
 Booth’s algorithm
1. Depending on the current and previous bits, do one of
the following:
00: Middle of a string of 0s  no arithmetic op
01: End of a string of 1s  add the multiplicand to the left half of
the product
10: Beginning of a string of 1s  sub the multiplicand from the left
half of the product
11: Middle of a string of 1s  no arithmetic op

2. Shift the Product register right 1 bit


Booth’s Algorithm
 Requirements:
 Start with a 0 for the bit to the right of the rightmost bit
 Booth’s ops is identified according to the values in 2 bits.
 Extend the sigh when the product is shifted to the right.
 Sign extension
E.g., 210  610 = 00102  01102
Example
 Let’s try Booth’s algorithm with negative numbers:
 2tenx -3ten = -6ten or 0010twox1101two=1111 1010two

Sign extension
2-Bit Booth Encoding
 Using more bits for faster multiplies
b: multiplicand

NOP
+b
+b
+2b
-2b
-b
-b
NOP
MIPS Multiplication
 Two 32-bit registers for product
 HI: most-significant 32 bits
 LO: least-significant 32-bits
 Instructions
 mult rs, rt / multu rs, rt
 64-bit product in HI/LO
 mfhi rd / mflo rd
 Move from HI/LO to rd
 Can test HI value to see if product overflows 32 bits
 mul rd, rs, rt
 Least-significant 32 bits of product –> rd

Chapter 3 — Arithmetic for Computers — 17


§3.4 Division
Division
 Check for 0 divisor
 Long division approach
quotient  If divisor ≤ dividend bits
dividend  1 bit in quotient, subtract
1001  Otherwise
1000 1001010  0 bit in quotient, bring down next
dividend bit
divisor -1000
10  Restoring division
101  Do the subtract, and if remainder
1010 goes < 0, add divisor back
-1000  Signed division
remainder 10  Divide using absolute values
 Adjust sign of quotient and remainder
n-bit operands yield n-bit as required
quotient and remainder

Chapter 3 — Arithmetic for Computers — 18


Division Hardware
Initially divisor
in left half

Initially dividend

Why ?
Chapter 3 — Arithmetic for Computers — 19
Example
 4-bit : dividing 7ten by 2ten or 0000 0111two by 0010two
Optimized Divider
Start: place Dividend in Remainder

1. Shift Remainder register left 1 bit

2. Subtract Divisor from left of Remainder,


put result in left half of Remainder

Remainder >= 0 Test Remainder < 0


Remainder

3a. Shift Remainder 3b. Restore original


to left, setting new value by adding Divisor
rightmost bit to 1 to left half of Remainder,
put sum there, shift
Remainder to left, set
 One cycle per partial-remainder new rightmost bit to 0
subtraction
32nd No: < 32
 Looks a lot like a multiplier! repetition?

 Same hardware can be used for both Yes, repetitions

Done. Shift left half of Remainder right 1 bit

Chapter 3 — Arithmetic for Computers — 21


Example
 Using optimized divider hardware to divide
7ten by 2ten or 0000 0111two by 0010two
Signed Division
 Simplest solution:
 remember the signs of the divisor and dividend
and then negate the quotient if the signs disagree
 Note: the dividend and the remainder must have
the same signs!

 Example
 7  2  Quotient = 3, Remainder = 1
 7  2  Quotient = 3, Remainder = 1
 7  2  Quotient = 3, Remainder = 1
 7  2  Quotient = 3, Remainder = 1
MIPS Division
 Use HI/LO registers for result
 HI: 32-bit remainder
 LO: 32-bit quotient
 Instructions
 div rs, rt / divu rs, rt
 No overflow or divide-by-0 checking
 Software must perform checks if required
 Use mfhi, mflo to access result

Chapter 3 — Arithmetic for Computers — 25


COMPUTER ORGANIZATION AND DESIGN
5th
Edition
The Hardware/Software Interface

3.5
Floating Point
§3.5 Floating Point
Floating Point
 Representation for non-integral numbers
 Including very small and very large numbers
 Like scientific notation
 –2.34 × 1056 normalized

 +0.002 × 10–4 not normalized


 +987.02 × 109
 In binary
 ±1.xxxxxxx2 × 2yyyy
 Types float and double in C
Chapter 3 — Arithmetic for Computers — 27
Floating Point Standard
 Defined by IEEE Std 754-1985
 Developed in response to divergence of
representations
 Portability issues for scientific code
 Now almost universally adopted
 Two representations
 Single precision (32-bit)
 Double precision (64-bit)

Chapter 3 — Arithmetic for Computers — 28


IEEE Floating-Point Format
Single: 8 bits single: 23 bits
double: 11 bits double: 52 bits
S Exponent Fraction

x  ( 1)S  (1 Fraction)  2(Exponent Bias)

 S: sign bit (0  non-negative, 1  negative)


 Normalize significand: 1.0 ≤ |significand| < 2.0
 Always has a leading pre-binary-point 1 bit, so no need to
represent it explicitly (hidden bit)
 Significand is Fraction with the “1.” restored
 Exponent: excess representation: actual exponent + Bias
 Ensures exponent is unsigned
 Single: Bias = 127; Double: Bias = 1023

Chapter 3 — Arithmetic for Computers — 29


Single-Precision Range
 Exponents 00000000 and 11111111 reserved
 Smallest value
 Exponent: 00000001
 actual exponent = 1 – 127 = –126
 Fraction: 000…00  significand = 1.0
 ±1.0 × 2–126 ≈ ±1.2 × 10–38
 Largest value
 exponent: 11111110
 actual exponent = 254 – 127 = +127
 Fraction: 111…11  significand ≈ 2.0
 ±2.0 × 2+127 ≈ ±3.4 × 10+38

Chapter 3 — Arithmetic for Computers — 30


Double-Precision Range
 Exponents 0000…00 and 1111…11 reserved
 Smallest value
 Exponent: 00000000001
 actual exponent = 1 – 1023 = –1022
 Fraction: 000…00  significand = 1.0
 ±1.0 × 2–1022 ≈ ±2.2 × 10–308
 Largest value
 Exponent: 11111111110
 actual exponent = 2046 – 1023 = +1023
 Fraction: 111…11  significand ≈ 2.0
 ±2.0 × 2+1023 ≈ ±1.8 × 10+308

Chapter 3 — Arithmetic for Computers — 31


Floating-Point Precision
 Relative precision
 all fraction bits are significant
 Single: approx 2–23
 Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal
digits of precision
 Double: approx 2–52
 Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal
digits of precision

Chapter 3 — Arithmetic for Computers — 32


Floating-Point Example
 Represent –0.75
 –0.75 = (–1)1 × 1.12 × 2–1
 S=1
 Fraction = 1000…002
 Exponent = –1 + Bias
 Single: –1 + 127 = 126 = 011111102
 Double: –1 + 1023 = 1022 = 011111111102
 Single: 1011111101000…00
 Double: 1011111111101000…00

Chapter 3 — Arithmetic for Computers — 33


Floating-Point Example
 What number is represented by the single-
precision float
11000000101000…00
 S=1
 Fraction = 01000…002
 Exponent = 100000012 = 129

 x = (–1)1 × (1 + 012) × 2(129 – 127)


= (–1) × 1.25 × 22
= –5.0

Chapter 3 — Arithmetic for Computers — 34


Infinities and NaNs
 Exponent = 111...1, Fraction = 000...0
 ±Infinity
 Can be used in subsequent calculations,
avoiding need for overflow check
 Exponent = 111...1, Fraction ≠ 000...0
 Not-a-Number (NaN)
 Indicates illegal or undefined result
 e.g., 0.0 / 0.0
 Can be used in subsequent calculations

Chapter 3 — Arithmetic for Computers — 35


Denormal Numbers
 Exponent = 000...0  hidden bit is 0
 Smaller than normal numbers
 for gradual underflow, with diminishing precision
 The smallest single precision de-normalized number
is:

 De-normal with fraction = 000...0


X = (-1)S × (0 + 0) × 2-126 = ± 0.0

Two representations of 0.0

Chapter 3 — Arithmetic for Computers — 36


Floating-Point Summary
Single Precision Double Precision Meaning
Exponent Significant Exponent Significant
0 0 0 0 0
0 Non-zero 0 Non-zero +/- de-normalized number
1-254 Anything 1-2046 Anything +/- floating-point number
255 0 2047 0 +/- infinity
255 Non-zero 2047 Non-zero NaN (Not a number)

The smallest positive single precision normalized number is:

The smallest single precision de-normalized number is:


or
Floating-Point Addition
 Consider a 4-digit decimal example
 9.999 × 101 + 1.610 × 10–1

 1. Align decimal points


 Shift number with smaller exponent
 9.999 × 101 + 0.016 × 101
 2. Add significands
 9.999 × 101 + 0.016 × 101 = 10.015 × 101
 3. Normalize result & check for over/underflow
 1.0015 × 102
 4. Round and renormalize if necessary
 1.002 × 102 (Already fits in 4 bits, so no change)

Chapter 3 — Arithmetic for Computers — 38


Floating-Point Addition
 Now consider a 4-digit binary example
 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)
 1. Align binary points
 Shift number with smaller exponent
 1.0002 × 2–1 + –0.1112 × 2–1
 2. Add significands
 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
 3. Normalize result & check for over/underflow
 1.0002 × 2–4, with no over/underflow
 4. Round and renormalize if necessary
 1.0002 × 2–4 (no change) = 0.0625

Chapter 3 — Arithmetic for Computers — 39


FP Adder Hardware
 Much more complex than integer adder
 Doing it in one clock cycle would take too
long
 Much longer than integer operations
 Slower clock would penalize all instructions
 FP adder usually takes several cycles
 Can be pipelined

Chapter 3 — Arithmetic for Computers — 40


FP Adder Hardware

Mux2 Mux3
Step 1
Mux1

Step 2

Mux1: larger exp


Mux2: Fraction with
smaller exp Step 3
Mux3: Fraction with
larger exp Step 4

Chapter 3 — Arithmetic for Computers — 41


Floating-Point Multiplication
 Consider a 4-digit decimal example
 1.110 × 1010 × 9.200 × 10–5
 1. Add exponents
 For biased exponents, subtract bias from sum
 New exponent = 10 + –5 = 5
 2. Multiply significands
 1.110 × 9.200 = 10.212  10.212 × 105
 3. Normalize result & check for over/underflow
 1.0212 × 106
 4. Round and renormalize if necessary
 1.021 × 106
 5. Determine sign of result from signs of operands
 +1.021 × 106

Chapter 3 — Arithmetic for Computers — 42


Floating-Point Multiplication
 Now consider a 4-digit binary example
 1.0002 × 2–1 × –1.1102 × 2–2 (0.5 × –0.4375)
 1. Add exponents
– 127
 Unbiased: –1 + –2 = –3
 Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127
 2. Multiply significands
 1.0002 × 1.1102 = 1.1102  1.1102 × 2–3
 3. Normalize result & check for over/underflow
 1.1102 × 2–3 (no change) with no over/underflow
 4. Round and renormalize if necessary
 1.1102 × 2–3 (no change)
 5. Determine sign: +ve × –ve  –ve
 –1.1102 × 2–3 = –0.21875

Chapter 3 — Arithmetic for Computers — 43


FP Arithmetic Hardware
 FP multiplier is of similar complexity to FP
adder
 But uses a multiplier for significands instead of
an adder
 FP arithmetic hardware usually does
 Addition, subtraction, multiplication, division,
reciprocal, square-root
 FP  integer conversion
 Operations usually takes several cycles
 Can be pipelined

Chapter 3 — Arithmetic for Computers — 44


FP Instructions in MIPS
 FP hardware is coprocessor 1
 Adjunct processor that extends the ISA
 Separate FP registers
 32 single-precision: $f0, $f1, … $f31
 Paired for double-precision: $f0/$f1, $f2/$f3, …
 Release 2 of MIPs ISA supports 32 × 64-bit FP reg’s
 FP instructions operate only on FP registers
 Programs generally don’t do integer ops on FP data,
or vice versa
 More registers with minimal code-size impact
 FP load and store instructions
 lwc1, ldc1, swc1, sdc1
 e.g., ldc1 $f8, 32($sp)

Chapter 3 — Arithmetic for Computers — 45


FP Instructions in MIPS
 Single-precision arithmetic
 add.s, sub.s, mul.s, div.s
 e.g., add.s $f0, $f1, $f6
 Double-precision arithmetic
 add.d, sub.d, mul.d, div.d
 e.g., mul.d $f4, $f4, $f6
 Single- and double-precision comparison
 c.xx.s, c.xx.d (xx is eq, lt, le, …)
 Sets or clears FP condition-code bit
 e.g. c.lt.s $f3, $f4
 Branch on FP condition code true or false
 bc1t, bc1f
 e.g., bc1t TargetLabel

Chapter 3 — Arithmetic for Computers — 46


FP Example: Array Multiplication
 X=X+Y×Z
 All 32 × 32 matrices, 64-bit double-precision elements
 C code:
void mm (double x[][],
double y[][], double z[][]) {
int i, j, k;
for (i = 0; i! = 32; i = i + 1)
for (j = 0; j! = 32; j = j + 1)
for (k = 0; k! = 32; k = k + 1)
x[i][j] = x[i][j]
+ y[i][k] * z[k][j];
}
 Addresses of x, y, z in $a0, $a1, $a2, and
i, j, k in $s0, $s1, $s2
Chapter 3 — Arithmetic for Computers — 47
FP Example: Array Multiplication
 MIPS code:
li $t1, 32 # $t1 = 32 (row size/loop end)
li $s0, 0 # i = 0; initialize 1st for loop
L1: li $s1, 0 # j = 0; restart 2nd for loop
L2: li $s2, 0 # k = 0; restart 3rd for loop
sll $t2, $s0, 5 # $t2 = i * 32 (size of row of x)
addu $t2, $t2, $s1 # $t2 = i * size(row) + j
sll $t2, $t2, 3 # $t2 = byte offset of [i][j]
addu $t2, $a0, $t2 # $t2 = byte address of x[i][j]
l.d $f4, 0($t2) # $f4 = 8 bytes of x[i][j]
L3: sll $t0, $s2, 5 # $t0 = k * 32 (size of row of z)
addu $t0, $t0, $s1 # $t0 = k * size(row) + j
sll $t0, $t0, 3 # $t0 = byte offset of [k][j]
addu $t0, $a2, $t0 # $t0 = byte address of z[k][j]
l.d $f16, 0($t0) # $f16 = 8 bytes of z[k][j]

Chapter 3 — Arithmetic for Computers — 48


FP Example: Array Multiplication

sll $t0, $s0, 5 # $t0 = i*32 (size of row of y)
addu $t0, $t0, $s2 # $t0 = i*size(row) + k
sll $t0, $t0, 3 # $t0 = byte offset of [i][k]
addu $t0, $a1, $t0 # $t0 = byte address of y[i][k]
l.d $f18, 0($t0) # $f18 = 8 bytes of y[i][k]
mul.d $f16, $f18, $f16 # $f16 = y[i][k] * z[k][j]
add.d $f4, $f4, $f16 # f4=x[i][j] + y[i][k]*z[k][j]
addiu $s2, $s2, 1 # $k k + 1
bne $s2, $t1, L3 # if (k != 32) go to L3
s.d $f4, 0($t2) # x[i][j] = $f4
addiu $s1, $s1, 1 # $j = j + 1
bne $s1, $t1, L2 # if (j != 32) go to L2
addiu $s0, $s0, 1 # $i = i + 1
bne $s0, $t1, L1 # if (i != 32) go to L1

Chapter 3 — Arithmetic for Computers — 49


Example: Rounding with Guard Digits
Interpretation of Data
The BIG Picture

 Bits have no inherent meaning


 Interpretation depends on the instructions
applied
 Computer representations of numbers
 Finite range and precision
 Need to account for this in programs

Chapter 3 — Arithmetic for Computers — 52


§3.6 Parallelism and Computer Arithmetic: Associativity
Associativity
 Parallel programs may interleave
operations in unexpected orders
 Assumptions of associativity may fail
(x+y)+z x+(y+z)
x -1.50E+38 -1.50E+38
y 1.50E+38 0.00E+00
z 1.0 1.0 1.50E+38
1.00E+00 0.00E+00

 Need to validate parallel programs under


varying degrees of parallelism
Chapter 3 — Arithmetic for Computers — 53
§3.8 Fallacies and Pitfalls
Right Shift and Division
 Left shift by i places multiplies an integer
by 2i
 Right shift divides by 2i?
 Only for unsigned integers
 For signed integers
 Arithmetic right shift: replicate the sign bit
 e.g., –5 / 4
 111110112 >> 2 = 111111102 = –2
 Rounds toward –∞
 c.f. 111110112 >>> 2 = 001111102 = +62

Chapter 3 — Arithmetic for Computers — 54


Who Cares About FP Accuracy?
 Important for scientific code
 But for everyday consumer use?
 “My bank balance is out by 0.0002¢!” 
 The Intel Pentium FDIV bug
 The market expects accuracy
 See Colwell, The Pentium Chronicles

Chapter 3 — Arithmetic for Computers — 55


§3.9 Concluding Remarks
Concluding Remarks
 ISAs support arithmetic
 Signed and unsigned integers
 Floating-point approximation to reals
 Bounded range and precision
 Operations can overflow and underflow
 MIPS ISA
 Core instructions: 54 most frequently used
 Other instructions: less frequent

Chapter 3 — Arithmetic for Computers — 56

You might also like