COMPUTER ORGANIZATION AND RISC-V
Edition
DESIGN
The Hardware/Software Interface
Week 4
Arithmetic operations
(Chapter 2, 3)
§3.1 Introduction
Contents
■ Operations on integers
■ Addition and subtraction
■ Multiplication
■ Dealing with overflow
■ Operations on Fixed/Floating-point
numbers
■ Addition and multiplication
Chapter 3 — Arithmetic for Computers — 2
§3.2 Addition and Subtraction
Integer Addition
■ Example: 7 + 6
■ Overflow if result out of range
■ Adding +ve and –ve operands, no overflow
■ Adding two +ve operands
■ Overflow if result sign is 1
■ Adding two –ve operands
■ Overflow if result sign is 0
Chapter 3 — Arithmetic for Computers — 3
Integer Subtraction
■ Add negation of second operand
■ Example: 7 – 6 = 7 + (–6)
+7: 0000 0000 … 0000 0111
–6: 1111 1111 … 1111 1010
+1: 0000 0000 … 0000 0001
■ Overflow if result out of range
■ Subtracting two +ve or two –ve operands, no overflow
■ Subtracting +ve from –ve operand
■ Overflow if result sign is 0
■ Subtracting –ve from +ve operand
■ Overflow if result sign is 1
Chapter 3 — Arithmetic for Computers — 4
§3.3 Multiplication
Multiplication
■ Start with long-multiplication approach
multiplicand
1000
multiplier
× 1001
1000
0000
0000
1000
product 1001000
Length of product is
the sum of operand
lengths
Chapter 3 — Arithmetic for Computers — 5
Multiplication Hardware
Initially 0
Chapter 3 — Arithmetic for Computers — 6
Faster Multiplier
■ Uses multiple adders
■ Cost/performance tradeoff
■ Can be pipelined
■ Several multiplication performed in parallel
Chapter 3 — Arithmetic for Computers — 7
Fixed-Point Multiplication
■ Ex) 6.5625 x 4.25
Chapter 3 — Arithmetic for Computers — 8
Floating-Point Addition
■ Consider a 4-digit decimal example
■ 9.999 × 101 + 1.610 × 10–1
■ 1. Align decimal points
■ Shift number with smaller exponent
■ 9.999 × 101 + 0.016 × 101
■ 2. Add significands
■ 9.999 × 101 + 0.016 × 101 = 10.015 × 101
■ 3. Normalize result & check for over/underflow
■ 1.0015 × 102
■ 4. Round and renormalize if necessary
■ 1.002 × 102
Chapter 3 — Arithmetic for Computers — 9
Floating-Point Addition
■ Now consider a 4-digit binary example
■ 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)
■ 1. Align binary points
■ Shift number with smaller exponent
■ 1.0002 × 2–1 + –0.1112 × 2–1
■ 2. Add significands
■ 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
■ 3. Normalize result & check for over/underflow
■ 1.0002 × 2–4, with no over/underflow
■ 4. Round and renormalize if necessary
■ 1.0002 × 2–4 (no change) = 0.0625
Chapter 3 — Arithmetic for Computers — 10
FP Adder Hardware
■ Much more complex than integer adder
■ Doing it in one clock cycle would take too
long
■ Much longer than integer operations
■ Slower clock would penalize all instructions
■ FP adder usually takes several cycles
■ Can be pipelined
Chapter 3 — Arithmetic for Computers — 11
FP Adder Hardware
Step 1
Step 2
Step 3
Step 4
Chapter 3 — Arithmetic for Computers — 12
Floating-Point Multiplication
■ Consider a 4-digit decimal example
■ 1.110 × 1010 × 9.200 × 10–5
■ 1. Add exponents
■ For biased exponents, subtract bias from sum
■ New exponent = 10 + –5 = 5
■ 2. Multiply significands
■ 1.110 × 9.200 = 10.212 ⇒ 10.212 × 105
■ 3. Normalize result & check for over/underflow
■ 1.0212 × 106
■ 4. Round and renormalize if necessary
■ 1.021 × 106
■ 5. Determine sign of result from signs of operands
■ +1.021 × 106
Chapter 3 — Arithmetic for Computers — 13
Floating-Point Multiplication
■ Now consider a 4-digit binary example
■ 1.0002 × 2–1 × –1.1102 × 2–2 (0.5 × –0.4375)
■ 1. Add exponents
■ Unbiased: –1 + –2 = –3
■ Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127
■ 2. Multiply significands
■ 1.0002 × 1.1102 = 1.1102 ⇒ 1.1102 × 2–3
■ 3. Normalize result & check for over/underflow
■ 1.1102 × 2–3 (no change) with no over/underflow
■ 4. Round and renormalize if necessary
■ 1.1102 × 2–3 (no change)
■ 5. Determine sign: +ve × –ve ⇒ –ve
■ –1.1102 × 2–3 = –0.21875
Chapter 3 — Arithmetic for Computers — 14
FP Arithmetic Hardware
■ FP multiplier is of similar complexity to FP
adder
■ But uses a multiplier for significands instead of
an adder
■ FP arithmetic hardware usually does
■ Addition, subtraction, multiplication, division,
reciprocal, square-root
■ FP integer conversion
■ Operations usually takes several cycles
■ Can be pipelined
Chapter 3 — Arithmetic for Computers — 15
§3.4 Division
Division
■ Check for 0 divisor
■ Long division approach
quotient ■ If divisor ≤ dividend bits
dividend ■ 1 bit in quotient, subtract
1001 ■ Otherwise
1000 1001010 ■ 0 bit in quotient, bring down next
dividend bit
-1000
divisor
10 ■ Restoring division
101 ■ Do the subtract, and if remainder
1010 goes < 0, add divisor back
-1000 ■ Signed division
remainder 10 ■ Divide using absolute values
■ Adjust sign of quotient and remainder
n-bit operands yield n-bit as required
quotient and remainder
Chapter 3 — Arithmetic for Computers — 16
Division Hardware
Initially divisor
in left half
Initially dividend
Chapter 3 — Arithmetic for Computers — 17
Optimized Divider
■ One cycle per partial-remainder subtraction
■ Looks a lot like a multiplier!
■ Same hardware can be used for both
Chapter 3 — Arithmetic for Computers — 18
Faster Division
■ Can’t use parallel hardware as in multiplier
■ Subtraction is conditional on sign of remainder
■ Faster dividers (e.g. SRT devision)
generate multiple quotient bits per step
■ Still require multiple steps
Chapter 3 — Arithmetic for Computers — 19
RISC-V Division
■ Four instructions:
■ div, rem: signed divide, remainder
■ divu, remu: unsigned divide, remainder
■ Overflow and division-by-zero don’t
produce errors
■ Just return defined results
■ Faster for the common case of no error
Chapter 3 — Arithmetic for Computers — 20
§3.9 Fallacies and Pitfalls
Right Shift and Division
■ Left shift by i places multiplies an integer
by 2i
■ Right shift divides by 2i?
■ Only for unsigned integers
■ For signed integers
■ Arithmetic right shift: replicate the sign bit
■ e.g., –5 / 4
■ 111110112 >> 2 = 111111102 = –2
■ Rounds toward –∞
■ c.f. 111110112 >>> 2 = 001111102 = +62
Chapter 3 — Arithmetic for Computers — 21
Associativity
■ Parallel programs may interleave
operations in unexpected orders
■ Assumptions of associativity may fail
■ Need to validate parallel programs under
varying degrees of parallelism
Chapter 3 — Arithmetic for Computers — 22
Who Cares About FP Accuracy?
■ Important for scientific code
■ But for everyday consumer use?
■ The Intel Pentium FDIV bug
■ The market expects accuracy
■ See Colwell, The Pentium Chronicles
■ https://www.youtube.com/watch?v=Frl_jjZkmvw
Chapter 3 — Arithmetic for Computers — 23
§3.10 Concluding Remarks
Concluding Remarks
■ Bits have no inherent meaning
■ Interpretation depends on the instructions
applied
■ Computer representations of numbers
■ Finite range and precision
■ Need to account for this in programs
Chapter 3 — Arithmetic for Computers — 24
Concluding Remarks
■ ISAs support arithmetic
■ Signed and unsigned integers
■ Floating-point approximation to reals
■ Bounded range and precision
■ Operations can overflow and underflow
Chapter 3 — Arithmetic for Computers — 25