0% found this document useful (0 votes)

26 views27 pages

Lecture 2

Uploaded by

ryuu.ducat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views27 pages

Lecture 2

Uploaded by

ryuu.ducat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Carnegie Mellon

Floating Point Numbers

N. Navet - Computing Infrastructure 1 / Lecture 2

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition

Carnegie Mellon

IEEE Floating Point standard

 IEEE 754 Standard
▪ Established in 1985 as uniform standard for floating point arithmetic
Before that, many proprietary formats, leading thus to non-portable
▪
applications
▪ Intel’s hired in the mid-1970s prof. Kahan (Berkeley) to devise a floating
point coprocessor (8087) for the 8086 processor → work re-used later in
IEEE standard
▪ Nowadays, IEEE 754 is supported in HW by virtually all CPUs (that have a
floating point unit, otherwise it can be implemented in SW)
 Driven by numerical concerns
▪ Good standards for rounding, overflow, underflow
▪ Hard to make fast in hardware
▪ Numerical analysts predominated over hardware designers in defining
the standard

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 2
Carnegie Mellon

Principles of floating point numbers

 Basis for the support (of an approximation) of arithmetic with real
numbers
 A floating point number is a rational number (i.e., quotient of two
integers)
 Real numbers that cannot be represented as floating points will be
approximated leading to numerical imprecisions (real numbers
form a continuum, floating points do not → rounding to the
nearest value that can be expressed needed)
 floating point is a number of the form 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑑 ∙ 𝑏𝑎𝑠𝑒 𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡 ,
where significand, exponent and base are all integers, e.g. in base
10, 5.367 = 5367 ∙ 10−3
 “floating point” because the point can “float”, it can be placed
anywhere relative to the significant digits of the number (depending
on the value of the exponent), e.g. 536.7 ∙ 10−2 = 5367 ∙ 10−3

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 3
Carnegie Mellon

Principles of floating point numbers

 As there is more than one way to represent a number, we need
a single standardized representation
 Familiar base-10 (normalized) scientific notation used in
physics, math and engineering: n = f *10e where
▪ f is the fraction (aka mantissa or significand) with one non-zero decimal
digit before the decimal point
▪ e is a positive or negative number called the exponent

Normalized scientific notation

on the right

 Range is determined by the number of digits of the exponent

 Precision by the number of digits in the fraction
 In computers, the base is 2, floating-point representation
encodes rational numbers of the form V = x × 2y
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 4
Carnegie Mellon

Tiny Floating Point Example #1

 Base 10
 Signed 3-digit significand that can be either 0, or (0.1 ≤ 𝑓 < 1) or (−1 < 𝑓 ≤
− 0.1 )
 Signed 2-digit exponent Min and max exponent ?
 Range over nearly 200 orders of magnitude: −0.999 ∙ 1099 to +0.999 ∙ 1099
 The separation between expressible numbers is not constant: e.g., the
separation between +0. 998 × 1099 and +0. 999 × 1099 is >> than the
separation between +0. 998 × 100 and +0. 999 × 100

 But the relative error introduced by rounding is about the same (i.e., the
separation between a number and its successor expressed as a percentage of
that number is approximatively the same over the whole range)
How to increase the accuracy of representation ?
How to increase the range of expressible numbers ?
Course reading – “Structured Computer Organization”:
Appendix B: floating point numbers
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 5
Carnegie Mellon

Example #1: the real line is divided up into seven regions

1. Large negative numbers less than −0. 999 × 1099.
2. Negative between −0.999 × 1099 and −0.100×10−99.
Not possible to
express any
3. Small negative, between -0.100×10−99 and zero
number in
4. Zero regions1,3,5,7
5. −99
Small positive, between 0 and 0.100×10 .
1060×1060 =10120
6. Positive between 0.100×10−99 and 0.999×1099. →positive overflow
7. Large positive numbers greater than 0.999×1099.

−0.999 ∙ 1099 −0.1 ∙ 10−99 0.1 ∙ 10−99 0.999 ∙ 1099

Nb: underflow errors is less serious than overflow since 0

is usually a satisfactory approximation in regions 3 and 5
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 6
Carnegie Mellon

Normalized numbers and hidden bits

 “Normalized” format is for representing all numbers but the
ones close to 0 that are represented with “denormalized” format
(will be seen later in the lecture)
 312.25 can be represented with the integer 31225 as the
significand and 10-2 as power term, but many other ways ..
 Its normalized scientific notation in base 10 is 3.1225 * 102 that
is with one non-zero decimal digit before the decimal point
 Same principle for normalized form in base 2: 1.xxx * 2y
 As the most significant bit is always a 1, it is not necessary to
store it → this is the hidden bit
 IEEE754 double precision: size of the significand is 52 bits not
including the hidden bit, 53 bits with it

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 7
Carnegie Mellon

Floating Point Representation – normalized numbers

 IEEE 754 standard represents FP numbers having the following form:
(–1)s M 2E
▪ Sign bit s determines whether number is negative or positive
▪ Significand M (except in special cases) a fractional binary number in range
[1.0,2.0) (interval starts at 1 because of leading 1: 1.xxxx…x * 2^E )
▪ Exponent E weights value by a power of two
How to express 0?
 Encoding of a FP number is done over 3 fields:
▪ Most Significant Bit s is sign bit s
▪ exp field encodes E (but is not equal to E)
▪ frac field encodes M (but is not equal to M)

s exp frac

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 8

Carnegie Mellon

As a programmer, you can expect a precision of

Precision options 7 decimal digits in single precision and 15 in
double precision. Except for good reasons, you
 Single precision: 32 bits should always use double precision numbers.

s exp frac
1 8-bits 23-bits

 Double precision: 64 bits

s exp frac
1 11-bits 52-bits
 Extended precision: 80 bits (not supported by all CPUs and
compilers) – out of the scope of the course
s exp frac
1 15-bits 64-bits

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 9

Carnegie Mellon

3 types of floating point encodings

 Determined by the value of the exponent – here we consider
single precision numbers, that is with an exponent of 8 bits

denormalized numbers are a “sub-format" within the IEEE-754 floating-point format

Not A Number (NaN): a value that is undefined

examples: 0/0, −5

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 10

Carnegie Mellon

Visualization: Floating Point Encodings

Cannot be represented

−Normalized −Denorm +Denorm +Normalized

− +

−0 +0

Denormalized encoding is for 0 and

numbers that are very close to 0

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 11

Carnegie Mellon

Case 1: “Normalized” Values v = (–1)s M 2E

 Most common case: when bit pattern of exp ≠ 000…0 and ≠
111…1 (i.e., 255 for single precision and 2047 for double)
 Exponent coded as a biased value: E = Exp – Bias
▪ Exp: unsigned value of exp field of the floating point number
▪ Bias = 2k-1 - 1, where k is number of exponent bits
▪ Single precision: bias=127 (Exp: 1…254, E: -126…127)
▪ Double precision: bias=1023 (Exp: 1…2046, E: -1022…1023)

 Significand coded with implied leading 1: M = 1.xxx…x2

▪ xxx…x: bits of frac field Beyond the lecture’s scope:
▪ Minimum when frac=000…0 (M = 1.0) thanks to the bias, exp field can be
encoded as unsigned (as it is
▪ Maximum when frac=111…1 (M = 2.0 – ε)
positive) and not in two’s
▪ Get extra leading bit for “free” (hidden bit) complement, which allows for
faster comparison of FP numbers
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 12
Carnegie Mellon

Normalized Encoding : example

v = (–1)s M 2E
in single precision E = Exp – Bias
 Value: float F = 15213.0;
▪ 1521310 = 11101101101101.02 x 20 5 steps: a) (unsigned) binary form b)
= 1.11011011011012 x 213 normalized form c) encode significand
d) encode exponent 5) sign bit
 Significand
M = 1.11011011011012
frac field (23bits)= 110110110110100000000002

Single precision
 Exponent
E = 13
Bias = 127
Exp field (8bits) = 140 = 100011002

 Result: Bit
Bit 31 22 Bit 0
0 10001100 11011011011010000000000
s exp frac
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 13
Carnegie Mellon

v = (–1)s M 2E
Example #2 E = Exp – Bias
http://www.binaryconvert.com/convert_float.html

1) Write 4.0 as v = (–1)s M 2E 4 = (–1)0 · 1.0 ·22

2) Encode 4.0 as a floating point
number (single precision)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 14

Carnegie Mellon

v = (–1)s M 2E
Example #2 E = Exp – Bias
http://www.binaryconvert.com/convert_float.html

4 = (–1)0 · 1.0 ·22

32 bits = 4 bytes

Bit Bit
22 0

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 15

Carnegie Mellon

v = (–1)s M 2E
Example #3 E = Exp – Bias

Encode 4.75 as a floating point number

in single precision format

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 16

Carnegie Mellon

v = (–1)s M 2E
Example #4 E = Exp – Bias
Encode 1.0 in IEEE754
single precision format

1 = (–1)0 · (1+0) · 20

How would 1.0 be encoded without the BIAS?

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 17

Carnegie Mellon

Case 2 : Denormalized numbers v = (–1)s M 2E

E = 1 – Bias
 exp = 000…0 indicates a denormalized number
 Purpose: represent 0 and numbers very close to 0 that normalized
numbers cannot represent
 Exponent value is constant : E = 1 – Bias (i.e., E = -126 in single
precision or E=-1022 in double precision)
 Significand coded with implied leading 0: M = 0.xxx…x2
▪ xxx…x: bits of frac
Why 0 cannot be represented
 Cases
with normalized encoding?
▪ exp = 000…0, frac = 000…0
Represents the value zero
▪
▪ Two distinct values: +0 and –0 (all bits are zero possibly except sign bit)
▪ exp = 000…0, frac ≠ 000…0
▪ Numbers are equi-spaced in that range as the exponent is constant

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 18
Carnegie Mellon

v = (–1)s M 2E
Example #5 E = -126

a) Encode of the smallest strictly positive denormalized number in

single precision floating point b) Express this value as a power of 2

= (–1)0 · 2-23 · 2-126 = 2-149

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 19

Carnegie Mellon

v = (–1)s M 2E
Example #6 E = -126

Single precision floating point: encoding of the largest positive

denormalized number in binary ?

= (–1)0 · (2-1 +2-2 + …+ 2-22 +2-23) · 2-126

= 2-126 · (1 - 2-23)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 20

Carnegie Mellon

Case 3: Special Values

 Condition: exp = 111…1

 Case: exp = 111…1, frac = 000…0

▪ Represents value  (infinity)
▪ Can be used as an operand and behaves according to the usual
mathematical rules for 
▪ As expected, both positive and negative 
▪ E.g., 1.0/0.0 = −1.0/−0.0 = +, 1.0/−0.0 = −

 Case: exp = 111…1, frac ≠ 000…0

▪ Not-a-Number (NaN)
▪ Represents case when no numeric value can be determined
▪ E.g., sqrt(–1),  − ,   0

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition [edited NN] 21
Carnegie Mellon

IEEE 754: a recap

≠0 and ≠ 111…1

 Floating Point Zero Same as Integer Zero

▪ All bits = 0

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 22

Carnegie Mellon

Supplementary material
Outside the scope of the course

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 23

Carnegie Mellon

Tiny Floating Point Example #2

s exp frac
1 4-bits 3-bits

 8-bit Floating Point Representation

▪ the sign bit is in the most significant bit
▪ the next four bits are the exponent, with a
bias of 7 v = (–1)s M 2E
▪ the last three bits are the frac Normalized : E = Exp – Bias
Denormalized : E = 1 – Bias
 Same general form as IEEE Format
a) what is the smallest strictly positive
▪ normalized, denormalized
normalized number and what is the
▪ representation of 0, NaN, infinity
largest ?
b) List all positive denormalized
numbers
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 24
Carnegie Mellon

v = (–1)s M 2E
Range (Positive Only) Normalized : E = Exp – Bias
s exp frac E Value
Denormalized : E = 1 – Bias

0 0000 001 -6 1/8*1/64 = 1/512 closest to zero

Denormalized 0 0000 010 -6 2/8*1/64 = 2/512
numbers …
0 0000 110 -6 6/8*1/64 = 6/512
0 0000 111 -6 7/8*1/64 = 7/512 largest denorm
smallest norm
0 0001 000 -6 8/8*1/64 = 8/512
0 0001 001 -6 9/8*1/64 = 9/512
…
0 0110 110 -1 14/8*1/2 = 14/16
0 0110 111 -1 15/8*1/2 = 15/16 closest to 1 below
Normalized
0 0111 000 0 8/8*1 = 1
numbers
0 0111 001 0 9/8*1 = 9/8 closest to 1 above
0 0111 010 0 10/8*1 = 10/8
…
0 1110 110 7 14/8*128 = 224
0 1110 111 7 15/8*128 = 240 largest norm
0 1111 000 n/a inf

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 25

Carnegie Mellon

Tiny Floating Point Example #3

 6-bit IEEE-like format

▪ e = 3 exponent bits
▪ f = 2 fraction bits s exp frac
▪ Bias is 23-1-1 = 3 1 3-bits 2-bits

 Notice how the distribution gets denser toward zero.

8 values

-15 -10 -5 0 5 10 15
Denormalized Normalized Infinity

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 26

Carnegie Mellon

Distribution of Values (close-up view)

 6-bit IEEE-like format
▪ e = 3 exponent bits
▪ f = 2 fraction bits s exp frac
▪ Bias is 3 1 3-bits 2-bits

-1 -0.5 0 0.5 1
Denormalized Normalized Infinity

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 27

Floating Point: 15-213: Introduction To Computer Systems 4 Lecture, Sep. 10, 2015
No ratings yet
Floating Point: 15-213: Introduction To Computer Systems 4 Lecture, Sep. 10, 2015
40 pages
Floating Point
No ratings yet
Floating Point
16 pages
DanLoad 6000 Communications Specification Manual
No ratings yet
DanLoad 6000 Communications Specification Manual
244 pages
Lecture 4
No ratings yet
Lecture 4
154 pages
Floating Point Arithmetic Class
No ratings yet
Floating Point Arithmetic Class
24 pages
LEC03 Data II
No ratings yet
LEC03 Data II
45 pages
Computer Systems: Binary Floating Point
No ratings yet
Computer Systems: Binary Floating Point
44 pages
chapter02b float 中文
No ratings yet
chapter02b float 中文
48 pages
04 Float
No ratings yet
04 Float
40 pages
L2-Variables and Floating Point Number System
No ratings yet
L2-Variables and Floating Point Number System
38 pages
Floating Point Arithmetic
100% (1)
Floating Point Arithmetic
30 pages
Fixed and Floating Point Numbers
No ratings yet
Fixed and Floating Point Numbers
44 pages
08 FloatingPoint
No ratings yet
08 FloatingPoint
52 pages
CH03 Data II
No ratings yet
CH03 Data II
31 pages
IEEE754 Floating Point Standard Presentation Detailed
No ratings yet
IEEE754 Floating Point Standard Presentation Detailed
15 pages
Lecture 3 - Floating Point
No ratings yet
Lecture 3 - Floating Point
33 pages
5 Data - Floating - Point v1
No ratings yet
5 Data - Floating - Point v1
25 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
64 pages
Week8 Slides
No ratings yet
Week8 Slides
43 pages
Floa NG Point: 15 - 213: Introduc On To Computer Systems 4 Lecture, Sep 5, 2013
No ratings yet
Floa NG Point: 15 - 213: Introduc On To Computer Systems 4 Lecture, Sep 5, 2013
40 pages
Lec 08
No ratings yet
Lec 08
36 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Summary of Integer Arithmetic and ALU: - Addition
No ratings yet
Summary of Integer Arithmetic and ALU: - Addition
22 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Floating Point Representation - M.eng Term Paper
No ratings yet
Floating Point Representation - M.eng Term Paper
6 pages
Floating Point Numbers: CS101 Introduction To Computing
No ratings yet
Floating Point Numbers: CS101 Introduction To Computing
41 pages
Floating Point & Fixed Point Representation - BCA II
No ratings yet
Floating Point & Fixed Point Representation - BCA II
24 pages
Single Precision Floating Point
No ratings yet
Single Precision Floating Point
24 pages
Floating Point Number
No ratings yet
Floating Point Number
28 pages
arch1-LECTURE-NUMBER REPRESENTATION
No ratings yet
arch1-LECTURE-NUMBER REPRESENTATION
42 pages
Floating Point Representation: Reading: B&O 2.4
No ratings yet
Floating Point Representation: Reading: B&O 2.4
44 pages
15 - Floating Point Encoding
No ratings yet
15 - Floating Point Encoding
17 pages
Floating Point Arithmetic Guide
No ratings yet
Floating Point Arithmetic Guide
42 pages
Floating - Point - Number
No ratings yet
Floating - Point - Number
36 pages
8.3 Floating Point Numbers
No ratings yet
8.3 Floating Point Numbers
19 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
55 pages
COMPX203 Computer Systems: Number Representation
No ratings yet
COMPX203 Computer Systems: Number Representation
33 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
IEEE 754 Floating Point Notes
No ratings yet
IEEE 754 Floating Point Notes
4 pages
Floating Point: - We Need A Way To Represent
No ratings yet
Floating Point: - We Need A Way To Represent
14 pages
Floating-Point Numbers
No ratings yet
Floating-Point Numbers
23 pages
Ece552 10 Floating Point
No ratings yet
Ece552 10 Floating Point
15 pages
Computer Architecture: Data Types
No ratings yet
Computer Architecture: Data Types
25 pages
This Unit: Arithmetic and ALU Design Floating Point Arithmetic
No ratings yet
This Unit: Arithmetic and ALU Design Floating Point Arithmetic
8 pages
Lec 4
No ratings yet
Lec 4
15 pages
05 Floating Point
No ratings yet
05 Floating Point
24 pages
A
No ratings yet
A
208 pages
Module 2 - PART D Floating
No ratings yet
Module 2 - PART D Floating
30 pages
IEEE 754 Floating Point Formats
No ratings yet
IEEE 754 Floating Point Formats
12 pages
IEEE 754 for Computer Scientists
No ratings yet
IEEE 754 for Computer Scientists
11 pages
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
No ratings yet
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
51 pages
Floating Point Representation
No ratings yet
Floating Point Representation
18 pages
Floating Point
No ratings yet
Floating Point
13 pages
Floating Point: CS230 System Programming 4
No ratings yet
Floating Point: CS230 System Programming 4
39 pages
Lecture Slides 02 026-IEEEfloats
No ratings yet
Lecture Slides 02 026-IEEEfloats
8 pages
Floating Points
No ratings yet
Floating Points
31 pages
IEEE 754: Floating Point Guide
No ratings yet
IEEE 754: Floating Point Guide
10 pages
Main SCM
No ratings yet
Main SCM
3,420 pages
Cheats
No ratings yet
Cheats
44 pages
Kode Vba Aplikasi Persediaan
No ratings yet
Kode Vba Aplikasi Persediaan
59 pages
"The Course That Gives CMU Its Zip!": Topics
No ratings yet
"The Course That Gives CMU Its Zip!": Topics
31 pages
CONSTANTES Step7 Siemens
No ratings yet
CONSTANTES Step7 Siemens
2 pages
The IEEE Standard For Floating Point Arithmetic
No ratings yet
The IEEE Standard For Floating Point Arithmetic
9 pages
4.4 - 1 New Floating Point
No ratings yet
4.4 - 1 New Floating Point
22 pages
IEEE 754 Floating Point Guide
No ratings yet
IEEE 754 Floating Point Guide
38 pages
Lecture 5
No ratings yet
Lecture 5
51 pages
CS61C 2022fa L07-Intro-RISC-V
No ratings yet
CS61C 2022fa L07-Intro-RISC-V
39 pages
IEEE 754 Floating Point Guide
No ratings yet
IEEE 754 Floating Point Guide
2 pages
Floating Point Sept 6, 2006 15-213: "The Course That Gives CMU Its Zip!"
No ratings yet
Floating Point Sept 6, 2006 15-213: "The Course That Gives CMU Its Zip!"
34 pages
String Handling: - in Java
No ratings yet
String Handling: - in Java
20 pages
Chapter 1 EX 2
No ratings yet
Chapter 1 EX 2
39 pages
Floating Point
No ratings yet
Floating Point
33 pages
Chapter 1 EX
No ratings yet
Chapter 1 EX
30 pages
Module 5 Visual Basic Variables and Formulas
No ratings yet
Module 5 Visual Basic Variables and Formulas
57 pages
DSA Notes
No ratings yet
DSA Notes
18 pages
5 String
No ratings yet
5 String
2 pages
Stackoverflow Com ...
No ratings yet
Stackoverflow Com ...
14 pages
IEEE Standard 754 Floating Point Numbers
No ratings yet
IEEE Standard 754 Floating Point Numbers
7 pages
SaveFile Decrypted
No ratings yet
SaveFile Decrypted
26 pages
Data Types Activity 3
No ratings yet
Data Types Activity 3
2 pages
Chapter 09 - Forms
No ratings yet
Chapter 09 - Forms
21 pages
Chapter 05 - Marking Up Text
No ratings yet
Chapter 05 - Marking Up Text
20 pages
Scoreboard CODE-RUSH-ROUND-1-A - DOMjudge
No ratings yet
Scoreboard CODE-RUSH-ROUND-1-A - DOMjudge
4 pages
4 Yxp
No ratings yet
4 Yxp
1 page
Ngôn Ngữ Lập Trình Trên Arduino - Hướng Dẫn Hàm - Cộng Đồng Arduino Việt Nam
No ratings yet
Ngôn Ngữ Lập Trình Trên Arduino - Hướng Dẫn Hàm - Cộng Đồng Arduino Việt Nam
1 page
Chapter 6 EX
No ratings yet
Chapter 6 EX
13 pages
Java Mancala Game AI Code
No ratings yet
Java Mancala Game AI Code
18 pages
Modul GUI Java
No ratings yet
Modul GUI Java
19 pages
Chapter 5 EX
No ratings yet
Chapter 5 EX
10 pages
Data JWS
No ratings yet
Data JWS
2 pages
Discrete Mathematics I: Solution
No ratings yet
Discrete Mathematics I: Solution
2 pages
Discrete Mathematics I: Solution
No ratings yet
Discrete Mathematics I: Solution
5 pages
Chapter 08 - Tables
No ratings yet
Chapter 08 - Tables
8 pages
ZZ + WMX
No ratings yet
ZZ + WMX
5 pages
Discrete Mathematics I: Solution
No ratings yet
Discrete Mathematics I: Solution
4 pages
Discrete Mathematics I: Solution
No ratings yet
Discrete Mathematics I: Solution
4 pages
Discrete Mathematics I: Solution
No ratings yet
Discrete Mathematics I: Solution
6 pages
Discrete Mathematics I: Solution
No ratings yet
Discrete Mathematics I: Solution
5 pages
Technical Data Structure Guide
No ratings yet
Technical Data Structure Guide
5 pages
Terbilang
No ratings yet
Terbilang
3 pages
Single Precision Floating-Point Conversion
No ratings yet
Single Precision Floating-Point Conversion
6 pages

Lecture 2

Uploaded by

Lecture 2

Uploaded by

Carnegie Mellon

Floating Point Numbers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition

IEEE Floating Point standard

Principles of floating point numbers

Principles of floating point numbers

Normalized scientific notation

 Range is determined by the number of digits of the exponent

Tiny Floating Point Example #1

Example #1: the real line is divided up into seven regions

−0.999 ∙ 1099 −0.1 ∙ 10−99 0.1 ∙ 10−99 0.999 ∙ 1099

Nb: underflow errors is less serious than overflow since 0

Normalized numbers and hidden bits

Floating Point Representation – normalized numbers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 8

As a programmer, you can expect a precision of

 Double precision: 64 bits

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 9

3 types of floating point encodings

denormalized numbers are a “sub-format" within the IEEE-754 floating-point format

Not A Number (NaN): a value that is undefined

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 10

Visualization: Floating Point Encodings

−Normalized −Denorm +Denorm +Normalized

Denormalized encoding is for 0 and

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 11

Case 1: “Normalized” Values v = (–1)s M 2E

 Significand coded with implied leading 1: M = 1.xxx…x2

Normalized Encoding : example

1) Write 4.0 as v = (–1)s M 2E 4 = (–1)0 · 1.0 ·22

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 14

4 = (–1)0 · 1.0 ·22

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 15

Encode 4.75 as a floating point number

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 16

How would 1.0 be encoded without the BIAS?

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 17

Case 2 : Denormalized numbers v = (–1)s M 2E

a) Encode of the smallest strictly positive denormalized number in

= (–1)0 · 2-23 · 2-126 = 2-149

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 19

Single precision floating point: encoding of the largest positive

= (–1)0 · (2-1 +2-2 + …+ 2-22 +2-23) · 2-126

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 20

Case 3: Special Values

 Condition: exp = 111…1

 Case: exp = 111…1, frac = 000…0

 Case: exp = 111…1, frac ≠ 000…0

IEEE 754: a recap

 Floating Point Zero Same as Integer Zero

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 22

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 23

Tiny Floating Point Example #2

 8-bit Floating Point Representation

0 0000 001 -6 1/8*1/64 = 1/512 closest to zero

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 25

Tiny Floating Point Example #3

 6-bit IEEE-like format

 Notice how the distribution gets denser toward zero.

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 26

Distribution of Values (close-up view)

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 27

You might also like