FLOATING POINT
REPRESENTATION
COMPUTER SCIENCE 9618 PAPER 3
FLOATING POINT REPRESENTATION
Floating point representation is a method used in
computers to represent real numbers
14.25 15.75 192.3125 -14.625
Mantissa And Exponent
In Math we use decimal numbers and if we need to
write a very large or very small number we use the
concept of Exponents.
For Example :
3.45 x 10 ^ 12
In the example above 3.45 is the Mantissa and 12 is
the exponent with base 10 as the base for Decimal
numbers is 10.
Mantissa : The Most Significant Digit Of FPN
Exponent : Power and because we are dealing with
binary numbers so the base will be 2.
Note : We will be using a memorized list which will
help us in solving the Questions.
(You are suppose to memorize it properly)
128 64 32 16 8 4 2 1 1/2 1/4 1/8 1/16 1/32
0.06 0.03
128 64 32 16 8 4 2 1 0.5 0.25 0.125
25 125
Each bit in a binary number represent a decimal
value which helps in conversion of decimal number
to binary numbers. Each bit represents an increasing
power of 2 starting from the rightmost bit.
Question Example For Reference
There will be certain things which will be given in
your Question. (No of bit for Mantissa and Exponent)
Case 1 : Positive Decimal Number To Binary
Conversion
Question : You have 12 bits for Mantissa and 4 bits for
Exponent. Represent 14.75 in Floating Point
Representation
Step 1 : Write down the memorized list
0.06 0.03
128 64 32 16 8 4 2 1 0.5 0.25 0.125
25 125
Step 2 : Put 1 on numbers which will be added to fulfil
Question Requirement
1 1 1 0 1 1
0.06 0.03
128 64 32 16 8 4 2 1 0.5 0.25 0.125
25 125
1 1 1 0 1 1
This representation is known as Fixed Point Representation
Step 3 : Move the decimal place all the way to the left
and write the exponent value.
1 1 1 0 1 1
0 1 1 1 0 1 1
As we moved the decimal 4 places to the left the exponent
value will be 4. If we move the decimal to left exponent
increases and if we move exponent to the right the
exponent decreases
Step 4 : Ignore decimal point and convert the exponent into
binary form
0 1 1 1 0 1 1
0 1 1 1 0 1 1
The value of Exponent is 4 so in binary format it will be
represented as 0100
0 1 0 0
128 64 32 16 8 4 2 1
Now in this stage we have the mantissa value and the
exponent value
0 1 1 1 0 1 1 X 2 ^4
As we know that the decimal will always be in between the
first two binary bits and we also know that the base will
always be 2 for binary numbers so we don’t write the base
value and the decimal number in our final answer.
0 1 1 1 0 1 1 0 1 0 0
Mantissa Exponent
Step 5 : Store them in given spaces
Note : Always store Mantissa from Left Side and Always store
exponent from Right Side and the empty boxes will be 0
0 1 1 1 0 1 1 0 1 0 0
Mantissa Exponent
Mantissa Exponent
0 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0
Exam Style Question
Case 2 : Negative Decimal Number To Binary
Conversion
Question : You have 12 bits for Mantissa and 4 bits for
Exponent. Represent -14.75 in Floating Point Representation
Step 1 : Ignore Negative sign and follow the steps of case 1
till step 4.
1 1 1 0 1 1
0.06 0.03
128 64 32 16 8 4 2 1 0.5 0.25 0.125
25 125
0 1 1 1 0 1 1 0 1 0 0
Mantissa Exponent
Step 2 : Exponent would remain same and apply two’s
compliment method on mantissa only and store it.
Two’s Compliment
Two's complement is a mathematical operation on
binary numbers, and it is the most common method
of representing signed integers in computers.
Steps to Find Two's Complement
1. Invert All Bits: Change all 0s to 1s and all 1s to 0s.
2. Add One: Add 1 to the inverted binary number.
0 1 1 1 0 1 1 0 1 0 0
Mantissa Exponent
0 1 1 1 0 1 1
One’s Compliment 1 0 0 0 1 0 0
Two’s Compliment 1
1 0 0 0 1 0 1
Binary Addition
Addition Sum Carry
0+0 0 0
0+1 1 0
1+0 1 0
1+1 0 1
1+1+1 1 1
1 0 0 0 1 0 1 0 1 0 0
Mantissa Exponent
Mantissa Exponent
1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0
Exam Style Question
Case 3 : Positive Binary Number to Denary
Note : You are just suppose to reverse the steps from case 1.
Calculate the exponent value and move the decimal to the
original position and use the memorized list to find the
denary value.
Mantissa Exponent
0 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0
Exponent
0 1 0 0 = 4
8 4 2 1
Mantissa
0 1 1 1 0 1 1 0 0 0 0 0
1 1 1 0 1 1
0.06 0.03
128 64 32 16 8 4 2 1 0.5 0.25 0.125
25 125
So after adding all the values with 1 we will get 14.75
Exam Style Question
Case 4 : Negative Binary Number to Denary
Note : You are just suppose to reverse the steps from case 2.
Calculate the exponent value and first apply two’s
compliment on mantissa and then move the decimal
according to the exponent value and then use the
memorized list to find the denary value.
Mantissa Exponent
1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0
Mantissa
1 0 0 0 1 0 1
One’s Compliment 0 1 0 1 0 1 0
Two’s Compliment 1
0 1 1 1 0 1 1
0 1 1 1 0 1 1
1 1 1 0 1 1
0.06 0.03
128 64 32 16 8 4 2 1 0.5 0.25 0.125
25 125
So after adding all the values with 1 we will get 14.75 and
as it was a negative number so the answer will be -14.75
Exam Style Question
Case 5 : Negative Exponent Binary Conversion
Mantissa Exponent
1 0 1 1 0 0 0 0 1 1 1 0
In this question see that both mantissa and exponent are
negative as they both are starting with 1. So we need to
apply two’s compliment on both and then calculate.
Two’s Compliment
Exponent
1 1 1 0
Exponent
1 1 1 0
1 So the exponent would be -2
0 0 0 1
1
0 0 1 0
Mantissa
1 0 1 1 0 0 0 0
Mantissa
1 0 1 1 0 0 0 0
1 1 1 1
0 1 0 0 1 1 1 1
1
0 1 0 1 0 0 0 0
Mantissa Exponent
0 1 0 1 0 0 0 0 -2
0 0 0 1 0 1
0.06 0.03
128 64 32 16 8 4 2 1 0.5 0.25 0.125
25 125
So after moving the decimal two places to the left as the
exponent was negative we 0.00101 and when we add the
values with 1 we get 0.15625
Normalization
Normalization is a technique that is used to make
your data more accurate.
0.1 Represents positive number
1.0 Represents negative number
How will we figure out that a binary representation is normalized
First and second bit should never be same
Mantissa Exponent
0 0 1 1 0 1 1 1 0 1 0 1
0.356 x 10 ^6
If you need to make this in standard form
then you would have to use one power from
the exponent and the final answer would be
3.56 x 10^5. You are suppose to use the same
technique to normalize Floating Point
Representation.
Exam Style Question
What problems could occur if the binary representation
is not normalized ?
Multiple representation of single number
Precision lost
Redundant leading zeros in mantissa
What are the problems in floating point representation ?
0.2 / 0.1 / 0.4 These numbers can not be exactly
represented. The solution for this problem is
Rounding which would cause rounding error.
0.2 has been represented by value greater than 0.2
0.4 has been represented by value greater than 0.4
so after calculating with these rounded numbers the
difference would increase and the difference will be
significant.
Why Rounding Error occurs ?
Because there is no exact representation for some
binary numbers
Explain the reason why binary number are stored in
normalized form ?
Normalization minimizes the number of leading zeros
Maximizing the precision of the number for the given
number of bits
enables very large or small number to be stored with
accuracy.
Avoids possibility of many numbers having multiple
representation
Trade Off Between Mantissa And Exponent
Trade Off means Relationship
Mantissa Exponent
0 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0
Precision Range
12 bits for mantissa --> 8 bits for Mantissa
if we reduce bits for mantissa from 12 to 8
that means less precision
4 bits for Exponent --> 8 bits for Exponent
if we increase the bits for Exponent from 4
bits to 8 bits that means better range
Question : What is the trade-off between Mantissa and
Exponent
The trade off is between precision and range
If more bits are used for mantissa that means better
precision.
If more bits are used for exponent that means better
range.
More no of bits for mantissa means less number of
bits for exponent.
Largest Positive Number And Smallest
Positive Number In A Given Scenario
You have to make sure the answer is normalized so for
that we are going to have a Key which can help us in
solving Question regarding largest or smallest number.
Largest Positive
0.1 and the rest of 0.1 and the rest
the bits should be 1 should also be 1 as
as we need to make we need the largest
a largest number positive exponent
which will give us
the largest number
Mantissa Exponent
0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1
Mantissa Exponent
Smallest Positive
0.1 and the rest of 1.0 and the rest
the bits should be should be 0 as we
0 as we need to need the largest
make a smallest negative exponent
number so the which will move
mantissa should our decimal to left
the smallest side and we will get
the smallest
number
Mantissa Exponent
0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0
Mantissa Exponent
Overflow Underflow
Less Space More More Space Less
Bits To Store Bits To Store
Question : State when overflow error occurs in floating
point representation ?
Following an arithmetic operation a number produced
exceeds the maximum value that can be stored in
mantissa and exponent an overflow error occurs. This
could occur when dividing by a very small number.
Question : State when underflow error occurs in floating
point representation ?
Following an arithmetic operation, the result is smaller
than the smallest number that can be stored in mantissa
and exponent an underflow error occurs. This could
occur when dividing by a very large number.
Question : 10 bits for mantissa and 6 bits for exponent.
The denary number 513 cannot be stored accurately as a
normalized floating-point number is this system (3)
Answer:
513 in binary is 0.1000000001 so it requires 11 bits to
store accurately. Results in overflow.
Question : Describe an alteration to the way floating-
point numbers are stored to enable this number to be
stored accurately using the total number of bits (2)
Answer : The number of bits for mantissa must be
increased. 11 bits for mantissa and 5 bits for exponent
Question : Explain why a binary representation is
sometimes only an approximation to the real number it
represents.
Answer :
Real numbers can have a fractional part (such as 0.4
and 0.25)
The fixed length of the storage means that you can’t
store a very large number or very small number
There are limited decimal/fractional representation
(0.5 0.25 0.125 .. )
it isn‘t possible to store all fractions with the level of
precision provide by the system
the fractional part of the number is as closes as
possible within the number of bits given.
Floating Point Representation
Question 1
Question 2
Question 3
Question 4
Question 5
Question 6
Question 7
Question 8
Question 9
Question 10
Question 11
Question 12
Question 13
Question 14
Question 15
Question 16
Question 17
Question 18
Question 19
Question 20
Question 21
Question 22
Question 23
Question 24
Question 25
Answer
Answer 1
Answer 2
Answer 3
Answer 4
Answer 5
Answer 6
Answer 7
Answer 8
Answer 9
Answer 10
Answer 11
Answer 12
Answer 13
Answer 14
Answer 15
Answer 16
Answer 17
Answer 18
Answer 19
Answer 20
Answer 21
Answer 22
Answer 23
Answer 24
Answer 25