Math 01.
332
Intro to Numerical Analysis
Chapter 2: Error and computer
arithmetic
Goals
▪ Understand how numbers are represented inside
computers
▪ Determine different sources of computational errors
What is the value of n after the while loop?
x = 0.0; n = 0;
while x ~= 1.0 (“if x is not equal to 1.0”)
x = x + 0.1;
n = n + 1;
end
2.1 Floating-point numbers
Floating-point format for decimal numbers
Each decimal number is represented in a UNIQUE way as:
𝑥 = 𝜎 ⋅ 𝑥ҧ ⋅ 10𝑒
where
𝜎 = ±1 (the sign);
𝑒: an integer (exponent);
1 ≤ 𝑥ҧ < 10 (significand or mantissa)
▪ The ability to store a decimal number depends on how
many bits are used to store the exponent and significand.
Binary format
▪ Each decimal number is represented in a UNIQUE way as:
𝑥 = 𝜎 ⋅ 𝑥ҧ ⋅ 2𝑒
where
𝜎 = ±1;
𝑒: an integer;
1 2 ≤ 𝑥ҧ < 10 2 : binary fraction.
▪ Note: (10)2 = 2 in the decimal representation.
▪ Note: the first position of 𝑥ҧ is always 1 EXCEPT when x is
zero, therefore there is no need to store this number.
Examples
Write the following numbers in the floating-point
representation:
1) 124.62
2) -0.0245
Examples
Write the following numbers in the binary format:
1) 𝑥 = 11011.0111 2
2) 𝑥 = −110.11001110011 2
IEEE floating-point formats for decimal numbers
𝑥 = 𝜎 ⋅ 𝑥ҧ ⋅ 2𝑒
▪ Single-precision (32 bits):
IEEE floating-point formats for decimal numbers
𝑥 = 𝜎 ⋅ 𝑥ҧ ⋅ 2𝑒
▪ Double-precision format (64 bits)
Accuracy of floating-point representation
▪ The accuracy of floating-point representation is defined as
the difference between 1 and the next larger number that
can be stored in that format.
▪ In single-precision (23 decimal positions after decimal for the
significand): The next number larger than 1 is
1.00000000000000000000001
So the precision is 2−23 ≈ 1.19 × 10−7 (the precision is up
to 7 decimal digits in the decimal format).
Accuracy of floating-point representation
▪ The number next to 1 is
1.000000000000….00000000001
▪ There are 52 digits to store the significand. So, the
precision is
2−52 ≈ 2.22 × 10−16
Rounding and Chopping error
▪ Recall the program:
x = 0.0; n = 0;
while x ~= 1.0 (“if x is not equal to 1.0”)
x = x + 0.1;
n = n + 1;
End
▪ A number with more than the maximum number of digits in
the significand will be truncated by rounding or chopping.
▪ Rounding is more accurate than chopping.
▪ Why did the above algorithm not stop after 10 iterations?
0.1 = 0.00011001100110011…. in binary format
0.1 = (1.1001 1001 1001 1001 1001 1001 1001...)2 2−4
2.2 Errors: definitions and sources
Absolute and relative errors
▪ Suppose we approximate a number 𝑥𝑇 by a number 𝑥𝐴 .
▪ Absolute error = absolute of (true value – approximate value)
𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝑒𝑟𝑟𝑜𝑟 = |𝑥𝑇 − 𝑥𝐴 |
▪ Relative error = absolute value of the absolute error divided by
the true value
𝑥𝑇 − 𝑥𝐴
𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑒𝑟𝑟𝑜𝑟 =
|𝑥𝑇 |
Sources of errors
▪ Modeling errors: mathematical models are used to
approximate real-world problems.
▪ Physical measurement errors: Physical quantities are
measured by instruments with errors.
▪ Machine representation error (rounding/chopping)
▪ Mathematical approximation error: solving equations,
approximating integrals, derivatives, etc. This is the focus
of this course.
▪ Programming/computational mistakes.
Loss of significance errors
▪ When subtracting numbers almost equal to each other,
significance digits may be lost, which can cause
significant errors.
▪ Example: calculate the following function near x = 0:
𝑓 𝑥 = 𝑥2 + 1 − 1
(take x = 1e-8, 1e-9)
Loss of significance error
▪ Example: Consider the function
1 − cos(𝑥)
𝑓 𝑥 =
𝑥2
What should be a “good” approximate value of the above
function near x = 0?
Using Matlab, keeping 8 decimal places of cos(x):
Using Taylor polynomials to avoid loss of
significance error
▪ Example: Consider the function
1 − cos(𝑥)
𝑓 𝑥 =
𝑥2
Use Taylor polynomials to approximate f(x) near x = 0
Taylor polynomials may also cause errors
▪ When Taylor polynomials are used for approximating small
function values at large values of the variable, it may also
cause large errors.
▪ Example: Use Taylor polynomials of 𝑒 𝑥 centered at x = 0 to
approximate
𝑒 −5
Noise in function evaluation
▪ Look at the graph of function 𝑓 𝑥 = 𝑥 − 1 3 near x = 1:
▪ When the function value is near zero, we may see “noise” in
its values.
Propagation of errors
▪ Errors in one step of computation can be carried over to
the next steps.
▪ Each arithmetic operation (+, -, x, : ) propagates errors
▪ Functions also propagate errors.
Estimating approximation errors
Example: Estimate a bound of the error and relative error
when ln(𝑒) is approximated by ln 2.71 .
One way to bound the error is to use the Mean Value
Theorem:
𝑓 𝑥 − 𝑓 𝑎 = 𝑓′(𝑐)(𝑥 − 𝑎)
For a constant 𝑐 between 𝑥 and 𝑎.
Estimating the error bound
Example: Without computing the exact value of cos 2 ,
estimate a bound of the error and relative error when it is
approximated by cos 1.414 .
Summary
Ways to avoid loss of significance errors:
▪ Avoid subtracting two numbers which are close to each
other or divide two small numbers by each other
▪ Avoid subtracting too big numbers.
▪ Avoid too small intervals in evaluating function values.