Introduction to Numerical Methods
Introduction to Numerical Methods
T. Gambill
http://courses.engr.illinois.edu/cs357/su2013/
Compass quizzes 150 points maximum
MPs 150 points maximum
Grades:
Midterm Exam 150 points
Final Exam 250 points
Homework:
no dropped scores
May discuss MPs with TAs or other students, but do not copy!
I copied partial solutions is still copying
I see departmental policy re: cheating (
https://agora.cs.illinois.edu/display/undergradProg/Honor+Code)
http://courses.engr.illinois.edu/cs357/su2013/
Questions?
Definition
Numerical Analysis - The study of algorithms (methods) for problems involving
quantities that take on continuous (as opposed to discrete) values.
Example (analytic)
Example (numerical)
1
4 0.25
1 0.33333 . . . (?)
3 3.14159 . . . (?)
π
0.88472 . . . (?)
tan (83)
Is it a “good” method?
Is it a robust (stable) algorithm?
Is it a fast implementation?
Accuracy Cost
Definition (Trefethen)
Study of algorithms for the problems of continuous mathematics
Approximates π with
(8/9)2 ∗ 4 ≈ 3.1605
1 1
π = 16 arctan ( ) − 4 arctan ( )
5 239
Led to calculation of the first 100 digits of π
Uses the Taylor series of arctan in the algorithm
x3 x5 x7
arctan (x) = x − + − ...
3 5 7
Used until 1973 to find the first Million digits
Numerical focus:
Approximation An approximate solution is sought. How close is this to the
desired solution?
Efficiency How fast and cheap (memory) can we compute a solution?
Stability Is the solution sensitive to small variations in the problem setup?
Error What is the role of finite precision of our computers?
Why?
Numerical methods improve scientific simulation
Some disasters attributable to bad numerical computing (Douglas Arnold)
I The Patriot Missile failure, in Dharan, Saudi Arabia, on February 25, 1991
which resulted in 28 deaths, is ultimately attributable to poor handling of
rounding errors.
I The explosion of the Ariane 5 rocket just after lift-off on its maiden voyage off
French Guiana, on June 4, 1996, was ultimately the consequence of a
simple overflow.
I The sinking of the Sleipner A offshore platform in Gandsfjorden near
Stavanger, Norway, on August 23, 1991, resulted in a loss of nearly one
billion dollars. It was found to be the result of inaccurate finite element
analysis.
Sets of Numbers
Natural Numbers = N = {1, 2, 3, ...}
Integers = Z = {0, ±1, ±2, ±3, ...}
Rationals = Q = {a/b | a ∈ Z, b ∈ Z, b , 0}
Reals = R = {±dn dn−1 ...d2 d1 .d−1 d−2 ... | n ∈ N and dj ∈ {0, 1, 2..., 9}, j =
n, n − 1, n − 2, ..., 1, −1, −2, ...}
n-tuples of Reals = Rn = {(r1 , r2 , ..., rn ) | ri ∈ R and n ∈ N}
Complex Numbers = C = {(a, b) = a + bi | a ∈ R, b ∈ R, i2 = −1}
Extended Reals = R = R ∪ ±∞.
Interval Numbers = IR = {[a, b] | a 6 b, a ∈ R , b ∈ R}
0.d0 d1 e = −1
d0 .d1 e=0
d0 d1 .0 e=1
s1 e1 e2 ...e11 d1 d2 ...d52
as seen from the picture in memory
If we write 1.d1 d2 ...d52 as 1.f then the above expression can be written in the
compact form,
”Special” numbers
Special numbers use values of E = 0 or f = 0.
For example, zero is represented by E = 0, f = 0
(−1)s 2−1022 (0.0)
s 0...0 0...0
bit : 63 bits : 62 ← 52 bits : 51 ← 0
subnormal(denormalized) numbers by E = 0, f , 0
(−1)s 2−1022 (0.f )
s 0...0 f
bit : 63 bits : 62 ← 52 bits : 51 ← 0
Infinity is represented by E = 2047, f = 0
(−1)s 21024 (1.0)
NaN is represented by E = 2047, f , 0
(−1)s 21024 (1.f )
Overflow/Underflow
computations too close to zero may result in underflow
computations too large may result in overflow
overflow error is considered more severe
underflow can just fall back to 0
T. Gambill (UIUC) CS 357 June 16, 2014 29 / 53
Test your understanding
Example
Effects of spacing of floating point values
1 first = []
2 a=1.0
3 while a > 0.0:
4 a = a/2.0
5 first. append (a)
6 second = []
7 a=1.0
8 while a+1.0 > 1.0:
9 a = a/2.0
10 second . append (a)
11 print(’len( first) =’, len(first))
12 print(’first [ -1] = ’, first [ -1])
13 print(’len( second ) = ’,len( second ))
14 print(’second [-1] = ’, second [ -1])
x = 0.b1 b2 b3 b4 . . .
= b1 · 2−1 + . . .
2x = b1 · 20 + b2 · 2−1 + b3 · 2−2 + . . .
Example
Example:Compute the binary representation of 0.625
2 · 0.625 = 1.25 ⇒ b1 = 1
2 · 0.25 = 0.5 ⇒ b2 = 0
2 · 0.5 = 1.0 ⇒ b3 = 1
So (0.625)10 = (0.101)2
Data Error
Data Error can occur when making a measurement of a physical quantity. But
Data Error can also occur by means of a data entry error.
Machine Epsilon
The machine epsilon m is the smallest positive machine number such that
1 + m , 1
1 >> eps
2 ans = 2.2204e -16
Absolute Error(xa ) = xa − xt
= approximate value − true value
This doesn’t tell the whole story. For example, if the values are large, like
billions, then an Error of 100 is small. If the values are smaller, say
around 10, then an Error of 100 is large. We need the relative error:
xa − xt
Relative Error(xa ) =
xt
approximate value − true value
=
true value
when true value , 0
After some simple algebra on the previous equation above we can write,
x = (1.d1 d2 d3 . . . d52 . . . )2 × 2e
x− = (1.d1 d2 . . . d52 )2 × 2e
x+ = (1.d1 d2 . . . d52 )2 + 2−52 × 2e
|x+ − x− |
|x− − x| 6 = 2e−53
2
x− − x 2e−53
−53
x 6 2e = 2 = m /2
|y − x|
6 3.26 ∗ 10−7
|x|
From the above example we would like to make the following assertions:
If absolute error is less than 0.5 ∗ 10−t then there are t equal digits to the
right of the decimal point between y and x, when both numbers are in the
non-scientific notation form,
Example
Given x = 0.00351 and an approximation y = .00346 then
|y − x|
6 1.43 ∗ 10−2
|x|
Now x and y agree in three digits to the right of the decimal but the first
assertion says it should be four. However if we rounded (to nearest) the fifth
decimal digits of x and y then the assertion would be true.
If we re-write x = 0.351 ∗ 10−2 and y = 0.346 ∗ 10−2 these numbers agree only
with one digit but the assertion says it should be two. Again, however, if we
round (to nearest) the third decimal digits of x and y then the assertion would
be true. How can we overcome this discrepancy in our assertions?
Significant Digits
|y−x|
If y is an approximation of x = 0.d1 d2 ...dt ... ∗ 10e and we have |x| 6 5.0 ∗ 10−t
then the t digits starting in the position > 10−t of y are called ”significant
digits”.
How?
Compute the condition ”number” for the problem.
incorporate roundoff-truncation knowledge into
I the method
I the algorithm
An alternative view
a b
a + bi ↔ .
−b a
Example
(2 + 3i) ∗ (−1 − 4i) = 10 − 11i
since
2 3 −1 −4 10 −11
∗ =
−3 2 4 −1 11 10
Example
[−1, 1] + [−0.1, 0.1] = [−1.1, 1.1]
Example
[2, 3] − [2, 3] = [−1, 1] thus x − x , 0 except for intervals of zero width, e.g.
[3, 3].
Example
[−1, 2] ∗ ([−2, 3] + [3, 4]) = [−1, 2] ∗ [1, 7] = [−7, 14] but
[−1, 2] ∗ [−2, 3] + [−1, 2] ∗ [3, 4] = [−4, 6] + [−4, 8] = [−8, 14] so the distributive
law doesn’t hold.
Example
x ◦ y need not be in DPFP even if both x, y ∈ DPFP. Consider x = 1, y = 3 and
x/y
Example
Note that fl(fl(x + y) + z) = fl(x + fl(y + z)) fails for some choice of
x, y, z ∈ DPFP. For example, when x = 1.111...100 ∗ 20 1.—fifty bits of 1’s
followed by two bits of 0’s and y = z = 1.0...0 ∗ 2− 53.
The Rationals, Reals and Complex Numbers form a field based on the
operations + , ∗.
Properties of a Field F
closure: If a ∈ F and b ∈ F then a + b ∈ F and a ∗ b ∈ F.
associativity: a + (b + c) = (a + b) + c and a ∗ (b ∗ c) = (a ∗ b) ∗ c
for all a ∈ F, b ∈ F, c ∈ F.
commutativity: a + b = b + a and a ∗ b = b ∗ a for all a ∈ F, b ∈ F.
additive and multiplicative identity: a + 0 = a and a ∗ 1 = a for all a ∈ F.
additive and multiplicative inverses: a + (−a) = 0 and b ∗ (1/b) = 1
for all a ∈ F and b ∈ F, b , 0.
distributivity: a ∗ (b + c) = a ∗ b + a ∗ c for all a ∈ F, b ∈ F, c ∈ F.
Upper Bound
Given any non-empty set S ⊂ R then we say that S has an upper bound if
there exists a real number r ∈ R with the property that for any s ∈ S then s 6 r.
Dedekind Completeness
For any non-empty set S ⊂ R that has an upper bound, then S has a least
upper bound.
Definition
A function is an ordered triple of sets f = (Domain, CoDomain, Graph)
where Graph = {(x, y) | x ∈ Domain, y ∈ CoDomain} and
Graph has the property that if (x, y) ∈ Graph and (x, z) ∈ Graph then y = z.
f :A→B
Example
The triple [−1, 1], [−1, 1], {(x, y) | x2 + y2 = 1, x, y ∈ [−1, 1]} is NOT
a function,
however [−1, 1], [0, 1], {(x, y) | x2 + y2 = 1, x ∈ [−1, 1], y ∈ [0, 1]} is a function.
Definition
Given a function,
Example
The function R, R, {(x, y) | y = x3 , x, y ∈ R} is a bijection.
Example
The function below is a bijection.
Example
The function below is a bijection.
1 0
R , R , (x, y) | y = M ∗ x, x, y ∈ R , M =
2 2 2
2 1
Next time:
Taylor Series
Order of Convergence
Condition Number
Stability