Chapter 1
Algorithms Design
and Analysis
Why to learn
algorithms ?
Course Goals
• Learn to design algorithms that are correct
and efficient
• How do we know that:
• an algorithm is correct ? -> correctness proofs
• an algorithm is efficient ? -> analysis of algorithms
(time complexity)
• How to design solutions for new problems ?
• Learning general techniques for design
• Studying a set of well-known algorithms, to serve as
examples of success stories for applying general
design techniques
What does this course talk about?
• Queue, Stack, Linked-list, Tree,
Data Structure:
Heap, Hash-table.
• Sorting, Searching, Pattern
Algorithms:
Matching, Graph.
• Divide and Concorde, Optimization,
Paradigms:
Dynamic, Greedy, Brute-force, Recursion.
• Complexity: Complexity analysis and big
O notation.
All of these will be implemented with Python
Jobs in the Market & Average
Salary
http://www.businessinsider.com/skills-that-can-get-you-hired-
in-us-2016-11
Text Books
Analysis of Algorithms
Measure Performance
Computational thinking consists of the skills to:
• formulate a problem as a computational problem
• construct a good computational solution (i.e. an
algorithm) for the problem or explain why there is no such
solution.
A computational thinker won’t, however, be satisfied with just any
solution: the solution has to be a ‘good’ one. You have already seen
that some solutions for finding a word in a dictionary are much better (in
particular, faster) than others. The search for good computational
solutions is a theme that runs throughout this module. Finally,
computational thinking goes beyond finding solutions: if no good solution
exists, one should be able to explain why this is so.
Analysis of
Algorithms
Input Algorithm Output
10
Our goal:
Analysis of Algorithms
Measure Performance
Measure Space
Complexity
Running Time
• Most algorithms transform
input objects into output best case
objects. average case
worst case
• The running time of an 120
algorithm typically grows 100
with the input size.
Running Time
80
• Average case time is often 60
difficult to determine.
•
40
We focus on the worst case
running time. 20
• Easier to analyze 0
1000 2000 3000 4000
• Crucial to applications Input Size
such as games, finance
and robotics 12
Why discarding average case,
and choose worst case instead?
• An algorithm may run faster on some inputs
than it does on others of the same size. Thus,
we may wish to express the running time of an
algorithm as the function of the input size
• An average-case analysis usually requires
that we calculate expected running times
based on a given input distribution, which
usually involves sophisticated probability
theory. Therefore, we will characterize running
times in terms of the worst case, as a function
of the input size, n, of the algorithm.
• Worst-case analysis is much easier than
average-case analysis, as it requires only the
ability to identify the worst-case input, which is
often simple.
Experimental Studies
• Write a program 9000
implementing the algorithm 8000
• Run the program with inputs 7000
of varying size and 6000
Time (ms)
composition, noting the time 5000
needed: 4000
3000
2000
1000
0
0 50 100
• Plot the results Input Size
14
Limitations of Experiments
• It is necessary to implement the whole
algorithm before conducting any experiment,
which may be difficult.
• Results may not be indicative of the running time
on other inputs not included in the experiment.
• In order to compare two algorithms, the same
hardware and software environments must
be used
15
So we need another way to measure
the performance of the algorithms
• So we need to learn about Theoretical
analysis or Asymptotic analysis.
• Uses a high-level description of the algorithm
instead of an implementation (Pseudo code).
• Characterizes running time as a function of
the input size, n.
• Takes into account all possible inputs.
• Allows us to evaluate the speed of an
algorithm independent of the
hardware/software environment.
Pseudo code
• High-level description of an
algorithm.
• More structured than
English prose.
• Less detailed than a
program.
• Preferred notation for
describing algorithms.
• Hides program design
Big-Oh Notation
• Given functions f(n) and
g(n), we say that f(n) is
O(g(n)) if there are
positive constants
c and n0 such that
f(n) cg(n) for n n0
• Example: 2n + 10 is O(n)
• 2n + 10 cn
• (c 2) n 10
• n 10/(c 2)
• Pick c = 3 and n0 = 10
17
Big-Oh and Growth Rate
• The big-Oh notation gives an upper bound on
the growth rate of a function
• The statement “f(n) is O(g(n))” means that the
growth rate of f(n) is no more than the growth
rate of g(n)
• We can use the big-Oh notation to rank functions
according to their growth rate
f(n) is O(g(n)) g(n) is O(f(n))
g(n) grows Yes No
more
f(n) grows more No Yes 18
Relatives of Big-Oh
For Knowledge only
big-Omega
f(n) is (g(n)) if there is a constant c > 0
and an integer constant n0 1 such that
f(n) c•g(n) for n n0
big-Theta
f(n) is (g(n)) if there are constants c’ > 0
and c’’ > 0 and an integer constant n0 1
such that c’•g(n) f(n) c’’•g(n) for n n0
19
Essential Seven functions to estimate algorithms
performance
g(n) = 1
Print(“Hello
Algorithms”)
Essential Seven functions to estimate algorithms
performance
g(n) = n
for i in
range(0, n):
Print(i)
Essential Seven functions to estimate algorithms
performance
Def power_of_2(a):
x = 0
g(n) = lg n while a > 1:
a = a/2
x = x+1
return x
Essential Seven functions to estimate algorithms
performance
for i in range(0,n):
Def power_of_2(a):
x = 0
while a > 1:
a = a/2
x = x+1
return x
g(n) = n lg n
Essential Seven functions to estimate algorithms
performance
for i in range(0,n):
for j in range(0,n):
print(i*j);
g(n) =
n2
Essential Seven functions to estimate algorithms
performance
for i in range(0, n):
for j in range(0,n):
for k in range(0,n):
print(i*j);
g(n) =
n3
Essential Seven functions to estimate algorithms
performance
def F(n):
if n == 0:
return 0
elif n == 1:
return 1
g(n) = 2n else: return
F(n-1) + F(n-2)
Seven Important Functions
• Seven functions that often
appear in algorithm analysis:
• Constant 1
• Logarithmic log n
• Linear n
• N-Log-N n log n
• Quadratic n2
• Cubic n3
• Exponential 2n
• the slope of the line corresponds
Analysis of Algorithms 27
to the growth rate
Comparison of Two Algorithms
insertion sort is
n2 / 4
merge sort is
2 n lg n
sort a million items?
insertion sort takes
roughly 70 hours
while
merge sort takes
roughly 12 seconds
28
How to calculate the algorithm’s complexity
We may not be able to predict in nanosecond
how long a Python program will take, but can we
estimate the time:
for i in range(0, n):
print(i);
This loop takes time k*n.
k : How long it takes to go through the loop once
n : The number of times of looping
(we can use this as the “size” of the problem)
The total time k*n is linear in n
Constant time
• Constant time means there is some
constant k such that this operation
always takes k nanoseconds
• A Java statement takes constant
time if:
• It does not include a loop
• It does not include calling a
method whose time is
unknown or is not a
constant
• If a statement involves a choice (if
or switch) among operations, each
of which takes constant time, we
consider the statement to take
constant time
• This is consistent with worst-case
30
analysis
Prefix Averages
(Quadratic)
The following algorithm computes prefix
averages in quadratic time by applying the
definition
31
Prefix Averages 2 (Looks
Better)
The following algorithm uses an internal Python
function to simplify the code
Algorithm prefixAverage2 still runs in O(n2) time! 32
Prefix Averages 3 (Linear
Time)
The following algorithm computes prefix
averages in linear time by keeping a running
sum
Algorithm prefixAverage3 runs in O(n) time 33
Activity
Recursion
What Is Recursion?
Recursion is a method of solving problems that
involves breaking a problem down into smaller and
smaller subproblems until you get to a small
enough problem that it can be solved trivially.
Usually recursion involves a function calling itself.
Analysis of recursive algorithm: back substitution method
A(n)
Ex. 1: {
if (n > 1)
return A(n-1)
else
return 1
T(n) = c + T (n-1) (1)
Which means that the time complexity of the time series T(n) is equal to constant
(denoted by c) that represents a constant number of operations that should occur
(for example checking if n is greater than 1 ) in addition to the call of T(n-1)
Using back substitution method:
T(n-1) = c + T(n-2) (2)
T(n-2) = c + T(n-3) (3)
From equations 1, 2, and 3
At step K T(n) = kc + T(n-k)
This T(n) will stop once n-k = 1 which implies that k = n-1
And hence T(n) = (n-1) c + T(n-(n-1)) = n c – c +1 O(n)
Analysis of recursive algorithm:
Ex. 2:
= n + T (n-1) at n>1
T(n) (1)
=1 elsewhere
Using back substitution method:
T(n-1) = n-1 + T(n-2) (2)
T(n-2) = n-2 + T(n-3) (3)
From equations 1, 2, and 3
At step K T(n) = n + (n-1) + (n-2)+ … + (n-k) + T(n-(k+1))
This T(n) will stop once n-k -1= 1 which implies that k = n-2
And hence T(n) = n + (n-1) + (n-2)+ … + (n-k) + 1
T(n) = n + (n-1) + (n-2)+ … + 2+ 1 sum of arithmetic series
= n(n+1)/2 O(n^2)
The Master Theorem
• Given: divide and conquer algorithm
• An algorithm that divides the problem of size n into a subproblems,
each of size n/b
• Let the cost of each stage (i.e., the work to divide the problem +
combine solved subproblems) be described by the function f(n)
• Then, the Master Theorem gives us a cookbook for the
algorithm’s running time:
Basic Idea
a problem of size n
subproblem 1 subproblem 2
of size n/2 of size n/2
a solution to a solution to
subproblem 1 subproblem 2
a solution to
the original problem
40
General Divide-and-Conquer Recurrence
Examples :
T(n) = aT(n/b) + f (n ), d T(n)
d
0 = T(n/2) + n
•
Here a = 1, b = 2, d = 1, a < bd
Master Theorem: O(n )
If a < bd O(nd)
If a = bd • T(n) = 2T(n/2) + 1
O(nd logb (n))
If a > bd O(nlogb (a) ) Here a = 2, b = 2, d = 0, a > bd
O(n log
2
2
) = O(n )
41
Examples
• T(n) = T(n/2) + 1
Here a = 1, b = 2, d = 0, a = bd
O(log(n) )
• T(n) = 4T(n/2) + n
Here a = 4, b = 2, d = 1, a > bd
O(n log
2
4
) = O(n 2
)
• T(n) = 4T(n/2) + n2
Here a = 4, b = 2, d = 2, a = bd
O(n2 log n)
• T(n) = 4T(n/2) + n3
Here a = 4, b = 2, d = 3, a < bd
O(n3)
42
Analysis of recursive algorithm
Ex.:
= 2 T (n / 2) + c at n>1
T(n) (1)
=1 elsewhere
Solution O(n)
Analysis of recursive algorithm: recursion tree method
Ex. :
= 2 T (n / 2) + n at n>1
T(n) =1 elsewhere
Solution O(n log n )