Algorithms & Data Structures
• Professor Reza Sedaghat
• COE428: Engineering Algorithms & Data Structures
•Email address:
[email protected]• Course outline: www.ee.ryerson.ca/~courses/COE428/
• Course References:
1) Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Rivest. Introduction to Algorithms, MIT, 2002, ISBN: 0‐07‐
013151‐1 (McGraw‐Hill) (Course Text)
57
2.3.2 Analysis of Merge Sort Algorithm
• How can we determine the time required to perform this algorithm?
• Time to sort n numbers
T(n) = Time to divide the number + Tdivide
Time to sort left side (size = n/2) + T(n/2)
Time to sort right side (size = n/2) + T(n/2)
Time to merge (total size = n/2 + n/2 = n) c 1n
• Assumptions:
• Let T(n) represent the time to sort n numbers
• Let Tdivide be the time to divide the numbers and assume Tdivide = 0
• Let c1n +c0 be the time to merge n numbers, which is a linear algorithm
and assume and c0 = 0
T(n) =Tdivide + T(n/2) + T(n/2) + c1n + c0 = 2T(n/2) + n
58
Analysis of Merge Sort Algorithm
T(n) =Tdivide + T(n/2) + T(n/2) + c1n + c0 = 2T(n/2) + cn
• Let c be a constant that describes the running time for the base case and also
is the time per array element for the divide and merge steps.
• We rewrite the recurrence as
Growth?
T(n) = 2T(n/2) + cn
T(n/2): ? n: Linear
59
Recurrence for merge sort
(1) if n = 1;
T(n) =
2T(n/2) + (n) if n > 1.
60
Recursion tree
• Shows successive expansions of the recurrence.
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
61
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
T(n)
62
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
T(n/2) T(n/2)
63
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/2 cn/2
T(n/4) T(n/4) T(n/4) T(n/4)
64
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/2 cn/2
cn/4 cn/4 cn/4 cn/4
(1)
65
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/2 cn/2
h = lg n cn/4 cn/4 cn/4 cn/4
(1)
66
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn cn
cn/2 cn/2
h = lg n cn/4 cn/4 cn/4 cn/4
(1)
67
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn cn
cn/2 cn/2 cn
h = lg n cn/4 cn/4 cn/4 cn/4
(1)
68
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn cn
cn/2 cn/2 cn
h = lg n cn/4 cn/4 cn/4 cn/4 cn
…
(1)
69
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn cn
cn/2 cn/2 cn
h = lg n cn/4 cn/4 cn/4 cn/4 cn
…
(1) #leaves = n (n)
70
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn cn
cn/2 cn/2 cn
h = lg n cn/4 cn/4 cn/4 cn/4 cn
…
(1) #leaves = n (n)
Total (n lg n)
71
• Analysis of Merge Sort Alg.
• Continue expanding until the
problem sizes get down to 1:
• Each level has cost cn.
•The top level has cost cn.
• The next level down has 2
subproblems, each contributing cost
cn/2.
• The next level has 4 subproblems,
each contributing cost cn/4.
• Each time we go down one level,
the number of subproblems doubles
but the cost per subproblem halves
cost per level stays the same.
72
1 2 2 3 4 5 6 6
merge
2 4 5 6 1 2 3 6
merge merge
2 5 4 6 1 3 2 6
merge merge merge merge
5 2 4 6 1 3 2 6
73
L (Level)
1 2 2 3 4 5 6 6 0 n / 2L = 8 / 20 = 8
merge
2 4 5 6 1 2 3 6 1 n / 2L = 8 / 21 = 4
merge merge
2 5 4 6 1 3 2 6
2 n / 2L = 8 / 22 = 2
merge merge merge merge
5 2 4 6 1 3 2 6 3 n / 2L = 8 / 23 = 1 n / 2L = 1
n = 2L
lg2 n = L
lg n
74
• Complexity
• Algorithms can be classified as linear, quadratic, logarithmic,
etc. depending on the time they take as a function of a number
denoting the size of the problem.
• The insertion sort algorithm (worst-case) was quadratic with
a merge sort of n lg n complexity
• The task is to compare the time or space complexity of
different algorithms for large sizes of inputs
• Assume that the time (or space) required for performing the
algorithm is some function of n, which is the size of the input
• The evaluation of the comparative performance of the two
algorithms for large problem sizes can be calculated as
follows:
f ( n)
lim
n g ( n)
75
• Complexity
1) 2) 3)
f ( n) f ( n) f ( n)
lim 0 lim C
lim
n g ( n ) n g ( n ) n g ( n )
1) The limit is . In this case, we say that f(n) grows
asymptotically much faster than g(n) (or, equivalently, that g(n)
grows much slower than f(n)
2) The limit is 0. This is the opposite of the first case; f(n) grows
much slower than g(n) (or, equivalently, that g(n) grows much
faster)
3) The limit is some constant. In this case, both f(n) and g(n) grow
at roughly the same rate (asymptotically)
76
• Complexity
• Definition: -Notation
• For a given function g(n) the set of function is denoted by (g(n))
• (g(n)) = { f(n) : positive constants c1, c2, and n0 exist such that
0 c1 g(n) f(n) c2 g(n) for all n n0 }
• A function f(n) belongs to the set (g(n)) if positive constants c1 and c2
exist such that it can be “sandwiched” between c1 g(n) and c2 g(n), for
sufficiently large n.
• Because (g(n)) is a set, we could write “f(n) (g(n))” to indicate that
f(n) is a member of (g(n))
77
• Complexity, -Notation
• Instead, “f(n) = (g(n))” is used to worst-case
express the same notion
• The figure gives an intuitive
diagram of functions f(n) and g(n) for
f(n) = (g(n))”
• For all values of n to the right of n0,
the value of f(n) lies at or above c1 best-case
g(n) and at or c1 g(n)
• In other words, for all n n0, the
function f(n) is equal to g(n) to within
a constant factor. The g(n) is an
asymptotically tight bound for f(n)
(g(n)) = { f(n) : 0 c1 g(n) f(n) c2 g(n) for all n n0 }
78
Example
f(n) =
1 2 !
n – 3n = (n2)
2
• Determine c1, c2, and n0
• To be proven: Is g(n) is an asymptotically tight bound for f(n)
0 c1 g(n) f(n) c2 g(n) for all n n0
1 2
c1n2 n – 3n c2n2
2
• Remember that in the function the largest component gives the running time,
here we drop lower term 3n
1 2 1
n c2 n2, assuming if n0 1 on the right side (upper bound) gives c2
2 2
1 3
• Dividing by n2 gives c1 – c2
2 n
1 3
• On the left side, lower bound, (c1 – ) if n = 6 c1 = 0, which is
2 n
invalid
1
• So, n must be n0 7 and results in c1
14 79
• Complexity
• Definition: O-Notation
• Remember that the -notation
asymptotically bounds a function
from above & below
• The O-Notation is an asymptotic
upper bound as illustrated
• For a given function g(n) the set of
function is denoted by O(g(n))
O(g(n)) = { f(n) : the positive constants c, and n0 exist such that
0 f(n) c g(n) for all n n0 }
80
• Complexity
• Definition: -Notation
• Remember that the O-notation
provides an asymptotic upper
bound on a function
• The -Notation is an asymptotic
lower bound as illustrated below
• For a given function g(n) the set
of function is denoted by (g(n))
(g(n)) = { f(n) : the positive constants c, and n0 exist such that
0 c g(n) f(n) for all n n0 }
81
• Complexity
• Summary:
0 c1 g(n) f(n) c2 g(n) for all n n0
0 f(n) c g(n) for all n n0
0 c g(n) f(n) for all n n0
82