LECTURE 05
DIVIDE-AND-CONQUER (PART 2)
Learning Outcome
For today’s lecture
Review Big-Oh
Divide-and-conquer paradigm
Review Merge-sort
Recurrence Equations
Iterative substitution
Recursion trees
The master method
Integer Multiplication
Quick-select
Recap: Big Oh
You’ve learnt about why we use Big Oh to represent
algorithms – to predict its runtime and also to select a
better/faster algorithm
Big Oh is simple (relatively) to derive for majority of
algorithms
However, some sorting functions we’ve covered so far
make Big Oh derivations not so simple
Recap: Divide-and-conquer
Divide-and-conquer is a general algorithm design
paradigm:
Divide: divide the input data S in two or more
disjoint subsets S1, S2, …
Recur: solve the subproblems recursively
Conquer: combine the solutions for S1, S2, …, into
a solution for S
The base case for the recursion are subproblems of
constant size
Analysis can be done using
recurrence equations
Recap: Mergesort
Mergesort on input sequence S with n elements
consists of three steps:
Divide: partition S into two sequences S1 and S2 of
about n/2 elements each
Recur: recursively sort S1 and S2
Conquer: merge S1 and S2 into a sorted sequence
Mergesort Review
Input Parameters: array a, start index p, end index r.
Output Parameter: array a sorted.
Mergesort (a, p, r) {
// continue if has more than one element.
if (p < r) {
// Divide: divide a into two nearly equal
parts.
m = (p + r) / 2
// Recur: sort each half.
Mergesort (a, p, m)
Mergesort (a, m + 1, r)
// Conquer: merge the two sorted halves.
Merge (a, p, m, r)
}
Recurrence Equation Analysis
The conquer step of mergesort consists of merging
two sorted sequences, each with n/2 elements and
implemented by means of a doubly linked list, takes at
most c.n steps, for some constant c.
Recurrence Equation Analysis
Likewise, the basis case (n < 2) will take
at most c steps.
Therefore, if we let T(n) denote the running time of
merge-sort:
Recurrence Equation Analysis
We can therefore analyze the running time of merge-
sort by finding a closed form solution to the above
equation.
That is, find a solution that has T(n) only on the left-
hand side
the question now is how?!?
Iterative Substitution
In the iterative substitution, or plug-and-chug technique, we
iteratively apply the recurrence equation to itself and see if we
can find a pattern:
𝑇(𝑛)=2𝑇(𝑛/2)+𝑐𝑛
Note that base case, T(n)=c occurs when 2i=n. That is, i=log n.
So,
𝑇 (𝑛)=𝑐 𝑛+𝑐 𝑛 log 𝑛
Thus, T(n) is O(n log n).
Iterative Substitution
𝑇(𝑛)=2𝑇(𝑛/2)+𝑐𝑛
Iterative Substitution
Note that base case,T(n)=c occurs when
2 =n. That is, i=log n. So,
i
𝑇 (𝑛)=𝑐 𝑛+𝑐 𝑛 log 𝑛
Thus, T(n) is O(n log n).
Iterative Substitution
i i
T (n) 2 T (n / 2 ) ibn
nT (n / n) bn log n 2 log n
=n
nT (1) bn log n
bn bn log n T(1) = b
The Recursion Tree
Draw the recursion tree for the recurrence relation
and look for a pattern:
𝑇 (𝑛)=
{ 𝑐 if 𝑛<2
2 𝑇 (𝑛/2)+𝑐 𝑛 if 𝑛≥ 2
time
depth T’s size
0 1 n cn
1 2 n/2 cn
i 2i n/2i cn
… … … …
Total time = cn + cn log n
(last level plus all previous levels)
Master Method
The master method is a cookbook method for solving
recurrences
Although it can handle many recurrences, it cannot
solve all of them
To begin the method, notice that many divide-and-
conquer recurrence equations have the form:
𝑇 (𝑛)=
{ 𝑏 if 𝑛<𝑑
𝑎𝑇 (𝑛/𝑏)+ 𝑓 (𝑛) if 𝑛≥ 𝑑
where a and b are constants, a > 0 and b > 1 and f is
a function of n
Master Method
The Master Theorem:
~ Case 1
~ Case 2
~ Case 3
Note: Case 2 is sometimes written as just Θ(nlogba)
with T(n) = Θ(nlogbalog n)
Master Methods
3 possible cases for the running time:
influence by the cost of leave (),
evenly distributed throughout the tree () or
influence by the cost of the root ()
ɛ is a constant where ɛ > 0
The worst case is the time is bounded by the divide
portion, i.e. the sub problems (aT(n/b)) has a huge
impact. This means that
The average case is: divide and merge have the same
impact
The best case is: divide has a very low impact where the
impact is equal to the merge f(n)
To Apply the Master Method
1. Extract a, b and f(n) from a given recurrence.
2. Determine (just fill up the value of a and b)
3. Compare f(n) and asymptotically
4. Determine the appropriate master method case
and apply it
Master Method, Example 1
The form: c if n d
T ( n )
aT ( n / b) f ( n ) if n d
The Master Theorem:
1. if f (n) is O(n logb a ), then T (n) is (n logb a )
2. if f (n) is (n logb a log k n), then T (n) is (n logb a log k 1 n)
3. if f (n) is (n logb a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.
Example:
T (n) 4T (n / 2) n
Solution: logba=2, so case 1 says T(n) is O(n2).
Master Method, Example 2
The form: c if n d
T ( n )
aT ( n / b) f ( n ) if n d
The Master Theorem:
1. if f (n) is O(n logb a ), then T (n) is (n logb a )
2. if f (n) is (n logb a log k n), then T (n) is (n logb a log k 1 n)
3. if f (n) is (n logb a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.
Example:
T (n) 2T (n / 2) n log n
Solution: logba=1, so case 3 says T(n) is O(n log2 n).
Master Method, Example 3
The form: c if n d
T ( n )
aT ( n / b) f ( n ) if n d
The Master Theorem:
1. if f (n) is O(n logb a ), then T (n) is (n logb a )
2. if f (n) is (n logb a log k n), then T (n) is (n logb a log k 1 n)
3. if f (n) is (n logb a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.
Example:
T (n) T (n / 3) n log n
Solution: logba=0, so case 3 says T(n) is O(n log n).
Master Method, Example 4
The form: c if n d
T ( n )
aT ( n / b) f ( n ) if n d
The Master Theorem:
1. if f (n) is O(n logb a ), then T (n) is (n logb a )
2. if f (n) is (n logb a log k n), then T (n) is (n logb a log k 1 n)
3. if f (n) is (n logb a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.
Example:
2
T (n) 8T (n / 2) n
Solution: logba=3, so case 1 says T(n) is O(n3).
Master Method, Example 5
The form: c if n d
T ( n )
aT ( n / b) f ( n ) if n d
The Master Theorem:
1. if f (n) is O(n logb a ), then T (n) is (n logb a )
2. if f (n) is (n logb a log k n), then T (n) is (n logb a log k 1 n)
3. if f (n) is (n logb a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.
Example:
3
T (n) 9T (n / 3) n
Solution: logba=2, so case 3 says T(n) is O(n3).
Master Method, Example 6
The form: c if n d
T ( n )
aT ( n / b) f ( n ) if n d
The Master Theorem:
1. if f (n) is O(n logb a ), then T (n) is (n logb a )
2. if f (n) is (n logb a log k n), then T (n) is (n logb a log k 1 n)
3. if f (n) is (n logb a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.
Example:
T (n) T (n / 2) 1
(binary search)
Solution: logba=0, so case 2 says T(n) is O(log n).
Master Method, Example 7
The form: c if n d
T ( n )
aT ( n / b) f ( n ) if n d
The Master Theorem:
1. if f (n) is O(n logb a ), then T (n) is (n logb a )
2. if f (n) is (n logb a log k n), then T (n) is (n logb a log k 1 n)
3. if f (n) is (n logb a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.
Example:
T (n) 2T (n / 2) log n (heap
construction)
Solution: logba=1, so case 1 says T(n) is O(n).
Master Method, Example 8
The form: c if n d
T ( n )
aT ( n / b) f ( n ) if n d
The Master Theorem:
1. if f (n) is O(n logb a ), then T (n) is (n logb a )
2. if f (n) is (n logb a log k n), then T (n) is (n logb a log k 1 n)
3. if f (n) is (n logb a ), then T (n) is ( f (n)),
provided af (n / b) f (n) for some 1.
Example:
T (n) 2T (n 2) log n
Cannot be solved using master method because T(n) is
not in the form of aT(n/b)+f(n).
Exercise
For each of the following recurrences, give the Big-oh
expression for the runtime T(n) if it can be solved using the
master method – otherwise indicate that it cannot be solved
and justify why
1. T(n) = T(n/2)+2n
2. T(n) = 16T(n/4)+n
3. T(n) = 2T(n/4)+n0.51
4. T(n) = 3T(n/3)+n/2
5. T(n) = 3T(n/3)+√n
6. T(n) = 64T(n/8) – n2logn
7. T(n) = 4T(n/2) + n/log n
8. T(n) = T(n+2) + n2
9. T(n) = 9T(n/3) + n
10. T(n) = T(n/2) + √n
MORE DIVIDE AND CONQUER ALGORITHM
Applying Master Method to Identify Running Complexity
Integer Multiplication
Old school method of performing multiplication
123 × 45 ?
What if you have 1234 x 5678?
123
To multiply 123*5, start from 3*5 = 15, 1
X 45
will then carry forward. Then, 2*5, we
615 need to add in 1 and so on.
n rows
492
This means that there are two operations:
5535 multiply and addition. This causing each
row will have ≤ 2n operations.
After complete obtaining the result of
each row, we sum the digits columns by
columns, we get another 2n operations.
Thus, we have 4n2 operations.
Integer Multiplication
Multiplication of two n-digit integers has O(n2) time
complexity
This is because we are performing the operation
one digit at a time from least significant digit (LSD)
to most significant digit (MSD)
Is it possible to get a multiplication algorithm with
better asymptotic complexity?
Try divide-and-conquer?
Integer Multiplication (cont.)
Take the two integers and divide to solve them
123 = (12×10)+3
45 = (4×10)+5
Then multiply (i.e. ‘merge’ them)
123×45 = ([12×10]+3)([4×10]+5)
= (12×4×102) + ([12×5+4×3]×10)+
(3×5)
This is called Karatsuba algorithm, a method that was
discovered by Anatoly Karatsuba in 1960
It is equivalence to ax +bx+c.
2
E.g.: (x+a)(x+b) = x2 + (a+b)x
+ ab
Integer Multiplication (cont.)
Note how the integers can be represented as
polynomials
num1 = x1.10n/2+y1 num1 = x2.10n/2+y2
So the result of multiplication is
num1.num2 = x1.x2.10n + (x1.y2 + x2.y1)10n/2 + y1.y2
Also fits into polynomial multiplication
ax2+bx+c
4*T(n/2) because it
has 4 multiplication
Recursion
Let T(n) be the running time to multiply two n-digit
numbers, a and b. Assume length(a) = length(b)
Algorithm Multiply (a, b)
If length(a) <= 1 then
return a*b
Partition a, b into a = and b=
A = Multiply(x1,x2)
B = Multiply(y1,x2)
C = Multiply(x1,y2)
D = Multiply(y1,y2)
Return
Time Complexity
Instead of n-times multiplication, we’ve reduced it to
just 4 multiplication– but is it any better?
The running complexity T(n) is actually total
complexity to solve each portion, i.e. T(n) =
4.T(n/2)+n
An Improved Integer Multiplication
Algorithm
Karatsuba algorithm attempts to reduce the multiplication
operations required
Note that
(a+b)(c+d) = a.c + b.d + a.d + b.c
However, it can also be written as
(a+b)(c+d) = a.c + b.d + (a+b)(c+d) – a.c – b.d
Reuse!
So we can change our original multiply equation to
num1.num2 = x1.x2.10n + (x1.y2 + y1.x2)10n/2 + y1.y2
= x1.x2.10n + [(x1+ y1)(x2 +y2) - x1. x2 - y1.y2]10n/2 + y1.y2
Integer Multiplication (D&C)
Four multiply operations
So, T(n) = 4T(n/2) + n
Is this the best?
Time Complexity of the new method
A*n indicate the
merging/addition cost at layer
0
From layer 0 till layer i, where i = log2 n, the pattern of
T(n) is:
Where
No different from the old school formula
Can we further improve it?
How to improve?
Few consideration:
Merging faster: not suitable for integer
multiplication
Make subproblems smaller: this will result in more
subproblems for integer multiplication and causing
the algorithm becomes more complicated
Decrease the number of sub problems.
An Improved Integer Multiplication
Algorithm
Karatsuba algorithm attempts to reduce the multiplication
operations required
Note that
(a+b)(c+d) = a.c + b.d + a.d + b.c
However, it can also be written as
(a+b)(c+d) = a.c + b.d + (a+b)(c+d) – a.c – b.d
Reuse!
So we can change our original multiply equation to
num1.num2 = x1.x2.10n + (x1.y2 + y1.x2)10n/2 + y1.y2
= x1.x2.10n + [(x1+ y1)(x2 +y2) - x1. x2 - y1.y2]10n/2 + y1.y2
An Improved Integer Multiplication Algorithm
Algorithm Multiply (a, b)
If length(a) <= 1 then
return a*b
Partition a, b into a = and b=
A = Multiply(x1,x2)
B = Multiply(y1,y2)
C = Multiply(x1+y1,x2+y2)
Return
Time Complexity of the Improved Algorithm
So, T(n) = 3T(n/2) + n, which implies T(n) is
O(nlog23), by the Master Theorem.
Thus, T(n) is O(n1.585).
Example
Assume we have 13 * 45
QUICKSELECT
Quick-Select
Quick-select is a randomized selection algorithm
based on the prune-and-search paradigm:
Prune: pick a random element x (called pivot) and
partition S into
L elements less than x
E elements equal x
G elements greater than x
Search: depending on k, either
answer is in E, or we need to
recurs in either L or G
Algorithm Quick-Select
Input Parameters: array a, start index p, end index r,
target smallest k.
Output Parameter: a[pi] at the correct position.
QuickSelect (a, p, r, k) {
if (p <= r) {
pi = Partition (a, p, r) // pivot index.
if (k == pi)
return a[pi]
if (k < pi)
QuickSelect (a, p, pi-1, k)
else
QuickSelect (a, pi+1, r, k)
}
}
Partition
The partition step of Quick-Select is the same
partition in Quick-Sort which takes O(n) time.
Based on Probabilistic Facts, Quick-Select runs in
time O(n).
Quick-Select has the same worst case as Quick-Sort
which takes O(n2) time.
Quick-Select Visualization
Find 5th smallest number, hence k = 4
In this example, pivot is always the last element.
S=(77 99 22 66 55 44 11 88 33), call
partition
pi=2, S=(22 11 33 77 99 66 55 44 88), k>pi, call partition on
right
pi=7, S=(22 11 33 77 66 55 44 88 99), k<pi, call partition on
left
pi=3, S=(22 11 33 44 77 66 55 88 99), k>pi, call partition on
right
pi=4, S=(22 11 33 44 55 77 66 88 99), k=pi,
stop
5th smallest =
55
Another Quick-Select Example
Find k=8th smallest element from the following array
65 28 59 33 21 56 22 95 50 12 90 53 28 77 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Random pivot is selected
65 28 59 33 21 56 22 95 50 12 90 53 28 77 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Perform partition function
28 33 21 56 22 50 12 53 28 39 59 65 95 90 77
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Pivot is greater than target k, repeat partition function
on the ‘lesser’ side
Based on example from Algorithm Design, Pearson-Addison Wesley – J. Kleinberg
Another Quick-Select Example
Find random pivot from L section
28 33 21 56 22 50 12 53 28 39 59 65 95 90 77
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Perform partition function with pivot
21 22 12 28 28 33 56 50 53 39 59 65 95 90 77
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Pivot is lesser than target k, repeat partition function
on the ‘greater’ side
21 22 12 28 28 33 56 50 53 39 59 65 95 90 77
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Random pivot
21 22 12 28 28 33 56 50 53 39 59 65 95 90 77
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Another Quick-Select Example
Partition
21 22 12 28 28 33 39 50 56 53 59 65 95 90 77
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Not necessary to repeat partition function because the
elements are the ninth and tenth (we want the 8th)
kth element is 50
How does this compare with the fastest sorting
algorithm?
Comparison
As the data size increases, the impact of quick-select
is more obvious
Source : http://blog.teamleadnet.com/2012/07/quick-select-algorithm-find-kth-element.html