Aad Mod3
Aad Mod3
Module III
Lecturer: Dr. Reshmi R Class: CSE-A&B
Syllabus:
Divide & Conquer and Greedy Strategy :
Divide & Conquer The Control Abstraction of Divide and Conquer-2-way Merge sort,Strassen’s Algorithm for Matrix
Multiplication-Analysis.
Greedy Strategy The Control Abstraction of Greedy Strategy- Fractional Knapsack Problem, Minimum Cost Spanning
Tree Computation- Kruskal’s Algorithms - Analysis, Single Source Shortest Path Algorithm - Dijkstra’s Algorithm-
Analysis.
Contents
2
FIS
1.1.2 Control Abstraction of DAndQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1
1.1 DIVIDE & CONQUER
Given a function to compute on n inputs, the divide-and-conquer strategy splits the inputs into k distinct subsets,
1 < k ≤ n, yielding k subproblems. These subproblems must be solved, and then a method must be found to combine
these subsolutions into a solution of the whole.
If the sub problems are still relatively large, then the divide-and-conquer strategy can possibly be reapplied.
1. Degenerated until eventually subproblems that are small enough to be solved without splitting are produced.
2. Often the subproblems resulting from a divide and conquer design are of the same type as the original problem.
3. Now smaller and smaller subproblems of the same kind are generated until eventually subproblems that are small
1.1.2 AT
enough to be solved without splitting are produced.
A control abstraction is a procedure whose flow of control is clear but whose primary operations are specified by other
procedures whose precise meanings are left undefined.
FIS
Explanation:
DAndC is initially invoked as DAndC(P), where P is the problem to be solved. Small(P) is a Boolean-valued function
that determines whether the input size is small enough that the answer can be computed without splitting. lf this is
so, the function S is invoked. Otherwise the problem P is divided into smaller subproblems. These subproblems
P1 , P2 , ..., Pk are solved by recursive applications of DAndC. Combine is a function that determines the solution to
P using the solutions to the k subproblems. lf the size of P is n and the sizes of the k subproblems are n1 , n2 , ..., nk
respectively, then the computing time of DAndC is described by the recurrence relation :
g(n) for small n
T (n) =
T (n ) + T (n ) + ... + T (n ) + f (n)
1 2 k otherwise
where T (n) is the time for DAndC on any input of size n and g(n) is the time to compute the answer directly for small
inputs. The function f (n) is the time for dividing P and combining the solutions to subproblems.
where a and b are known constants. Assume that T (1) is known and n is a power of b (i.e., n = bk ).
Example:
Consider the case in which a = 2 and b = 2. Let T (1) = 2 and f (n) = n. We have,
T (n) = 2T ( n2 ) + n
= 2[2T n4 + n2 ] + n = 4T ( n4 + 2n)
= 4[2T n8 + n4 ] + n = 8T ( n8 + 3n)
ln general, we see that T (n) = 2i T ( 2ni ) + in, for any log2 n ≥ i ≥ 1. ln particular, then,
T (n) = 2log2 n T ( 2logn2 n ) + n log2 n, corresponding to the choice of i = log2 n. Thus, T (n) = nT (1) + n log2 n =
n log2 n + 2n.
Merge sort is an ideal example of the divide-and-conquer strategy in which the splitting is into two equal-sized sets and
the combining operation is the merging of two sorted sets into one.
Merge Sort
(310 | 285 | 179 | 652, 351 | 423, 861, 254, 450, 520)
At this point the algorithm has returned to the first invocation of MergeSort and is about to process the second re-
cursive call. Repeated recursive calls are invoked producing the following subarrays:
(179, 285, 310, 351, 652 | 423 | 861 | 254 | 450, 520)
FIS
Elements a[6] and a[7] are merged. Then a[8] is merged with a[6 : 7]:
(179, 285, 310, 351, 652 | 254, 423, 861 | 450, 520)
Next a[9] and a[10] are merged, and then a[6 : 8] and a[9 : 10]:
(179, 285, 310, 351, 652 | 254, 423, 450, 520, 861)
At this point there are two sorted subarrays and the final merge produces the fully sorted result
(179, 254, 285, 310, 351, 423, 450, 520, 652, 861) Note: The pair of values in each node are the values
of the parameters low and high.
Complexity Analysis :
If the time for the merging operation is proportional to n, then the computing time for merge sort is described by the
recurrence relation:
a n = 1, a is a constant
T (n) =
2T ( n ) + cn n > 1, c is a constant
2
Let A and B be two nxn matrices. The product matrix C = AB is also an nxn matrix whose i, j th element is formed
by taking the elements in the ith row of A and the j th column of B and multiplying them to get,
C(i, j) = Σ1≤k<n A(i,k) .B(k,j)
for all i and j between 1 and n. To compute Ci,j using this formula, we need n multiplications. As the matrix C has
n2 elements, the time for the resulting matrix multiplication algorithm, which we refer to as the conventional method
is θ(n3 ).
1.3.1
AT
Matrix multiplication - Divide & Conquer
The divide-and-conquer strategy suggests another way to compute the product of two nxn matrices.
Assume that n is a power of 2, that is, that there exists a nonnegative integer k, such that n = 2k . In-case n, is not a
power of two, then enough rows and columns of zeros can be added to both A and B so that the resulting dimensions
are a power of two.
FIS
Imagine that A and B are each partitioned into four square sub matrices, each submatrix having dimensions 2x2.
n n
Then the product AB can be computed by using the above formula for the product of 2x2 matrices.
If AB is :
If n = 2, then above formulas are computed using a multiplication operation for the elements of A and B. These ele-
ments are typically floating point numbers. For n > 2, the elements of C can be computed using matrix multiplication
and addition operations applied to matrices of size n2 x n2 . Since n is a power of 2, these matrix products can be recur-
sively computed by the same algorithm which is used for the nxn case. This algorithm will continue applying itself to
smaller-sized submatrices until n becomes suitably small (n = 2) so that the product is computed directly.
To compute AB, need to perform eight multiplications of 2x2
n n
matrices and four additions of 2x2
n n
matrices. Since
two 2x2
n n
matrices can be added in time cn for some constant c, the overall computing time T (n) of the resulting
2
This recurrence can be solved in the same way as earlier recurrences to obtain T (n) = O(n3 ). Hence no improvement
over the conventional method has been made.
Matrix multiplications are more expensive than matrix additions (O(n3 ) versus O(n2 )), so need to reformulate the
equations for Cij so as to have fewer multiplications and possibly more additions.
Volker Strassen has discovered a way to compute the Cij using only 7 multiplications and 18 additions or subtractions.
His method involves first computing the seven n2 x n2 matrices P, Q, R, S, T, U, and V. Then the Cij are computed using
below formulas. P, Q, R, S, T, U, and V can be computed using 7 matrix multiplications and 10 matrix additions
AT
or subtractions. The Cij , require an additional 8 additions or subtractions.
FIS
The resulting recurrence relation for T(n) is:
b n ≤ 2, b is a constant
T (n) =
7T ( n ) + an2 n > 2, a is a constant
2
Note : Strassen’s method for matrix multiplication is not preferred for practical applications because
• The constants used in Strassen’s method are high and for a typical application Naive method works better.
• For Sparse matrices, there are better methods especially designed for them.
The greedy method suggests that one can devise an algorithm that works in stages, considering one input at a time. At
each stage, a decision is made regarding whether a particular input is in an optimal solution. This is done by considering
the inputs in an order determined by some selection procedure.
lf the inclusion of the next input into the partially constructed optimal solution will result in an infeasible solution, then
this input is not added to the partial solution. Otherwise, it is added.
The selection procedure itself is based on some optimization measure. This measures are objective function.
AT
All problems have n inputs and require us to obtain a subset that satisfies some constraints. Any subset that satis-
fies these constraints is called a feasible solution.
A feasible solution that either maximizes or minimizes a given objective function is called an optimal solution.
Several different optimization measures may be plausible for a given problem, which will result in algorithms that
generate suboptimal solutions. This version of the greedy technique is called the subset paradigm.
In certain greedy methods, decisions are made by considering the inputs in some order. Each decision is made using
an optimization criterion that can be computed using decisions already made.This version of the greedy method the is
called ordering paradigm.
FIS
The function Select selects an input from a[] and removes it. The selected input’s value is assigned to x. Feasible is
a Boolean-valued function that determines whether at can be included into the solution vector. The function Union
combines at with the solution and updates the objective function.
Given n objects and a knapsack or bag. Object i has a weight wi and the knapsack has a capacity m. If a fraction xi,
0 ≤ xi ≤ 1, of object i is placed into the knapsack, then a profit of pi xi is earned. The objective is to obtain a filling
of the knapsack that maximizes the total profit earned. Since the knapsack capacity is m, we require the total weight
AT
(18, 15, 10). Four feasible solutions are:
1. Fill the knapsack by including next the object with largest profit. If an object under consideration doesn’t fit, then
a fraction of it is included to fill the knapsack. Thus each time an object is included (except possibly when the
last object is included) into the knapsack, we obtain the largest possible increase in profit value.
In the above example, object one has the largest profit value (p1 = 25). So it is placed into the knapsack first. Then
x1 = 1 and a profit of 25 is earned. Only 2 units of knapsack capacity are left. Object two has the next largest profit
(p2 = 24). However, w2 = 15 and it doesn’t fit into the knapsack. Using x2 = 2
15 fills the knapsack exactly with part
of object 2 and the value of the resulting solution is 28.2. This is solution 2 and it is readily seen to be suboptimal.
Note: This method used to obtain the solution is termed a greedy method because at each step (except pos-
sibly the last one) we chose to introduce that object which would increase the objective function value the most.
However, this greedy method did not yield an optimal solution.
2. Consider the objects in order of non-decreasing weights wi . In the above example,solution 3 gives the results.
This too is suboptimal. This time, even though capacity is used slowly, profits aren’t coming in rapidly enough.
3. Next greedy strategy gives an algorithm that strives to achieve a balance between the rate at which profit increases
and the rate at which capacity is used. At each step, include that object which has the maximum profit per unit
Disregarding the time to initially sort the objects, each of the three strategies outlined above requires only O(n) time.
When one applies the greedy method to the solution of the knapsack problem, there are at least three different measures
one can attempt to optimize when determining which object to include next. These measures are total profit, capacity
used, and the ratio of accumulated profit to capacity used.
Example:
AT Algorithm - Fractional Knapsack problem
For the given set of items and the knapsack capacity of 10kg, find the subset of the items to be added in the knapsack
such that the profit is maximum.
FIS
item no 1 2 3 4 5
Weight(kg) 3 3 2 5 1
Profits 10 15 10 20 8
Solution :
Step 1:
Given, n = 5
Wi = {3, 3, 2, 5, 1}
Pi = {10, 15, 10, 20, 8}
Calculate Pi
Wi for all the items:
item no 1 2 3 4 5
Weight(kg) 3 3 2 5 1
Profits 10 15 10 20 8
Pi
Wi 3.3 5 5 4 8
Step 2:
Arrange all the items in descending order based on Pi
Wi
Step 3:
Without exceeding the knapsack capacity, insert the items in the knapsack with maximum profit.
Knapsack = {5, 2, 3}
However, the knapsack can still hold 4kg weight, but the next item having 5kg, weight will exceed the capacity. There-
fore, only 4kg weight of the 5kg will be added in the knapsack. Hence, the knapsack holds the
item no 5 2 3 4 1
Weight(kg) 1 3 2 5 3
Profits 8 15 10 20 10
Pi
Wi 8 5 5 4 3.3
Knapsack 1 1 1 4
5 0
1.6
AT
Maximum Profit,(Σ1≤i≤n Pi ) = [(1 ∗ 8) + (1 ∗ 15) + (1 ∗ 10) + (4/5 ∗ 20)] = 37.
Solution Vector (X) = {x1 , x2 , x3 , x4 , x5 } = {0, 1, 1, 45 , 1}
Definition 1 Let G = (V E) be an undirected connected graph. A sub graph t = (V, E 0 ) of G is a spanning tree of G
iff 0 t0 is a tree.
FIS
An undirected graph and three of its spanning trees
A spanning tree is a subset of Graph G, which has all the vertices covered with minimum possible number of edges.
Hence, a spanning tree does not have cycles and it cannot be disconnected.
The minimum spanning tree is a spanning tree whose sum of the edges is minimum. There are two methods to find
Minimum Spanning Tree:
1. Kruskal’s Algorithm
2. Prim’s Algorithm
• If the nodes of G represent cities and the edges represent possible communication links connecting two cities,
then spanning tree represents all feasible choices with minimum number of links needed to connect the n cities.
The minimum number of links needed is n − 1.
• To construct highways or railroads spanning several cities then can use the concept of minimum spanning tree.
• To set a houses with Electric Powers, Water, Telephone lines,Sewage line, we can connect houses with minimum
cost spanning trees to reduce the cost.
Kruskal’s algorithm construct a Minimum Spanning Tree for a connected weighted graph. It is a Greedy Algorithm.
The Greedy Choice is to put the smallest weight edge that does not makes a cycle in the MST constructed so far. Steps
to find Minimum spanning tree using Kruskal’s algorithm
Example: AT
2. Starting only with the vertices of G and proceeding sequentially add each edge which does not result in a cycle,
until (n − 1) edges are used.
Find Minimum cost spanning tree of the below graph using Kruskal’s method.
FIS
=⇒
AT
Next cost is 3, and associated edges are A,C and C,D. Add these edges. Next cost in the table is 4, and we observe that
adding it will create a circuit in the graph. In the process we shall ignore/avoid all edges that create a circuit.
FIS
The edges with cost 5 and 6 also create circuits. So ignore them and move on.
=⇒
Now left with only one node to be added. Between the two least cost edges available 7 and 8, we shall add the edge
with cost 7.
By adding edge S,A we have included all the nodes of the graph and we now have minimum cost spanning tree.
AT
FIS
Complexity Analysis:
Initially E is the set of all edges in G. The only functions we wish to perform on this set are
(1) determine an edge with minimum cost and
(2) delete this edge.
Both these functions can be performed efficiently if the edges in E are maintained as a sorted sequential list. If the
edges are maintained as a minheap, then the next edge to consider can be obtained in O(log |E|) time. The construction
of the heap itself takes O(|E|) time.
To be able to perform efficiently, the vertices in G should be grouped together in such a way that one can easily determine
whether the vertices v and w are already connected by the earlier selection of edges. If they are, then the edge (v, w)
is to be discarded. If they are not, then (v, w) is to be added to t. Using the set representation and the union and find
algorithms, we can obtain an efficient (almost linear) implementation. The computing time is, therefore, determined
as the worst case which is O(|E| log |E|).
In a shortest-paths problem, we are given a weighted, directed graph G = (V, E), with weight function w : E → R
mapping edges to real-valued weights. The weight of path p = {v0 , v1 , ..., vk } is the sum of the weights of its constituent
edges:
• Single-destination shortest-paths problem: Find a shortest path to a given destination vertex t from each vertex
v. By reversing the direction of each edge in the graph, we can reduce this problem to a single-source problem.
AT
• Single-pair shortest-path problem: Find a shortest path from u to v for given vertices u and v. If we solve
the single-source problem with source vertex u, we solve this problem also. Moreover, no algorithms for this
problem are known that run asymptotically faster than the best single-source algorithms in the worst case.
• All-pairs shortest-paths problem: Find a shortest path from u to v for every pair of vertices u and v. Although
this problem can be solved by running a single source algorithm once from each vertex, it can usually be solved
faster.
FIS
1.7.1 Dijkstra’s Algorithm
Dijkstra algorithm is a single-source shortest path algorithm. Here, single-source means that only one source is given,
and we have to find the shortest path from the source to all the nodes.
Example:
First, consider any vertex as a source vertex. Suppose here vertex ’s’ as a source vertex.
Here we assume that ’s’ as a source vertex, and distance to all the other vertices is infinity. Initially, we do not know
the distances. First, we will find out the vertices which are directly connected to the vertex s.
Let’s assume that the vertex s is represented by 0 u0 and the vertex t is represented by 0 v 0 . The distance between the
vertices can be calculated as:
d(u, v) = min{d(u) + c(u, v), d(v)}
i.e, if (d(u) + c(u, v) < d(v)) then d(v) = d(u) + c(u, v)
Step 1:
Here, vertex s is source vertex. Distance from vertex 0 s0 to 0 t0 and 0 y 0 are updated.
Step 2:
Vertex with minimum distance (vertex y) is selected.
All vertices are visited and distance from vertex 0 s0 to all other vertices are updated.
Algorithm:
z
x
t
AT8
7
9
s→y
s→y→t
s→y→z
s→y→t→x
FIS
Complexity Analysis:
Any shortest path algorithm must examine each edge in the graph at least once since any of the edges could be in a
shortest path. Hence, the minimum possible time for such an algorithm would be Ω(|E|).Since cost adjacency matrices
were used to represent the graph, it takes O(n2 ) time just to determine which edges are in G, and so any shortest path
algorithm using this representation must take Ω(n2 ) time. If a change to adjacency lists is made, the overall frequency
AT
FIS