Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views62 pages

DAA Module 4

The document discusses dynamic programming and the greedy method, explaining how dynamic programming solves problems with overlapping sub-problems by storing results in a table. It provides examples including the coin-row problem, change-making problem, coin-collecting problem, and the knapsack problem, detailing their respective algorithms and efficiencies. The document also introduces memory functions as a hybrid approach to optimize the computation of dynamic programming solutions.

Uploaded by

Adwaith Shine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views62 pages

DAA Module 4

The document discusses dynamic programming and the greedy method, explaining how dynamic programming solves problems with overlapping sub-problems by storing results in a table. It provides examples including the coin-row problem, change-making problem, coin-collecting problem, and the knapsack problem, detailing their respective algorithms and efficiencies. The document also introduces memory functions as a hybrid approach to optimize the computation of dynamic programming solutions.

Uploaded by

Adwaith Shine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Module-4

Dynamic Programming
and
THE GREEDY METHOD
Dynamic Programming

▪ Dynamic programming is a technique for solving problems with overlapping


sub-problems.

▪ Typically, these sub-problems arise from a recurrence relating a given problem’s


solution to solutions of its smaller sub-problems.

▪ Rather than solving overlapping sub-problems again and again, dynamic


programming suggests solving each of the smaller sub-problems only once and
recording the results in a table from which a solution to the original problem can
then be obtained.

▪ The Dynamic programming can also be used when the solution to a problem can be
viewed as the result of sequence of decisions.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Three Basic Examples
EXAMPLE 1: Coin-row problem

▪ There is a row of n coins whose values are some positive integers c1, c2, ..., cn, not
necessarily distinct.
▪ The goal is to pick up the maximum amount of money subject to the constraint
that no two coins adjacent in the initial row can be picked up.
▪ Let F(n) be the maximum amount that can be picked up from the row of n coins.
▪ To derive a recurrence for F(n), we partition all the allowed coin selections into two
groups: those that include the last coin and those without it.
▪ The largest amount we can get from the first group is equal to cn +F(n−2)—the
value of the nth coin plus the maximum amount we can pick up from the first n−2
coins.
▪ The maximum amount we can get from the second group is equal to F(n−1) by the
definition of F(n).
▪ Thus, we have the following recurrence subject to the obvious initial conditions:

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


DESIGN AND ANALYSIS OF ALGORITHMS Module-4
▪ The application of the algorithm to the coin row of denominations 5, 1, 2, 10, 6, 2
is shown in Figure 8.1.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ It yields the maximum amount of 17.
▪ It is worth pointing out that, in fact, we also solved the problem for the first i coins
in the row given for every 1≤i ≤6.
▪ For example, for i =3, the maximum amount is F(3)=7.
▪ To find the coins with the maximum total value found, we need to back trace the
computations to see which of the two possibilities—cn + F(n− 2) or
F(n−1)—produced the maxima in formula (8.3).
▪ In the last application of the formula, it was the sum c6 +F(4), which means that the
coin c6 =2 is a part of an optimal solution.
▪ Moving to computing F(4), the maximum was produced by the sum c4 +F(2), which
means that the coin c4 =10 is a part of an optimal solution as well.
▪ Finally, the maximum in computing F(2) was produced by F(1), implying that the
coin c2 is not the part of an optimal solution and the coin c1=5 is.
▪ Thus, the optimal solution is {c1, c4, c6}.
▪ To avoid repeating the same computations during the backtracking, the information
about which of the two terms in (8.3) was larger can be recorded in an extra array
when the values of F are computed.

▪ Using the Coin-Row to find F(n), the largest amount of money that can be picked
up, as well as the coins composing an optimal set, clearly takes Ɵ(n) time and Ɵ(n)
space.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


EXAMPLE 2: Change-making problem

▪ Consider the general instance of the following well-known problem.


▪ Give change for amount n using the minimum number of coins of
denominations, d1 < d2 <...< dm.
▪ Here, we consider a dynamic programming algorithm for the general case,
assuming availability of unlimited quantities of coins for each of the m
denominations, d1 < d2 <...< dm where d1=1.
▪ Let F(n) be the minimum number of coins whose values add up to n; it is
convenient to define F(0) =0.
▪ The amount n can only be obtained by adding one coin of denomination dj to the
amount n−dj for j=1,2,...,m such that n≥dj.
▪ Therefore, we can consider all such denominations and select the one minimizing
F(n−dj)+1.
▪ Since 1 is a constant, we can, of course, find the smallest F(n−dj) first and then
add 1 to it.
▪ Hence, we have the following recurrence for F(n):

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ We can compute F(n) by filling a one-row table left to right in the manner similar to
the way it was done above for the coin-row problem, but computing a table entry
here requires finding the minimum of up to m numbers.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ The application of the algorithm to amount n = 6 and denominations 1, 3, 4 is
shown in Figure 8.2.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ To find the coins of an optimal solution, we need to backtrace the computations
to see which of the denominations produced the minima in formula (8.4).
▪ For the instance considered, the last application of the formula (for n = 6), the
minimum was produced by d2 =3.
▪ The second minimum (for n=6−3) was also produced for a coin of that
denomination.
▪ Thus, the minimum-coin set for n = 6 is two 3’s.

The time and space efficiencies of the algorithm are obviously O(nm) and Ɵ(n),
respectively.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


EXAMPLE 3: Coin-collecting problem

▪ Several coins are placed in cells of an n × m board, no more than one coin per cell.
▪ A robot, located in the upper left cell of the board, needs to collect as many of
the coins as possible and bring them to the bottom right cell.
▪ On each step, the robot can move either one cell to the right or one cell down
from its current location.
▪ When the robot visits a cell with a coin, it always picks up that coin.
▪ Design an algorithm to find the maximum number of coins the robot can collect
and a path it needs to follow to do this.
▪ Let F(i,j) be the largest number of coins the robot can collect and bring to the cell
(i, j) in the ith row and jth column of the board.
▪ It can reach this cell either from the adjacent cell (i − 1, j) above it or from the
adjacent cell (i, j − 1) to the left of it.
▪ The largest numbers of coins that can be brought to these cells are F(i−1, j) and F(i,
j−1), respectively.
▪ Of course, there are no adjacent cells above the cells in the first row, and there
are no adjacent cells to the left of the cells in the first column.
▪ For those cells, we assume that F(i− 1,j) and F(i,j−1) are equal to 0 for their
nonexistent neighbors.
▪ Therefore, the largest number of coins the robot can bring to cell (i, j) is the
maximum of these two numbers plus one possible coin at cell (i, j) itself.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ In other words, we have the following formula for F(i,j):

▪ where cij = 1 if there is a coin in cell (i, j), and


cij = 0 otherwise.

▪ Using these formulas, we can fill in the n×m table of F(i,j) values either row by row
or column by column, as is typical for dynamic programming algorithms involving
two-dimensional tables.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


DESIGN AND ANALYSIS OF ALGORITHMS Module-4
▪ The algorithm is illustrated in Figure 8.3b for the coin setup in Figure 8.3a.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ Since computing the value of F(i,j) by formula (8.5) for each cell of the table takes
constant time, the time efficiency of the algorithm is Ɵ(nm).
▪ Its space efficiency is, obviously, also Ɵ(nm).

▪ Tracing the computations backward makes it possible to get an optimal path:


▪ if F(i−1,j)>F(i,j−1), an optimal path to cell (i, j) must come down from the adjacent
cell above it;
▪ if F(i−1,j) <F(i,j−1) an optimal path to cell (i, j) must come from the adjacent cell
on the left; and
▪ if F(i− 1,j)= F(i,j− 1), it can reach cell (i, j) from either direction.

▪ This yields two optimal paths for the instance in Figure 8.3a, which are shown in
Figure 8.3c.
▪ If ties are ignored, one optimal path can be obtained in Ɵ(n+m) time.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Knapsack Problem

▪ We start this section with designing a dynamic programming algorithm for the
knapsack problem: given n items of known weights w1, . . . , wn and values v1, . . . , vn
and a knapsack of capacity W, find the most valuable subset of the items that fit into
the knapsack.
▪ To design a dynamic programming algorithm, we need to derive a recurrence relation
that expresses a solution to an instance of the knapsack problem in terms of solutions
to its smaller subinstances.
▪ Let us consider an instance defined by the first i items, 1≤ i ≤ n, with weights w1, . . . ,
wi, values v1, . . . , vi , and knapsack capacity j, 1 ≤ j ≤ W.
▪ Let F(i, j) be the value of an optimal solution to this instance.
▪ We can divide all the subsets of the first i items that fit the knapsack of capacity j into
two categories: those that do not include the ith item and those include. Note the
following:
Among the subsets that do not include the ith item, the value of an optimal
subset is, by definition, F(i − 1, j).
Among the subsets that do include the ith item (hence, j − wi ≥ 0), an optimal
subset is made up of this item and an optimal subset of the first i−1 items that
fits into the knapsack of capacity j − wi . The value of such an optimal subset is vi
+ F(i − 1, j − wi).

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ Thus, the value of an optimal solution among all feasible subsets of the first I items is
the maximum of these two values.

▪ It is convenient to define the initial conditions as follows:


F(0, j) = 0 for j ≥ 0 and F(i, 0) = 0 for i ≥ 0.

▪ Our goal is to find F(n, W), the maximal value of a subset of the n given items that fit
into the knapsack of capacity W, and an optimal subset itself.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Example-1: Let us consider the instance given by the following data

The dynamic programming table, filled by applying formulas is given below

Thus, the maximal value is F(4, 5) = $37

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ We can find the composition of an optimal subset by backtracing the computations
of this entry in the table.
▪ Since F(4, 5) > F(3, 5), item 4 has to be included in an optimal solution along with an
optimal subset for filling 5 − 2 = 3 remaining units of the knapsack capacity.
▪ The value of the latter is F(3, 3). Since F(3, 3) = F(2, 3), item 3 need not be in an
optimal subset.
▪ Since F(2, 3) > F(1, 3), item 2 is a part of an optimal selection, which leaves element
F(1, 3 − 1) to specify its remaining composition.
▪ Similarly, since F(1, 2) > F(0, 2), item 1 is the final part of the optimal solution {item
1, item 2, item 4}.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Analysis
▪ The time efficiency and space efficiency of this algorithm are both in Θ(nW).
▪ The time needed to find the composition of an optimal solution is in O(n).

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Memory Functions
▪ The direct top-down approach to finding a solution to such a recurrence leads to an
algorithm that solves common subproblems more than once and hence is very
inefficient.
▪ The classic dynamic programming approach, on the other hand, works bottom up: it
fills a table with solutions to all smaller subproblems, but each of them is solved only
once.
▪ An unsatisfying aspect of this approach is that solutions to some of these smaller
subproblems are often not necessary for getting a solution to the problem given.
▪ Since this drawback is not present in the top-down approach, it is natural to try to
combine the strengths of the top-down and bottom-up approaches.
▪ The goal is to get a method that solves only subproblems that are necessary and does
so only once.
▪ Such a method exists; it is based on using memory functions.
▪ This method solves a given problem in the top-down manner but, in addition, maintains
a table of the kind that would have been used by a bottom-up dynamic programming
algorithm.
▪ Initially, all the table’s entries are initialized with a special “null” symbol to indicate that
they have not yet been calculated.
▪ Thereafter, whenever a new value needs to be calculated, the method checks the
corresponding entry in the table first: if this entry is not “null,” it is simply retrieved
from the table; otherwise, it is computed by the recursive call whose result is then
recorded in the table
DESIGN AND ANALYSIS OF ALGORITHMS Module-4
▪ The following algorithm implements this idea for the knapsack problem.
▪ After initializing the table, the recursive function needs to be called with i = n (the
number of items) and j = W (the knapsack capacity)

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Example-2
Let us apply the memory function method to the instance considered in Example 1. The
table in Figure given below gives the results.
Only 11 out of 20 nontrivial values (i.e., not those in row 0 or in column 0) have been
computed. Just one nontrivial entry, V (1, 2), is retrieved rather than being recomputed.
For larger instances, the proportion of such entries can be significantly larger.

Figure: Example of solving an instance of the knapsack problem by the memory function algorithm.

In general, we cannot expect more than a constant-factor gain in using the memory
function method for the knapsack problem, because its time efficiency class is the same
as that of the bottom-up algorithm.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Transitive Closure using Warshall’s Algorithm

Definition:
The transitive closure of a directed graph with n vertices can be defined as the n×n
boolean matrix T = { tij }, in which the element in the ith row and the jth column is 1 if
there exists a nontrivial path (i.e., directed path of a positive length) from the ith vertex
to the jth vertex; otherwise, tij is 0.

Example:
An example of a digraph, its adjacency matrix, and its transitive closure is given below.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ We can generate the transitive closure of a digraph with the help of Depth-First
Search or Breadth-First Search.
▪ Performing either traversal starting at the ith vertex gives the information about the
vertices reachable from it and hence the columns that contain 1’s in the ith row of the
transitive closure.
▪ Thus, doing such a traversal for every vertex as a starting point yields the transitive
closure in its entirety.

▪ Since this method traverses the same digraph several times, we can use a better
algorithm called Warshall’s algorithm.
▪ Warshall’s algorithm constructs the transitive closure through a series of n × n
boolean matrices:

▪ Each of these matrices provides certain information about directed paths in the
digraph.
▪ Specifically, the element rij(k) in the ith row and jth column of matrix R(k) (i, j = 1, 2, . . .
, n, k = 0, 1, . . . , n) is equal to 1 if and only if there exists a directed path of a positive
length from the ith vertex to the jth vertex with each intermediate vertex, if any,
numbered not higher than k.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ Thus, the series starts with R(0) , which does not allow any intermediate vertices in
its paths; hence, R(0) is nothing other than the adjacency matrix of the digraph.
▪ R(1) contains the information about paths that can use the first vertex as
intermediate.
▪ The last matrix in the series, R(n) , reflects paths that can use all n vertices of the
digraph as intermediate and hence is nothing other than the digraph’s transitive
closure.
▪ This means that there exists a path from the ith vertex vi to the jth vertex vj with each
intermediate vertex numbered not higher than k:

--- (*)

▪ Two situations regarding this path are possible.


1. In the first, the list of its intermediate vertices does not contain the kth vertex.
Then this path from vi to vj has intermediate vertices numbered not higher than
k−1. hence rij(k-1)= 1.
2. The second possibility is that path (*) does contain the kth vertex vk among the
intermediate vertices. Then path (*) can be rewritten as

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ Thus, we have the following formula for generating the elements of matrix R(k) from the
elements of matrix R(k−1)

▪ The Warshall’s algorithm works based on the above formula.


▪ This formula implies the following rule for generating elements of matrix R(k) from
elements of matrix R(k-1), which is particularly convenient for applying Warshall’s algorithm by
hand:
If an element rij is 1 in R(k-1), it remains 1 in R(k).
If an element rij is 0 in R(k-1), it has to be changed to 1 in R(k) if and only if the element
in its row i and column k and the element in its column j and row k are both 1’s in
R(k-1). This rule is illustrated in Figure.

▪ As an example, the application of Warshall’s algorithm to the digraph is shown below.


DESIGN AND ANALYSIS OF ALGORITHMS Module-4
New 1’s are in bold.
DESIGN AND ANALYSIS OF ALGORITHMS Module-4
Analysis
Its time efficiency is Θ(n3). We can make the algorithm to run faster by treating matrix
rows as bit strings and employ the bitwise or operation available in most modern
computer languages.

Space efficiency: Although separate matrices for recording intermediate results of the
algorithm are used, that can be avoided.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


All Pairs Shortest Paths using Floyd's Algorithm

Problem definition:
Given a weighted connected graph (undirected or directed), the all-pairs shortest
paths problem asks to find the distances—i.e., the lengths of the shortest paths -
from each vertex to all other vertices.

Applications:
▪ Solution to this problem finds applications in communications, transportation
networks, and operations research.
▪ Among recent applications of the all-pairs shortest-path problem is pre-computing
distances for motion planning in computer games.
▪ We store the lengths of shortest paths in an n x n matrix D called the distance matrix:
the element dij in the ith row and the jth column of this matrix indicates the length of
the shortest path from the ith vertex to the jth vertex.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ We can generate the distance matrix with an algorithm that is very similar to
Warshall’s algorithm. It is called Floyd’s algorithm.
▪ Floyd’s algorithm computes the distance matrix of a weighted graph with n vertices
through a series of n × n matrices:

▪ The element d(k) in the ith row and the jth column of matrix D(k) (i, j = 1, 2, . . . , n, k =
0, 1, . . . , n) is equal to the length of the shortest path among all paths from the ith
vertex to the jth vertex with each intermediate vertex, if any, numbered not higher
than k.

▪ As in Warshall’s algorithm, we can compute all the elements of each matrix D(k) from
its immediate predecessor D(k−1)

▪ If dij(k) = 1, then it means that there is a path;


vi , a list of intermediate vertices each numbered not higher than k , vj

▪ We can partition all such paths into two disjoint subsets: those that do not use the
kth vertex vk as intermediate and those that do.
▪ Since the paths of the first subset have their intermediate vertices numbered not
higher than k − 1, the shortest of them is, by definition of our matrices, of length dij
(k–1)
In the second subset the paths are of the form.
vi , vertices numbered ≤ k − 1 , vk , vertices numbered ≤ k − 1, vj
DESIGN AND ANALYSIS OF ALGORITHMS Module-4
▪ The situation is depicted symbolically in Figure, which shows the underlying idea of
Floyd’s algorithm.

▪ Taking into account the lengths of the shortest paths in both subsets leads to the
following recurrence

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Analysis:
Its time efficiency is Θ(n3), similar to the warshall’s algorithm

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ Application of Floyd’s algorithm to the digraph is shown below.
▪ Updated elements are shown in bold

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


The Greedy Method
Prim’s Algorithm

▪ Prim's algorithm constructs a minimum spanning tree through a sequence of


expanding sub- trees.

▪ The initial sub-tree in such a sequence consists of a single vertex selected arbitrarily
from the set V of the graph's vertices.

▪ On each iteration it expands the current tree in the greedy manner by simply
attaching to it the nearest vertex not in that tree. (By the nearest vertex, we mean a
vertex not in the tree connected to a vertex in the tree by an edge of the smallest
weight. Ties can be broken arbitrarily.)

▪ The algorithm stops after all the graph's vertices have been included in the tree being
constructed.

▪ Since the algorithm expands a tree by exactly one vertex on each of its iterations, the
total number of such iterations is n-1, where n is the number of vertices in the graph.
The tree generated by the algorithm is obtained as the set of edges.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Correctness:
Prim’s algorithm always yields a minimum spanning tree.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Example:

An example of prim’s algorithm is shown below.


The parenthesized labels of a vertex in the middle column indicate the nearest tree vertex
and edge weight; selected vertices and edges are shown in bold.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


DESIGN AND ANALYSIS OF ALGORITHMS Module-4
DESIGN AND ANALYSIS OF ALGORITHMS Module-4
Analysis of Efficiency:

The efficiency of Prim’s algorithm depends on the data structures chosen for the graph
itself and for the priority queue of the set V − VT whose vertex priorities are the distances
to the nearest tree vertices.

1. If a graph is represented by its weight matrix and the priority queue is implemented
as an unordered array, the algorithm’s running time will be in Θ(|V|2). Indeed, on
each of the |V|−1 iterations, the array implementing the priority queue is traversed
to find and delete the minimum and then to update, if necessary, the priorities of the
remaining vertices.

We can implement the priority queue as a min-heap. (A min-heap is a complete


binary tree in which every element is less than or equal to its children.)

Deletion of the smallest element from and insertion of a new element into a min-heap
of size n are O(log n) operations.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


2. If a graph is represented by its adjacency lists and the priority queue is implemented
as a min-heap, the running time of the algorithm is in O(|E| log |V |)

This is because the algorithm performs |V|−1 deletions of the smallest element and
makes |E| verifications and, possibly, changes of an element’s priority in a min-heap
of size not exceeding |V|.

Each of these operations, as noted earlier, is a O(log |V|) operation.

Hence, the running time of this implementation of Prim’s algorithm is in


(|V| − 1+ |E|) O (log |V |) = O(|E| log |V |)
because, in a connected graph, |V| − 1≤ |E|

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Kruskal’s Algorithm

Background
▪ Kruskal's algorithm is another greedy algorithm for the minimum spanning tree
problem that also always yields an optimal solution.
▪ It is named Kruskal's algorithm, after Joseph Kruskal.
▪ Kruskal's algorithm looks at a minimum spanning tree for a weighted connected
graph G = (V, E) as an acyclic sub graph with |V |-1 edges for which the sum of the
edge weights is the smallest.
▪ Consequently, the algorithm constructs a minimum spanning tree as an expanding
sequence of sub graphs, which are always acyclic but are not necessarily connected
on the intermediate stages of the algorithm.

Working
▪ The algorithm begins by sorting the graph's edges in non decreasing order of their
weights.
▪ Then, starting with the empty sub graph, it scans this sorted list adding the next edge
on the list to the current sub graph if such an inclusion does not create a cycle and
simply skipping the edge otherwise.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


DESIGN AND ANALYSIS OF ALGORITHMS Module-4
▪ The fact that ET ,the set of edges composing a minimum spanning tree of graph G
actually a tree in Prim's algorithm but generally just an acyclic sub graph in Kruskal's
algorithm.

▪ Kruskal’s algorithm is not simpler because it has to check whether the addition of
the next edge to the edges already selected would create a cycle.

▪ We can consider the algorithm's operations as a progression through a series of


forests containing all the vertices of a given graph and some of its edges.

▪ The initial forest consists of |V| trivial trees, each comprising a single vertex of the
graph. The final forest consists of a single tree, which is a minimum spanning
tree of the graph.

▪ On each iteration, the algorithm takes the next edge (u, v) from the sorted list of
the graph's edges, finds the trees containing the vertices u and v, and, if these trees
are not the same, unites them in a larger tree by adding the edge (u, v).

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Analysis of Efficiency

▪ The crucial check whether two vertices belong to the same tree can be found out using
union- find algorithms.

▪ Efficiency of Kruskal’s algorithm is based on the time needed for sorting the edge
weights of a given graph.

▪ Hence, with an efficient sorting algorithm, the time efficiency of Kruskal's algorithm will
be in O (|E| log |E|).

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Illustration

An example of Kruskal’s algorithm is shown below.


The selected edges are shown in bold.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


DESIGN AND ANALYSIS OF ALGORITHMS Module-4
Single Source Shortest Paths

▪ Definition: For a given vertex called the source in a weighted connected graph, the
problem is to find shortest paths to all its other vertices.

▪ The single-source shortest-paths problem asks for a family of paths, each leading from
the source to a different vertex in the graph, though some paths may, of course, have
edges in common.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Dijkstra's Algorithm

▪ Dijkstra's Algorithm is the best-known algorithm for the single-source shortest-paths


problem.
▪ This algorithm is applicable to undirected and directed graphs with nonnegative
weights only.

Working :

Dijkstra's algorithm finds the shortest paths to a graph's vertices in order of their distance
from a given source.
▪ First, it finds the shortest path from the source to a
vertex nearest to it, then to a second nearest, and so
on.
▪ In general, before its ith iteration commences, the
algorithm has already identified the shortest paths to
i-1 other vertices nearest to the source.
▪ These vertices, the source, and the edges of the
shortest paths leading to them from the source form
a sub-tree Ti of the given graph shown in the figure.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ Since all the edge weights are nonnegative, the next vertex nearest to the source
can be found among the vertices adjacent to the vertices of Ti.

▪ The set of vertices adjacent to the vertices in Ti can be referred to as "fringe


vertices"; they are the candidates from which Dijkstra's algorithm selects the next
vertex nearest to the source.

▪ To identify the ith nearest vertex, the algorithm computes, for every fringe vertex u,
the sum of the distance to the nearest tree vertex v (given by the weight of the edge
(v, u)) and the length d, of the shortest path from the source to v (previously
determined by the algorithm) and then selects the vertex with the smallest such
sum.

▪ The fact that it suffices to compare the lengths of such special paths is the central
insight of Dijkstra's algorithm.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ To facilitate the algorithm's operations, we label each vertex with two labels.

✔ The numeric label d indicates the length of the shortest path from the source
to this vertex found by the algorithm so far; when a vertex is added to the
tree, d indicates the length of the shortest path from the source to that vertex.
✔ The other label indicates the name of the next-to-last vertex on such a path,
i.e., the parent of the vertex in the tree being constructed. (It can be left
unspecified for the sources and vertices that are adjacent to none of the
current tree vertices.)
✔ With such labeling, finding the next nearest vertex u* becomes a simple task of
finding a fringe vertex with the smallest d value. Ties can be broken arbitrarily.

▪ After we have identified a vertex u* to be added to the tree, we need to perform two
operations:
✔ Move u* from the fringe to the set of tree vertices.
✔ For each remaining fringe vertex u that is connected to u* by an edge of weight
w (u*, u) such that d u*+ w(u*, u) <d u, update the labels of u by u* and du* +
w(u*, u), respectively.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Illustration:
An example of Dijkstra's algorithm is shown below. The next closest vertex is shown in
bold.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ The shortest paths (identified by following nonnumeric labels backward from a
destination vertex in the left column to the source) and their lengths (given by
numeric labels of the tree vertices) are as follows:

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ The pseudocode of Dijkstra’s algorithm is given below.
▪ Note that in the following pseudocode, VT contains a given source vertex and the
fringe contains the vertices adjacent to it after iteration 0 is completed.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Analysis:
▪ The time efficiency of Dijkstra’s algorithm depends on the data structures used
for implementing the priority queue and for representing an input graph itself.
▪ For graphs represented by their adjacency lists and the priority queue implemented
as a min-heap, it is in O(|E| log |V|).

Applications:
▪ Transportation planning and packet routing in communication networks, including
the Internet.
▪ Finding shortest paths in social networks, speech recognition, document formatting,
robotics, compilers, and airline crew scheduling.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Optimal Tree problem

Background

▪ Suppose we have to encode a text that comprises characters from some n-symbol
alphabet by assigning to each of the text's characters some sequence of bits called
the codeword.
▪ There are two types of encoding: Fixed-length encoding, Variable-length encoding

Fixed-length encoding:

This method assigns to each character a bit string of the same length m, (m>=
log2 n).
This is exactly what the standard ASCII code does.
One way of getting a coding scheme that yields a shorter bit string on the
average is based on the old idea of assigning shorter code-words to more
frequent characters and longer code-words to less frequent characters .

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Variable-length encoding:

This method assigns code-words of different lengths to different characters,


introduces a problem that fixed-length encoding does not have.
Namely, how can we tell how many bits of a encoded text represent the first
(or, more generally, the ith) character?
To avoid this complication, we can limit ourselves to prefix-free (or simply
prefix) codes.
In a prefix code, no codeword is a prefix of a codeword of another character.
Hence, with such an encoding, we can simply scan a bit string until we get the
first group of bits that is a codeword for some character, replace these bits by
this character, and repeat this operation until the bit string's end is reached.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ If we want to create a binary prefix code for some alphabet, it is natural to associate
the alphabet's characters with leaves of a binary tree in which all the left edges are
labelled by 0 and all the right edges are labelled by 1 (or vice versa).
▪ The codeword of a character can then be obtained by recording the labels on the
simple path from the root to the character's leaf.
▪ Since there is no simple path to a leaf that continues to another leaf, no codeword
can be a prefix of another codeword; hence, any such tree yields a prefix code.

▪ Among the many trees that can be constructed in this manner for a given alphabet
with known frequencies of the character occurrences, construction of such a tree
that would assign shorter bit strings to high-frequency characters and longer ones to
low-frequency characters can be done by the following greedy algorithm, invented by
David Huffman.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Huffman Trees and Codes

Huffman's Algorithm:

Step 1:
Initialize n one-node trees and label them with the characters of the alphabet.
Record the frequency of each character in its tree's root to indicate the tree's
weight. (More generally, the weight of a tree will be equal to the sum of the
frequencies in the tree's leaves.)

Step 2:
Repeat the following operation until a single tree is obtained.
Find two trees with the smallest weight.
Make them the left and right sub-tree of a new tree and record the sum of their
weights in the root of the new tree as its weight.

▪ A tree constructed by the above algorithm is called a Huffman tree.


▪ It defines-in the manner described-a Huffman code.

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


Example: Consider the five-symbol alphabet
{A, B, C, D, _} with the following occurrence
frequencies in a text made up of these
symbols:

The Huffman tree construction for the above


problem is shown in Figure.

The resulting codewords are as follows,

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


▪ Hence, DAD is encoded as 011101, and 10011011011101 is decoded as BAD_AD.
▪ With the occurrence frequencies given and the codeword lengths obtained, the
average number of bits per symbol in this code is

2 * 0.35 + 3 * 0.1+ 2 * 0.2 + 2 * 0.2 + 3 * 0.15 = 2.25

▪ Had we used a fixed-length encoding for the same alphabet, we would have to use
at least 3 bits per each symbol.
▪ Thus, for this example, Huffman’s code achieves the compression ratio (a standard
measure of a compression algorithm’s effectiveness) of (3−2.25)/3*100%= 25%.
▪ In other words, Huffman’s encoding of the above text will use 25% less memory than
its fixed-length encoding

DESIGN AND ANALYSIS OF ALGORITHMS Module-4


DESIGN AND ANALYSIS OF ALGORITHMS Module-4

You might also like