Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views50 pages

Module3 Final

Module 3 of the Design and Analysis of Algorithms course covers greedy methods, transform and conquer, and dynamic programming, focusing on algorithms like Prim's, Kruskal's, and Dijkstra's for solving minimum cost spanning trees and shortest path problems. It details the greedy approach for optimization problems, the union-find algorithm for cycle detection in Kruskal's algorithm, and the application of Dijkstra's algorithm for finding shortest paths in weighted graphs. The module also includes practical examples and pseudocode for implementing these algorithms.

Uploaded by

zaynm0225
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views50 pages

Module3 Final

Module 3 of the Design and Analysis of Algorithms course covers greedy methods, transform and conquer, and dynamic programming, focusing on algorithms like Prim's, Kruskal's, and Dijkstra's for solving minimum cost spanning trees and shortest path problems. It details the greedy approach for optimization problems, the union-find algorithm for cycle detection in Kruskal's algorithm, and the application of Dijkstra's algorithm for finding shortest paths in weighted graphs. The module also includes practical examples and pseudocode for implementing these algorithms.

Uploaded by

zaynm0225
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Design and Analysis of Algorithms (CS252) Module 3

Module 3: Greedy Method, Transform and Conquer, Dynamic


Programming
Syllabus
Minimum cost spanning trees: Prim’s Algorithm, Kruskal’s Algorithm, union find method

Single source shortest paths: Dijkstra's Algorithm.

Optimal Tree problem: Huffman Trees and Codes.

Transform and Conquer Approach: Heaps and Heap Sort.

Dynamic Programming: General method with Examples, Transitive Closure: Algorithm,


All Pairs Shortest Paths: Floyd's Algorithm

Text book 1: 9.1, 9.2, 9.3, 9.4, 8.1, 8.4

Greedy Approach
As discussed in the previous module, greedy algorithms are generally used for optimization problems –
either to minimize or to maximize the value of some objective functions. The greedy approach constructs
a solution through a sequence of steps, each expanding a partially constructed solution obtained so far,
until a complete solution to the problem is reached. Although such an approach may not work for some
computational tasks, there are many for which it is optimal.

In the previous module we have seen two maximization problems – knapsack problem where we maximize
the value of items that can be taken in a knapsack with a certain capacity and the job sequencing problem
where we try find the maximum profit that can be obtained by optimally scheduling a subset of tasks.

In this module we will explore three greedy algorithms (Prim’s, Kruskal’s and Dijkstra’s) that work on
graphs and another one to create an optimal code tree.

3.1 Minimum cost Spanning Trees (MST)

The following problem arises naturally in many practical situations: given n points, connect them in the
cheapest possible way so that there will be a path between every pair of points. It has direct applications
to the design of all kinds of networks— including communication, computer, transportation, and

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 1


Design and Analysis of Algorithms (CS252) Module 3

electrical—by providing the cheapest way to achieve connectivity. It identifies clusters of points in data
sets.

We can represent the points given by vertices of a graph, possible connections by the graph’s edges, and
the connection costs by the edge weights. Then the question can be posed as the minimum spanning tree
problem.

A spanning tree of a connected graph is its connected acyclic subgraph (i.e., a tree) that contains all the
vertices of the graph. If such a graph has weights assigned to its edges, a minimum spanning tree (MST)
is its spanning tree of the smallest weight, where the weight of a tree is defined as the sum of the weights
on all its edges.

The minimum spanning tree problem is the problem of finding a minimum spanning tree for a given
weighted connected graph.

The figure below shows a weighted graph and its three spanning trees. The spanning tress have weights of
6, 9 and 8 and minimal spanning tree is the tree that has weight 6.

Figure 1: Graph and its spanning trees

A bruteforce approach (using exhaustive search) to find minimal spanning tree is to list all possible
spanning trees and find the tree that has the minimum cost among them - this approach has exponential
time complexity.

All the well-known efficient algorithms for finding minimum spanning trees are applications of the greedy
method. We apply the greedy method by iteratively choosing objects (edges in our case) to join a growing
collection, by incrementally picking an object that minimizes the value of an objective function. In the case
of the minimal spanning tree problem, the objective function is the sum of edge weights in the spanning
tree.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 2


Design and Analysis of Algorithms (CS252) Module 3

Two classic algorithms for the minimum spanning tree problem are Prim’s algorithm and Kruskal’s
algorithm. Both these algorithms solve the problem by building the final spanning tree edge by edge.
However, the way they select the edges differ though both of them always yield an optimal solution.

Prim’s algorithm

Prim’s algorithm constructs a minimum spanning tree through a sequence of expanding subtrees. The
initial subtree in such a sequence consists of a single vertex selected arbitrarily from the set V of the graph’s
vertices. On each iteration, the algorithm expands the current tree in the greedy manner by simply attaching
to it the nearest vertex not in that tree. (By the nearest vertex, we mean a vertex not in the tree connected
to a vertex in the tree by an edge of the smallest weight. Ties can be broken arbitrarily). The algorithm
stops after all the graph’s vertices have been included in the tree being constructed.

Since the algorithm expands a tree by exactly one vertex on each if its iterations, the total number of such
iterations is n-1, where n is the number of vertices in the graph.

The pseudocode for Prim’s algorithm is given below.

Exercise - Find the MST of the following graph

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 3


Design and Analysis of Algorithms (CS252) Module 3

The table below shows the growth of the tree after every iteration of the loop in the algorithm.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 4


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 5


Design and Analysis of Algorithms (CS252) Module 3

Running time of Prim’s algorithm

The running time of Prim’s algorithm depends on the data structures chosen for the graph itself and for the
priority queue of the set V - VT whose vertex priorities are the distances to the nearest tree vertices. Since
there are V-1 iterations of the for loop the algorithm's running time will be in Θ(|V|2) if the graph is
represented by its weight matrix and the priority queue is implemented as an unordered array. If the graph
is represented by its adjacency lists and the priority queue is implemented as a min-heap, the running time
of the algorithm is in O(|E| log |V|).

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 6


Design and Analysis of Algorithms (CS252) Module 3

3.3 Kruskal’s Algorithm


Kruskal’s algorithm begins by sorting the graph's edges in non-decreasing order of their weights. Then,
starting with the empty subgraph, it scans this sorted list adding the next edge on the list to the current sub
graph if such an inclusion does not create a cycle and simply skipping the edge otherwise.
Prim’s- works on Vertices Kruskal’s – works on Edges

Example usage of Kruskal’s Algorithm

Apple Kruskal’s algorithm to find an MST of the graph shown

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 7


Design and Analysis of Algorithms (CS252) Module 3

Tree Edges Action Illustration


(ET)
- bc:1 Add to MST

bc:1 ef:2 Add to MST

ef:2 ab:3 Add to MST

ab:3 bf:4 Add to MST

bf:4 Discard: cf will


form a cycle
cf:4 Discard: af will
form a cycle
af:5 df:5 Add to MST

df:5 We can stop here as 5 edges are already added to MST. The MST of
a graph with |V| nodes can have only |V|-1 edges. Adding any other
edge will create a cycle.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 8


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 9


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 10


Design and Analysis of Algorithms (CS252) Module 3

Though both algorithms (Prim’s and Kruskal’s) appear to be simple, Kruskal’s algorithm has to check
whether the addition of the next edge to the edges already selected would create a cycle. A new cycle is
created if and only if the new edge connects two vertices already connected by a path, i.e., if and only if
the two vertices belong to the same connected component. Note also that each connected component of a
subgraph generated by Kruskal’s algorithm is a tree because it has no cycles. One of the efficient
algorithms that can be used to check whether adding a new edge to the set of these disconnected trees is
the union find algorithm.

With an efficient union find algorithm, the running time of Kruskal's algorithm will be dominated by the
time needed for sorting the edge weights of a given graph. Hence, with an efficient sorting algorithm, the
time efficiency of Kruskal's algorithm will be in O(|E| log |E|).

Applications (prim’s + Kruskal’s):

Networking connections and telephone cable connections

It is used to remove loops in directions in google map.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 11


Design and Analysis of Algorithms (CS252) Module 3

3.3 Disjoint Subsets and Union-Find Algorithms


Kruskal's algorithm has to check whether the addition of the next edge to the edges already selected for
the minimum spanning tree would create a cycle. It is easy to see that a new cycle is created if and only if
the new edge connects two vertices already connected by a path, i.e., if and only if the two vertices belong
to the same connected component (Figure 9.5).

3.3.1 Union Find Algorithm used to detect cycles

Kruskal's algorithm require a dynamic partition of some n-element set S into a collection of disjoint subsets
Sl, S2, ... , Sk.

1. After being initialized as a collection of n one-element subsets, each containing a different element
of S, the collection is subjected to a sequence of intermixed union and find operations.

2. The number of union operations in any such sequence must be bounded above by n − 1 because
each union increases a subset’s size at least by 1 and there are only n elements in the entire set S.

This is represented by an abstract data type of a collection of disjoint sets of a finite set. Let us use an array
of integers, called Parent[]. If we are dealing with N items, i’th element of the array represents the parent
of i’th item. Initially, the values of all items is set to -1.

for(int i=0;i<n;i++)

par[i]=-1;

Three operations can be defined on this collection of disjoint subsets:

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 12


Design and Analysis of Algorithms (CS252) Module 3

1. makeset(x) creates a one-element set {x}. It is assumed that this operation can be applied to each of
the elements of set S only once.

2. find(x) returns a subset containing x. The task is to find representative of the set of a given element.
The representative is always root of the tree. So we implement find() by recursively traversing the
parent array until we hit a node that is root (parent of itself).

while(par[u]!=-1)

u=par[u];

3. union(x, y) constructs the union of the disjoint subsets Sx and Sy containing x and y, respectively,
and adds it to the collection to replace Sx and Sy, which are deleted from it. The task is to combine
two sets and make one. It takes two elements as input and finds the representatives of their sets
using the Find operation, and finally puts either one of the trees (representing the set) under the root
node of the other tree.

// call find(x) will give u and find(y) will give v

// Assume representative of x as u and representative of y as v

// if u==v, cycle is detected

// if u!=v, no cycle is detected

if(u!=v)

// Make the representative of v as u

if(u<v)

par[v]=u;

// Make the representative of u as v

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 13


Design and Analysis of Algorithms (CS252) Module 3

else

par[u]=v;

For example, let set S be {1,2,3,4,5,6}.

Applying makeset(i) six times, once each for each element in the set, initializes the structure to the
collection of six singleton sets: {1}, {2}, {3}, {4], {5}, {6}.

Performing union(l, 4) and union(5, 2) yields {1, 4}, {5, 2}, {3}, {6}.

If now, union( 4, 5) is called, we get the sets {1, 4, 5, 2}, {3}, {6}.

Most implementations of this abstract data type assign a common representative object for each object in
the subset. This representative object is usually one of the members of the subset. Some implementations
do not impose any specific constraints on such a representative. Some implementations, however, require
the smallest element of each subset to be used as the subset’s representative.

The representative object of any two objects x and y will be different unless both of them are in the same
subset. It is usually assumed that the set elements are (or can be mapped into) integers.

3.3.2 Using Union-Find in Kruskal’s Algorithm to find minimum spanning tree


Find the MST of the following graph using Union-Find Method in Kruskal’s Algorithm:

Edges arranged in ascending order of weights: 1-2, 4-5, 0-1, 1-5, 2-5, 0-5, 3-5, 0-4, 2-3, 3-4.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 14


Design and Analysis of Algorithms (CS252) Module 3

There are two principal alternatives for implementing this data structure.
The first one, called the quick find, optimizes the time efficiency of the find operation; the second one,
called the quick union, optimizes the union operation.
Under this scheme (quick union), the implementation of makeset(x) requires assigning the corresponding
element in the representative array to x. The time efficiency of this operation is obviously in O(1), and
hence the initialization of n singleton subsets is in O(n). The time efficiency of single find(x) operation is
O(n). Executing union(x, y) takes a constant-time operation O(1).

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 15


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 16


Design and Analysis of Algorithms (CS252) Module 3

Single-source shortest-paths problem – Dijkstra’s Algorithm


We consider the single-source shortest-paths problem: for a given vertex called the source in a weighted
connected graph, find shortest paths to all its other vertices. Here, we consider the best-known algorithm
for the single-source shortest-paths problem, called Dijkstra's algorithm.

There are many applications, where finding the weight of the paths between two nodes in a graph are
required. For example, in a road network, the interconnection structure of a set of roads is modelled as a
graph whose vertices are intersections and dead ends in the set of roads, and edges are defined by segments
of road that exists between pairs of such vertices. In such contexts, we often would like to find the shortest
path that exists between two vertices in the road network.

The application of Dijkstra's algorithm is limited to graphs with nonnegative weights only.

Dijkstra's algorithm finds the shortest paths to a graph's vertices in order of their distance from a given
source. First, it finds the shortest path from the source to a vertex nearest to it, then to a second nearest,
and so on. The set of vertices adjacent to the vertices in Ti can be referred to as "fringe vertices"; they
are the candidates from which Dijkstra's algorithm selects the next vertex nearest to the source.

To facilitate the algorithm’s operations, we label each vertex with two labels. The numeric label d indicates
the length of the shortest path from the source to that vertex. The other label indicates the name of the next-
to-last vertex on such a path, i.e., the parent of the vertex in the tree being constructed. With such labelling,
finding the next nearest vertex u∗ becomes a simple task of finding a fringe vertex with the smallest d
value.

The labeling and mechanics of Dijkstra 's algorithm are quite similar to those used by Prim's algorithm.
Both of them construct an expanding subtree of vertices by selecting the next vertex from the priority
queue of the remaining vertices. It is important not to mix them up, however. They solve different problems
and therefore operate with priorities computed in a different manner: Dijkstra's algorithm compares path
lengths and therefore must add edge weights, while Prim's algorithm compares the edge weights as given.

Example: Use Dijkstra’s algorithm to find the shortest path from a to all other vertices of the graph given
below.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 17


Design and Analysis of Algorithms (CS252) Module 3

a-b : 3

a-b-d : 3+2=5

a-b-c : 3+4=7

a-b-d-e : 3+2+4=9

The shortest paths (identified by following nonnumeric labels backward from a destination vertex in the
left column to the source) and their lengths (given by numeric labels of the tree vertices) are as follows:

from a to b : a − b of length 3

from a to d : a − b − d of length 5

from a to c : a − b − c of length 7

from a to e : a − b − d − e of length 9

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 18


Design and Analysis of Algorithms (CS252) Module 3

Initialize(Q) :- Used to create empty


priority queue.
Insert(Q,V,dv) :- Initialises the value for
the vertex V.
Decrease(Q,V,dv) :- Used to update the
vertex V with the value dv.
DeleteMin(Q) :- Deletes the minimum
vertex from the priority queue.

Time efficiency of Dijkstra's algorithm

Dijkstra’s algorithm is very similar to Prim’s algorithm except that values assigned to the vertices in the
priority queue are the path weights rather than edge weights. The time efficiency of Dijkstra's algorithm
depends on the data structures used for implementing the priority queue and for representing an input graph
itself. For graphs represented by their weight matrix and the priority queue implemented as an unordered
array, the time complexity is Θ(|V|2). For graphs represented by their adjacency lists and the priority queue
implemented as a min-heap, it is in O(|E| log |V|).

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 19


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 20


Design and Analysis of Algorithms (CS252) Module 3

Consider source as a.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 21


Design and Analysis of Algorithms (CS252) Module 3

5.

Consider source as a

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 22


Design and Analysis of Algorithms (CS252) Module 3

Optimal Tree problem: Huffman Trees and Codes.


Suppose we have to encode a text that comprises characters from some n-character alphabet by assigning
to each of the text's characters some sequence of bits called the code word. For example, we can use a
fixed-length encoding that assigns to each character a bit string of the same length m like the standard
ASCII code does (all ascii characters use 8 bits). For example, we can code characters {a,b,c,d} as
{00,01,10,11} using fixed-length encoding where each character uses 2 bits.

Another way of getting a coding scheme that yields a shorter bit string on the average by assigning shorter
code words to more frequent characters and longer code words to less frequent characters. But this
introduces a problem, namely how can we tell how many bits of an encoded text represent any symbol. To
avoid this we use prefix code in which no code word is a prefix of a code word of another symbol. To
construct prefix code we will apply a greedy algorithm invented by David Huffman.

Huffman's Algorithm

Step 1 Initialize n one-node trees and label them with the characters of the alphabet. Record the frequency
of each character in its tree's root to indicate the tree's weight. (More generally, the weight of a tree will be
equal to the sum of the frequencies in the tree's leaves.)

Step 2 Repeat the following operation until a single tree is obtained. Find two trees with the smallest weight
(ties can be broken arbitrarily). Make them the left and right sub-tree of a new tree and record the sum of
their weights in the root of the new tree as its weight.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 23


Design and Analysis of Algorithms (CS252) Module 3

The tree constructed by the above algorithm is called a Huffman tree and the code constructed by the above
algorithm is called as a Huffman code.

Huffman Tree (HT) – HT is a binary tree that minimizes the path length from the root to the leaf of the
predefined weights.

Huffman Codes (HC) – HC is an optimal prefix variable length encoding scheme that assigns bit strings
to symbols based on their frequencies in a given text.

Example

Consider the five-symbol alphabet {A, B, C, D, _} with the following occurrence frequencies/probabilities
in a text made up of these symbols:

Encode for the string DAD and decode the code 1001101101110101.

Solution: The Huffman tree construction for this input is shown in Figure below.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 24


Design and Analysis of Algorithms (CS252) Module 3

Hence, DAD is encoded as 011101, and 1001101101110101 is decoded as BAD_ADD.

With the occurrence frequencies/probabilities given and the codeword lengths obtained, the average
number of bits per symbol in this code is (mean):

2 * 0.35 + 3 * 0.1+ 2 * 0.2 + 2 * 0.2 + 3 * 0.15 = 2.25.

Had we used a fixed-length encoding for the same alphabet, we would have to use at least 3 bits per each
symbol. Thus, for this example, Huffman’s code achieves the compression ratio—a standard measure of a
compression algorithm’s effectiveness—of (3−2.25)/3=0.25=25%. In other words, Huffman’s encoding of
the text will use 25% less memory than its fixed-length encoding.

Compression Ratio in percentage=

[(number of bits for fixed length-average number of bits per symbol)/ number of bits for fixed length]*100

Another Example of Huffman code.

We want to store a hypothetical text document in compressed form. The document contains words
consisting of only 6 characters (a,b,c,d,e,f). We have scanned the document and counted the occurrence of
each character in the document. The result is as

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 25


Design and Analysis of Algorithms (CS252) Module 3

The average number of bits per symbol in this code is(mean):

1 * 0.45 + 3 * 0.13+ 3 * 0.12 + 3 * 0.16 + 4 * 0.09 + 4 * 0.05 = 0.45+0.39+0.36+0.48+0.36+0.2=2.24

Compression ratio = 3-2.24/3=0.25=25%

Example 3: Construct a Huffman code for the following data and encode for the string
ABACABAD and decode for 100010111001010

Symbol A B C D -
Frequency 0.4 0.1 0.2 0.15 0.15

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 26


Design and Analysis of Algorithms (CS252) Module 3

Transform and Conquer Approach: Heaps and Heap Sort


The transform-and-conquer technique of problem solving typically has two stages. First, in the
transformation stage, the problem's instance is modified to be, for one reason or another, more amenable
to solution. Then, in the second or conquering stage, it is solved.

There are three major variations of this idea that differ by what we transform a given instance to

1. Transformation to a simpler or more convenient instance of the same problem-we call it instance
simplification

Some examples of instance simplification are

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 27


Design and Analysis of Algorithms (CS252) Module 3

 Checking element uniqueness in an array: The bruteforce algorithm takes O(n2). We can
transform the array by sorting it and then check for uniqueness in the sorted array. This can be
done in O(n log n).
 A simple binary search tree (BST) can be transformed into a balanced BST (AVL tree) to
improve the performance in searching a key in a BST.

2. Transformation to a different representation of the same instance - we call it representation change.

An example of representation change is using a heap representation to sort an array.

3. Transformation to an instance of a different problem for which an algorithm is already available-we


call it problem reduction.

The Least Common Multiple (LCM) of two numbers, 'a' and 'b', can be calculated using the
formula: lcm(a, b) = (|a * b|) / gcd(a, b), where gcd(a, b) is the Greatest Common Divisor (GCD) of 'a'
and 'b'.

Heaps and Heap sort

A heap is a data structure that is commonly used to implement a priority queue. It is also employed in
heapsort algorithm.

A priority queue is a multiset of items with an orderable characteristic called an item’s priority, with the
following operations:

1. Finding an item with the highest (i.e., largest) priority. This is called a max-heap. A min-heap can
be used to find the smallest item.
2. Deleting an item with the highest priority (lowest priority in the case of a min-heap_
3. Adding a new item to the multiset

What is heap data structure?

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 28


Design and Analysis of Algorithms (CS252) Module 3

A heap can be defined as a binary tree with keys assigned to its nodes, one key per node, provided the
following two conditions are met:

1. The shape property—the binary tree is essentially complete (or simply complete), i.e., all its levels
are full except possibly the last level, where only some rightmost leaves may be missing.
2. The parental dominance or heap property—the key in each node is greater than (less than in the
case of min-heap) or equal to the keys in its children. (This condition is considered automatically
satisfied for all leaves.)

The figure below shows some trees which are heaps and a few others which are not heaps

Note that key values in a heap are ordered top down; that is, a sequence of values on any path from the
root to a leaf is decreasing (non-increasing, if equal keys are allowed). However, there is no left-to-right
order in key values; that is, there is no relationship among key values for nodes either on the same level of
the tree or, more generally, in the left and right subtrees of the same node.

Here is a list of important properties of heaps

1. There exists exactly one essentially complete binary tree with n nodes. Its height is equal to log2 n.
2. The root of a heap always contains its largest element.
3. A node of a heap considered with all its descendants is also a heap.
4. A heap can be implemented as an array by recording its elements in the top-down, left-to-right
fashion.

Array implementation of heap

We could also define a heap as an array H[1..n]. In this array,

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 29


Design and Analysis of Algorithms (CS252) Module 3

Constructing a heap for given list of keys.

There are two alternative approaches to construct a heap for given list of keys. The bottom-up heap
construction algorithm and the top-down heap construction algorithm.

The bottom-up heap construction algorithm

This algorithm initializes the essentially complete binary tree with n nodes by placing keys in the order
given and then "heapifies" the tree as follows. Starting with the last parental node, the algorithm checks
whether the parental dominance holds for the key at this node. If it does not, the algorithm exchanges the
node's key K with the larger key of its children and checks whether the parental dominance holds for K in
its new position. This process continues until the parental dominance requirement for K is satisfied. After
completing the "heapification" of the subtree rooted at the current parental node, the algorithm proceeds
to do the same for the node's immediate predecessor. The algorithm stops after this is done for the tree's
root.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 30


Design and Analysis of Algorithms (CS252) Module 3

The algorithm for Bottom-up construction is given below.

The top-down heap construction algorithm

This algorithm constructs a heap by successive insertions of a new key into a previously constructed heap;
the algorithm starts with an empty heap and stops when all elements are inserted into the heap.

Inserting an element to a heap

First, attach a new node with key K in it after the last leaf of the existing heap. Then shift K up to its
appropriate place in the new heap as follows.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 31


Design and Analysis of Algorithms (CS252) Module 3

Compare K with its parent’s key: if the latter is greater than or equal to K, stop (the structure is a
heap); otherwise, swap these two keys and compare K with its new parent. This swapping continues
until K is not greater than its last parent or it reaches the root (illustrated in Figure).

The figure below shows the sequence of key insertions done in the top-down construction of a heap for the
keys 2,8,6,1,10.

Delete operation in max-heap

The deletion operation is normally performed to delete the root element of the heap. This is the dequeue
operation when heap is used as a priority queue. The algorithm for deleting an element from heap is given
below.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 32


Design and Analysis of Algorithms (CS252) Module 3

The figure below shows the use of the algorithm to delete from heap.

Time complexity of heap operations

Inserting an element in to heap of n elements cannot require more key comparisons than the heap’s height.
Since the height of a heap with n nodes is about log2 n, the time efficiency of insertion is in O(log n).

The efficiency of deletion is determined by the number of key comparisons needed to “heapify” the tree
after the swap has been made and the size of the tree is decreased by 1. Since this cannot require more key
comparisons than twice the heap’s height, the time efficiency of deletion is in O(log n) as well.

Heap sort
Heapsort is a sorting algorithm that uses a heap data structure to sort elements efficiently. It has a time
complexity of O (n log n) in the worst, average, and best cases. This is a two-stage algorithm that works
as follows

Stage 1 (heap construction): Construct a heap for a given array.

Stage 2 (maximum deletions): Apply the root-deletion operation n − 1 times to the remaining heap.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 33


Design and Analysis of Algorithms (CS252) Module 3

In stage 2, the element deleted is placed immediately after the last element of the heap in the
underlying array structure.

For a given array that contains numbers, the steps in Stage 1 is shown in the figure below.

The steps in Stage 2 has deleting the max element from the heap placing it at the array position immediately
after the boundary of the heap. Deletion essentially involves swapping the root element with last leaf node
of the heap and then re-heapying the resulting tree. The heap structure and the array elements after each
deletion and re-heaping are shown in the figures below. The array elements which are not part of the heap
are at the end of array representation.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 34


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 35


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 36


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 37


Design and Analysis of Algorithms (CS252) Module 3

Time complexity of heap sort

The time efficiency of heapsort is Θ(n log n) in both the worst and average cases. Each deletion operation
from the heap takes at most log n operations and since there are n such deletions, stage 2 of the heap sort
has a complexity of Θ(n log n). Stage 1, which constructs the heap also has O(n log n) complexity. Hence
the overall complexity of heap sort is Θ(n log n).

Note: In Place algorithm


Quick sort is better than Heap sort and Heap sort is better than Merge sort

Problem 2

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 38


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 39


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 40


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 41


Design and Analysis of Algorithms (CS252) Module 3

Dynamic Programming - General Method


Dynamic programming is a technique for solving problems with overlapping sub-problems. Typically,
these sub-problems arise from a recurrence relating a solution to a given problem with solutions to its
smaller sub-problems of the same type. Rather than solving overlapping sub-problems again and again,
dynamic programming suggests solving each of the smaller sub-problems only once and recording the
results in a table from which we can then obtain a solution to the original problem. The word
"programming" in the name of this technique stands for "planning" and does not refer to computer
programming.

Dynamic programming is more often used in optimization problems though the technique can be used in
other problems also. We will first look at non-optimization problems where dynamic programming is used.

Example of Dynamic Programming: Computing Fibonacci number

The Fibonacci numbers are the elements of the sequence

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89 …., which can be defined by the simple recurrence

and two initial conditions

If we try to use recurrence directly to compute the nth Fibonacci number F(n), we would have to recompute
the same values of this function for a smaller number many times. For Example, computation of F(10)
requires computation of F(9) and F(8). The computation of F(9) also requires computation of F(8). If we
use recursive calls, F(8) will be computed twice. The time complexity of the algorithm can be reduced by
computing F(8) only once. Dynamic programming achieves this by noting down the result of the first
computation of F(8) and using this value in other computations where F(8) is required.

Dynamic programming is an algorithmic paradigm in which a problem is solved by identifying a collection


of sub-problems and tackling them one by one, smallest first, using the answers to small problems to help
figure out larger ones, until the whole lot of them is solved. For example, to compute F(N) we can simply
fill elements of a one-dimensional array with consecutive values of F(n) by starting with 0 and 1. This is
an example of a classic bottom-up dynamic programming approach, where we solve all smaller sub-

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 42


Design and Analysis of Algorithms (CS252) Module 3

problems of a given problem to obtain the final solution. The iterative version of computing Fibonacci
number is following this approach of bottom-up dynamic programming. There exists a top-down variation
of dynamic programming which uses memory functions.

Computing a Binomial Coefficient

Computing a binomial coefficient is a standard example of applying dynamic programming to a non-


optimization problem. The binomial coefficient, denoted C(n, k) or is the number of combinations
(subsets) of k elements from an n-element set. The name "binomial coefficients" comes from the
participation of these numbers in the binomial formula

Two of the properties of binomial coefficients are:

The following pseudocode can be used to compute all the binomial coefficients

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 43


Design and Analysis of Algorithms (CS252) Module 3

What is the time efficiency of this algorithm? The algorithm's basic operation is addition. The inner loop
is executed i times until the value of i reaches k. Once i becomes greater the k, the loop is executed k times.
Hence,

Warshall's and Floyd's Algorithms

Warshall's algorithm is for computing the transitive closure of a directed graph and Floyd's algorithm is
for the all-pairs shortest-paths problem. These algorithms are based on essentially the same idea, which we
can interpret as an application of the dynamic programming technique.

Warshall's algorithm for computing the transitive closure

The adjacency matrix A of an unweighted directed graph is the Boolean matrix that has 1 in its ith row and
jth column if there is a directed edge from the ith vertex to the jth vertex and 0 otherwise. A transitive closure
of the digraph is a Boolean matrix that has 1 in its ith row and jth column if there is a path from ith vertex to
the jth vertex and 0 otherwise.

An example of a digraph, its adjacency matrix, and its transitive closure are given below.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 44


Design and Analysis of Algorithms (CS252) Module 3

One way to find out whether there is a path from i to j is to run either DFS or BFS from node i. This can
be repeated for all nodes to construct the transitive closure matrix.

Warshall’s algorithm named after Stephen Warshall, who discovered it provides a more efficient method
to find transitive closure. It is convenient to assume that the digraph’s vertices and hence the rows and
columns of the adjacency matrix are numbered from 1 to n. Warshall’s algorithm constructs the transitive
closure through a series of n × n boolean matrices:

R(0) does not allow any intermediate vertices in its paths (only direct edge)- Adjacency matrix

R(1) does not allow any intermediate vertices in its paths except vertex 1 is allowed as intermediate.

R(2) does not allow any intermediate vertices in its paths except vertex 1,2 is allowed as
intermediate.

….

R(n) reflects paths that can use all n vertices of the digraph as intermediate and hence is nothing else
but the digraph’s transitive closure.

The entry in ith row, jth column of R(k) tells us whether there in a path from vertex i to vertex j that has
vertices numbered 1 to k as intermediate vertices in the path. For example, R(0) is the original adjacency
matrix, which shows direct edges (i.e., 0 intermediate vertices) and R(1) has a 1 at [i,j] iff there is a path

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 45


Design and Analysis of Algorithms (CS252) Module 3

from i to j involving vertex 1 as the only intermediate vertex in the path (i.e., there should be an edge from
i to 1 and from 1 to j).

The elements matrix R(k) is computed from R(k-1) using the formula:
The following method can be used for generating elements of matrix R(k)

1. If an element rij is 1 in R(k−1), it remains 1 in R(k).


2. If an element rij is 0 in R(k−1), it has to be changed to 1 in R(k) if and only if the element in its row i
and column k and the element in its column j and row k are both 1’s in R(k−1).

Warshall’s algorithm is given below.

Time complexity

The time complexity of Warshall’s algorithm is Θ(n3).

Example: Apply Warshall’s algorithm on the graph shown below.

The following figure shows the application of Warshall’s algorithm.

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 46


Design and Analysis of Algorithms (CS252) Module 3

Floyd’s Algorithm for the All-Pairs Shortest-Paths Problem

All-Pairs Shortest-Paths Problems finds the shortest distance from each vertex to all other vertices in a
weighted graph (undirected or directed). Floyd’s algorithm is an all-pairs shortest path algorithm which
uses concepts that are similar to the ones used in Warshall’s algorithm. It is applicable to both undirected
and directed weighted graphs provided that they do not contain a cycle of a negative length.

Floyd’s algorithm records the lengths of shortest paths in an n × n matrix D called the distance matrix: the
element dij in the ith row and the jth column of this matrix indicates the length of the shortest path from the
ith vertex to the jth vertex.

Like Warshall’s algorithm, Floyd’s algorithm computes the shortest paths as series of n × n distance
matrices:

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 47


Design and Analysis of Algorithms (CS252) Module 3

The entry in ith row, jth column of D(k) gives us the weight(distance) of the shortest path from vertex i to
vertex j that has vertices numbered 1 to k as intermediate vertices in the path.

D(0) is the original adjacency matrix, which shows the weight of the direct edges (i.e., no intermediate
vertices in the path). The entry in the ith row, jth column of this matrix is initialized with the weight of the
edges in the graph. If there is no edge between i and j in the graph, dij(0) has value infinity. The figure below
shows the adjacency matrix and the final distance matrix of a directed graph.

The elements matrix D(k) is computed from D(k-1) using the formula

This formula tells us that the shortest path from vertex i to j through vertices numbered 1 to k is the
minimum among the following

1. Distance of the path from i to j that has only intermediate nodes numbered 1 to k-1 (i.e., dij(k-1))
and
2. Sum of the distances of the two paths: Path from i to k that uses only intermediate nodes numbered
1 to k-1 , and path from k to j solely through nodes numbered 1 to k-1 (dik(k-1) + dkj(k-1))

Time complexity

1. The time complexity of Floyd’s algorithm is Θ(n3).

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 48


Design and Analysis of Algorithms (CS252) Module 3

Application of Floyd’s algorithm on a graph is given below.

For the illustration purpose, we will see how two entries in D(3) are calculated. Note that the node numbered
3 is c.

Dba(3) = min(dba(2), dbc(2)+ dca(2)) = min(2, 5+9) = 2.

Dab(3) = min(dab(2), dac(2)+ dcb(2)) = min(∞, 3+7) = 10

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 49


Design and Analysis of Algorithms (CS252) Module 3

Dr. Radhakrishnan G, Dept. of CSE, CITech 2024-25 Page 50

You might also like