Module 5 Graphs
Module 5 Graphs
Hashing – Introduction, Static Hashing, Dynamic Hashing Text Book 3 -8.1 – 8.3 Graphs - Graph representation, Elementary graph operations,
Minimum cost spanning Trees – Kruskal’s Algorithm, Prim’s algorithm Text Book 3 - 6.1,6.2,6.3.1,6.3.2
Hashing – Introduction,
Hashing is the process of transforming any given key or a string of characters into another value. This is usually represented by a shorter, fixed-length value or key that represents and makes
it easier to find or employ the original string.
What is Hashing?
Hashing in Data Structures refers to the process of transforming a given key to another value. It involves mapping data to a specific index in a hash table using a hash function that enables
fast retrieval of information based on its key. The transformation of a key to the corresponding value is done using a Hash Function and the value obtained from the hash function is
Components of Hashing
There are majorly three components of hashing:
1. Key: A Key can be anything string or integer which is fed as input in the hash function the technique that determines an index or location for storage of
an item in a data structure.
2. Hash Function: The hash function receives the input key and returns the index of an element in an array called a hash table. The index
is known as the hash index .
3. Hash Table: Hash table is a data structure that maps keys to values using a special function called a hash function. Hash stores the data in an asso -
ciative manner in an array where each data value has its own unique index.
Components of Hashing
There are majorly three components of hashing:
1. Key: A Key can be anything string or integer which is fed as input in the hash function the technique that determines an index or location for storage of an item in a data struc -
ture.
2. Hash Function: The hash function receives the input key and returns the index of an element in an array called a hash table. The index is known as the hash index.
3. Hash Table: Hash table is a data structure that maps keys to values using a special function called a hash function. Hash stores the data in an associative manner in an array
Suppose we have a set of strings {“ab”, “cd”, “efg”} and we would like to store it in a table.
Our main objective here is to search or update the values stored in the table quickly in O(1) time and we are not concerned about the ordering of strings in the table. So the given set of strings
can act as a key and the string itself will act as the value of the string but how to store the value corresponding to the key?
Step 1: We know that hash functions (which is some mathematical formula) are used to calculate the hash value which acts as the index of the data structure where the value will
be stored.
o “a” = 1,
Step 3: Therefore, the numerical value by summation of all characters of the string:
“ab” = 1 + 2 = 3,
“cd” = 3 + 4 = 7 ,
“efg” = 5 + 6 + 7 = 18
Step 4: Now, assume that we have a table of size 7 to store these strings. The hash function that is used here is the sum of the characters in key mod Table size . We can com-
pute the location of the string in the array by taking the sum(string) mod 7 .
o “ab” in 3 mod 7 = 3,
o “efg” in 18 mod 7 = 4.
The above technique enables us to calculate the location of a given string by using a simple hash function and rapidly find the value that is stored in that location. Therefore the idea of hash -
ing seems like a great way to store (key, value) pairs of the data in a table.
The hash function creates a mapping between key and value, this is done through the use of mathematical formulas known as hash functions. The result of the
hash function is referred to as a hash value or hash. The hash value is a representation of the original string of characters but usually smaller than the original.
For example: Consider an array as a Map where the key is the index and the value is the value at that index. So for an array A if we have index i which will be
treated as the key then we can find the value by simply looking at the value at A[i].
Adjacency Matrix
Adjacency List
In this method, the graph is stored in the form of the 2D matrix where rows and columns denote vertices. Each entry in the matrix represents the weight of the
Static Hashing,
Dynamic Hashing
In dynamic hashing mechanism the collisions will resolved in two different ways namely open addressing and closed addressing.
Open Addressing
chaining
Closed addressing
Linear probing
Quadratic probing
Double hashing
Open addressing with separate chaining for collision handling in hashing operations
Separate Chaining:
The idea behind separate chaining is to implement the array as a linked list called a chain.
The linked list data structure is used to implement this technique. So what happens is, when multiple elements are hashed into the same slot index, then these elements are inserted into a
Here, all those elements that hash into the same slot index are inserted into a linked list. Now, we can use a key K to search in the linked list by just linearly traversing. If the intrinsic key for
any entry is equal to K then it means that we have found our entry. If we have reached the end of the linked list and yet we haven’t found our entry then it means that the entry does not exist.
Hence, the conclusion is that in separate chaining, if two different elements have the same hash value then we store both the elements in the same linked list one after the other.
Example: Let us consider a simple hash function as “key mod 5” and a sequence of keys as 12, 22, 15, 25
Open Addressing is a method for handling collisions. In Open Addressing, all elements are stored in the hash table itself. So at any point, the size of the table must
be greater than or equal to the total number of keys (Note that we can increase table size by copying old data if needed). This approach is also known as closed
hashing. This entire procedure is based upon probing.
Insert(k): Keep probing until an empty slot is found. Once an empty slot is found, insert k.
Search(k): Keep probing until the slot’s key doesn’t become equal to k or an empty slot is reached.
Delete(k): Delete operation is interesting. If we simply delete a key, then the search may fail. So slots of deleted keys are marked specially as “deleted”.
The insert can insert an item in a deleted slot, but the search doesn’t stop at a deleted slot.
1. Linear Probing:
In linear probing, the hash table is searched sequentially that starts from the original location of the hash. If in case the location that we get is already occupied, then
we check for the next location.
For example, The typical gap between two probes is 1 as seen in the example below:
Let hash(x) be the slot index computed using a hash function and S be the table size
Example: Let us consider a simple hash function as “key mod 5” and a sequence of keys that are to be inserted are 50, 70, 76, 85, 93.
2. Quadratic Probing
If you observe carefully, then you will understand that the interval between probes will increase proportionally to the hash value. Quadratic probing is a method
with the help of which we can solve the problem of clustering that was discussed above. This method is also known as the mid-square method. In this method, we
look for the i2‘th slot in the ith iteration. We always start from the original hash location. If only the location is occupied then we check the other slots.
2
Example: Let us consider table Size = 7, hash function as Hash(x) = x % 7 and collision resolution strategy to be f(i) = i . Insert = 22, 30, and 50.
1. Adjacency Matrix
2. Adjacency List
Adjacency Matrix
An adjacency matrix is a way of representing a graph as a matrix of boolean (0’s and 1’s).
Let’s assume there are n vertices in the graph So, create a 2D matrix adjMat[n][n] having dimension n x n.
If there is an edge from vertex i to j, mark adjMat[i][j] as 1.
If there is no edge from vertex i to j, mark adjMat[i][j] as 0.
The below figure shows an undirected graph. Initially, the entire Matrix is initialized to 0. If there is an edge from source to destination, we insert 1 to both cases
(adjMat[destination] and adjMat[destination]) because we can go either way.
The below figure shows a directed graph. Initially, the entire Matrix is initialized to 0. If there is an edge from source to destination, we insert 1 for that
particular adjMat[destination].
Adjacency List
An array of Lists is used to store edges between two vertices. The size of array is equal to the number of vertices (i.e, n). Each index in this array represents a
specific vertex in the graph. The entry at the index i of the array contains a linked list containing the vertices that are adjacent to vertex i.
Let’s assume there are n vertices in the graph So, create an array of list of size n as adjList[n].
adjList[0] will have all the nodes which are connected (neighbour) to vertex 0.
adjList[1] will have all the nodes which are connected (neighbour) to vertex 1 and so on.
The below undirected graph has 3 vertices. So, an array of list will be created of size 3, where each indices represent the vertices. Now, vertex 0 has two neighbours
(i.e, 1 and 2). So, insert vertex 1 and 2 at indices 0 of array. Similarly, For vertex 1, it has two neighbour (i.e, 2 and 0) So, insert vertices 2 and 0 at indices 1 of
array. Similarly, for vertex 2, insert its neighbours in array of list.
The below directed graph has 3 vertices. So, an array of list will be created of size 3, where each indices represent the vertices. Now, vertex 0 has no neighbours.
For vertex 1, it has two neighbour (i.e, 0 and 2) So, insert vertices 0 and 2 at indices 1 of array. Similarly, for vertex 2, insert its neighbours in array of list.
BFS of Graphs
Breadth-First Search (BFS) is an algorithm for traversing or searching tree or graph data structures. It starts at a given node (often called the "source" or "root"
node) and explores all of the neighbor nodes at the present depth level before moving on to nodes at the next depth level. Here's a step-by-step explanation of how
BFS works:
1. Initialization:
o
Start with a queue and enqueue the starting node.
o
Mark the starting node as visited.
o
2. Process:
o
While the queue is not empty:
Dequeue a node from the queue.
Process the dequeued node (e.g., print it, check if it is the target node, etc.).
Enqueue all adjacent (neighboring) nodes of the dequeued node that have not been visited yet and mark them as visited.
3. Termination:
o
The algorithm continues until the queue is empty, meaning all reachable nodes have been visited.
Example
Graph = {
'A': ['B', 'C'],
'B': ['A', 'D', 'E'],
'C': ['A', 'F'],
'D': ['B'],
'E': ['B', 'F'],
'F': ['C', 'E']
}
BFS of Graphs
1. Initialization: Enqueue the starting node into a queue and mark it as visited.
Dequeue a node from the queue and visit it (e.g., print its value).
This algorithm ensures that all nodes in the graph are visited in a breadth-first manner, starting from the starting node.
Starting from the root, all the nodes at a particular level are visited first and then the nodes of the next level are traversed till all the nodes are visited.
To do this a queue is used. All the adjacent unvisited nodes of the current level are pushed into the queue and the nodes of the current level are marked visited and
Illustration:
Let us understand the working of the algorithm with the help of the following example.
Step 3: Remove node 0 from the front of queue and visit the unvisited neighbours and push them into queue.
Remove node 0 from the front of queue and visited the unvisited neighbours and push into queue.
Step 4: Remove node 1 from the front of queue and visit the unvisited neighbours and push them into queue.
Remove node 1 from the front of queue and visited the unvisited neighbours and push
Step 5: Remove node 2 from the front of queue and visit the unvisited neighbours and push them into queue.
Remove node 2 from the front of queue and visit the unvisited neighbours and push them into queue.
Step 6: Remove node 3 from the front of queue and visit the unvisited neighbours and push them into queue.
As we can see that every neighbours of node 3 is visited, so move to the next node that are in the front of the queue.
Remove node 3 from the front of queue and visit the unvisited neighbours and push them into queue.
Steps 7: Remove node 4 from the front of queue and visit the unvisited neighbours and push them into queue.
As we can see that every neighbours of node 4 are visited, so move to the next node that is in the front of the queue.
Remove node 4 from the front of queue and visit the unvisited neighbours and push them into queue.
A Minimum Cost Spanning Tree (MCST) is a subset of the edges in a connected, weighted graph that connects all the vertices together without any cycles and with the minimum possible
total edge weight. The problem of finding an MCST is fundamental in graph theory and has practical applications in network design, such as designing least-cost communication networks,
There are several algorithms to find an MCST, with the two most popular being:
1. Kruskal's Algorithm:
o This algorithm sorts all the edges in the graph by their weights.
o It adds edges one by one to the spanning tree, starting with the smallest weight, ensuring that no cycles are formed.
o It uses a union-find data structure to efficiently check for cycles.
2. Prim's Algorithm:
o This algorithm starts with a single vertex and grows the spanning tree one edge at a time.
o It always adds the smallest edge that connects a vertex in the tree to a vertex outside the tree.
o It can be efficiently implemented using a priority queue (or min-heap).
Kruskal’s Algorithm,
A minimum spanning tree (MST) or minimum weight spanning tree for a weighted, connected, undirected graph is a spanning tree with a weight less than or equal to the weight of every
In Kruskal’s algorithm, sort all edges of the given graph in increasing order. Then it keeps on adding new edges and nodes in the MST if the newly added edge does not form a cycle. It picks
the minimum weighted edge at first and the maximum weighted edge at last. Thus we can say that it makes a locally optimal choice in each step in order to find the optimal solution. Hence
this is a Greedy Algorithm.
Below are the steps for finding MST using Kruskal’s algorithm:
Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If the cycle is not formed, include this edge. Else, discard it.
Repeat step#2 until there are (V-1) edges in the spanning tree.
Illustration:
Input Graph:
The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be having (9 – 1) = 8 edges.
After sorting:
1 7 6
2 8 2
2 6 5
4 0 1
4 2 5
6 8 6
7 2 3
7 7 8
8 0 7
8 1 2
9 3 4
10 5 4
11 1 7
14 3 5
Now pick all edges one by one from the sorted list of edges
Step 6: Pick edge 8-6. Since including this edge results in the cycle, discard it. Pick edge 2-3: No cycle is formed, include it.
Step 7: Pick edge 7-8. Since including this edge results in the cycle, discard it. Pick edge 0-7. No cycle is formed, include it.
Step 8: Pick edge 1-2. Since including this edge results in the cycle, discard it. Pick edge 3-4. No cycle is formed, include it.
Note: Since the number of edges included in the MST equals to (V – 1), so the algorithm stops here
Prim’s algorithm
The algorithm starts with an empty spanning tree. The idea is to maintain two sets of vertices. The first set contains the vertices already included in the MST, and the other set contains the
vertices not yet included. At every step, it considers all the edges that connect the two sets and picks the minimum weight edge from these edges. After picking the edge, it moves the other
A group of edges that connects two sets of vertices in a graph is called cut in graph theory. So, at every step of Prim’s algorithm, find a cut, pick the
minimum weight edge from the cut, and include this vertex in MST Set (the set that contains already included vertices).
The working of Prim’s algorithm can be described by using the following steps:
Note: For determining a cycle, we can divide the vertices into two sets [one set contains the vertices included in MST and the other contains the fringe vertices.]
Consider the following graph as an example for which we need to find the Minimum Spanning Tree (MST).
Example of a graph
Step 1: Firstly, we select an arbitrary vertex that acts as the starting vertex of the Minimum Spanning Tree. Here we have selected vertex 0 as the starting vertex.
Step 2: All the edges connecting the incomplete MST and other vertices are the edges {0, 1} and {0, 7}. Between these two the edge with minimum weight is {0, 1}.
So include the edge and vertex 1 in the MST.
Step 3: The edges connecting the incomplete MST to other vertices are {0, 7}, {1, 7} and {1, 2}. Among these edges the minimum weight is 8 which is of the edges
{0, 7} and {1, 2}. Let us here include the edge {0, 7} and the vertex 7 in the MST. [We could have also included edge {1, 2} and vertex 2 in the MST].
Step 4: The edges that connect the incomplete MST with the fringe vertices are {1, 2}, {7, 6} and {7, 8}. Add the edge {7, 6} and the vertex 6 in the MST as it has
the least weight (i.e., 1).
Step 5: The connecting edges now are {7, 8}, {1, 2}, {6, 8} and {6, 5}. Include edge {6, 5} and vertex 5 in the MST as the edge has the minimum weight (i.e., 2)
among them.
Step 6: Among the current connecting edges, the edge {5, 2} has the minimum weight. So include that edge and the vertex 2 in the MST.
Step 7: The connecting edges between the incomplete MST and the other edges are {2, 8}, {2, 3}, {5, 3} and {5, 4}. The edge with minimum weight is edge {2, 8}
which has weight 2. So include this edge and the vertex 8 in the MST.
Step 8: See here that the edges {7, 8} and {2, 3} both have same weight which are minimum. But 7 is already part of MST. So we will consider the edge {2, 3} and
include that edge and vertex 3 in the MST.
Step 9: Only the vertex 4 remains to be included. The minimum weighted edge from the incomplete MST to 4 is {3, 4}.
The final structure of the MST is as follows and the weight of the edges of the MST is (4 + 8 + 1 + 2 + 4 + 2 + 7 + 9) = 37.
Note: If we had selected the edge {1, 2} in the third step then the MST would look like the following.
Structure of the alternate MST if we had selected edge {1, 2} in the MST