Data Structures and Files
MODULE 08
GRAPHS
Motivation: This chapter develops the fundamentals of graph theory which is effectively used in
analyses of logical complexity of various computer-programming codes Moreover this chapter also
discusses types of graphs and various operations that can be performed on a graph.
Objective: To know the basic operations that can be performed on the graph, to understand the
different traversing methods of graph, to know about the graph ADT and to know about shortest
path algorithm.
Syllabus:
Prerequisite: Content Duration Self Study Time
Knowledge of The graph Abstract Data Type 1 Hr 1 Hr
relationship between
Data Structures for Graphs 1 Hr 1 Hr
objects, definition of
traversal scheme, data Graph Traversals 1 Hr 1 Hr
structure presentation.
Directed Graphs 1 Hr 1 Hr
Abbreviations:
ADT: Abstract Data Type
Key Definitions:
1. A graph is a way of representing relationships that exist between pairs of objects.
2. A graph G is simply a set V of vertices and a collection E of pairs of vertices from V, called edges.
3. A subgraph of a graph G is a graph H whose vertices and edges are subsets of the vertices and
edges of G.
Graphs 118
Data Structures and Files
4. A traversal is a systematic procedure for exploring a graph by examining all of its vertices and
edges.
5. A directed graph (digraph), is a graph whose edges are all directed
6. A weighted graph is a graph that has a numeric label w(e) associated with each edge e, called
the weight of edge e
Theory:
The Graph Abstract Data Type
A graph is a way of representing relationships that exist between pairs of objects. That is, a
graph is a set of objects, called vertices, together with a collection of pairwise connections
between them.
Graphs have applications in a host of different domains, including mapping, transportation,
electrical engineering, and computer networks.
A graph G is simply a set V of vertices and a collection E of pairs of vertices from V, called
edges. Thus, a graph is a way of representing connections or relationships between pairs of
objects from some set V.
Edges in a graph are either directed or undirected.
An edge (u, v) is said to be directed from u to v if the pair (u, v) is ordered, with u preceding
v.
An edge (u, v) is said to be undirected if the pair (u, v) is not ordered.
Graphs 119
Data Structures and Files
Fig 1:Directed Graph
Fig 2:Undirected Graph
A mixed graph G is a graph in which some edges may be directed and some may be undirected.
The two vertices joined by an edge are called the end vertices (or endpoints) of the edge. If an
edge is directed, its first endpoint is its origin and the other is the destination of the edge.
Two vertices u and v are said to be adjacent if there is an edge whose end vertices are u and v.
An edge is said to be incident on a vertex if the vertex is one of the edge's endpoints.
The degree of a vertex v, denoted deg(v), is the number of incident edges of v. The in-degree
and out-degree of a vertex v are the number of the incoming and outgoing edges of v, and are
denoted indeg(v) and outdeg(v), respectively.
A path is a sequence of alternating vertices and edges that starts at a vertex and ends at a vertex
such that each edge is incident to its predecessor and successor vertex.
A cycle is a path with at least one edge that has its start and end vertices the same.
Graphs 120
Data Structures and Files
A subgraph of a graph G is a graph H whose vertices and edges are subsets of the vertices and
edges of G, respectively. For example, in the flight network of Figure 11.3, vertices BOS, JFK, and
MIA, and edges AA 903 and DL 247 form a subgraph.
A spanning subgraph of G is a subgraph of G that contains all the vertices of the graph G. A
graph is connected if, for any two vertices, there is a path between them.
A forest is a graph without cycles. A tree is a connected forest, that is, a connected graph
without cycles.
Fig 3 : Graph
The Graph ADT
As an abstract data type, a graph is a collection of elements that are stored at the graph's
positions—its vertices and edges. Hence, we can store elements in a graph at either its edges or its
vertices (or both). In Java, this means we can define Vertex and Edge interfaces that each extend
the Position interface.
Methods used in a graph
vertices()
Return an iterable collection of all the vertices of the graph.
Graphs 121
Data Structures and Files
edges()
Return an iterable collection of all the edges of the graph.
incidentEdges(v)
Return an iterable collection of the edges incident upon vertex v.
opposite(v,e)
Return the endvertex of edge e distinct from vertex v; an error occurs if e is not incident
on v.
endVertices(e)
Return an array storing the end vertices of edge e.
areAdjacent(v,w)
Test whether vertices v and w are adjacent.
replace(v,x)
Replace the element stored at vertex v with x.
replace(e,x)
Replace the element stored at edge e with x.
insertVertex(x)
Insert and return a new vertex storing element x.
insertEdge(v,w,x)
Insert and return a new undirected edge with end vertices v and w and storing element
x.
Graphs 122
Data Structures and Files
removeVertex(v)
Remove vertex v and all its incident edges and return the element stored at v.
removeEdge(e)
Remove edge e and return the element stored at e.
Data Structures for Graphs
There are three popular ways of representing graphs, which are usually referred to as the
edge list structure, the adjacency list structure, and the adjacency matrix. In all three
representations, we use a collection to store the vertices of the graph.
The edge list structure and the adjacency list structure only store the edges actually present
in the graph, while the adjacency matrix stores a placeholder for every pair of vertices
(whether there is an edge between them or not).
For a graph G with n vertices and m edges, an edge list or adjacency list representation uses
O(n + m) space, whereas an adjacency matrix representation uses O(n2) space.
The Edge List Structure
The edge list structure is possibly the simplest, though not the most efficient, representation of a
graph G. In this representation, a vertex v of G storing an element o is explicitly represented by a
vertex object. All such vertex objects are stored in a collection V, such as an array list or node list.
If V is an array list, for example, then we naturally think of the vertices as being numbered.
Vertex Objects
The vertex object for a vertex v storing element o has instance variables for:
• A reference to o.
Graphs 123
Data Structures and Files
• A reference to the position (or entry) of the vertex-object in collection V.
The distinguishing feature of the edge list structure is not how it represents vertices, however,
but the way in which it represents edges. In this structure, an edge e of G storing an element o is
explicitly represented by an edge object. The edge objects are stored in a collection E, which
would typically be an array list or node list.
Edge Objects
The edge object for an edge e storing element o has instance variables for:
• A reference to o.
• References to the vertex objects associated with the endpoint vertices of e.
• A reference to the position (or entry) of the edge-object in collection E.
Visualizing the Edge List Structure
Graphs 124
Data Structures and Files
Fig 4: schematic representation of the edge list structure for G
The reason this structure is called the edge list structure is that the simplest and most common
implementation of the edge collection E is with a list.
The main feature of the edge list structure is that it provides direct access from edges to the
vertices they are incident upon. This allows us to define simple algorithms for methods
endVertices(e) and opposite(v, e).
Details for selected methods of edge list graph ADT are as follows:
• Methods vertices() and edges() are implemented by calling V.iterator() and E.iterator(),
respectively.
• Methods incidentEdges and areAdjacent all take O(m) time, since to determine which edges
are incident upon a vertex v we must inspect all edges.
Graphs 125
Data Structures and Files
• Since the collections V and E are lists implemented with a doubly linked list, we can insert
vertices, and insert and remove edges, in O(1) time.
• The update method removeVertex(v) takes O(m) time, since it requires that we inspect all
the edges to find and remove those incident upon v.
Thus, the edge list representation is simple but has significant limitations.
The Adjacency List Structure
The adjacency list structure for a graph G adds extra information to the edge list structure that
supports direct access to the incident edges (and thus to the adjacent vertices) of each vertex. The
adjacency list structure includes all the structural components of the edge list structure plus the
following:
• A vertex object v holds a reference to a collection I(v), called the incidence collection of v,
whose elements store references to the edges incident on v.
• The edge object for an edge e with end vertices v and w holds references to the positions (or
entries) associated with edge e in the incidence collections I(v)and I(w).
Traditionally, the incidence collection I(v) for a vertex v is a list, which is why we call this way of
representing a graph the adjacency list structure. The adjacency list structure provides direct
access both from the edges to the vertices and from the vertices to their incident edges.
Graphs 126
Data Structures and Files
Fig 5: schematic representation of the adjacency list structure of G
Details for selected methods of adjacency list graph ADT are as follows
• Method incidentEdges(v) takes time proportional to the number of incident vertices of v,
that is, O(deg(v)) time.
• Method areAdjacent(u,v) can be performed by inspecting either the incidence collection of u
or that of v. By choosing the smaller of the two, we get O(min(deg(u),deg(v))) running time.
• Method removeVertex(v) takes O(deg(v)) time.
The Adjacency Matrix Structure
Like the adjacency list structure, the adjacency matrix structure of a graph also extends the edge
list structure with an additional component. In this case, we augment the edge list with a matrix (a
two-dimensional array) A that allows us to determine adjacencies between pairs of vertices in
Graphs 127
Data Structures and Files
constant time. In the adjacency matrix representation, we think of the vertices as being the
integers in the set {0,1,..., n − 1} and the edges as being pairs of such integers. This allows us to
store references to edges in the cells of a two-dimensional n × n array A. Specifically, the
adjacency matrix representation extends the edge list structure as follows :
• A vertex object v stores a distinct integer i in the range 0,1,..., n − 1, called the index of v.
• We keep a two-dimensional n × n array A such that the cell A[i,j] holds a reference to the edge
(v, w), if it exists, where v is the vertex with index i and w is the vertex with index j. If there is no
such edge, then A[i,j] = null.
Fig 6: schematic representation of the simplified adjacency matrix
Graph Traversals
A traversal is a systematic procedure for exploring a graph by examining all of its vertices and edges.
Graphs 128
Data Structures and Files
Depth-First Search
Depth-first search is useful for testing a number of properties of graphs, including whether there is
a path from one vertex to another and whether or not a graph is connected.
Traversing mechanism
Depth-first search in an undirected graph G is analogous to wandering in a labyrinth with a string
and a can of paint without getting lost. We begin at a specific starting vertex s in G, which we
initialize by fixing one end of our string to s and painting s as "visited." The vertex s is now our
"current" vertex—call our current vertex u. We then traverse G by considering an (arbitrary) edge
(u,v) incident to the current vertex u. If the edge (u,v) leads us to an already visited (that is,
painted) vertex v, we immediately return to vertex u. If, on the other hand, (u, v) leads to an
unvisited vertex v, then we unroll our string, and go to v. We then paint v as "visited," and make it
the current vertex, repeating the computation aboce. Eventually, we will get to a "dead-end," that
is, a current vertex u such that all the edges incident on u lead to vertices already visited. Thus,
taking any edge incident on u will cause us to return to u. To get out of this impasse, we roll our
string back up, backtracking along the edge that brought us to u, going back to a previously visited
vertex v. We then make v our current vertex and repeat the computation above for any edges
incident upon v that we have not looked at before. If all of v's incident edges lead to visited
vertices, then we again roll up our string and backtrack to the vertex we came from to get to v,
and repeat the procedure at that vertex. Thus, we continue to backtrack along the path that we
have traced so far until we find a vertex that has yet unexplored edges, take one such edge, and
continue the traversal. The process terminates when our backtracking leads us back to the start
vertex s, and there are no more unexplored edges incident on s.
Graphs 129
Data Structures and Files
Fig 7: depth-first search traversal on a graph starting at vertex A
Graphs 130
Data Structures and Files
Discovery Edges and Back Edges
We can visualize a DFS traversal by orienting the edges along the direction in which they are
explored during the traversal, distinguishing the edges used to discover new vertices, called
discovery edges, or tree edges, from those that lead to already visited vertices, called back
edges. In the analogy above, discovery edges are the edges where we unroll our string when we
traverse them, and back edges are the edges where we immediately return without unrolling
any string. As we will see, the discovery edges form a spanning tree of the connected
component of the starting vertex s. We call the edges not in this tree "back edges" because,
assuming that the tree is rooted at the start vertex, each such edge leads back from a vertex in
this tree to one of its ancestors in the tree.
The pseudo-code for a DFS traversal starting at a vertex v follows our analogy with string and
paint. We use recursion to implement the string analogy, and we assume that we have a
mechanism (the paint analogy) to determine if a vertex or edge has been explored or not, and to
label the edges as discovery edges or back edges. This mechanism will require additional space
and may affect the running time of the algorithm.
Code for DFS algorithm
Graphs 131
Data Structures and Files
In terms of its running time, depth-first search is an efficient method for traversing a graph.
Note that DFS is called exactly once on each vertex, and that every edge is examined exactly
twice, once from each of its end vertices. Thus, if n s vertices and ms edges are in the connected
component of vertex s, a DFS starting at s runs in O(n s + ms) time, provided the following
conditions are satisfied:
Breadth-First Search
Like DFS, BFS traverses a connected component of a graph, and in so doing defines a useful
spanning tree. BFS is less "adventurous" than DFS, however. Instead of wandering the graph, BFS
proceeds in rounds and subdivides the vertices into levels. BFS can also be thought of as a
traversal using a string and paint, with BFS unrolling the string in a more conservative manner.
Traversing Mechanism
BFS starts at vertex s, which is at level 0 and defines the "anchor" for our string. In the first round,
we let out the string the length of one edge and we visit all the vertices we can reach without
unrolling the string any farther. In this case, we visit, and paint as "visited," the vertices adjacent
to the start vertex s—these vertices are placed into level 1. In the second round, we unroll the
string the length of two edges and we visit all the new vertices we can reach without unrolling our
string any farther. These new vertices, which are adjacent to level 1 vertices and not previously
assigned to a level, are placed into level 2, and so on. The BFS traversal terminates when every
vertex has been visited.
Algorithm for a BFS starting at a vertex s
We use auxiliary space to label edges, mark visited vertices, and store collections associated with
levels. That is, the collections L 0, L1, L2, and so on, store the vertices that are in level 0, level 1, level
2, and so on. These collections could, for example, be implemented as queues. They also allow BFS
to be nonrecursive.
Graphs 132
Data Structures and Files
Graphs 133
Data Structures and Files
Fig 8 breadth-first search traversal
Graphs 134
Data Structures and Files
One of the nice properties of the BFS approach is that, in performing the BFS traversal, we can
label each vertex by the length of a shortest path (in terms of the number of edges) from the
start vertex s. In particular, if vertex v is placed into level i by a BFS starting at vertex s, then the
length of a shortest path from s to v is i
Directed Graphs
A directed graph (digraph), is a graph whose edges are all directed.
Methods Dealing with Directed Edges
When we allow for some or all the edges in a graph to be directed, we should add the following
two methods to the graph ADT in order to deal with edge directions.
isDirected(e): Test whether edge e is directed.
insertDirectedEdge(v, w, o): Insert and return a new directed edge with origin v and destination w
and storing element o.
Also, if an edge e is directed, the method endVertices(e) should return an array A such that A[0] is
the origin of e and A[1] is the destination of e. The running time for the method isDirected(e)
should be O(1), and the running time of the method insertDirectedEdge(v, w, o) should match that
of undirected edge insertion.
Reachability
Reachability, deals with determining where we can get to in a directed graph. A traversal in a
directed graph always goes along directed paths, that is, paths where all the edges are traversed
according to their respective directions. Given vertices u and v of a digraph , we say that u reaches
v (and v is reachable from u) if has a directed path from u to v. We also say that a vertex v reaches
an edge (w,z) if v reaches the origin vertex w of the edge.
A digraph is strongly connected if for any two vertices u and v of , u reaches v and v reaches u. A
directed cycle of is a cycle where all the edges are traversed according to their respective
Graphs 135
Data Structures and Files
directions. (Note that may have a cycle consisting of two edges with opposite direction between
the same pair of vertices.)
A digraph is acyclic if it has no directed cycles.
The transitive closure of a digraph is the digraph such that the vertices of are the same as the
vertices of , and has an edge (u, v), whenever has a directed path from u to v. That is, we define by
starting with the digraph and adding in an extra edge (u, v) for each u and v such that v is
reachable from u
Figure 9: Examples of reachability in a digraph: (a) a directed path from BOS to LAX (b) a directed
cycle (ORD, MIA, DFW, LAX, ORD) its vertices induce a strongly connected subgraph; (c) the
subgraph of the vertices and edges reachable from ORD is shown in blue; (d) removing the
dashed blue edges gives an acyclic digraph.
Interesting problems that deal with reachability in a digraph include the following:
• Given vertices u and v, determine whether u reaches v.
Graphs 136
Data Structures and Files
• Find all the vertices of that are reachable from a given vertex s.
• Determine whether is strongly connected.
• Determine whether is acyclic.
• Compute the transitive closure of
Traversing a Digraph
As with undirected graphs, we can explore a digraph in a systematic way with methods akin to the
depth-first search (DFS) and breadth-first search (BFS) algorithms defined previously for
undirected graphs.
The Directed DFS algorithm.
Directed Acyclic Graphs
Directed graphs without directed cycles are encountered in many applications. Such a digraph is
often referred to as a directed acyclic graph, or DAG, for short. Applications of such graphs
include the following:
• Inheritance between classes of a Java program.
• Prerequisites between courses of a degree program.
• Scheduling constraints between the tasks of a project.
Graphs 137
Data Structures and Files
Example Directed Acyclic Graph:
In order to manage a large project, it is convenient to break it up into a collection of smaller tasks.
The tasks, however, are rarely independent, because scheduling constraints exist between them.
(For example, in a house building project, the task of ordering nails obviously precedes the task of
nailing shingles to the roof deck.) Clearly, scheduling constraints cannot have circularities, because
they would make the project impossible. (For example, in order to get a job you need to have
work experience, but in order to get work experience you need to have a job.) The scheduling
constraints impose restrictions on the order in which the tasks can be executed. Namely, if a
constraint says that task a must be completed before task b is started, then a must precede b in
the order of execution of the tasks. Thus, if we model a feasible set of tasks as vertices of a
directed graph, and we place a directed edge from v tow whenever the task for v must be
executed before the task for w, then we define a directed acyclic graph.
The example above motivates the following definition. Let be a digraph with n vertices. A
topological ordering of is an ordering v1,...,vn of the vertices of such that for every edge (vi, vj) of , i
< j. That is, a topological ordering is an ordering such that any directed path in G traverses vertices
in increasing order
Graphs 138
Data Structures and Files
Fig 10: Two topological orderings of the same acyclic digraph
Multiple types Question:
1 .A connected graph T without any cycles is called
a. a tree graph b. free tree c. a tree d. All of above
2. In a graph if e=(u, v) means
a. u is adjacent to v but v is not adjacent to u b. e begins at u and ends at v
c. u is processor and v is successor d. both b and c
3 If every node u in G is adjacent to every other node v in G, A graph is said to be
a. isolated b. complete c. finite d. strongly connected
4. In a graph if e=[u, v], Then u and v are called
a. endpoints of e b. adjacent nodes c. neighbours d. all of above
5. Finding the location of the element with a given value is:
a. Traversal b. Search c. Sort d. None of above
Subjective Question
1. Draw a simple, connected, undirected, weighted graph with 8 vertices and 16 edges, each with
unique edge weights. Illustrate the execution of Kruskal's algorithm on this graph.
2. Describe weighted graph. Also explain shortest path algorithm
3. Explain depth first search algorithm in detail.
4. Explain various operations that can be performed on graph ADT.
Graphs 139
Data Structures and Files
University Questions:
1. Explain graph traversal techniques.
2. Write a program to give array representation of a graph.
3. Explain applications of the graph.
4. Describe shortest path algorithm.
5. Write an algorithm for (a) Depth first search (b) breadth first search
6. Explain graph representation with example.
References:
1. Data structures and Algorithms in Java by Michael T. Goodrich & Roberto Tammasia
2. Data structure with Java by John R Hubbard
3. Data structure using Java by Tanenbaum
Graphs 140