Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
21 views147 pages

Full Lecture Notes

The document is a course outline for MATH20902: Discrete Mathematics, authored by Mark Muldoon, detailing various topics in discrete mathematics including graph theory, graph coloring, algorithm efficiency, trees, and matrix-tree theorems. It includes sections on foundational concepts, representations, and advanced theorems relevant to the study of graphs and their properties. The document serves as a comprehensive guide for students studying discrete mathematics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views147 pages

Full Lecture Notes

The document is a course outline for MATH20902: Discrete Mathematics, authored by Mark Muldoon, detailing various topics in discrete mathematics including graph theory, graph coloring, algorithm efficiency, trees, and matrix-tree theorems. It includes sections on foundational concepts, representations, and advanced theorems relevant to the study of graphs and their properties. The document serves as a comprehensive guide for students studying discrete mathematics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 147

MATH20902: Discrete Mathematics

Mark Muldoon

April 25, 2021


Contents

I Notions and Notation


1 First Steps in Graph Theory 1
1.1 The Königsberg Bridge Problem . . . . . . . . . . . . . . . . . . . . . 1
1.2 Definitions: graphs, vertices and edges . . . . . . . . . . . . . . . . . 3
1.3 Standard examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 A first theorem about graphs . . . . . . . . . . . . . . . . . . . . . . 8

2 Representation, Sameness and Parts 10


2.1 Ways to represent a graph . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Edge lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Adjacency matrices . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Adjacency lists . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 When are two graphs the same? . . . . . . . . . . . . . . . . . . . . . 13
2.3 Terms for parts of graphs . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Graph Colouring 17
3.1 Notions and notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 An algorithm to do colouring . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 The greedy colouring algorithm . . . . . . . . . . . . . . . . . 19
3.2.2 Greedy colouring may use too many colours . . . . . . . . . . 20
3.3 An application: avoiding clashes . . . . . . . . . . . . . . . . . . . . . 22

4 Efficiency of algorithms 24
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2 Examples and issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.1 Greedy colouring . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.2 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . 27
4.2.3 Primality testing and worst-case estimates . . . . . . . . . . . 27
4.3 Bounds on asymptotic growth . . . . . . . . . . . . . . . . . . . . . . 28
4.4 Analysing the examples . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4.1 Greedy colouring . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4.2 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . 29
4.4.3 Primality testing via trial division . . . . . . . . . . . . . . . . 30
4.5 Afterword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5 Walks, Trails, Paths and Connectedness 32


5.1 Walks, trails and paths . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2.1 Connectedness in undirected graphs . . . . . . . . . . . . . . 35
5.2.2 Connectedness in directed graphs . . . . . . . . . . . . . . . . 35
5.3 Afterword: a useful proposition . . . . . . . . . . . . . . . . . . . . . 36

II Trees and the Matrix-Tree Theorem


6 Trees and forests 39
6.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.1.1 Leaves and internal nodes . . . . . . . . . . . . . . . . . . . . 39
6.1.2 Kinds of trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.2 Three useful lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.2.1 A festival of proofs by induction . . . . . . . . . . . . . . . . . 42
6.2.2 Graph surgery . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.3 A theorem about trees . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3.1 Proof of the theorem . . . . . . . . . . . . . . . . . . . . . . . 46

7 The Matrix-Tree Theorems 48


7.1 Kirchoff’s Matrix-Tree Theorem . . . . . . . . . . . . . . . . . . . . . 48
7.2 Tutte’s Matrix-Tree Theorem . . . . . . . . . . . . . . . . . . . . . . 50
7.2.1 Arborescences: directed trees . . . . . . . . . . . . . . . . . . 50
7.2.2 Tutte’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.3 From Tutte to Kirchoff . . . . . . . . . . . . . . . . . . . . . . . . . . 53

8 Matrix-Tree Ingredients 55
8.1 Lightning review of permutations . . . . . . . . . . . . . . . . . . . . 55
8.1.1 The Symmetric Group Sn . . . . . . . . . . . . . . . . . . . . 56
8.1.2 Cycles and sign . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.2 Using graphs to find the cycle decomposition . . . . . . . . . . . . . . 57
8.3 The determinant is a sum over permutations . . . . . . . . . . . . . . 58
8.4 The Principle of Inclusion/Exclusion . . . . . . . . . . . . . . . . . . 60
8.4.1 A familiar example . . . . . . . . . . . . . . . . . . . . . . . . 60
8.4.2 Three subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8.4.3 The general case . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.4.4 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.5 Appendix: Proofs for Inclusion/Exclusion . . . . . . . . . . . . . . . 64
8.5.1 Proof of Lemma 8.12, the case of two sets . . . . . . . . . . . 64
8.5.2 Proof of Theorem 8.13 . . . . . . . . . . . . . . . . . . . . . . 65
8.5.3 Alternative proof . . . . . . . . . . . . . . . . . . . . . . . . . 66

9 Proof of Tutte’s Matrix-Tree Theorem 68


9.1 Single predecessor graphs . . . . . . . . . . . . . . . . . . . . . . . . 68
9.2 Counting spregs with determinants . . . . . . . . . . . . . . . . . . . 70
9.2.1 Counting spregs . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9.2.2 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.2.3 Counting spregs in general . . . . . . . . . . . . . . . . . . . . 74
9.3 Proof of Tutte’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . 75

III Eulerian and Hamiltonian Graphs


10 Eulerian Multigraphs 78
10.1 Eulerian tours and trails . . . . . . . . . . . . . . . . . . . . . . . . . 78

11 Hamiltonian graphs and the Bondy-Chvátal Theorem 84


11.1 Hamiltonian graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
11.2 The closure a graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
11.2.1 An algorithm to construct [G] . . . . . . . . . . . . . . . . . . 86
11.2.2 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
11.3 The Bondy-Chvátal Theorem . . . . . . . . . . . . . . . . . . . . . . 88
11.4 Afterword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

IV Distance in Graphs and Scheduling


12 Distance in Graphs 92
12.1 Adding weights to edges . . . . . . . . . . . . . . . . . . . . . . . . . 92
12.2 A notion of distance . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
12.3 Shortest path problems . . . . . . . . . . . . . . . . . . . . . . . . . . 95
12.3.1 Uniform weights & Breadth First Search . . . . . . . . . . . . 95
12.3.2 Bellman’s equations . . . . . . . . . . . . . . . . . . . . . . . 96
12.4 Appendix: BFS revisited . . . . . . . . . . . . . . . . . . . . . . . . . 97

13 Tropical Arithmetic and Shortest Paths 101


13.1 All pairs shortest paths . . . . . . . . . . . . . . . . . . . . . . . . . . 102
13.2 Counting walks using linear algebra . . . . . . . . . . . . . . . . . . . 102
13.3 Tropical arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
13.3.1 Tropical matrix operations . . . . . . . . . . . . . . . . . . . . 105
13.3.2 A tropical version of Bellman’s equations . . . . . . . . . . . . 106
13.4 Minimal-weight paths in a tropical style . . . . . . . . . . . . . . . . 106

14 Critical Path Analysis 110


14.1 Scheduling problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
14.1.1 From tasks to weighted digraphs . . . . . . . . . . . . . . . . 110
14.1.2 From weighted digraphs to schedules . . . . . . . . . . . . . . 112
14.2 Graph-theoretic details . . . . . . . . . . . . . . . . . . . . . . . . . . 112
14.2.1 Shortest times and maximal-weight paths . . . . . . . . . . . 113
14.2.2 Topological ordering . . . . . . . . . . . . . . . . . . . . . . . 114
14.3 Critical paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
14.3.1 Earliest starts . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
14.3.2 Latest starts . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
14.3.3 Critical paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
V Planar Graphs
15 Planar Graphs 118
15.1 Drawing graphs in the plane . . . . . . . . . . . . . . . . . . . . . . . 118
15.1.1 The topology of curves in the plane . . . . . . . . . . . . . . . 118
15.1.2 Faces of a planar graph . . . . . . . . . . . . . . . . . . . . . 121
15.2 Euler’s formula for planar graphs . . . . . . . . . . . . . . . . . . . . 121
15.3 Planar graphs can’t have many edges . . . . . . . . . . . . . . . . . . 123
15.3.1 Preliminaries: bridges and girth . . . . . . . . . . . . . . . . . 123
15.3.2 Main result: an inequality relating n and m . . . . . . . . . . 125
15.3.3 Gritty details of the proof of Theorem 15.12 . . . . . . . . . . 128
15.3.4 The maximal number of edges in a planar graph . . . . . . . 131
15.4 Two non-planar graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 132
15.5 Kuratowski’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 133
15.6 Wagner’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
15.7 Afterword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

iv
Part I

Notions and Notation


Chapter 1

First Steps in Graph Theory

This chapter introduces Graph Theory, the main subject of the course, and includes
some basic definitions as well as a number of standard examples.
Reading: Some of the material in this chapter comes from the beginning of Chap-
ter 1 in

Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,


which is available online via SpringerLink.

If you are at the university, either physically or via the VPN, you can download the
chapters of this book as PDFs.

1.1 The Königsberg Bridge Problem


Video Graph theory is usually said to have been invented in 1736 by the great Leon-
1.1 hard Euler, who used it to solve the Königsberg Bridge Problem. I used to find this
hard to believe—the graph-theoretic graph is such a natural and useful abstraction
that it’s difficult to imagine that no one hit on it earlier—but Euler’s paper about
graphs1 is generally acknowledged2 as the first one and it certainly provides a sat-
isfying solution to the bridge problem. The sketch in the left panel of Figure 1.1
comes from Euler’s original paper and shows the main features of the problem. As
one can see by comparing Figures 1.1 and 1.2, even this sketch is already a bit of
an abstraction.
The question is, can one make a walking tour of the city that (a) starts and
finishes in the same place and (b) crosses every bridge exactly once. The short
answer to this question is “No” and the key idea behind proving this is illustrated in
the right panel of Figure 1.1. It doesn’t matter what route one takes while walking
around on, say, the smaller island: all that really matters are the ways in which
the bridges connect the four land masses. Thus we can shrink the small island to a
1
L. Euler (1736), Solutio problematis ad geometriam situs pertinentis, Commentarii Academiae
Scientiarum Imperialis Petropolitanae 8, pp. 128–140.
2
See, for example, Robin Wilson and John J. Watkins (2013), Combinatorics: Ancient &
Modern, OUP. ISBN 978-0-19-965659-2.

1
North Bank

West
East
Island
Island

South Bank

Figure 1.1: The panel at left shows the seven bridges and four land masses
that provide the setting for the Königsberg bridge problem, which asks whether it is
possible to make a circular walking tour of the city that crosses every bridge exactly
once. The panel at right includes a graph-theoretic abstraction that helps one prove
that no such tour exists.

Figure 1.2: Königsberg is a real place—a port on the Baltic—and during Euler’s
lifetime it was part of the Kingdom of Prussia. The panel at left is a bird’s-eye view
of the city that shows the celebrated seven bridges. It was made by Matthäus Merian
and published in 1652. The city is now called Kaliningrad and is part of the Russian
Federation. It was bombed heavily during the Second World War: the panel at right
shows a recent satellite photograph and one can still recognise the two islands and
modern versions of some of the bridges, but very little else appears to remain.

point—and do the same with the other island, as well as with the north and south
banks of the river—and then connect them with arcs that represent the bridges.
The problem then reduces to the question whether it is possible to draw a path that
starts and finishes at the same dot, but traces each of over the seven arcs exactly
once.
One can prove that such a tour is impossible by contradiction. Suppose that
one exists: it must then visit the easternmost island (see Figure 1.3) and we are
free to imagine that the tour actually starts there. To continue we must leave the
island, crossing one of its three bridges. Then, later, because we are required to

2
North Bank

West East
Island Island

South Bank

Figure 1.3: The Königsberg Bridge graph on its own: it is not possible to trace a
path that starts and ends on the eastern island without crossing some bridge at least
twice.

cross each bridge exactly once, we will have to return to the eastern island via a
different bridge from the one we used when setting out. Finally, having returned
to the eastern island once, we will need to leave again in order to cross the island’s
third bridge. But then we will be unable to return without recrossing one of the
three bridges. And this provides a contradiction: the walk is supposed to start and
finish in the same place and cross each bridge exactly once.

1.2 Definitions: graphs, vertices and edges


The abstraction behind Figure 1.3 turns out to be very powerful: one can draw
similar diagrams to represent “connections” between “things” in a very general way.
Examples include: representations of social networks in which the points are people
and the arcs represent acquaintance; genetic regulatory networks in which the points
are genes and the arcs represent activation or repression of one gene by another and
scheduling problems in which the points are tasks that contribute to some large
project and the arcs represent interdependence among the tasks. To help us make
more rigorous statements, we’ll use the following definition:

Definition 1.1. A graph is a finite, nonempty set V , the vertex set, along with
a set E, the edge set, whose elements e ∈ E are pairs e = (a, b) with a, b ∈ V .

We will often write G(V, E) to mean the graph G with vertex set V and edge
set E. An element v ∈ V is called a vertex (plural vertices) while an element e ∈ E
is called an edge.
The definition above is deliberately vague about whether the pairs that make
up the edge set E are ordered pairs—in which case (a, b) and (b, a) with a ̸= b are
distinct edges—or unordered pairs. In the unordered case (a, b) and (b, a) are just
two equivalent ways of representing the same edge.

Definition 1.2. An undirected graph is a graph in which the edge set consists of
unordered pairs.

3
a b a b

Figure 1.4: Diagrams representing graphs with vertex set V = {a, b} and edge
set E = {(a, b)}. The diagram at left is for an undirected graph, while the one at
right shows a directed graph. Thus the arrow on the right represents the ordered pair
(a, b).

Definition 1.3. A directed graph is a graph in which the edge set consists of
ordered pairs. The term “directed graph” is often abbreviated as digraph.

Although graphs are defined abstractly as above, it’s very common to draw
diagrams to represent them. These are drawings in which the vertices are shown
as points or disks and the edges as line segments or arcs. Figure 1.4 illustrates the
graphical convention used to mark the distinction between directed and undirected
edges: the former are drawn as line segments or arcs, while the latter are shown as
arrows. A directed edge e = (a, b) appears as an arrow that points from a to b.
Sometimes one sees graphs with more than one edge3 connecting the same two
vertices; the Königsberg Bridge graph is an example. Such edges are called multiple
or parallel edges. Additionally, one sometimes sees graphs with edges of the form
e = (v, v). These edges, which connect a vertex to itself, are called loops or self
loops. All these terms are illustrated in Figure 1.5.

v1

v4 v2

v3

Figure 1.5: A graph whose edge set includes the self loop (v1 , v1 ) and two parallel
copies of the edge (v1 , v2 ).

It is important to bear in mind that diagrams such as those in Figures 1.3–1.5


are only illustrations of the edges and vertices. In particular, the arcs representing
edges may cross, but this does not necessarily imply anything: see Figure 1.6.

Remark. In this course when we say “graph” we will normally mean an undirected
graph that contains no loops or parallel edges: if you look in other books you may
3
In this case it is a slight abuse of terminology to talk about the edge “set” of the graph, as sets
contain only a single copy of each of their elements. Very scrupulous books (and students) might
prefer to use the term edge list in this context, but I will not insist on this nicety.

4
a b a b

d c c d

Figure 1.6: Two diagrams for the same graph: the crossed edges in the leftmost
version do not signify anything.

see such objects referred to as simple graphs. By contrast, we will refer to a graph
that contains parallel edges as a multigraph.

Definition 1.4. Two vertices a ̸= b in an undirected graph G(V, E) are said to be


adjacent or to be neighbours if (a, b) ∈ E. In this case we also say that the edge
e = (a, b) is incident on the vertices a and b.

Definition 1.5. If the directed edge e = (u, v) is present in a directed graph


H(V ′ , E ′ ) we will say that u is a predecessor of v and that v is a successor
of u. We will also say that u is the tail or tail vertex of the edge (u, v), while v is
the tip or tip vertex.

1.3 Standard examples


Video In this section I’ll introduce a few families of graphs that we will refer to throughout
1.2 the rest of the term.

The complete graphs Kn


The complete graph Kn is the undirected graph on n vertices whose edge set includes
every possible edge. If one numbers the vertices consecutively the edge and vertex
set are

V = {v1 , v2 , . . . , vn }
E = {(vj , vk ) | 1 ≤ j ≤ (n − 1), (j + 1) ≤ k ≤ n} .

There are thus  


n n(n − 1)
|E| = =
2 2
edges in total: see Figure 1.7 for the first few examples.

The path graphs Pn


These graphs are formed by stringing n vertices together in a path. The word “path”
actually has a technical meaning in graph theory, but you needn’t worry about that

5
K1 K5
K3 K4
K2

Figure 1.7: The first five members of the family Kn of complete graphs.

P4
P5

Figure 1.8: Diagrams for the path graphs P4 and P5 .

today. Pn has vertex and edge sets as listed below,

V = {v1 , v2 , . . . , vn }
E = {(vj , vj+1 ) | 1 ≤ j < n} ,

and Figure 1.8 shows two examples.

The cycle graphs Cn


The cycle graph Cn , sometimes also called the circuit graph, is a graph in which
n ≥ 3 vertices are arranged in a ring. If one numbers the vertices consecutively the
edge and vertex set are

V = {v1 , v2 , . . . , vn }
E = {(v1 , v2 ), (v2 , v3 ), . . . , (vj , vj+1 ), . . . , (vn−1 , vn ), (vn , v1 )} .

Cn has n edges that are often written (vj , vj+1 ), where the subscripts are taken to
be defined periodically so that, for example, vn+1 ≡ v1 . See Figure 1.9 for examples.

v1
v1 v2
v1
v5 v2

C3 C4 C5
v3 v2
v4 v3 v4 v3

Figure 1.9: The first three members of the family Cn of cycle graphs.

6
The complete bipartite graphs Km,n
The complete bipartite graph Km,n is a graph whose vertex set is the union of a set V1
of m vertices with second set V2 of n different vertices and whose edge set includes
every possible edge running between these two subsets:
V = V1 ∪ V2
= {u1 , . . . , um } ∪ {v1 , . . . , vn }
E = {(u, v) | u ∈ V1 , v ∈ V2 } .
Km,n thus has |E| = mn edges: see Figure 1.10 for examples.

K1,3 K2,2 K2,3

Figure 1.10: A few members of the family Km,n of complete bipartite graphs.
Here the two subsets of the vertex set are illustrated with colour: the white vertices
constitute V1 , while the red ones form V2 .

There are other sorts of bipartite graphs too:


Definition 1.6. A graph G(V, E) is said to be a bipartite graph if
• it has a nonempty edge set: E ̸= ∅ and
• its vertex set V can be decomposed into two nonempty, disjoint subsets
V = V1 ∪ V2 with V1 ∩ V2 = ∅ and V1 ̸= ∅ and V2 ̸= ∅
in such a way that all the edges (u, v) ∈ E contain a member of V1 and a
member of V2 .

The cube graphs Id


These graphs are specified in a way that’s closer to the purely combinatorial, set-
theoretic definition of a graph given above. Id , the d-dimensional cube graph, has
vertices that are strings of d zeroes or ones, and all possible labels occur. Edges
connect those vertices whose labels differ in exactly one position. Thus, for example,
I2 has vertex and edge sets
V = {00, 01, 10, 11} and E = {(00, 01), (00, 10), (01, 11), (10, 11)} .
Figure 1.11 shows diagrams for the first few cube graphs and these go a long way
toward explaining the name. More generally, Id has vertex and edge sets given by

V = w | w ∈ {0, 1}d
E = {(w, w′ ) | w and w′ differ in a single position} .
This means that Id has |V | = 2d vertices, but it’s a bit harder to count the edges.
In the last part of the chapter we’ll prove a theorem that enables one to show that
Id has |E| = d 2d−1 edges.

7
0 I1 1 010 110

01 11 011 111

I3
I2
001 101

00 10 000 100

Figure 1.11: The first three members of the family Id of cube graphs. Notice
that all the cube graphs are bipartite (the red and white vertices are the two disjoint
subsets from Definition 1.6), but that, for example, I3 has only 12 edges, while the
complete bipartite graph K4,4 has 16.

c
b d
a e v a b c d e f g h
g deg(v) 1 1 1 2 1 1 4 1
f
h

Figure 1.12: The degrees of the vertices in a small graph. Note that the graph
consists of two “pieces”.

1.4 A first theorem about graphs


Video I find it wearisome to give, or learn, one damn definition after another and so I’d
1.3 like to conclude the chapter with a small, but useful theorem. To do this we need
one more definition:

Definition 1.7. In an undirected graph G(V, E) the degree of a vertex v ∈ V is


the number of times that the vertex appears in the edge set. One writes deg(v) and
says “the degree of v”.

So, for example, every vertex in the complete graph Kn has degree n − 1, while
every vertex in a cycle graph Cn has degree 2; Figure 1.12 provides more examples.
The generalisation of degree to directed graphs is slightly more involved. A vertex v
in a digraph has two degrees: an in-degree that counts the number of edges having
v at their tip and an out-degree that counts number of edges having v at their tail.
See Figure 1.13 for an example.

8
a v degin (v) degout (v)
a 2 0
d b b 1 1
c 1 1
c d 0 2

Figure 1.13: The degrees of the vertices in a small digraph.

Once we have the notion of degree, we can formulate our first theorem:

Theorem 1.8 (Handshaking Lemma, Euler 1736). If G(V, E) is an undirected graph


then X
deg(v) = 2|E|. (1.1)
v∈V

Proof. Each edge contributes twice to the sum of degrees, once for each of the two
vertices on which it is incident.
The following two results are immediate consequences:

Corollary 1.9. In an undirected graph there must be an even number of vertices


that have odd degree.

Corollary 1.10. The cube graph Id has |E| = d 2d−1 .

The first is fairly obvious: the right hand side of (1.1) is clearly an even number, so
the sum of degrees appearing on the left must be even as well. To get the formula
for the number of edges in Id , note that it has 2d vertices, each of degree d, so the
Handshaking Lemma tells us that
X
2|E| = deg(v) = 2d × d
v∈V

and thus |E| = (d × 2d )/2 = d 2d−1 .

9
Chapter 2

Representation, Sameness and


Parts

Reading: Some of the material in this chapter comes from the beginning of Chap-
ter 1 in
Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,
which is available online via SpringerLink.

If you are at the university, either physically or via the VPN, you can download the
chapters of this book as PDFs.

2.1 Ways to represent a graph


Video The first part of this chapter is concerned with various ways of specifying a graph.
1.4 It may seem unnecessary to have so many different descriptions for a mathematical
object that is, fundamentally, just a pair of finite sets, but each of the representa-
tions below will prove convenient when we are developing algorithms (step-by-step
computational recipes) to solve problems involving graphs.

2.1.1 Edge lists


From the first chapter, we already how to represent a graph G(V, E) by specifying
its vertex set V and its edge list E as, for example,
Example 2.1 (Edge list representation). The undirected graph G(V, E) with

V = {1, 2, 3} and E = {(1, 2), (2, 3), (1, 3)}

is K3 , the complete graph on three vertices. But if we regard the edges as directed
then G is the graph pictured at the right of Figure 2.1
Of course, if every vertex in G(V, E) appears in some edge (equivalently, if every
vertex has nonzero degree), then we can dispense with the vertex set and specify
the graph by its edge list alone.

10
1 2 1 2

3 3

Figure 2.1: If graph from Example 2.1 is regarded as undirected (our default
assumption) then it is K3 , the complete graph on three vertices, but if it’s directed
then it’s the digraph at right above.

2.1.2 Adjacency matrices


A second approach is to give an adjacency matrix, often written A. One builds an
adjacency matrix by first numbering the vertices, so that the vertex set becomes
V = {v1 , v2 , . . . , vn } for a graph on n vertices. The adjacency matrix A is then an
n × n matrix whose entries are given by the following rule:

1 if (vi , vj ) ∈ E
Aij = (2.1)
0 otherwise

Once again, the directed and undirected cases are different. For the graphs from
Example 2.1 we have:
 
1 2 0 1 1
if G is then A =  1 0 1  ,
3 1 1 0
 
1 2 0 1 1
but if G is then A =  0 0 1  .
3 0 0 0

Remark 2.2. The following properties of the adjacency matrix follow readily from
the definition in Eqn. (2.1).

• The adjacency matrix is not unique because it depends on a numbering scheme


for the vertices. If one renumbers the vertices, the rows and columns of A will
be permuted accordingly.

• If G(V, E) is undirected then its adjacency matrix A is symmetric. That’s


because we think of the edges as unordered pairs, so, for example, (1, 2) ∈ E
is the same thing as (2, 1) ∈ E.

• If the graph has no loops then Ajj = 0 for 1 ≤ j ≤ n. That is, there are zeroes
down the main diagonal of A.

11
• One can compute the degree of a vertex by adding up entries in the adjacency
matrix. I leave it as an exercise for the reader to establish that in an undirected
graph,
Xn Xn
deg(vj ) = Ajk = Akj , (2.2)
k=1 k=1

where the first sum runs across the j-th row, while the second runs down the
j-th column. Similarly, in a directed graph we have
X
n X
n
degout (vj ) = Ajk and degin (vj ) = Akj . (2.3)
k=1 k=1

• Sometimes one sees a modified form of the adjacency matrix used to describe
multigraphs (graphs that permit two or more edges between a given pair of
vertices). In this case one takes

Aij = number of times the edge (i, j) appears in E (2.4)

2.1.3 Adjacency lists


One can also specify an undirected graph by giving the adjacency lists of all its
vertices.

Definition 2.3. In an undirected graph G(V, E) the adjacency list associated with
a vertex v is the set Av ⊆ V defined by

Av = {u ∈ V | (u, v) ∈ E}.

An example appears in Figure 2.2. It follows readily from the definition of degree
that
deg(v) = |Av |. (2.5)

3 5
A1 = {2}
A2 = {1, 3, 4}
1 2 A3 = {2, 4, 5}
A4 = {2, 3}
4
A5 = {3}

Figure 2.2: The graph at left has adjacency lists as shown at right.

12
1 2 v Predecessors Successors
1 ∅ {2, 3}
2 {1} {3}
3 3 {1, 2} ∅

Figure 2.3: The directed graph at left has the predecessor and successor lists shown
at right.

Similarly, one can specify a directed graph by providing separate lists of succes-
sors or predecessors (these terms were defined in Lecture 1) for each vertex.

Definition 2.4. In an directed graph G(V, E) the predecessor list of a vertex v


is the set Pv ⊆ V defined by

Pv = {u ∈ V | (u, v) ∈ E}

while the successor list of v is the set Sv ⊆ V defined by

Sv = {u ∈ V | (v, u) ∈ E}.

Figure 2.3 gives some examples. The analogues of Eqn. (2.5) for a directed graph
are
degin (v) = |Pv | and degout (v) = |Sv |. (2.6)

2.2 When are two graphs the same?


Video For the small graphs that appear in these notes, it’s usually fairly obvious when
2.1 two of them are the same. But in general it’s nontrivial to be rigorous about what
we mean when we say two graphs are “the same”. The point is that if we stick to
the abstract definition of a graph-as-two-sets, we need to formulate our definition
of sameness in a similar style. Informally we’d like to say that two graphs are the
same (we’ll use the term isomorphic for this) if it is possible to relabel the vertex
sets in such a way that their edge sets match up. More precisely:

Definition 2.5. Two graphs G1 (V1 , E1 ) and G2 (V2 , E2 ) are said to be isomorphic
if there exists a bijection1 α : V1 → V2 such that the edge (α(a), α(b)) ∈ E2 if and
only if (a, b) ∈ E1 .

Generally it’s difficult to decide whether two graphs are isomorphic. In particu-
lar, there are no known fast algorithms2 (we’ll learn to speak more precisely about
what it means for an algorithm to be “fast” later in the term) to decide. One can,
1
Recall that a bijection is a mapping that’s one-to-one and onto.
2
Algorithms for graph isomorphism are the subject of intense current research: see Erica Klar-
reich’s Jan. 2017 article in Quanta Magazine, Complexity Theory Problem Strikes Back, for a
popular account of some recent results.

13
a c
000 001
5 7
e g
6 8 011 010

1 3 110 100

2 4 f h
101 111
b d

Figure 2.4: Here are three different graphs that are all isomorphic to the cube
graph I3 , which is the middle one. The bijections that establish the isomorphisms
are listed in Table 2.1.
v 000 001 010 011 100 101 110 111
αL (v) 1 2 3 4 5 6 7 8
αR (v) a b c d e f g h

Table 2.1: If we number the graphs in Figure 2.4 so that the leftmost is G1 (V1 , E1 )
and the rightmost is G3 (V3 , E3 ), then the bijections αL : V2 → V1 and αR : V2 → V3
listed above establish that G2 is isomorphic, respectively, to G1 and G3 .

of course, simply try all possible bijections between the two vertex sets, but there
are n! of these for graphs on n vertices and so this brute force approach rapidly
becomes impractical. On the other hand, it’s often possible to detect quickly that
two graphs aren’t isomorphic. The simplest such tests are based on the following
propositions, whose proofs are left to the reader.

Proposition 2.6. If G1 (V1 , E1 ) and G2 (V2 , E2 ) are isomorphic then |V1 | = |V2 | and
|E1 | = |E2 |.

Proposition 2.7. If G1 (V1 , E1 ) and G2 (V2 , E2 ) are isomorphic and α : V1 → V2 is


the bijection that establishes the isomorphism, then deg(v) = deg(α(v)) for every
v ∈ V1 and deg(u) = deg(α−1 (u)) for every u ∈ V2 .

Another simple test depends on the following quantity, examples of which appear
in Figure 2.5.

Definition 2.8. The degree sequence of an undirected graph G(V, E) is a list of


the degrees of the vertices, arranged in ascending order.

The corresponding test for non-isomorphism depends on the following proposition,


whose proof is left as an exercise.

Proposition 2.9. If G1 (V1 , E1 ) and G2 (V2 , E2 ) are isomorphic then they have the
same degree sequence.

14
(1, 2, 2, 3) (2, 2, 2) (1, 1, 2, 2)

Figure 2.5: Three small graphs and their degree sequences.

v3 u4

v1 v2 v4 u1 u2 u3

v5 u5

Figure 2.6: These two graphs both have degree sequence (1, 2, 2, 2, 3), but they’re
not isomorphic: see Example 2.10 for a proof.

Unfortunately although it’s a necessary condition for two isomorphic graphs to


have the same degree sequence, a shared degree sequence isn’t sufficient to establish
isomorphism. That is, it’s possible for two graphs to have the same degree sequence,
but not be isomorphic: Figure 2.6 shows one such pair, but it’s easy to make up
more.

Example 2.10 (Proof that the graphs in Figure 2.6 aren’t isomorphic). Both graphs
in Figure 2.6 have the same degree sequence, (1, 2, 2, 2, 3), so both contain a single
vertex of degree 1 and a single vertex of degree 3. These vertices are adjacent in the
graph at left, but not in the one at right and this observation forms the basis for a
proof by contradiction that the graphs aren’t isomorphic.
Assume, for contradiction, that they are isomorphic and that

α : {v1 , v2 , v3 , v4 , v5 } → {u1 , u2 , u3 , u4 , u5 }

is the bijection that establishes the isomorphism. Then Prop. 2.7 implies that it must
be true that α(v1 ) = u1 (as these are the sole vertices of degree one) and α(v2 ) = u3 .
But then the presence of the edge (v1 , v2 ) on the left would imply the existence of an
edge (α(v1 ), α(v2 )) = (u1 , u3 ) on the right, and no such edge exists. This contradicts
our assumption that α establishes an isomorphism, so no such α can exist and the
graphs aren’t isomorphic.

15
2.3 Terms for parts of graphs
Video Finally, we’ll often want to speak of parts of graphs and the two most useful defini-
2.2 tions here are:

Definition 2.11. A subgraph of a graph G(V, E) is a graph G′ (V ′ , E ′ ) where


V ′ ⊆ V and E ′ ⊆ E.

and

Definition 2.12. Given a graph G(V, E) and a subset of its vertices V ′ ⊆ V , the
subgraph induced by V ′ is the subgraph G′ (V ′ , E ′ ) where

E ′ = {(u, v) | u, v ∈ V ′ and (u, v) ∈ E}.

That is, the subgraph induced by the vertices V ′ consists of V ′ itself and all those
edges in the original graph that involve only vertices from V ′ . Both these definitions
are illustrated in Figure 2.7.

Figure 2.7: The three graphs at right are subgraphs of the one at left. The middle
one is the subgraph induced by the blue shaded vertices.

16
Chapter 3

Graph Colouring

The material from the first two chapters provides enough background that we can
begin to discuss a problem—graph colouring—that is both mathematically rich and
practically applicable.
Reading:
The material for this chapter appears, in very condensed form, in Chapter 9 of

Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition


(Available from SpringerLink)

A somewhat longer discussion, with many interesting exercises, appears in

John M. Harris, Jeffry L. Hirst and Michael J. Mossinghoff (2008), Com-


binatorics and Graph Theory, 2nd edition. (Available from SpringerLink)

3.1 Notions and notation

Video Definition 3.1. A k-colouring of an undirected graph G(V, E) is a function


2.3
ϕ : V → {1, . . . , k}

that assigns distinct values to adjacent vertices: that is, (u, v) ∈ E ⇒ ϕ(u) ̸= ϕ(v).
If G has a k-colouring then it is said to be k-colourable.

I’ll refer to the values assigned by ϕ(v) as “colours” and say that a graph is k-
colourable if one can draw it in such a way that no two adjacent vertices have the
same colour. Examples of graphs and colourings include

• Kn , the complete graph on n vertices is clearly n-colourable, but not (n − 1)


colourable;

• Km,n , the complete bipartite graph on groups of m and n vertices, is 2-


colourable.

17
1 2 1 1

4 3 2 2

3 1 1 1 1

4 2 2 2 2 2

Figure 3.1: The complete graphs K4 and K5 as well as the complete bipartite
graphs K2,2 and K3,4 , each coloured using the smallest possible number of colours.
Here the colouring is represented in two ways: as numbers giving ϕ(v) for each vertex
v and with, well, colours (see the electronic version).

Both classes of example are illustrated in Figure 3.1.


Definition 3.2. The chromatic number χ(G) is the smallest number k such that
G is k-colourable.
Thus, as the examples above suggest, χ(Kn ) = n and χ(Km,n ) = 2. The latter is a
special case of the following easy lemma, whose proof is left as an exercise.
Lemma 3.3. A graph G has chromatic number χ(G) = 2 if and only if it is bipartite.
Another useful result is
Lemma 3.4. If H is a subgraph of G and G is k-colourable, then so is H.
and an immediate corollary is
Lemma 3.5. If H is a subgraph of G then χ(H) ≤ χ(G).
which comes in handy when trying to prove that a graph has a certain chromatic
number.
The proof of Lemma 3.4 is straightforward: the idea is that constraints on the
colours of vertices arise from edges and so, as every edge in H is also present in
G, it can’t be any harder to colour H than it is to colour G. Equivalently: if we
have a colouring of G and want a colouring of H we can simply use the same colour
assignments. To be more formal, say that the vertex and edge sets of G are V and E,
respectively, while those of H are V ′ ⊆ V and E ′ ⊆ E. If a map ϕG : V → {1, . . . , k}
is a k-colouring of G, then ϕH : V ′ → {1, . . . , k} defined as the restriction of ϕG to
V ′ produces a k-colouring of H.
Lemma 3.5 then follows from the observation that, although Lemma 3.4 assures
us that H has a colouring that uses χ(G) colours, it may also be possible to find
some other colouring that uses fewer.

18
3.2 An algorithm to do colouring
Video The chromatic number χ(G) is defined as a kind of ideal: it’s the minimal k for
2.4 which we can find a k-colouring. This might make you suspect that it’s hard to find
χ(G) for an arbitrary graph—how could you ever know that you’d used the smallest
possible number of colours? And, aside from a few exceptions such as those in the
previous section, you’d be right to think this: there is no known fast (we’ll make
the notion of “fast” more precise soon) algorithm to find an optimal (in the sense of
using the smallest number of colours) colouring.

3.2.1 The greedy colouring algorithm


There is, however, a fairly easy way to to compute a (possibly non-optimal) colouring
c : V → N. The idea is to number the vertices and then, starting with c(v1 ) = 1,
visit the remaining vertices in order, assigning them the lowest-numbered colour not
yet used for a neighbour. The algorithm is called greedy because it has no sense of
long-range strategy: it just proceeds through the list of vertices, blindly choosing
the colour that seems best at the moment.

Algorithm 3.6 (Greedy colouring).


Given a graph G with edge set E, vertex set V = {v1 , . . . , vn } and adjacency lists
Av , construct a function c : V → N such that if the edge e = (vi , vj ) ∈ E, then
c(vi ) ̸= c(vj ).

(1) Initialize
Set c(vj ) ← 0 for all 1 ≤ j ≤ n
c(v1 ) ← 1
j←2

(2) c(vj ) ← min k ∈ N | k > 0 and c(u) ̸= k ∀u ∈ Avj

(3) Are we finished? Is j = n?

• If so, stop: we’ve constructed a function c with the desired properties.


• If not, set j ← (j + 1) and go to step (2).

Remarks

• The algorithm above is meant to be explicit enough that one could implement
it in R or MATLAB. It thus includes expressions such as j ← 2 which means
“set j to 2” or “j gets the (new) value 2”. The operator ← is sometimes called
the assignment operator and it appears in some form in all the programming
languages I know. Sometimes it’s expressed with notation like j = j + 1,
but this is a jarring, nonsensical-looking thing for a mathematician and so I’ll
avoid it.

19
• We will discuss several more algorithms in this course, but will not be much
more formal about how they are specified. This is mainly because a truly rig-
orous account of computation would take us into the realms of computability
theory, a part of mathematical logic, and would require much of the rest of
the term, leaving little time for our main subjects.
Finally, to emphasise further the mechanical nature of greedy colouring, we could
rewrite it in a style that looks even closer to MATLAB code:
Algorithm 3.7 (Greedy colouring: as pseudo-code).
Given a graph G with edge set E, vertex set V = {v1 , . . . , vn } and adjacency lists
Av , construct a function c : V → N such that if the edge e = (vi , vj ) ∈ E, then
c(vi ) ̸= c(vj ).
(1) Set c(vj ) ← 0 for 1 ≤ j ≤ n.
(2) c(v1 ) ← 1.
(3) for 2 ≤ j ≤ n {
(4) Choose a colour k > 0 for vertex vj that differs from those of its neighbours
c(vj ) ← min k ∈ N | k > 0 and c(u) ̸= k ∀u ∈ Avj
(5) } End of loop over vertices vj .
Both versions of the algorithm perform exactly the same steps, in the same
order, so comparison of these two examples may clarify the different approaches to
presenting algorithms.

3.2.2 Greedy colouring may use too many colours


If we use Algorithm 3.7 to construct a function c : V → N, then we can regard it as
a k-colouring by setting ϕ(vj ) = c(vj ), where k is given by
k = max c(vj ). (3.1)
vj ∈V

For the reasons discussed above, this k provides only an upper bound on the chro-
matic number of G. To drive this point home, consider Figure 3.2, which illustrates
the process of applying the greedy colouring algorithm to two graphs, one in each
column.
For the graph in the left column—call it G1 —the algorithm produces a 3-
colouring, which is actually optimal. To see why, notice that the subgraph induced
by the vertices {v1 , v2 , v3 } is isomorphic to K3 . Thus we need at least 3 colours for
these three vertices and so, using Lemma 3.5, we can conclude that χ(G1 ) ≥ 3. On
the other hand, the greedy algorithm provides an explicit example of a 3-colouring,
which implies that χ(G1 ) ≤ 3, so we have proven that χ(G1 ) = 3.
The graph in the right column–call it G2 —is isomorphic to G1 (a very keen
reader could write out the isomorphism explicitly), but its vertices are numbered
differently and this means that Algorithm 3.7 colours them in a different order and
arrives at a sub-optimal k-colouring with k = 4.

20
v2 v2

v1 v4 v5 v1 v5 v4
1 1

v3 v3

v2 v2
2 2

v1 v4 v5 v1 v5 v4
1 1

v3 v3

v2 v2
2 2

v1 v4 v5 v1 v5 v4
1 1

3 3
v3 v3

v2 v2
2 2

v1 v4 v5 v1 v5 v4
1 1 1 1

3 3
v3 v3

v2 v2
2 2

v1 v4 v5 v1 v5 v4
1 1 2 1 4 1

3 3
v3 v3

Figure 3.2: Two examples of applying Algorithm 3.7: the colouring process runs
from the top of a column to the bottom. The graphs in the right column are the same
as those in the left, save that the labels on vertices 4 and 5 have been switched. As
in Figure 3.1, the colourings are represented both numerically and graphically.

21
3.3 An application: avoiding clashes
Video I’d like to conclude by introducing a family of applications that involve avoiding
2.5 some sort of clash—where some things shouldn’t be allowed to happen at the same
time or in the same place. A prototypical example is:

Example 3.8. Suppose that a group of ministers serve on committees as described


below:
Committee Members
Culture, Media & Sport Alexander, Burt, Clegg
Defence Clegg, Djanogly, Evers
Education Alexander, Gove
Food & Rural Affairs Djanogly, Featherstone
Foreign Affairs Evers, Hague
Justice Burt, Evers, Gove
Technology Clegg, Featherstone, Hague
What is the minimum number of time slots needed so that one can schedule meetings
of these committees in such a way that the ministers involved have no clashes?

One can turn this into a graph-colouring problem by constructing a graph whose
vertices are committees and whose edges connect those that have members in com-
mon: such committees can’t meet simultaneously, or their shared members will
have clashes. A suitable graph appears at left in Figure 3.3, where, for example,
the vertex for the Justice committee (labelled Just) is connected to the one for the
Education committee (Ed) because Gove serves on both.
The version of the graph at right in Figure 3.3 shows a three-colouring and,
as the vertices CMS, Ed and Just form a subgraph isomorphic to K3 , this is the
smallest number of colours one can possibly use and so the chromatic number of the
committee-and-clash graph is 3. This means that we need at least three time slots to
schedule the meetings. To see why, think of a vertex’s colour as a time slot: none of
the vertices that receive the same colour are adjacent, so none of the corresponding
committees share any members and thus that whole group of committees can be
scheduled to meet at the same time. There are variants of this problem that involve,
for example, scheduling exams so that no student will be obliged to be in two places
at the same time or constructing sufficiently many storage cabinets in a lab so that
chemicals that would react explosively if stored together can be housed separately:
see this week’s Problem Set for another example.

22
Def

CMS Ed
CMS Tech

Ed Def FRA
Tech
FRA

Just FA

Just FA

Figure 3.3: The graph at left has vertices labelled with abbreviated committee names
and edges given by shared members. The graph at right is isomorphic, but has been
redrawn for clarity and given a three-colouring, which turns out to be optimal.

23
Chapter 4

Efficiency of algorithms

The development of algorithms (systematic recipes for solving problems) will be a


theme that runs through the whole course. We’ll be especially interested in saying
rigorous-but-useful things about how much work an algorithm requires. Today’s
lecture introduces the standard vocabulary for such discussions and illustrates it with
several examples including the Greedy Colouring Algorithm, the standard algorithm
for multiplication of two square n × n matrices and the problem of testing whether
a number is prime.
Reading:
The material in the first part of today’s lecture comes from Section 2.5 of

Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition


(available online via SpringerLink),

though the discussion there mentions some graph-theoretic matters that we have
not yet covered.

4.1 Introduction
Video The aim of today’s lecture is to develop some convenient terms for the way in which
3.1 the amount of work required to solve a problem with a particular algorithm depends
on the “size” of the input. Note that this is a property of the algorithm: it may
be possible to find a better approach that solves the problem with less work. If we
were being very careful about these ideas we would make a distinction between the
quantities I’ll introduce below, which, strictly speaking, describe the time complexity
of an algorithm, and a separate set of bounds that say how much computer memory
(or how many sheets of paper, if we’re working by hand) an algorithm requires. This
latter quantity is called the space complexity of the algorithm, but we won’t worry
about that much in this course.
To get an idea of the kinds of results we’re aiming for, recall the standard algo-
rithm for pencil-and-paper addition of two numbers (write one number above the
other, draw a line underneath . . . ).

24
2011
21
2032
The basic step in this process is the addition of two decimal digits, for example, in
the first column, 1 + 1 = 2. The calculation here thus requires two basic steps.
More generally, the number of basic steps required to perform an addition a + b
using the pencil-and-paper algorithm depends on the numbers of decimal digits in
a and b. The following proposition (whose proof is left to the reader) explains why
it’s thus natural to think of log10 (a) and log10 (b) as the sizes of the inputs to the
addition algorithm.
Definition 4.1 (Floor and ceiling). For a real number x ∈ R, define ⌊x⌋, which
is read as “floor of x”, to be the greatest integer less-than-or-equal-to x. Similarly,
define ⌈x⌉, “ceiling of x”, to be the least integer greater-than-or-equal-to x. In more
conventional notation these functions are given by

⌊x⌋ = max {k ∈ Z | k ≤ x} and ⌈x⌉ = min {k ∈ Z | k ≥ x} .

Proposition 4.2 (Logs and length in decimal digits). The decimal representation
of a number n > 0 ∈ N has exactly d = 1 + ⌊log10 (n)⌋ decimal digits.
In light of these results, one might hope to say something along the lines of
The pencil-and-paper addition algorithm computes the sum a + b
in 1 + min (⌊log10 (a)⌋, ⌊log10 (b)⌋) steps.
This is a bit of a mouthful and, worse, isn’t even right. Quick-witted readers will
have noticed that we haven’t taken proper account of carried digits. The example
above didn’t involve any, but if instead we had computed

1959
21
1980

we would have carried a 1 from the first column to the second and so would have
needed to do three basic steps: 9 + 1, 1 + 5, and 6 + 2.
In general, if the larger of a and b has d decimal digits then computing a + b
could require as many as d − 1 carrying additions. That means our statement above
should be replaced by something like
The number of steps N required for the pencil-and-paper addition algo-
rithm to compute the sum a + b satisfies the following bounds:

1 + min (⌊log10 (a)⌋, ⌊log10 (b))⌋) ≤ N ≤ 1 + ⌊log10 (a)⌋ + ⌊log10 (b)⌋

This introduces an important theme: knowing the size of the input isn’t always
enough to determine exactly how long a computation will take, but may enable one
to place bounds on the running-time.

25
The statements above are rather fiddly and not especially useful. In practice,
people want such estimates so they can decide whether a problem is do-able at
all. They want to know whether, say, given a calculator that can add two 5-digit
numbers in one second1 , it would be possible to work out the University of Manch-
ester’s payroll in less than a month. For these sorts of questions2 one doesn’t want
the cumbersome, though precise sorts of statements formulated above, but rather
something semi-quantitative along the lines of:

If we measure the size of the inputs to the pencil-and-paper addition


algorithm in terms of the number of digits, then if we double the size of
the input, the amount of work required to get the answer also doubles.

The remainder of today’s lecture will develop a rigorous framework that we can use
to make such statements.

4.2 Examples and issues


The three main examples in the rest of the lecture will be

• Greedy colouring of a graph G(V, E) with |V | vertices and |E| edges.

• Multiplication of two n × n matrices A and B.

• Testing whether a number n ∈ N is prime by trial division.

In the rest of this section I’ll discuss the associated algorithms briefly, with an eye
to answering three key questions:

(1) What are the details of the algorithm and what should we regard as the basic
step?

(2) How should we measure the size of the input?

(3) What kinds of estimates can we hope to make?

4.2.1 Greedy colouring


Algorithm 3.7 constructs a colouring for a graph G(V, E) by examining, in turn, the
adjacency lists of each of the vertices in V = {v1 , . . . , vn }. If we take as our basic
1
Up until the mid-1940’s, a calculator was a person and so the speed mentioned here was not
unrealistic. Although modern computing machinery doesn’t work with decimal representations
and can perform basic arithmetic operations in tiny fractions of a second, similar reasoning still
allows one to determine the limits of practical computability.
2
Estimates similar to the ones we’ve done here bear on much less contrived questions such as: if
there are processors that can do arithmetic on two 500-digit numbers in a nanosecond, how many
of them would GCHQ need to buy if they want to be able to crack a 4096-bit RSA cryptosystem
within an hour?

26
step the process of looking at a neighbour’s colour, then the algorithm requires a
number of basic steps given by
X X
|Av | = deg(v) = 2|E| (4.1)
v∈V v∈V

where the final equality follows from Theorem 1.8, the Handshaking Lemma. This
suggests that we should measure the size of the problem in terms of the number of
edges.

4.2.2 Matrix multiplication


If A and B are square, n × n matrices then the i, j-th entry in the product AB is
given by:
Xn
(AB)i,j = Aik Bkj .
k=1

Here we’ll measure the size of the problem with n, the number of rows in the matrices,
and take as our basic steps the arithmetic operations, addition and multiplication
of two real numbers. The formula above then makes it easy to see that it takes
n multiplications and (n − 1) additions to compute a single entry in the product
matrix and so, as there are n2 such entries, we need

n2 (n + (n − 1)) = 2n3 − n2 (4.2)

total arithmetic operations to compute the whole product.


Notice that I have not included any measure of the magnitude of the matrix
entries in our characterisation of the size of the problem. I did this because (a)
it agrees with the usual practice among numerical analysts, who typically analyse
algorithms designed to work with numbers whose machine representations have a
fixed size and (b) I wanted to emphasise that an efficiency estimate depends in
detail on how one chooses to measure size of the inputs. The final example involves
a problem and algorithm where it does make sense to think about the magnitude of
the input.

4.2.3 Primality testing and worst-case estimates


It’s easy to prove by contradiction the following proposition.

Proposition 4.3 (Smallest divisor of a composite number). If√n ∈ N is composite


(that is, not a prime), then it has a divisor b that satisfies b ≤ n.

We can thus test whether a number is prime with the following simple algorithm:

27
Algorithm 4.4 (Primality testing via trial division).
Given a natural number n ∈ N, determine whether it is prime.

(1) For each b ∈ N in the range 2 ≤ b ≤ n {

(2) Ask: does b divide n?


If Yes, report that n is composite.
If No, continue to the next value of b.

(3) }

(4) If no divisors were found, report that n is prime.

This problem is more subtle than the previous examples in a couple of ways. A
natural candidate for the basic step here is the computation of (n mod b), which
answers the question “Does b divide n?”. But the kinds of numbers whose primality
one wants to test in, for example, cryptographic applications are large and so we
might want take account of the magnitude of n in our measure of the input size.
If we compute (n mod b) with the standard long-division algorithm, the amount of
work required for a basic step will itself depend on the number of digits in n and so,
as in our analysis of the pencil-and-paper addition algorithm, it’ll prove convenient
to measure the size of the input with d = log10 (n), which is approximately the
number of decimal digits in n.
A further subtlety is that because the algorithm reports an answer as soon as
it finds a factor, the amount of work required varies wildly, even among n with the
same number of digits. For example, half of all 100-digit numbers are even and so
will be revealed as composite
√ by the very first value of b we’ll try. Primes, on the
other hand, will require ⌊ n⌋ − 1 tests. A standard way to deal with this second
issue is to make estimates about the worst case efficiency: in this case, that’s the
running-time required for primes. A much harder approach is to make an estimate
of the average case running-time obtained by averaging over all inputs with a given
size.

4.3 Bounds on asymptotic growth


Video As the introduction hinted, we’re really after statements about how quickly the
3.2 amount of work increases as the size of the problem does. The following definitions
provide a very convenient language in which to formulate such statements.

Definition 4.5 (Bounds on asymptotic growth). The rate of growth of a function


f : N → R+ is often characterised in terms of some simpler function g : N → R+ in
the following ways

• f (n) = O(g(n)) if ∃ c1 > 0 such that, for all sufficiently large n, f (n) ≤ c1 g(n);

• f (n) = Ω(g(n)) if ∃ c2 > 0 such that, for all sufficiently large n, f (n) ≥ c2 g(n);

• f (n) = Θ(g(n)) if f (n) = O(g(n)) and f (n) = Ω(g(n)).

28
Notice that the definitions of f (n) = O(g(n)) and f (n) = Ω(g(n)) include the phrase
“for all sufficiently large n”. This is equivalent to saying, for example,
f (n) = Ω(g(n)) if there exist some c1 > 0 and N1 ≥ 0 such that for all
n ≥ N1 , f (n) ≤ c1 g(n).
The point is that the definitions are only concerned with asymptotic growth—they’re
all about the limit of large n.

4.4 Analysing the examples


Here I’ll apply the definitions from the previous section to our three examples. In
practice people are mainly concerned with the O-behaviour of an algorithm. That
is, they are mainly interested in getting an asymptotic upper bound on the amount
of work required. This makes sense when one doesn’t know anything special about
the inputs and is thinking about buying computer hardware, but as an exercise I’ll
obtain Θ-type bounds where possible.

4.4.1 Greedy colouring


Proposition 4.6 (Greedy colouring). If we take the process of checking a neighbour’s
colour as the basic step, greedy colouring requires Θ(|E|) basic steps.
We’ve already argued that the number of basic steps is 2|E|, so if we take c1 = 3
and c2 = 1 we have, for all graphs, that the number of operations f (|E|) = 2|E|
required for the greedy colouring algorithm satisfies
c2 |E| ≤ f (|E|) ≤ c1 |E| or |E| ≤ 2|E| ≤ 3|E|
and so the algorithm is both O(|E|) and Ω(|E|) and hence is Θ(|E|).

4.4.2 Matrix multiplication


Proposition 4.7 (Matrix multiplication). If we characterise the size of the inputs
with n, the number of rows in the matrices, and take as our basic step the arith-
metic operations of addition and multiplication of two matrix entries, then matrix
multiplication requires Θ(n3 ) basic steps.
We argued in Section 4.2.2 that the number of arithmetic operations required to
multiply a pair of n × n matrices is f (n) = 2n3 − n2 . It’s not hard to see that for
n ≥ 1,
n3 ≤ 2n3 − n2 or equivalently 0 ≤ n3 − n2
and so f (n) = Ω(n3 ). Further, it’s also easy to see that for n ≥ 1 we have
2n3 − n2 ≤ 3n3 or equivalently − n3 − n2 ≤ 0,
and so we also have that f (n) = O(n3 ). Combining these bounds, we’ve established
that
f (n) = Θ(n3 ),

29
which is a special case of a more general result. One can prove—see the Problem
Sets—that if f : N → R+ is a polynomial in n of degree k, then f = Θ(nk ). Algo-
rithms that are O(nk ) for some k ∈ R are often called polynomial time algorithms.

4.4.3 Primality testing via trial division


Proposition 4.8 (Primality testing). If we measure the size of the input with d,
the number of decimal digits in√n, and take as our basic step the computation of
n mod b for an integer 2 ≤ b ≤ n, then primality testing via trial division requires
O(10d/2 ) steps.
In the most demanding cases, when n is actually
√ prime, we need to compute
n√mod b for each integer b in the range 2 ≤ b ≤ n. This means we need to do
⌊ n⌋ − 1 basic steps and so the number of steps f (d) required by the algorithm
satisfies

f (d) ≤ ⌊ n⌋ − 1

≤ ⌊ n⌋

≤ n.

To get the righthand side in terms of the problem size d = ⌊log10 (n)⌋ + 1, note that,
for all x ∈ R, x < ⌊x⌋ + 1 and so

log10 (n) < d.

Then √ 1/2
n = n1/2 = 10log10 (n)
which implies that

f (d) ≤ n
1/2
≤ 10log10 (n)
1/2
≤ 10d
≤ 10d/2 ,

where, in passing from the second line to the third, I have replaced log10 (n) with d.
This change increases the righthand side of the inequality (which thus still provides
an upper bound on f (d)) because, as we saw above, d > log10 (n). Thus we have
established that primality testing via trial division is O(10d/2 ).
Such algorithms are often called exponential-time or just exponential and they
are generally regarded as impractical for, as one increases d, the computational
requirements can jump abruptly from something modest and doable to something
impossible. Further, we haven’t yet taken any account of the way the sizes of the
numbers n and b affect the amount of work required for the basic step. If we were to
do so—if, say, we choose single-digit arithmetic operations as our basic steps—the
bound on the operation count would only grow larger: trial division is not a feasible
way to test large numbers for primality.

30
4.5 Afterword
I chose the algorithms discussed above for simplicity, but they are not necessarily
the best known ways to solve the problems. I also simplified the analysis by judi-
cious choice of problem-size measurement and basic step. For example, in practical
matrix multiplication problems the multiplication of two matrix elements is more
computationally expensive than addition of two products, to the extent that people
often just ignore the additions and try to estimate the number of multiplications.
The standard algorithm is still Θ(n3 ), but more efficient algorithms are known. The
basic idea is related to a clever observation about the number of multiplications
required to compute the product of two complex numbers

(a + ib) × (c + id) = (ac − bd) + i(ad + bc). (4.3)

The most straightforward approach requires us to compute the four products ac,
bd, ad and bc. But Gauss noticed that one can instead compute just three products,
ac, bd and
q = (a + b)(c + d) = ac + ad + bc + bd,
and then use the relation
(ad + bc) = q − ac − bd
to compute the imaginary part of the product in Eqn. (4.3). In 1969 Volker Strassen
discovered a similar trick whose simplest application allows one to compute the
product of two 2 × 2 matrices with only 7 multiplications, as opposed to the 8
that the standard algorithm requires. Building on this observation, he found an
algorithm that can compute all the entries in the product of two n × n matrices
using only O(nlog2 (7) ) ≈ O(n2.807 ) multiplications3 .
More spectacularly, it turns out that there is a polynomial-time algorithm for
primality testing. It was discovered in the early years of this century by Agrawal,
Kayal and Saxena (often shortened to AKS)4 . This is particularly cheering in that
two of the authors, Kayal and Saxena, were undergraduate project students when
they did this work.

3
I learned about Strassen’s work in a previous edition of William H. Press, Saul A. Teukolsky,
William T. Vetterling and Brian P. Flannery (2007), Numerical Recipes in C++, 3rd edition, CUP,
Cambridge. ISBN: 978-0-521-88068-8, which is very readable, but for a quick overview of the area
you might want to look at Sara Robinson (2005), Toward an optimal algorithm for matrix
multiplication, SIAM News, 38(9).
4
See: Manindra Agrawal, Neeraj Kayal and Nitin Saxena (2004), PRIMES is in P, Annals of
Mathematics, 160(2):781–793. DOI: 10.4007/annals.2004.160.781. The original AKS paper
is quite approachable, but an even more reader-friendly treatment of their proof appears in An-
drew Granville (2005), It is easy to determine whether a given integer is prime, Bulletin of the
AMS, 42:3–38. DOI: 10.1090/S0273-0979-04-01037-7.

31
Chapter 5

Walks, Trails, Paths and


Connectedness

Reading: Some of the material in this lecture comes from Section 1.2 of
Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,
which is available online via SpringerLink.

If you are at the university, either physically or via the VPN, you can download the
chapters of this book as PDFs.

Several of the examples in the previous lectures—for example two of the sub-
graphs in Figure 2.7 and the graph in Figure 1.12—consist of two or more “pieces”.
If one thinks about the definition of a graph as a pair of sets, these multiple pieces
don’t present any mathematical problem, but it proves useful to have precise vocab-
ulary to discuss them.

5.1 Walks, trails and paths


Video The first definition we need involves a sequence of edges
3.3
(e1 , e2 , . . . , eL ) (5.1)

Note that some edges may appear more than once.


Definition 5.1. A sequence of edges such as the one in Eqn (5.1) is a walk in a
graph G(V, E) if there exists a corresponding sequence of vertices

(v0 , v1 , . . . , vL ). (5.2)

such that ej = (vj−1 , vj ) ∈ E. Note that the vertices don’t have to be distinct. A
walk for which v0 = vL is a closed walk.
This definition makes sense in both directed and undirected graphs and in the latter
case corresponds to a path that goes along the edges in the sense of the arrows that
represent them.

32
The walk specified by the edge sequence

(e1 , e2 , e3 , e4 ) = ((1, 2), (2, 3), (3, 1), (1, 5))


1 2
has corresponding vertex sequence

5 3 (v0 , v1 , v2 , v3 , v4 ) = (1, 2, 3, 1, 5),

4 while the vertex sequence


(v0 , v1 , v2 , v3 ) = (1, 2, 3, 1) corresponds to a closed
walk.

Figure 5.1: Two examples of walks.

Definition 5.2. The length of a walk is the number of edges in the sequence. For
the walk in Eqn. 5.1 the length is thus L.

It’ll prove useful to define two more constrained sorts of walk:

Definition 5.3. A trail is a walk in which all the edges ej are distinct and a closed
trail is a closed walk that is also a trail.

Definition 5.4. A path is a trail in which all the vertices in the sequence in
Eqn (5.2) are distinct.

Definition 5.5. A cycle is a closed trail in which all the vertices are distinct, except
for the first and last, which are identical.

Remark 5.6. In an undirected graph a cycle is a subgraph isomorphic to one of the


cycle graphs Cn and must include at least three edges, but in directed graphs and
multigraphs it is possible to have a cycle with just two edges.

Remark 5.7. As the three terms walk, trail and path mean very similar things in
ordinary speech, it can be hard to keep their graph-theoretic definitions straight, even
though they make useful distinctions. The following observations may help:

• All trails are walks and all paths are trails. In set-theoretic notation:

Walks ⊇ Trails ⊇ Paths

• There are trails that aren’t paths: see Figure 5.2.

5.2 Connectedness
Video We want to be able to say that two vertices are connected if we can get from one
3.4 to the other by moving along the edges of the graph. Here’s a definition that builds
on the terms defined in the previous section:

33
a c

b d

f e

Figure 5.2: The walk specified by the vertex sequence (a, b, c, d, e, b, f ) is a trail as
all the edges are distinct, but it’s not a path as the vertex b is visited twice.

Definition 5.8. In a graph G(V, E), two vertices a and b are said to be connected
if there is a walk given by a vertex sequence (v0 , . . . , vL ) where v0 = a and vL = b.
Additionally, we will say that a vertex is connected to itself.
Definition 5.9. A graph in which each pair of vertices is connected is a connected
graph.
See Figure 5.3 for an example of a connected graph and another that is not con-
nected.

a a

b b

Figure 5.3: The graph at left is connected, but the one at right is not, because
there is no walk connecting the shaded vertices labelled a and b.

Once we have the definitions above, it’s possible to make a precise definition of
the “pieces” of a graph. It depends on the notion of an equivalence relation, which
you should have met earlier your studies.
Definition 5.10. A relation ∼ on a set S is an equivalence relation if it is:
reflexive: a ∼ a for all a ∈ S;

symmetric: a ∼ b ⇒ b ∼ a for all a, b ∈ S;

transitive: a ∼ b and b ∼ c ⇒ a ∼ c for all a, b, c ∈ S.


The main use of an equivalence relation on S is that it decomposes S into a collection
of disjoint equivalence classes. That is, we can write
[
S= Sj
j

where Sj ∩ Sk = ∅ if j ̸= k and a ∼ b if and only if a, b ∈ Sj for some j.

34
5.2.1 Connectedness in undirected graphs
The key idea is that “is-connected-to” is an equivalence relation on the vertex set of
a graph. To see this, we need only check the three properties:

reflexive: This is true by definition, and is the main reason why we say that a
vertex is always connected to itself.

symmetric: If there is a walk from a to b then we can simply reverse the corre-
sponding sequence of edges to get a walk from b to a.

transitive: Suppose a is connected to b, so that the graph contains a walk corre-


sponding to some vertex sequence

(a = u0 , u1 , . . . , uL1 −1 , uL1 = b)

that connects a to b. If there is also a walk from b to c given by some vertex


sequence
(b = v0 , v1 , . . . , vL2 −1 , vL2 = c)
then we can get a walk from a to c by tracing over the two walks listed above,
one after the other. That is, there is a walk from a to c given by the vertex
sequence
(a, u1 , . . . , uL1 −1 , b, v1 , . . . , vL2 −1 , c) .
We have shown that if a is connected to b and b is connected to c, then a is
connected to c and this is precisely what it means for “is-connected-to” to be
a transitive relation.

The process of traversing one walk after another, as we did in the proof of the
transitive property, is sometimes called concatenation of walks.

Definition 5.11. In an undirected graph G(V, E) a connected component is a


subgraph induced by an equivalence class under the relation “is-connected-to” on V .

The disjointness of equivalence classes means that each vertex belongs to exactly one
connected component and so we will sometimes talk about the connected component
of a vertex.

5.2.2 Connectedness in directed graphs


In directed graphs “is-connected-to” isn’t an equivalence relation because it’s not
symmetric. That is, even if we know that there’s a walk from some vertex a to
another vertex b, we have no guarantee that there’s a walk from b to a: Figure 5.4
provides an example. None the less, there is an analogue of a connected component
in a directed graph that’s captured by the following definitions:

Definition 5.12. In a directed graph G(V, E) a vertex b is said to be accessible


or reachable from another vertex a if G contains a walk from a to b. Additionally,
we’ll say that all vertices are accessible (or reachable) from themselves.

35
u v

Figure 5.4: In a directed graph it’s possible to have a walk from vertex a to vertex
b without having a walk from b to a, as in the digraph at left. In the digraph at right
there are walks from u to v and from v to u so this pair is strongly connected.

Definition 5.13. Two vertices a and b in a directed graph are strongly connected
if b is accessible from a and a is accessible from b. Additionally, we regard a vertex
as strongly connected to itself.
With these definitions it’s easy to show (see the Problem Sets) that “is-strongly-
connected-to” is an equivalence relation on the vertex set of a directed graph and so
the vertex set decomposes into a disjoint union of strongly connected components.
This prompts the following definition:
Definition 5.14. A directed graph G(V, E) is strongly connected if every pair of
its vertices is strongly connected. Equivalently, a digraph is strongly connected if it
contains exactly one strongly connected component.
Finally, there’s one other notion of connectedness applicable to directed graphs,
weak connectedness:
Definition 5.15. A directed graph G(V, E) is weakly connected if, when one
converts all its edges to undirected ones, it becomes a connected, undirected graph.
Figure 5.5 illustrates the difference between strongly and weakly connected graphs.
Finally, I’d like to introduce a piece of notation for the graph that one gets by
ignoring the directedness of the edges in a digraph:
Definition 5.16. If G(V, E) is a directed multigraph then |G| is the undirected
multigraph produced by ignoring the directedness of the edges. Note that if both the
directed edges (a, b) and (b, a) are present in a digraph G(V, E), then two parallel
copies of the undirected edge (a, b) appear in |G|.

5.3 Afterword: a useful proposition


Video As with the Handshaking Lemma in Lecture 1, I’d like to finish off a long run of
3.5 definitions by using them to formulate and prove a small, useful result.
Proposition 5.17 (Connected vertices are joined by a path). If two vertices a and
b are connected, so that there is a walk from a to b, then there is also a path from a
to b.

36
convert
directed edges to
undirected ones

Figure 5.5: The graph at the top is weakly connected, but not strongly connected,
while the one at the bottom is both weakly and strongly connected.

Proof of Proposition 5.17


To say that two vertices a and b in a graph G(V, E) are connected means that there
is a walk given by a vertex sequence

(v0 , . . . , vL ) (5.3)

where v0 = a and vL = b. There are two possibilities:

(i) all the vertices in the sequence are distinct;

(ii) some vertex or vertices appear more than once.

In the first case the walk is also a path and we are finished. In the second case
it is always possible to find a path from a to b by removing some edges from the
walk in Eqn. (5.3). This sort of “path surgery” is outlined below and illustrated in
Example 5.18.
We are free to assume that the set of repeated vertices doesn’t include a or b
as we can easily make this true by trimming some vertices off the two ends of the
sequence. To be concrete, we can define a new walk by first trimming off everything
before the last appearance of a—say that’s vj —to yield a walk specified by the
vertex sequence
(vj , . . . , vL )
and then, in that walk, remove everything that comes after the first appearance of
b—say that’s vk —so that we end up with a new walk

(a = v0′ , . . . , vL′ ′ = b) = (vj , . . . , vk ). (5.4)

To finish the proof we then need to deal with the case where the walk in (5.4),
which doesn’t contain any repeats of a or b, still contains repeats of one or more

37
q t w

a r s u v b

Figure 5.6: In the graph above the shaded vertices a and b are connected by the path
(a, r, s, u, v, b).

other vertices. Suppose that c ∈ V with c ̸= a, b is such a repeated vertex: we can


eliminate repeated visits to c by defining a new walk specified by the vertex sequence

(a = u0 , . . . , uL′′ = b) = (v0′ , . . . , vj′ , vk+1



, . . . , vL′ ′ ) (5.5)

where vj′ is the first appearance of c in the sequence at left in Eqn. (5.4) and vk′ is
the last. There can only be finitely many repeated vertices in the original walk (5.3)
and so, by using the approach sketched above repeatedly, we can eliminate them all,
leaving a path from a to b. Very scrupulous students may wish to rewrite this proof
using induction on the number of repeated vertices.

Example 5.18 (Connected vertices are connected by a path). Consider the graph
is Figure 5.6. The vertices a and b are connected by the walk

(v0 , . . . , v15 ) = (a, q, r, a, r, s, t, u, s, t, u, v, b, w, v, b)

which contains many repeated vertices. To trim it down to a path we start by


eliminating repeats of a and b using the approach from Eqn. (5.4), which amounts
to trimming off those vertices that are underlined in the vertex sequence above.
To see how this works, notice that the vertex a appears as v0 and v3 in the
original walk and we want v0′ = vj in (5.4) to be its last appearance, so we set
vj = v3 . Similarly, b appears as v12 and v15 in the original walk and we want
vL′ ′ = vk to be b’s first appearance, so we set vk = v12 . This leaves us with

(v0′ , . . . , v9′ ) = (v3 , . . . , v12 )


= (a, r, s, t, u, s, t, u, v, b) (5.6)

Finally, we eliminate the remaining repeated vertices by applying the approach


from Eqn. (5.5) to the sequence (v0′ , . . . , v9′ ). This amounts to chopping out the
sequence of vertices underlined in Eqn. (5.6). To follow the details, note that each
of the vertices s, t and u appears twice in (5.6). To eliminate, say, the repeated
appearances of s we should use (5.4) with uj = v2′ as the first appearance of s in
Eqn. (5.6) and uk = v5′ as the last. This leaves us with the new walk

(u0 , . . . , u6 ) = (v0′ , v1′ , v2′ , v6′ , v7′ , v8′ , v9′ ) = (a, r, s, t, u, v, b)

which is a path connecting a to b.

38
Part II

Trees and the Matrix-Tree


Theorem
Chapter 6

Trees and forests

This section of the notes introduces an important family of graphs—trees and


forests—and also serves as an introduction to inductive proofs on graphs.
Reading:
The material in today’s lecture comes from Section 1.2 of

Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition


(available online via SpringerLink),

though the discussion there includes a lot of material about counting trees that we’ll
handle in a different way.

6.1 Basic definitions


Video We begin with a flurry of definitions.
4.1
Definition 6.1. A graph G(V, E) is acyclic if it doesn’t include any cycles.

Another way to say a graph is acyclic is to say that it contains no subgraphs


isomorphic to one of the cycle graphs.

Definition 6.2. A tree is a connected, acyclic graph.

Definition 6.3. A forest is a graph whose connected components are trees.

Trees play an important role in many applications: see Figure 6.1 for examples.

6.1.1 Leaves and internal nodes


Trees have two sorts of vertices: leaves (sometimes also called leaf nodes) and internal
nodes: these terms are defined more carefully below and are illustrated in Figure 6.2.

Definition 6.4. A vertex v ∈ V in a tree T (V, E) is called a leaf or leaf node if


deg(v) = 1 and it is called an internal node if deg(v) > 1.

39
Two trees Graphs that aren’t trees

Figure 6.1: The two graphs at left (white and yellow vertices) are trees, but the two
at right aren’t: the one at upper right (with green vertices) has multiple connected
components (and so it isn’t connected) while the one at lower right (blue vertices)
contains a cycle. The graph at upper right is, however, a forest as each of its
connected components is a tree.

Figure 6.2: In the two trees above the internal nodes are white, while the leaf nodes
are coloured green or yellow.

40
6.1.2 Kinds of trees
Definition 6.5. A binary tree is a tree in which every internal node has degree
three.

Definition 6.6. A rooted tree is a tree with a distinguished leaf node called the
root node.

Warning to the reader: The definition of rooted tree above is common among
biologists, who use trees to represent evolutionary lineages (see Darwin’s sketch at
right in Figure 6.3). Other researchers, especially computer scientists, use the same
term to mean something slightly different.

Figure 6.3: At left are three examples of rooted binary trees. In all cases the root
node is brown, the leaves are green and the internal nodes are white. At right is a page
from one of Darwin’s notebooks, showing the first known sketch of an evolutionary
tree: here the nodes represent species and the edges indicate evolutionary descent.

6.2 Three useful lemmas

Video Lemma 6.7 (Minimal |E| in a connected graph). A connected graph on n vertices
4.2 has at least (n − 1) edges.

Lemma 6.8 (Maximal |E| in an acyclic graph). An acyclic graph on n vertices has
at most (n − 1) edges.

Definition 6.9. A vertex v is said to be isolated if it has no neighbours. Equiva-


lently, v is isolated if deg(v) = 0.

Lemma 6.10 (Vertices of degree 1). If a graph G(V, E) has n ≥ 2 vertices, none
of which are isolated, and (n − 1) edges then G has at least two vertices of degree 1.

41
G\v

G
v
e

G\e

Figure 6.4: A graph G(V, E) and the subgraphs G\v formed by deleting the yellow
vertex v and G\e formed by deleting the red edge e.

6.2.1 A festival of proofs by induction


Proofs by induction about graphs generally have three parts

• a base case that typically involves a graph with very few vertices or edges
(often just one or two) and for which the result is obvious;

• an inductive hypothesis in which one assumes the result is true for all
graphs with, say, n0 or fewer vertices (or perhaps m0 or fewer edges);

• an inductive step where one starts with a graph that satisfies the hypotheses
of the theorem and has, say, n0 + 1 vertices (or m0 + 1 edges or whatever is
appropriate) and then reduces the theorem as it applies to this larger graph
to something involving smaller graphs (to which the inductive hypothesis ap-
plies), typically by deleting an edge or vertex.

6.2.2 Graph surgery


The proofs below accomplish their inductive steps by deleting either an edge or a
vertex, so here I introduce some notation for these processes.

Definition 6.11. If G(V, E) is a graph and v ∈ V is one of its vertices then G\v
is defined to be the subgraph formed by deleting v and all the edges that are incident
on v.

Definition 6.12. If G(V, E) is a graph and e ∈ E is one of its edges then G\e is
defined to be the subgraph formed by deleting e.

Both these definitions are illustrated in Figure 6.4.

42
G\v1

v1
v2
G\v2

Figure 6.5: In the inductive step of the proof of Lemma 6.7 we delete some arbitrary
vertex v ∈ V in a connected graph G(V, E) to form the graph G\v. The result may
still be a connected graph, as in G\v1 at upper right, or may fall into several connected
components, as in G\v2 at lower right.

Proof of Lemma 6.7


Video We’ll prove Lemma 6.7 by induction on the number of vertices. First let us rephrase
4.3 the lemma in an equivalent way:
If G(V, E) is a connected graph on |V | = n vertices, then |E| ≥ n − 1.

Base case: There is only one graph with |V | = 1 and it is, by definition, connected
and has |E| = 0, which satisfies the lemma. One could alternatively start from
K2 , which is the only connected graph on two vertices and has |E| = 1.

Inductive hypothesis: Suppose that the lemma is true for all graphs G(V, E) with
1 ≤ |V | ≤ n0 , for some fixed n0 .

Inductive step: Now consider a connected graph G(V, E) with |V | = n0 + 1: the


lemma we’re trying to prove then says |E| ≥ n0 . Choose some vertex v ∈ V
and delete it, forming the graph G\v. We’ll say that the new graph has vertex
set V ′ = V \v and edge set E ′ . There are two possibilities (see Figure 6.5):

(i) G\v is still a connected graph;


(ii) G\v has k ≥ 2 connected components: call these G1 (V1 , E1 ), . . . , Gk (Vk , Ek ).

In the first case—where G\v is connected—we also know |V ′ | = |V | − 1 = n0


and so the inductive hypothesis applies and tells us that |E ′ | ≥ (n0 − 1). But
as G was connected, the vertex v that we deleted must have had at least one
neighbour, and hence at least one edge, so we have

|E| ≥ |E ′ | + 1 ≥ (n0 − 1) + 1 ≥ n0

which is exactly the result we sought.

43
In the second case—where deleting v causes G to fall into k ≥ 2 connected
components—we can call the components G1 (V1 , E1 ), G2 (V2 , E2 ), · · · , Gk (Vk , Ek )
with nj = |Vj |. Then

X
k X
k
nj = |Vj | = |V ′ | = |V | − 1 = n0 .
j=1 j=1

Further, the j-th connected component is a connected graph on nj < n0


vertices and so the inductive hypothesis applies to each component separately,
telling us that |Ej | ≥ nj − 1. But then we have
!
X
k Xk X k
|E ′ | = |Ej | ≥ (nj − 1) ≥ nj − k ≥ n0 − k. (6.1)
j=1 j=1 j=1

And, as we know that the original graph G was connected, we also know
that the deleted vertex v was connected by at least one edge to each of the k
components of G\v. Combining this observation with Eqn. (6.1) gives us

|E| ≥ |E ′ | + k ≥ (n0 − k) + k ≥ n0 ,

which proves the lemma for the second case too.

Proof of Lemma 6.8


Once again, we’ll do induction on the number of vertices. As above, we begin by
rephrasing the lemma:

If G(V, E) is an acyclic graph on |V | = n vertices, then |E| ≤ n − 1.

Base case: Either K1 or K2 could serve as the base case: both are acyclic graphs
that have a maximum of |V | − 1 edges.

Inductive hypothesis: Suppose that Lemma 6.8 is true for all acyclic graphs with
|V | ≤ n0 , for some fixed n0 .

Inductive step: Consider an acyclic graph G(V, E) with |V | = n0 + 1: we want to


prove that |E| ≤ n0 . Choose an arbitrary edge e = (a, b) ∈ E and delete it to
form G′ (V, E ′ ) = G\e, which has the same vertex set as G, but a smaller edge
set E ′ = E\e.
First note that G′ must have one more connected component than G does be-
cause a and b, the two vertices that appear in the deleted edge e, are connected
in G, but cannot be connected in G′ . If they were still connected, there would
(by Prop. 5.17) be a path connecting them in G′ that, when combined with
e, would form a cycle in G, contradicting the assumption that G is acyclic.
Thus we know that G′ has k ≥ 2 connected components that we can call
G1 (V1 , E1 ), . . . , Gk (Vk , Ek ).

44
If we again define nj = |Vj |, we know that nj ≤ n0 for all j and so the inductive
hypothesis applies to each component separately: |Ej | ≤ nj − 1. Adding these
up yields
!
Xk X k X k
|E ′ | = |Ej | ≤ (nj − 1) ≤ nj − k ≤ (n0 + 1) − k.
j=1 j=1 j=1

And then, as |E| = |E ′ | + 1 we have


|E| = |E ′ | + 1 ≤ (n0 + 1 − k) + 1 ≤ n0 + (2 − k) ≤ n0 ,
where the final inequality follows from the observation that G′ has k ≥ 2
connected components.

Proof of Lemma 6.10


The final Lemma in this section is somewhat technical: we’ll use it in the proof of
a theorem in Section 6.3. The lemma says that graphs G(V, E) that have |V | = n
and |E| = (n − 1) and have no isolated vertices must contain at least two vertices
with degree one. The proof is by contradiction and uses the Handshaking Lemma.
Imagine the vertices are numbered and arranged in order of increasing degree
so V = {v1 , . . . , vn } and deg(v1 ) ≤ deg(v2 ) ≤ · · · ≤ deg(vn ). The Handshaking
Lemma then tells us that
X n
deg(vj ) = 2|E| = 2(n − 1) = 2n − 2. (6.2)
j=1

As there are no isolated vertices, we also know that deg(vj ) ≥ 1 for all j. Now
assume—aiming for a contradiction—that there is at most a single vertex with
degree one. That is, assume deg(v1 ) ≥ 1, but deg(vj ) ≥ 2 ∀j ≥ 2. Then
X
n X
n
deg(vj ) = deg(v1 ) + deg(vj )
j=1 j=2
X
n
≥1+ 2
j=2

≥ 1 + (n − 1) × 2
≥ 2n − 1.
This contradicts Eqn. (6.2), which says that the sum of degrees is 2n − 2. Thus
it must be true that two or more vertices have degree one, which is the result we
sought.

6.3 A theorem about trees


Video The lemmas of the previous section make it possible to give several nice character-
4.4 isations of a tree and the theorem below, which has a form that one often finds in
Discrete Maths or Algebra books, shows that they’re all equivalent.

45
Theorem 6.13 (Jungnickel’s Theorem 1.2.8). For a graph G(V, E) on |V | = n
vertices, any two of the following imply the third:

(a) G is connected.

(b) G is acyclic.

(c) G has (n − 1) edges.

6.3.1 Proof of the theorem


The theorem above is really three separate propositions bundled into one statement:
we’ll prove them in turn.

(a) and (b) =⇒ (c)


On the one hand, our lemma about the minimal number of edges in a connected
graph (Lemma 6.7) says that property (a) implies that |E| ≥ (n − 1). On the
other hand our lemma about the maximal number of edges in an acyclic graph
(Lemma 6.8) says |E| ≤ (n − 1). The only possibility compatible with both these
inequalities is |E| = (n − 1).

(a) and (c) =⇒ (b)


To prove this by contradiction, assume that it’s possible to have a connected graph
G(V, E) that has (n − 1) edges and contains a cycle. Choose some edge e that’s part
of the cycle and delete it to form H = G\e. H is then a connected graph (removing
an edge from a cycle does not change the number of connected components) with
only n−2 edges, which contradicts our earlier result (Lemma 6.7) about the minimal
number of edges in a connected graph.

(b) and (c) =⇒ (a)


We’ll prove—by induction on n = |V |—that an acyclic graph with |V | − 1 edges
must be connected.

Base case: There is only one graph with |V | = 1. It’s acyclic, has |V | − 1 = 0
edges and is connected.

Inductive hypothesis: Suppose that all acyclic graphs with 1 ≤ |V | ≤ n0 vertices


and |E| = |V | − 1 edges are connected.

Inductive step: Now consider an acyclic graph G(V, E) with |V | = n0 + 1 and


|E| = n0 : we need to prove that it’s connected. First, notice that such a
graph cannot have any isolated vertices, for suppose there was some vertex v
with deg(v) = 0. We could then delete v to produce H = G\v, which would
be an acyclic graph with n0 vertices and n0 edges, contradicting our lemma
(Lemma 6.8) about the maximal number of edges in an acyclic graph.

46
Thus G contains no isolated vertices and so, by the technical lemma from
the previous section (Lemma 6.10), we know that it has at least two vertices
of degree one. Say that one of these two is u ∈ V and delete it to make
G′ (V ′ , E ′ ) = G\u. Then G′ is still acyclic, because G is, and deleting vertices
can’t create cycles. Furthermore G′ has |V ′ | = |V | − 1 = n0 vertices and
|E ′ | = |E| − 1 = n0 − 1 edges. This means that the inductive hypothesis
applies and we can conclude that G′ is connected. But if G′ is connected, so
is G and we are finished.

47
Chapter 7

The Matrix-Tree Theorems

This section of the notes introduces a pair of very beautiful theorems that use linear
algebra to count trees in graphs.
Reading:
The next few lectures are not covered in Jungnickel’s book, though a few definitions
in our Section 7.2.1 come from his Section 1.6. But the main argument draws on
ideas that you should have met in Foundations of Pure Mathematics, Linear Algebra
and Algebraic Structures.

7.1 Kirchoff’s Matrix-Tree Theorem


Video Our goal over the next few lectures is to establish a lovely connection between Graph
5.1 Theory and Linear Algebra. It is part of a circle of beautiful results discovered by
the great German physicist Gustav Kirchoff in the mid-19th century, when he was
studying electrical circuits. To formulate his result we need a few new definitions.

Definition 7.1. A subgraph T (V, E ′ ) of a graph G(V, E) is a spanning tree if it


is a tree that contains every vertex in V .

Figure 7.1 gives some examples.

Definition 7.2. If G(V, E) is a graph on n vertices with V = {v1 , . . . , vn } then its


graph Laplacian L is an n × n matrix whose entries are

 deg(vj ) If i = j
Lij = −1 If i ̸= j and (vi , vj ) ∈ E

0 Otherwise

Equivalently, L = D − A, where D is a diagonal matrix with Djj = deg(vj ) and A


is the graph’s adjacency matrix.

48
v1 v3

v2
v4
G T1 T2 T3

Figure 7.1: A graph G(V, E) with V = {v1 , . . . , v4 } and three of its spanning
trees: T1 , T2 and T3 . Note that although T1 and T3 are isomorphic, we regard them
as different spanning trees for the purposes of the Matrix-Tree Theorem.

Example 7.3 (Graph Laplacian). The graph G whose spanning trees are illustrated
in Figure 7.1 has graph Laplacian
L=D−A
   
2 0 0 0 0 1 1 0
 0 2 0 0   1 0 1 0 
=
 0 0 3
− 
0   1 1 0 1 
0 0 0 1 0 0 1 0
 
2 −1 −1 0
 −1 2 −1 0 
=
 −1 −1
 (7.1)
3 −1 
0 0 −1 1
Once we have these two definitions it’s easy to state the Matrix-Tree theorem
Theorem 7.4 (Kirchoff’s Matrix-Tree Theorem, 1847). If G(V, E) is an undirected
graph and L is its graph Laplacian, then the number NT of spanning trees contained
in G is given by the following computation.
(1) Choose a vertex vj and eliminate the j-th row and column from L to get a new
matrix L̂j ;
(2) Compute
NT = det(L̂j ). (7.2)

The number NT in Eqn. (7.2) counts spanning trees that are distinct as subgraphs
of G: equivalently, we regard the vertices as distinguishable. Thus some of the trees
that contribute to NT may be isomorphic: see Figure 7.1 for an example.
This result is remarkable in many ways—it seems amazing that the answer
doesn’t depend on which vertex we choose when constructing L̂j —but to begin
with let’s simply use the theorem to compute the number of spanning trees for the
graph in Example 7.3
Example 7.5 (Counting spanning trees). If we take G to be the graph whose Lapla-
cian is given in Eqn. (7.1) and choose vj = v1 we get
 
2 −1 0
L̂1 =  −1 3 −1 
0 −1 1

49
and so the number of spanning trees is

NT = det(L̂1 )
   
3 −1 −1 −1
= 2 × det − (−1) × det
−1 1 0 1
= 2 × (3 − 1) + (−1 − 0)
=4−1 = 3

I’ll leave it as an exercise for the reader to check that one gets the same result from
det(L̂2 ), det(L̂3 ) and det(L̂4 ).

7.2 Tutte’s Matrix-Tree Theorem


Video We’ll prove Kirchoff’s theorem as a consequence of a much more recent result1 about
5.2 directed graphs. To formulate this we need a few more definitions that generalise
the notion of a tree to digraphs.

7.2.1 Arborescences: directed trees


Recall the definition of accessible from Lecture 5:

In a directed graph G(V, E) a vertex b is said to be accessible from


another vertex a if G contains a walk from a to b. Additionally, we’ll
say that all vertices are accessible from themselves.

This allows us to define the following suggestive term:

Definition 7.6. A vertex v ∈ V in a directed graph G(V, E) is a root if every other


vertex is accessible from v.

We’ll then be interested in the following directed analogue of a tree:

Definition 7.7. A directed graph T (V, E) is a directed tree or arborescence if

(i) T contains a root

(ii) The graph |T | that one obtains by ignoring the directedness of the edges is a
tree.

See Figure 7.2 for an example. Of course, it’s then natural to define an analogue of
a spanning tree:

Definition 7.8. A subgraph T (V, E ′ ) of a digraph G(V, E) is a spanning arbores-


cence if T is an arborescence that contains all the vertices of G.

1
Proved by Bill Tutte about a century after Kirchoff’s result in W.T. Tutte (1948), The dis-
section of equilateral triangles into equilateral triangles, Math. Proc. Cambridge Phil. Soc.,
44(4):463–482.

50
Figure 7.2: The graph at left is an arborescence whose root vertex is shaded red,
while the graph at right contains a spanning arborescence whose root is shaded red
and whose edges are blue.

7.2.2 Tutte’s theorem


Theorem 7.9 (Tutte’s Directed Matrix-Tree Theorem, 1948). If G(V, E) is a di-
graph with vertex set V = {v1 , . . . , vn } and L is an n × n matrix whose entries are
given by 
 degin (vj ) If i = j
Lij = −1 If i ̸= j and (vi , vj ) ∈ E (7.3)

0 Otherwise
then the number Nj of spanning arborescences with root at vj is

Nj = det(L̂j ) (7.4)

where L̂j is the matrix produced by deleting the j-th row and column from L.
Here again, the number Nj in Eqn. (7.4) counts spanning arborescences that are
distinct as subgraphs of G: equivalently, we regard the vertices as distinguishable.
Thus some of the arborescences that contribute to Nj may be isomorphic, but if they
involve different edges we’ll count them separately.
Example 7.10 (Counting spanning arborescences). First we need to build the matrix
L defined by Eqn. (7.3) in the statement of Tutte’s theorem. If we choose G to be
the graph pictured at upper left in Figure 7.3 then this is L = Din − A where Din is
a diagonal matrix with Djj = degin (vj ) and A is the graph’s adjacency matrix.

L = Din − A
   
2 0 0 0 0 1 0 0
 0 3 0  
0   1 0 1 1 
= 0 0 1 − 
0   0 1 0 1 
0 0 0 2 1 1 0 0
 
2 −1 0 0
 −1 3 −1 −1 
= 0 −1

1 −1 
−1 −1 0 2
Then Table 7.1 summarises the results for the number of rooted trees.

51
v4

v3 v1

v2

Figure 7.3: The digraph at upper left, on which the vertices are labelled, has three
spanning arborescences rooted at v4 .

j L̂j det(L̂j )
 
3 −1 −1
1  −1 1 −1  2
−1 0 2
 
2 0 0
2  0 1 −1  4
−1 0 2
 
2 −1 0
3  −1 3 −1  7
−1 −1 2
 
2 −1 0
4  −1 3 −1  3
0 −1 1

Table 7.1: The number of spanning arborescences for the four possible roots in the
graph at upper left in Figure 7.3.

52
G H

Figure 7.4: If we convert an undirected graph such as G at left to a directed graph


such as H at right, it is easy to count the spanning trees in G by counting spanning
arborescences in H.

Figure 7.5: The undirected graph at left is a spanning tree for G in Figure 7.4,
while the directed graph at right is a spanning arborescence for H (right side of
Fig. 7.4) rooted at the shaded vertex v.

7.3 From Tutte to Kirchoff


Video The proofs of these theorems are long and so I will merely sketch some parts. One of
5.3 these is the connection between Tutte’s directed Matrix-Tree theorem and Kirchoff’s
undirected version. The key idea is illustrated in Figures 7.4 and 7.5. If we want to
count spanning trees in an undirected graph G(V, E) we should first make a directed
graph H(V, E ′ ) that has the same vertex set as G, but has two directed edges—one
running in each direction—for each of the edges in G. That is, if G has an undirected
edge e = (a, b) then H has both the directed edges (a, b) and (b, a).
Now we choose some arbitrary vertex v in H and count the spanning arbores-
cences that have v as a root. It’s not hard to see that each spanning tree in G
corresponds to a unique v-rooted arborescence in H, and vice-versa. More formally,
there is a bijection between the set of spanning trees in G and v-rooted spanning
arborescences in H: see Figure 7.5. The keen reader might wish to write out a
careful statement of how this bijection acts (that is, which tree gets matched with
which arborescence).
Finally, note that for our directed graph H, which includes the edges (a, b) and
(b, a) whenever the original, undirected graph contains (a, b), we have
degin (v) = degout (v) = degG (v) for all v ∈ V
where the in- and out-degrees are in H and degG (v) is in G. This means that the
matrix L appearing in Tutte’s theorem is equal, element-by-element, to the graph
Laplacian appearing in Kirchoff’s theorem. So if we use Tutte’s approach to compute

53
the number of spanning arborescences in H, the result will be the same numerically
as if we’d used Kirchoff’s theorem to count spanning trees in G.

54
Chapter 8

Matrix-Tree Ingredients

This lecture introduces some ideas that we will need for the proof of the Matrix-Tree
Theorem. Many of them should be familiar from Foundations of Pure Mathematics
or Algebraic Structures.
Reading:
The material about permutations and the determinant of a matrix presented here
is pretty standard and can be found in many places: the Wikipedia articles on
Determinant (especially the section on n × n matrices) and Permutation are not
bad places to start.
The remaining ingredient for the proof of the Matrix-Tree theorems is the Prin-
ciple of Inclusion/Exclusion. It is covered in the first year module Foundations of
Pure Mathematics, but it is also a standard technique in Combinatorics and so is
discussed in many introductory books1 . The Wikipedia article is, again, a good place
to start. Finally, as a convenience for students from outside the School of Mathe-
matics, I have included an example in Section 8.4.4 and an Appendix, Section 8.5,
that provides full details of all the proofs.

8.1 Lightning review of permutations


Video First we need a few facts about permutations that you should have learned earlier
5.4 in your degree.
Definition 8.1. A permutation on n objects is a bijection σ from the set {1, . . . , n}
to itself.
We’ll express permutations as follows
 
1 2 ... n
σ=
σ(1) σ(2) . . . σ(n)
where σ(j) is the image of j under the bijection σ.
Definition 8.2. The set fix(σ) is defined as
fix(σ) = {j | σ(j) = j}.
1
See, for example, Dossey, Otto, Spence, and Vanden Eynden (2006), Discrete Mathematics or,
for a short, clear account, Anderson (1974), A First Course in Combinatorial Mathematics.

55
8.1.1 The Symmetric Group Sn
One can turn the set of all permutations on n objects into a group by using composi-
tion (applying one function to the output of another) of permutations as the group
multiplication. The resulting group is called the symmetric group on n objects
or Sn and it has the following properties.
• The identity element is the permutation in which σ(j) = j for all 1 ≤ j ≤ n.
• Sn has n! elements.

8.1.2 Cycles and sign


Definition 8.3 (Cycle permutations). A cycle is a permutation σ specified by a
sequence of distinct integers i1 , i2 , . . . , iℓ ∈ {1, . . . , n} with the properties that
• σ(j) = j if j ∈
/ {i1 , i2 , . . . , iℓ }
• σ(ij ) = ij+1 for 1 ≤ j < ℓ
• σ(iℓ ) = i1 .
Here ℓ is the length of the cycle and a cycle with ℓ = 2 is called a transposition.
We’ll express the cycle σ specified by the sequence i1 , i2 , . . . , iℓ with the notation

σ = (i1 , i2 , . . . , iℓ ).

and we’ll say that two cycles σ1 = (i1 , i2 , . . . , iℓ1 ) and σ2 = (j1 , j2 , . . . , jℓ2 ) are dis-
joint if
{i1 , . . . iℓ1 } ∩ {j1 , . . . jℓ2 } = ∅.
The main point about cycles is that they’re like the “prime factors” of permu-
tations in the following sense:
Proposition 8.4. A permutation has a unique (up to reordering of the cycles)
representation as a product of disjoint cycles.
This representation is often referred to as the cycle decomposition of the permutation.
Finally, given the cycle decomposition of a permutation one can define a function
that we will need in the next section.
Definition 8.5 (Sign of a permutation). The function sgn : Sn → {±1} can be
computed as follows:
• If σ is the identity permutation, then sgn(σ) = 1.
• If σ is a cycle of length ℓ then sgn(σ) = (−1)ℓ−1 .
• If σ has a decomposition into k ≥ 2 disjoint cycles whose lengths are ℓ1 , . . . , ℓk
then
Xk
L−k
sgn(σ) = (−1) where L= ℓj .
j=1

56
This definition of sgn(σ) is equivalent to one that you may know from other
courses:

1 If σ is the product of an even number of transpositions
sgn(σ) =
−1 otherwise

8.2 Using graphs to find the cycle decomposition


Lest we forget graph theory completely, I’d like to conclude our review of permu-
tations by constructing a certain graph that makes it easy to read-off the cycle
decomposition of a permutation. The same construction establishes a bijection that
maps permutations σ ∈ Sn to subgraphs of Kn whose strongly-connected compo-
nents are either isolated vertices or disjoint, directed cycles. This bijection is another
key ingredient in the proof of Tutte’s Matrix Tree Theorem.

Definition 8.6. Given a permutation σ ∈ Sn , define the directed graph Gσ to have


vertex set V = {1, . . . , n} and edge set

E = {(j, σ(j)) | j ∈ V and σ(j) ̸= j}.

The following proposition then makes it easy to find the cycle decomposition of
a permutation σ:

Proposition 8.7. The cycle (i1 , i2 , . . . , il ) appears in the cycle decomposition of σ


if and only if the directed cycle defined by the vertex sequence

(i1 , i2 , . . . , il , i1 )

is a subgraph of Gσ .

Example 8.8 (Graphical approach to cycle decomposition). Consider a permuta-


tion from S6 given by  
1 2 3 4 5 6
σ= .
2 6 4 3 5 1
Then the digraph Gσ has vertex set V = {1, . . . , 6} and edge set

E = {(1, 2), (2, 6), (3, 4), (4, 3), (6, 1)}.

A diagram for this graph appears below and clearly includes two disjoint directed
cycles

1 2 3 4 5 6

57
Thus our permutation has fix(σ) = {5} and its cycle decomposition is

σ = (1, 2, 6)(3, 4) = (3, 4)(1, 2, 6) = (4, 3)(2, 6, 1),

where I have included some versions where the order of the two disjoint cycles is
switched and one where the terms within the cycle are written in a different (but
equivalent) order.
With a little more work (that’s left to the reader) one can prove that this graph-
ical approach establishes a bijection between Sn and that family of subgraphs of Kn
which consists of unions of disjoint cycles. The bijection sends the identity permu-
tation to the subgraph consisting of n isolated vertices and sends a permutation σ
that is the product of k ≥ 1 disjoint cycles

σ = (i1,1 , . . . , i1,ℓ1 ) · · · (ik,1 , . . . , ik,ℓk ) (8.1)

to the subgraph Gσ that has vertex set V = {v1 , . . . , vn } and whose edges are those
that appear in the k disjoint, directed cycles C1 , . . . , Ck , where Cj is the cycle
specified by the vertex sequence
 
vij,1 , . . . , vij,ℓj , vij,1 . (8.2)

In Eqns. (8.1) and (8.2) the notation ij,r is the vertex number of the r-th vertex in
the j-th cycle, while ℓj is the length of the j-th cycle.

8.3 The determinant is a sum over permutations


Video The Matrix-Tree theorems relate the number of spanning trees or arborescences to
5.5 the determinant of a matrix and so it should not be surprising that another of our
key ingredients is a fact about determinants. The standard recursive approach to
computing determinants—in which one computes the determinant of an n×n matrix
as a sum over the determinants of (n − 1) × (n − 1) submatrices—is equivalent to a
sum over permutations:
Proposition 8.9. If A is an n × n matrix then
X Y
n
det(A) = sgn(σ) Ajσ(j) . (8.3)
σ∈Sn j=1

Some of you may have encountered this elsewhere, though most will be meeting
it for the first time. I won’t prove it, as that would be too much of a diversion
from graphs, but Eqn. (8.3) has the blessing of Wikipedia, where it is attributed to
Leibniz, and proofs appear in many undergraduate algebra texts2 . The very keen
reader could also construct an inductive proof herself, starting from the familiar
recursive formula.
Finally, I’ll demonstrate that it works for the two smallest nontrivial examples.
2
I found one in I.N. Herstein (1975), Topics in Algebra, 2nd ed., Wiley.

58
Example 8.10 (2 × 2 matrices). First we need a list of the elements of S2 and their
signs:
Name σ sgn(σ)
 
1 2
σ1 1
 1 2 
1 2
σ2 -1
2 1
Then we can compute the determinant of a 2 × 2 matrix in the usual way
 
a11 a12
det = a11 a22 − a12 a21
a21 a22

and then again using Eqn (8.3)


  X
2 Y
n
a11 a12
det = sgn(σk ) ajσk (j)
a21 a22
k=1 j=1

= sgn(σ1 ) × a1σ1 (1) a2σ1 (2) + sgn(σ2 ) × a1σ2 (1) a2σ2 (2)
= (1) × a11 a22 + (−1) × a12 a21
= a11 a22 − a12 a21

Example 8.11 (3 × 3 matrices). Table 8.1 lists the elements of S3 in the form
we’ll need. First, we compute the determinant in the usual way, a tedious but
straightforward business.
 
a11 a12 a13
det  a21 a22 a23  =
a31 a32 a33
     
a22 a23 a21 a23 a21 a22
a11 × det − a12 × det + a13 × det
a32 a33 a31 a33 a31 a32
= a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31

Then we calculate again using Eqn (8.3) and the numbering scheme for the elements
of S3 that’s shown in Table 8.1.
 
a11 a12 a13 X6 Yn

det a21 a22 a23  = sgn(σk ) ajσk (j)
a31 a32 a33 k=1 j=1

= sgn(σ1 ) × a1σ1 (1) a2σ1 (2) a3σ1 (3) + · · · + sgn(σ6 ) × a1σ6 (1) a2σ6 (2) a3σ6 (3)
= (1) × a11 a22 a33 + (−1) × a11 a23 a32 + (−1) × a12 a21 a33
+ (1) × a12 a23 a31 + (1) × a13 a21 a32 + (−1) × a13 a22 a31
= a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31

59
σk fix(σk ) Cycle Decomposition sgn(σk )
 
1 2 3
σ1 = {1, 2, 3} – 1
 1 2 3 
1 2 3
σ2 = {1} (2, 3) -1
 1 3 2 
1 2 3
σ3 = {3} (1, 2) -1
 2 1 3 
1 2 3
σ4 = ∅ (1, 2, 3) 1
 2 3 1 
1 2 3
σ5 = ∅ (3, 2, 1) 1
 3 1 2 
1 2 3
σ6 = {2} (1, 3) -1
3 2 1

Table 8.1: The cycle decompositions of all the elements in S3 , along with the
associated functions sgn(σ) and fix(σ).

8.4 The Principle of Inclusion/Exclusion


Video The remaining ingredient for the proof of the Matrix-Tree theorems is the Principle
5.6 of Inclusion/Exclusion. As it is covered in a core first year module, none of the
proofs in the rest of this lecture are examinable.

8.4.1 A familiar example


Suppose we have some finite “universal” set U and two subsets, X1 ⊆ U and X2 ⊆ U .
If the subsets are disjoint then it’s easy to work out the number of elements in their
union:
X1 ∩ X2 = ∅ ⇒ |X1 ∪ X2 | = |X1 | + |X2 |.
The case where the subsets have a non-empty intersection provides the simplest
instance of the Principle of Inclusion/Exclusion. You may already know a similar
result from Probability.

Lemma 8.12 (Inclusion/Exclusion for two sets). If X1 and X2 are finite sets then

|X1 ∪ X2 | = |X1 | + |X2 | − |X1 ∩ X2 |. (8.4)

Note that this formula, which is illustrated in Figure 8.1, works even X1 ∩ X2 = ∅,
as then |X1 ∩ X2 | = 0. The proof of this lemma appears in the Appendix, in
Section 8.5.1.

8.4.2 Three subsets


Before moving to the general case, let’s consider one more small example, this
time with three subsets X1 , X2 and X3 : we can handle this case by clever use

60
U U

X1 X2 X1 X1∩ X2 X2

Figure 8.1: In the example at left X1 ∩ X2 = ∅, so |X1 ∪ X2 | = |X1 | + |X2 |, but in the
example at right X1 ∩X2 ̸= ∅ and so |X1 ∪X2 | = |X1 |+|X2 |−|X1 ∩X2 | < |X1 |+|X2 |.

of Lemma 8.12 from the previous section. If we regard (X1 ∪ X2 ) as a single set and
X3 as a second set, then Eqn. (8.4) says

|(X1 ∪ X2 ) ∪ X3 | = |(X1 ∪ X2 )| + |X3 | − |(X1 ∪ X2 ) ∩ X3 |


= (|X1 | + |X2 | − |X1 ∩ X2 |) + |X3 | − |(X1 ∪ X2 ) ∩ X3 |
= |X1 | + |X2 | + |X3 | − |X1 ∩ X2 | − |(X1 ∪ X2 ) ∩ X3 |

Focusing on the final term, we can use standard relations about unions and inter-
sections to say
(X1 ∪ X2 ) ∩ X3 = (X1 ∩ X3 ) ∪ (X2 ∩ X3 ).
Then, applying Eqn. (8.4) to the pair of sets (X1 ∩ X3 ) and (X2 ∩ X3 ), we obtain

|(X1 ∪ X2 ) ∩ X3 | = |(X1 ∩ X3 ) ∪ (X2 ∩ X3 )|


= |X1 ∩ X3 | + |X2 ∩ X3 | − |(X1 ∩ X3 ) ∩ (X2 ∩ X3 )|
= |X1 ∩ X3 | + |X2 ∩ X3 | − |X1 ∩ X2 ∩ X3 |

where, in going from the second line to the third, we have used

(X1 ∩ X3 ) ∩ (X2 ∩ X3 ) = X1 ∩ X2 ∩ X3 .

Finally, putting all these results together, we obtain the analogue of Eqn. (8.4)
for three subsets:

|(X1 ∪ X2 ) ∪ X3 | = |X1 | + |X2 | + |X3 | − |X1 ∩ X2 | − |(X1 ∪ X2 ) ∩ X3 |


= |X1 | + |X2 | + |X3 | − |X2 ∩ X3 |
− (|X1 ∩ X3 | + |X2 ∩ X3 | − |(X1 ∩ X2 ∩ X3 |)
= (|X1 | + |X2 | + |X3 |)
− (|X1 ∩ X2 | + |X1 ∩ X3 | + |X2 ∩ X3 |)
+ |X1 ∩ X2 ∩ X3 |. (8.5)

Figure 8.2 helps make sense of this formula and prompts the following observations:
• Elements of X1 ∪ X2 ∪ X3 that belong to exactly one of the Xj are counted
exactly once by the sum (|X1 | + |X2 | + |X3 |) and do not contribute to any of
the terms involving intersections.

61
U

X1 X2
X1 X2

X3

Figure 8.2: In the diagram above all of the intersections appearing in Eqn. (8.5) are
nonempty.

• Elements of X1 ∪ X2 ∪ X3 that belong to exactly two of the Xj are double-


counted by the sum, (|X1 | + |X2 | + |X3 |), but this double-counting is corrected
by the term involving two-fold intersections.
• Finally, elements of X1 ∪ X2 ∪ X3 that belong to all three of the sets are
triple-counted by the initial sum (|X1 | + |X2 | + |X3 |). This triple-counting is
then completely cancelled by the term involving two-fold intersections. Then,
finally, this cancellation is repaired by the final term, which counts each such
element once.

8.4.3 The general case


The Principle of Inclusion/Exclusion generalises the results in Eqns. (8.1) and (8.5)
to unions of arbitrarily many subsets.
Theorem 8.13 (The Principle of Inclusion/Exclusion). If U is a finite set and
{Xj }nj=1 is a collection of n subsets, then
[
n
Xj = |X1 ∪ · · · ∪ Xn |
j=1

= |X1 | + · · · + |Xn |
− |X1 ∩ X2 | − · · · − |Xn−1 ∩ Xn |
+ |X1 ∩ X2 ∩ X3 | + · · · + |Xn−2 ∩ Xn−1 ∩ Xn |
..
.
X
+ (−1)m−1 |Xi1 ∩ · · · ∩ Xim |
1≤i1 ≤···≤im ≤n
..
.
+ (−1)n−1 |X1 ∩ · · · ∩ Xn | (8.6)

62
or, more concisely,

X \
|X1 ∪ · · · ∪ Xn | = (−1)|I|−1 Xi (8.7)
I⊆{1,...,n}, I̸=∅ i∈I

The proof of this result appears in the Appendix, in Section 8.5.2 below.

8.4.4 An example
How many of the integers n with 1 ≤ n ≤ 150 are coprime to 70? This is a job
for the Principle of Inclusion/Exclusion. First note that the prime factorisation of
70 is 70 = 2 × 5 × 7. Now consider a universal set U = {1, . . . , 150} and the three
subsets X1 , X2 and X3 consisting of multiples of 2, 5 and 7, respectively. A member
of U that shares a prime factor with 70 belongs to at least one of the Xj and so the
number we’re after is

|U | − |X1 ∪ X2 ∪ X3 | = |U | − (|X1 | + |X2 | + |X3 |)


+ (X1 ∩ X2 | + |X1 ∩ X3 | + |X2 ∩ X3 |)
− |X1 ∩ X2 ∩ X3 |.
= 150 − (75 + 30 + 21) + (15 + 10 + 4) − 2
= 150 − 126 + 29 − 2
= 51 (8.8)

where I have used the numbers in Table 8.2 which lists the various cardinalities that
we need.

Set Description Cardinality


X1 multiples of 2 75
X2 multiples of 5 30
X3 multiples of 7 21
X1 ∩ X2 multiples of 10 15
X1 ∩ X3 multiples of 14 10
X2 ∩ X3 multiples of 35 4
X1 ∩ X2 ∩ X3 multiples of 70 2

Table 8.2: The sizes of the various intersections needed for the calculation in
Eqn. (8.8).

63
8.5 Appendix: Proofs for Inclusion/Exclusion
The proofs in this section will not appear on the exam, but are provided for those
who are interested or for whom the subject is new.

X1\ X2 X1∩ X2 X2\ X1

Figure 8.3: Here X1 \X2 and X1 ∩ X2 are shown in shades of blue, while X2 \X1 is
in yellow.

8.5.1 Proof of Lemma 8.12, the case of two sets


Recall that X1 and X2 are subsets of some universal set U and that we seek to prove
that |X1 ∪ X2 | = |X1 | + |X2 | − |X1 ∩ X2 |.
Proof. Note that the sum |X1 |+|X2 | counts each member of the intersection X1 ∩X2
twice, once as a member of X1 and then again as a member of X2 . Subtracting
|X1 ∩ X2 | corrects for this double-counting. Alternatively, for those who prefer
proofs that look more like calculations, begin by defining

X1 \X2 = {x ∈ U | x ∈ X1 , but x ∈
/ X2 }.

Then, as is illustrated in Figure 8.3, X1 = (X1 \X2 ) ∪ (X1 ∩ X2 ). Further, the sets
X1 \X2 and X1 ∩ X2 are disjoint by construction, so

|X1 | = |X1 \X2 | + |X1 ∩ X2 | or |X1 \X2 | = |X1 | − |X1 ∩ X2 |. (8.9)

Similarly, X1 \X2 and X2 are disjoint and X1 ∪ X2 = (X1 \X2 ) ∪ X2 so

|X1 ∪ X2 | = |X1 \X2 | + |X2 |


= |X1 | − |X1 ∩ X2 | + |X2 |
= |X1 | + |X2 | − |X1 ∩ X2 |

where, in passing from the first line to the second, we have used (8.9). The last line
is the result we were trying to prove, so we are finished.

64
8.5.2 Proof of Theorem 8.13
One can prove this result in at least two ways:
• by induction, with a calculation that is essentially the same as the one used
to obtain the n = 3 case—Eqn. (8.5)—from the n = 2 one—Eqn. (8.4);
• by showing that each x ∈ X1 ∪ · · · ∪ Xn contributes exactly one to the sum on
the right hand side of Eqn. (8.7).
The first approach is straightforward, if a bit tedious, but the second is more inter-
esting and is the one discussed here.
The key idea is to think of the the elements of X1 ∪ · · · ∪ Xn individually and
ask what each one contributes to the sum in Eqn. (8.7). Suppose that an element
x ∈ X1 ∪ · · · ∪ Xn belongs to exactly ℓ of the subsets, with 1 ≤ ℓ ≤ n: we will
prove that x makes a net contribution of 1. For the sake of concreteness, we’ll say
x ∈ Xi1 , . . . , Xiℓ where i1 , . . . , iℓ are distinct elements of {1, . . . , n}.
• As we’ve assumed that x belongs to exactly ℓ of the subsets Xj , it contributes
a total of ℓ to the first row, |X1 | + · · · + |Xn |, of the long sum in Eqn. (8.6).

• Further, x contributes a total of − 2ℓ to the sum in the row involving two-way
intersections
− |X1 ∩ X2 | − · · · − |Xn−1 ∩ Xn |.
To see this, note that if x ∈ Xj ∩ Xk then both j and k must be members of
the set {i1 , . . . , iℓ }.
• Similar arguments show that if k ≤ ℓ, then x contributes a total of
   
k−1 ℓ k−1 ℓ!
(−1) = (−1)
k k! (ℓ − k)!
to the sum in the row of Eqn. (8.6) that involves k-fold intersections.
• Finally, for k > ℓ there are no k-fold intersections that contain x and so x
makes a contribution of zero to the corresponding rows in Eqn. (8.6).
Putting these observations together we see that x make a net contribution of
     
ℓ ℓ ℓ−1 ℓ
ℓ− + − . . . + (−1) (8.10)
2 3 ℓ
This sum can be made to look more familiar by considering the following application
of the Binomial Theorem:

0 = (1 − 1)ℓ
Xℓ  
j ℓ−j ℓ
= (−1) (1)
j=0
j
     
ℓ ℓ ℓ ℓ
=1−ℓ+ − + . . . + (−1) .
2 3 ℓ

65
Thus 
     
ℓ ℓ ℓ−1 ℓ
0 = 1− ℓ− + − . . . + (−1)
2 3 ℓ
or      
ℓ ℓ ℓ−1 ℓ
ℓ− + − . . . + (−1) = 1.
2 3 ℓ
The left hand side here is the same as the sum in Eqn. (8.10) and so we’ve established
that any x which belongs to exactly ℓ of the subsets Xj makes a net contribution of 1
to the sum on the right hand side of Eqn. (8.7). And as every x ∈ X1 ∪· · ·∪Xn must
belong to at least one of the Xj , this establishes the Principle of Inclusion/Exclusion.

8.5.3 Alternative proof


Students who like proofs that look more like calculations may prefer to reformu-
late the arguments from the previous section in terms of characteristic functions
(sometimes also called indicator functions) of sets. If we define 1X : U → {0, 1} by

1 if s ∈ X
1X (s) =
0 otherwise

then we can calculate |X| for a subset X ⊆ U as follows:


X
|X| = 1X (x)
x∈U
! !
X X
= 1X (x) + 1X (x)
x∈X x∈X
/
X
= 1X (x) (8.11)
x∈X

where, in passing from the second to third lines, I have dropped the second sum
because all its terms are zero.
Then the Principle of Inclusion/Exclusion is equivalent to

X X \
1X1 ∪···∪Xn (x) = (−1)|I|−1 Xi
x∈X1 ∪···∪Xn I⊆{1,...,n}, I̸=∅ i∈I
X X
= (−1)|I|−1 1∩i∈I Xi (x)

I⊆{1,...,n}, I̸=∅ x∈ i∈I Xi
 
X
n X X
= (−1)k−1  1∩i∈I Xi (x)

k=1 I⊆{1,...,n}, |I|=k x∈ i∈I Xi

which I have obtained by using of Eqn. (8.11) to replace terms in Eqn. (8.7) with
the corresponding sums of values of characteristic functions.

66
We can then rearrange the expression on the right, first expanding the ranges of
the sums over elements of k-fold intersections (this doesn’t change the result since
1X (x) = 0 for x ∈/ X) and then interchanging the order of summation so that
the sum over elements comes first. This calculation proves that the Principle of
Inclusion/Exclusion is equivalent to the following:
X
1X1 ∪···∪Xn (x)
x∈X1 ∪···∪Xn
!
X
n X X
= (−1)k−1 1∩i∈I Xi (x)
k=1 I⊆{1,...,n}, |I|=k x∈X1 ∪···∪Xn
X Xn X
= k−1
(−1) 1∩i∈I Xi (x) (8.12)
x∈X1 ∪···∪Xn k=1 I⊆{1,...,n}, |I|=k

Arguments similar to those in Section 8.5.2 then establish the following results, the
last of which, along with Eqn. (8.12), proves Theorem 8.13.

Proposition 8.14. If an element x ∈ X1 ∪ · · · ∪ Xn belongs to exactly ℓ of the sets


{Xj }nj=1 then for k ≤ ℓ we have
X  
ℓ ℓ!
1∩i∈I Xi (x) = =
k k! (ℓ − k)!
I⊆{1,...,n}, |I|=k

while if k > ℓ X
1∩i∈I Xi (x) = 0
I⊆{1,...,n}, |I|=k

Proposition 8.15. For an element x ∈ X1 ∪ · · · ∪ Xn we have


X
n X X
n  

(−1)k−1 1 ∩
Xi (x) = (−1) k−1
= 1.
i∈I
k
k=1 I⊆{1,...,n}, |I|=k k=1

Lemma 8.16. The characteristic function 1X1 ∪···∪Xn of the set X1 ∪· · ·∪Xn satisfies

X
n X
1X1 ∪···∪Xn (x) = (−1)k−1 1∩i∈I Xi (x).
k=1 I⊆{1,...,n}, |I|=k

67
Chapter 9

Proof of Tutte’s Matrix-Tree


Theorem

The proof here is derived from a terse account in the lecture notes from a course on
Algebraic Combinatorics taught by Lionel Levine at MIT in Spring 2011.1 I studied
them with Samantha Barlow, a former Discrete Maths student who did a third-year
project with me in 2011-12.
Reading:
I don’t know of any textbook accounts of the proof given here, but the intrepid reader
might like to look at the following two articles, both of which make the connection
between the Principle of Inclusion/Exclusion and Tutte’s Matrix Tree theorem.
• J.B. Orlin (1978), Line-digraphs, arborescences, and theorems of Tutte and
Knuth, Journal of Combinatorial Theory, Series B, 25(2):187–198. DOI:
10.1016/0095-8956(78)90038-2

• S. Chaiken (1983), A combinatorial proof of the all minors matrix tree the-
orem, SIAM Journal on Algebraic and Discrete Methods, 3:319–329. DOI:
10.1137/0603033

9.1 Single predecessor graphs


Video Before we plunge into the proof itself I’d like to define a certain family of graphs
6.1 that includes, but is larger than, the family of spanning arborescences.
Definition 9.1. A single predecessor graph (“spreg”) with distinguished
vertex v in a digraph G(V, E) is a subgraph T (V, E ′ ) (Note that T and G have
the same vertex set) in which each vertex other than the distinguished vertex v has
exactly one predecessor while v itself has no predecessors. Equivalently,

degin (v) = 0 and degin (u) = 1 ∀u ̸= v ∈ V.


1
Dr. Levine has since moved to Cornell, where his notes about the Matrix Tree Theorem are
still available.

68
Figure 9.1: Three examples of single predecessor graphs (spregs). In each the
distinguished vertex is white, while the other vertices, which all have degin (u) = 1,
are shaded in other colours. The example at left, has multiple weakly connected
components, while the other two are arborescences.

Figure 9.1 includes several examples of spregs, including two that are arborescences,
which prompts the following proposition:

Proposition 9.2 (Spanning arborescences are spregs). If T (V, E ′ ) is a spanning


arborescence for G(V, E) with root v, then it is also a spreg with distinguished vertex
v.

Proof. By definition, G and T share the same vertex set, so all we need check is
that the vertices u ̸= v in T have a single predecessor. Recall that an arborescence
rooted at v is a directed graph T (V, E) such that

(i) Every vertex u ̸= v is accessible form v. That is, there is a directed path from
v to every other vertex.

(ii) T becomes an ordinary, undirected tree if we ignore the directedness of the


edges.

The proposition consists of two separate claims: that degin (v) = 0 and that
degin (u) = 1 ∀u ̸= v ∈ V . We’ll prove both by contradiction.
Suppose that degin (v) > 0: it’s then easy to see that T must include a directed
cycle. Consider one of v’s predecessors—call is u0 . It is accessible from v, so there is
a directed path from v to u0 . And u0 is a predecessor of v, so there is also a directed
edge (u0 , v) ∈ E. If we append this edge to the end of the path, we get a directed
path from v back to itself. This contradicts the second property of an arborescence
and so we must have degin (v) = 0.
The proof for the second part of the proposition is illustrated in Figure 9.2. Sup-
pose that ∃ u ̸= v ∈ V such that degin (u) ≥ 2 and choose two distinct predecessors
of u: call them v1 and v2 and note that one of them may be the root vertex v.
Now consider the directed paths from v to v1 and v2 . In the undirected version of
T these paths, along with the edges (v1 , u) and (v2 , u), must include a cycle, which
contradicts the second property of an arborescence.
The examples in Figure 9.1 make it clear that there are other kinds of spregs
besides spanning arborescences, but there aren’t that many kinds:

69
v2 v1 = v

v
v1 u u
v2

Figure 9.2: Two examples to illustrate the second part of the proof that an
arborescence is a spreg. If one ignores the directedness of the edges in the graphs
above, both contain cycles.

Proposition 9.3 (Characterising spregs). A spreg with distinguished vertex v con-


sists of an arborescence rooted at v, plus zero or more disjoint weakly connected
components, each of which contains a single directed cycle.
Note that the arborescence mentioned in the Proposition is not necessarily a span-
ning one: the leftmost graph in Figure 9.1 consists of a small, non-spanning arbores-
cence and a second component that contains a cycle.
The reasoning needed to prove this proposition is similar to that for the previous
one and so is left to the Problem Sets.
This lemma is one of the key ingredients in the proof of Tutte’s Matrix-Tree
Theorem. The idea is to first note that a spanning arborescence is a spreg. We
then count the spanning arborescences contained in a graph by first counting all
the spregs, then use the Principle of Inclusion/Exclusion to count—and subtract
away—those spregs that contain one or more cycles.

9.2 Counting spregs with determinants


Video Recall that we’re trying to prove
6.2
Theorem 1 (Tutte’s Directed Matrix-Tree Theorem, 1948). If G(V, E) is a digraph
with vertex set V = {v1 , . . . , vn } and L is an n × n matrix whose entries are given
by 
 degin (vj ) If i = j
Lij = −1 If i ̸= j and (vi , vj ) ∈ E (9.1)

0 Otherwise
then the number Nj of spanning arborescences with root at vj is

Nj = det(L̂j )

where L̂j is the matrix produced by deleting the j-th row and column from L.
First note that—because we can always renumber the vertices before we apply
the theorem—it is sufficient to prove the result for the case with root vertex v = vn .
Now consider the representation of det(L̂n ) as a sum over permutations:
X Y
n−1
det(L̂n ) ≡ det(L) = sgn(σ) Ljσ(j) . (9.2)
σ∈Sn−1 j=1

70
Predecessor of Is a spanning
v1 v2 v3 arboresence?
v2 v1 v2 No
v2 v3 v2 No
v2 v4 v2 Yes
v4 v1 v2 Yes
v4 v3 v2 No
v4 v4 v2 Yes

Table 9.1: Each row here corresponds to one of the spregs in Figure 9.3.

where I have introduced the notation L ≡ L̂n to avoid the confusion of having two
kinds of subscripts on L̂n . This means that L is an (n − 1) × (n − 1) matrix in which

Lij = Lij ,

where Lij is the i, j entry in the matrix L defined by Eqn. (9.1) in the statement of
Tutte’s theorem.

9.2.1 Counting spregs


In this section we’ll explore two examples that illustrate a connection between terms
in the sum for det(L) and the business of counting various kinds of spregs.

The identity term: counting all spregs


In the case where σ = id, so that σ(j) = j for all j, we have sgn(σ) = 1 and

Y
n−1 Y
n−1 Y
n−1
Ljσ(j) = Ljj = degin (vj ). (9.3)
j=1 j=1 j=1

This product is also equal to the total number of spregs in G(V, E) that have dis-
tinguished vertex vn . To see why, look back at the definition of a spreg and think
about what we’d need to do if we wanted to write down a complete list of these
spregs. We could specify a spreg by listing the single predecessor for each vertex
other than vn in a table like the one below
Vertex v1 v2 v3
Predecessor v2 v1 v2
which describes one of the spregs rooted at v4 contained in the four-vertex graph
shown in Figure 9.3. And if we wanted to list all the four-vertex spregs contained
in this graph we could start by assembling the predecessor lists of all the vertices
other than the distinguished vertex,

P1 = {v2 , v4 }, P2 = {v1 , v3 , v4 } and P3 = {v2 },

71
v4

v3 v1

v2

v4 v4 v4

v3 v1 v3 v1 v3 v1

v2 v2 v2

v4 v4 v4

v3 v1 v3 v1 v3 v1

v2 v2 v2

Figure 9.3: The graph G(V, E) at upper left contains six spregs with distinguished
vertex v4 , all of which are shown in the two rows below. Three of them are spanning
arborescences rooted at v4 , while the three others contain cycles.

where Pj lists the predecessors of vj . Then, to specify a spreg with distinguished


vertex v4 we would choose one entry from each of the predecessor lists, meaning that
there are
|P1 | × |P2 | × |P3 | = degin (v1 ) × degin (v2 ) × degin (v3 ) = 2 × 3 × 1 = 6
such spregs in total. All six possibilities are listed in Table 9.1 and illustrated in
Figure 9.3. The equation above also emphasises that |Pj | = degin (vj ) and so makes
the connection with the product in Eqn. (9.3).

Terms that count spregs containing a single directed cycle


Video In the case where the permutation σ contains a single cycle of length ℓ, so that
6.3
σ = (i1 , . . . , iℓ ),
we have sgn(σ) = (−1)ℓ−1 and
  !
Y
n−1 Y Y

Ljσ(j) =  Ljj  × Lik ik+1
j=1 j∈fix(σ) k=1
  !
Y Y

= degin (vj ) × Lik ik+1
j∈fix(σ) k=1

where the indices ik are to be understood periodically, so iℓ+1 = i1 . The factors


Lik ik+1 in the second of the two products above are off-diagonal entries of L = L̂n
and thus satisfy 
−1 if (vik , vik+1 ) ∈ E
Lik ik+1 =
0 otherwise.

72
Thus if one or more of the edges (vik , vik+1 ) is absent from the graph we have

Y

Lik ik+1 = 0,
k=1

but if all the edges (vik , vik+1 ) are present we have can make the following observa-
tions:
• the graph contains a directed cycle given by the vertex sequence

(vi1 , . . . , viℓ , vi1 );

• Lik ik+1 = −1 for all 1 ≤ k ≤ ℓ and so we have


 
Y
n−1 Y Y

ℓ−1  
sgn(σ) Ljσ(j) = (−1) degin (vj ) × (−1)
j=1 j∈fix(σ) k=1
 
Y
= (−1)ℓ−1  degin (vj ) (−1)ℓ
j∈fix(σ)
Y
2ℓ−1
= (−1) degin (vj )
j∈fix(σ)
Y
=− degin (vj ). (9.4)
j∈fix(σ)

Arguments
Q similar to those in the previous section then show that the product
j∈fix(σ) degin (vj ) in Eqn. (9.4) counts the number of ways to choose predecessors
for those vertices that aren’t part of the cycle. We can summarise all these ideas
with the following pair of results:
Proposition 9.4. For a permutation σ ∈ Sn−1 consisting of a single cycle

σ = (i1 , . . . , iℓ )

define an associated directed cycle Cσ specified by the vertex sequence (vi1 , . . . , viℓ , vi1 ).
Then the term in det(L) corresponding to σ satisfies
 Y

 − degin (vj ) if Cσ ⊆ G(V, E) and fix(σ) ̸= ∅


Y
n−1
j∈fix(σ)
sgn(σ) Ljσ(j) =

 −1 if Cσ ⊆ G(V, E) and fix(σ) = ∅
j=1 

0 if Cσ ̸⊆ G(V, E)

Corollary 9.5. For σ and Cσ as in Proposition 9.4

Y
n−1
Ljσ(j) = |{spregs containing Cσ }| .
j=1

73
Number of spregs
Qn−1
σ Cσ Cσ ⊆ G? j=1 Ljσ(j) containing Cσ
(1,2) (v1 , v2 , v1 ) Yes degin (v3 ) = 1 1
(1,3) (v1 , v3 , v1 ) No degin (v2 ) × 0 0
(2,3) (v2 , v3 , v2 ) Yes degin (v1 ) = 2 2
(1,2,3) (v1 , v2 , v3 , v1 ) No 0 0
(1,3,2) (v1 , v3 , v2 , v1 ) No 0 0
Table 9.2: The results of using Corollary 9.5 to count spregs containing the various
cycles Cσ associated with the non-identity elements of S3 . The right column lists the
number one gets by direct counting of the spregs shown in Figure 9.3.

9.2.2 An example
Before pressing on to generalise the results of the previous section to arbitrary
permutations, let’s see what Corollary 9.5 allows us to say about the graph in
Figure 9.3. There G(V, E) is a digraph on four vertices, so the determinant that
comes into Tutte’s theorem is that Qof L4 , a three-by-three matrix. We’ve already
seen that if σ = id the product j∈fix(σ) degin (vj ) gives six, the total number of
spregs contained in the graph. The results for the remaining elements of S3 are
listed in Table 9.2 and all are covered by Corollary 9.5, as all non-identity elements
of S3 are single cycles.

9.2.3 Counting spregs in general


Here we generalise the results from Section 9.2.1 to permutations that are the prod-
ucts of arbitrarily many cycles.

Lemma 9.6 (Counting spregs containing cycles).


Suppose σ ∈ Sn−1 is the product of k > 0 disjoint cycles

σ = (i1,1 , . . . , i1,ℓ1 ) . . . (ik,1 , . . . , ik,ℓk ),

where ℓj is the length of the j-th cycle. Associate the directed cycle Cj defined by
the vertex sequence (vij,1 , . . . , vij,ℓj , vij,1 ) with the j-th cycle in the permutation and
define
[k
Cσ = Cj .
j=1

Then the term in det(L) corresponding to σ satisfies


 Y

 (−1)k
degin (vj ) if Cσ ⊆ G(V, E) and fix(σ) ̸= ∅
Y
n−1 
 j∈fix(σ)
sgn(σ) Ljσ(j) =

 (−1)k if Cσ ⊆ G(V, E) and fix(σ) = ∅
j=1 

0 if Cσ ̸⊆ G(V, E)

74
Further,
Y
n−1 n Sk o
Ljσ(j) = spregs containing Cσ = j=1 Cj . (9.5)
j=1

The proof of this result requires reasoning much like that used in Section 9.2.1 and
so is left to the reader.

9.3 Proof of Tutte’s theorem


Video Throughout this section I will continue to write L in place of L̂n to avoid a confusing
6.4 welter of subscripts.
Proof. As we argued at the beginning of Section 9.2, it is sufficient to prove that
det(L̂n ) = det(L) is the number of spanning arborescences rooted at vn . We’ll do
this with the Principle of Inclusion/Exclusion and so, to begin, we need to specify
the universal set U and the subsets Xj . Begin by considering the set C of all possible
directed cycles involving the vertices v1 . . . vn−1 . It’s clearly a finite set and so we
can declare that it has M elements and imagine that we’ve chosen some (arbitrary)
numbering scheme so that we can list the set of cycles as

C = {C1 , . . . , CM }.

We’ll then choose the sets U and Xj as follows:

• U is the set of all spregs with distinguished vertex vn . That is, U is the set of
subgraphs of G(V, E) in which

degin (vn ) = 0 and degin (vj ) = 1 for 1 ≤ j ≤ (n − 1).

• Xj ⊆ U is the subset of U consisting of spregs containing the cycle Cj . This


subset may, of course, be empty.

Proposition 9.3—the one about characterising spregs—tells us that a spreg that


has distinguished vertex vn is either a spanning arboresence rooted at vn or a graph
that contains one or more disjoint cycles. This means that

[
M
Nn = |{spanning arborescences rooted at vn }| = |U | − Xj .
j=1

and the Principle of Inclusion/Exclusion then says


 
X \
Nn = |U | −  (−1)|I|−1 Xj 
I⊆{1,...,M }, I̸=∅ j∈I

X \
= |U | + (−1)|I| Xj (9.6)
I⊆{1,...,M }, I̸=∅ j∈I

75
As we know that spregs contain only disjoint cycles, we can say
|Xj ∩ Xk | = 0 unless Cj ∩ Ck = ∅
and so can eliminate many of the terms in the sum over intersections in Eqn. (9.6),
rewriting it as a sum over collections of disjoint cycles:
X \
Nn = |U | + (−1)|I| Xj . (9.7)
j∈I
I⊆{1,...,M }, I̸=∅
Cj ∩Ck =∅ ∀j̸=k∈I

Then we can use the lemma from the previous section—Lemma 9.6, which re-
lates non-identity permutations to numbers of spregs containing cycles—to rewrite
Eqn. (9.7) in terms of permutations. First note that Eqn. (9.5) allows us to write
\ n S o Y
n−1
Xj = spregs containing j∈I Cj = LkσI (k) .
j∈I k=1

Here σI ∈ Sn−1 is the permutation


Y
σI = σCj
j∈I

whose cycle representation is the product of the permutations corresponding to the


directed cycles Cj for j ∈ I. In the product above σCj is the cycle permutation
corresponding to the directed cycle Cj . The correspondence here somes from the
bijection between permutations and unions of directed cycles that we discussed in
Section 8.2.
Now, again using Lemma 9.6, we have
X Y
n−1
Nn = |U | + (−1)|I| LjσI (j)
j=1
I⊆{1,...,M }, I̸=∅
Cj ∩Ck =∅ ∀j̸=k∈I

X Y
n−1
Nn = |U | + sgn(σI ) LjσI (j) (9.8)
j=1
I⊆{1,...,M }, I̸=∅
Cj ∩Ck =∅ ∀j̸=k∈I

As the sum in Eqn. (9.8) ranges over all collections of disjoint cycles, the permuta-
tions σI range over all non-identity permutations in Sn−1 and so we have
X Y
n−1
Nn = |U | + sgn(σ) Ljσ(j) . (9.9)
σ ̸= id j=1

Finally, from Eqn. (9.3) we know that


Y
n−1
|U | = |{spregs containing all v ∈ V with distinguished vertex vn }| = degin (vj )
j=1

76
which is the term in det(L) corresponding to the identity permutation. Combining
this observation with Eqn. (9.9) gives us

X Y
n−1
Nn = sgn(σ) Ljσ(j) = det(L),
σ∈Sn−1 j=1

which is the result we sought.

77
Part III

Eulerian and Hamiltonian Graphs


Chapter 10

Eulerian Multigraphs

This section of the notes revisits the Königsberg Bridge Problem and generalises
it to explore Eulerian multigraphs: those that contain a closed walk that traverses
every edge exactly once.
Reading:
The material in today’s lecture comes from Section 1.3 of

Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,


(available online via SpringerLink),

though his proof is somewhat more terse.

10.1 Eulerian tours and trails


Video In Lecture 1 we used a proof by contradiction to demonstrate that there is no
7.1 solution to the Königsberg Bridge Problem, which is illustrated in Figure 10.1. That
is, it’s not possible to find a walk that (a) crosses each of the city’s seven bridges
exactly once and (b) starts and finishes in the same place. Today we’ll generalise
the problem, then find a number of equivalent conditions that tell us when the
corresponding closed walk exists.
First, recall that a multigraph G(V, E) has the same definition as a graph, except
that we allow parallel edges. That is, we allow pairs of vertices (u, v) to appear
more than once in E. Because of this, people sometimes speak of the edge list of a
multigraph, as opposed to the edge set.

78
North Bank

West
East
Island
Island

South Bank

Figure 10.1: We proved in the first lecture of the term that it is impossible to find
a closed walk that traverses every edge in the graph above exactly once.

The main theorem we’ll prove today relies on the following definitions:
Definition 10.1. An Eulerian trail in a multigraph G(V, E) is a trail that includes
each of the graph’s edges exactly once.
Definition 10.2. An Eulerian tour in a multigraph G(V, E) is an Eulerian trail
that starts and finishes at the same vertex. Equivalently, it is a closed trail that
traverses each of the graph’s edges exactly once.
Definition 10.3. A multigraph that contains an Eulerian tour is said to be an
Eulerian multigraph.
Armed with these, it’s then easy to formulate the following characterisation of Eu-
lerian multigraphs:
Theorem 10.4 (Jungnickel’s Theorem 1.3.1). Let G be a connected multigraph.
Then the following statements are equivalent:
(1) G is Eulerian.
(2) Each vertex of G has even degree.
(3) The edge set of G can be partitioned into cycles.
The last of these characterisations may be new to you: it means that it is possible
to arrange the edges of G into a collection of disjoint cycles. Figure 10.2 shows
an example of such a partition for a graph derived from the Königsberg Bridge
multigraph by adding two extra edges, shown in blue at left. Adding these edges
makes the graph Eulerian, and a decomposition of the edge set into cycles appears
at right. Note that undirected multigraphs can contain cycles of length two that
consist of a pair of parallel edges.
The proof of the theorem is simpler if one has the following lemma, whose proof
I’ll defer until after that of the main result. Note that the lemma, unlike the theorem,
does not require the multigraph to be connected.
Lemma 10.5 (Vertices of even degree and cycles). If G(V, E) is a multigraph with
a nonempty edge set E ̸= ∅ and the property that deg(v) is an even number for all
v ∈ V , then G contains a cycle.

79
Figure 10.2: The panel at left shows a graph produced by adding two edges (shown
in blue) to the graph from the Königsberg Bridge Problem. These extra edges make
the graph Eulerian and the panel at right illustrates a partition of the edge set into
cycles

Proof of Theorem 10.4. The theorem says these statements are all “equivalent”,
which encompasses a total of six implications1 but we don’t need to prove all of
them: it’s sufficient to prove, say, that (1) =⇒ (2), (2) =⇒ (3) and (3) =⇒ (1).
That is, it’s sufficient to make a directed graph whose vertices are the statements
and whose edges indicate implications. If this graph is strongly connected, so that
one can get from any statement to any other by following a chain of implications,
then the result is proven.
(1) =⇒ (2):
Proof. We know G is Eulerian, so it has a closed trail that includes each edge exactly
once. Imagine that this trail is specified by the following sequence of vertices

v0 , . . . , vm = v0 (10.1)

where |E| = m and the vj are the vertices encountered along the trail, so that
some of them may appear more than once. In particular, v0 = vm because the trail
starts and finishes at the same vertex. As G is a connected multigraph, every vertex
appears somewhere in the sequence (if not, the absent vertices would have degree
zero and not be connected to any of the others).
Consider first some vertex u ̸= v0 . It must appear one or more times in the
sequence above and, each time, it appears in a pair of successive edges: if u = vj
with 0 < j < m, then these edges are (vj−1 , vj ) and (vj , vj+1 ). This means that
deg(u) is a sum of 2’s, with one term in the sum for each appearance of u in the
sequence (10.1). A similar argument applies to v0 , save that the edge that forms a
pair with (v0 , v1 ) is (vm−1 , vm = v0 ).

Video (2) =⇒ (3): The theorem requires this implication to hold for connected multi-
7.2 graphs, but this particular result is more general and applies to any multigraph in
which all vertices have even degree. We’ll prove this stronger version by induction
on the number of edges. That is, we’ll prove:
1
(1) =⇒ (2), (2) =⇒ (1), (1) =⇒ (3) . . .

80
Proposition. If G(V, E) is a multigraph (whether connected or not) in which deg(v)
is an even number for all vertices v ∈ V , then the edge set E can be partitioned into
cycles.
Proof. The base case is a multigraph with |E| = 0. Such a graph consists of one or
more isolated vertices and, as the graph has no edges, deg(v) = 0 (an even number)
for all v ∈ V and the (empty) edge set can clearly be partitioned into a union of
zero cycles.
Now suppose the result is true for every multigraph G(V, E) with |E| ≤ m0 edges
whose vertices all have even degree. Consider such a multigraph with |E| = m0 + 1:
we need to demonstrate that the edge set of such a graph can be partitioned into
cycles. We can use Lemma 10.5 to establish that we can find at least one cycle C
contained in G. And then we can form a new graph G′ (V, E ′ ) = G\C formed by
removing C from G. This bit of graph surgery either leaves the degree of a vertex
unchanged (if the vertex isn’t part of C) or decreases it by two, but either way, all
vertices in G′ have even degree because the corresponding vertices in G do.
The cycle C will contain at least one edge (and, unless we permit self-loops, two
or more) and so G′ will have at most m0 edges and so the inductive hypothesis will
apply to it. This means that we can partition E ′ = E\C into cycles. But then we
can add C to the partition of E ′ and so get a partition into cycles for E, completing
the inductive step and so proving our result.

(3) =⇒ (1): Here we need to establish that if the edge set of a connected multigraph
G(V, E) consists of a union of cycles, then G contains an Eulerian tour. This result is
trivial unless the partition of E involves at least two cycles, so we’ll restrict attention
to that case from now on.
The key observation is that we can always find two cycles that we can merge to
produce a single, longer closed trail that includes all the edges from the two cycles.
To see why, note that there must be a pair of cycles that share a vertex (if there
weren’t, all the cycles would all lie in distinct connected components, contradicting
the connectedness of G). Suppose that the shared vertex is v⋆ and that the cycles
are C1 and C2 given by the vertex sequences
C1 = {v⋆ = v0 , v1 , . . . , vℓ1 = v⋆ } and C2 = {v⋆ = u0 , u1 , . . . , uℓ2 = v⋆ } .
We can combine them, as illustrated in Figure 10.3 to make a closed trail given by
the vertex sequence
{v⋆ = v0 , v1 , . . . , vℓ1 = v⋆ = u0 , u1 , . . . , uℓ2 = v⋆ } .
Scrupulous readers may wish to use this observation as the basis of a proof by
induction (on the number of elements in the partition of E) of the somewhat stronger
result:
Proposition 10.6. If G(V, E) is a connected graph whose edge set can be partitioned
as a union of disjoint, closed trails, then G is Eulerian.
Then, as a cycle is a special case of a closed trail, we get the desired implication
as an immediate corollary.

81
u0 = v* = v0

u1
u7

u2 u6
v3 v1

u3 u5
u4

v2

Figure 10.3: The key step in the proof of the implication (3) =⇒
(1) in the proof of Theorem 10.4. The cycles C1 = (v⋆ = v0 , v1 , v2 , v3 , v0 ),
whose vertices are shown in red, and C2 = (v⋆ = u0 , u1 , u2 , u3 , u4 , u5 , u6 , u7 , u0 ),
whose vertices are shown in yellow, may be merged to create the closed trail
(v⋆ = v0 , v1 , v2 , v3 , v0 = v⋆ = u0 , u1 , u2 , u3 , u4 , u5 , u6 , u7 , u0 ) indicated by the dotted
line.

I’d like to conclude by giving an algorithmic proof of Lemma 10.5. The idea is
to choose some initial vertex u0 and then construct a trail in the graph by following
one of u0 ’s edges, then one of the edges of u0 ’s successor in the trail . . . and so on
until we revisit some vertex and thus discover a cycle. Provided that we can do
as I say—always move on through the graph without ever tracing over some edge
twice—this approach is bound to work because there are only finitely many vertices.
The proof that follows formalises this approach by spelling out an explicit algorithm.
Proof of Lemma 10.5. Consider the following algorithmic process, which finds a cy-
cle in a multigraph G(V, E) for which E ̸= ∅ and deg(v) is even for all v ∈ V .
Algorithm 10.7 (Finding a cycle).
Given a multigraph G(V, E) in which |E| > 0 and all vertices have even degree,
construct a trail T given by a sequence of edges

T = {(u0 , u1 ), (u1 , u2 ), . . . , (uℓ−1 , uℓ )}

that includes a cycle.

(1) Number the vertices, so that V = {v1 , . . . , vn }. This is for the sake of con-
creteness: later in the algorithm, when we need to choose one of a set of
vertices that have a particular property, we can choose the lowest-numbered
one.
(2) Initialize some things
• Set a counter j ← 0.
• Choose the first vertex in the trail, u0 , to be the lowest-numbered vertex
that has deg(vk ) > 0. Such vertices exist, as we know |E| > 0.

82
• Initialise a list A (for “available”) of edges that we have not yet included
in T . At the outset we set A ← E as we haven’t used any edges yet.

(3) Find the edge (uj , w) ∈ A where w is the lowest-numbered neighbour of uj


whose edge we haven’t yet used. The key to the algorithm’s success is that this
step is always possible. We chose u0 with deg(u0 ) > 0, so this step is possible
when j = 0. And when j > 0, the evenness of deg(uj ) means that if the trail
we are constructing can arrive at uj , then it must also be able to depart. The
growing trail T either comes to a stop at uj (see below) or uses a pair of the
vertex’s edges—one to arrive and another to depart—and so leaves an even
number of unused edges behind.
We can thus always extend the trail T by one edge, modifying the list of unused
edges A accordingly.

• T ← T ∪ {(uj , w)}
• A ← A\{(uj , w)} (We’ve used (one copy of) the edge (uj , w)).

(4) Are we finished? Does w already appear in the trail?

• If yes, stop. The trail includes a cycle that starts and finishes at w.
• If no, set uj+1 ← w, then set j ← j + 1 and go to Step 3.

The only way this process can stop is by revisiting a vertex and it must do this
within |V | = n steps. And once we’ve revisited a vertex, we’ve found a cycle and so
are finished.

83
Chapter 11

Hamiltonian graphs and the


Bondy-Chvátal Theorem

This lecture introduces the notion of a Hamiltonian graph and proves a lovely the-
orem due to J. Adrian Bondy and Vašek Chvátal that says—in essence—that if a
graph has lots of edges, then it must be Hamiltonian.
Reading:
The material in today’s lecture comes from Section 1.4 of

Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,


(available online via SpringerLink),

and is essentially an expanded version of the proof of Jungnickel’s Theorem 1.4.1.

11.1 Hamiltonian graphs


Video In the last lecture we characterised Eulerian graphs, which are those that have a
8.1 closed trail that includes every edge exactly once. It’s then natural to wonder about
graphs that have closed trails that include every vertex exactly once. Somewhat
surprisingly, these turn out to be much, much harder to characterise. To begin
with, let’s make some definitions that parallel those for Eulerian graphs:

Definition 11.1. A Hamiltonian path in a graph G(V, E) is a path that includes


all of the graph’s vertices.

Definition 11.2. A Hamiltonian tour or Hamiltonian cycle in a graph


G(V, E) is a cycle that includes every vertex.

Definition 11.3. A graph that contains a Hamiltonian tour is said to be a Hamil-


tonian graph. Note that this implies that Hamiltonian graphs have |V | ≥ 3, as
otherwise they would be unable to contain a cycle.

Generally speaking, it’s difficult to decide whether a graph is Hamiltonian—there


are no known efficient algorithms. There are, however, some special cases that are

84
easy: the cycle graphs Cn consist of nothing except one big Hamiltonian tour, and
the complete graphs Kn with n ≥ 3 obviously contain the Hamiltonian cycle

(v1 , v2 , . . . , vn , v1 )

obtained by numbering the vertices and visiting them in order. We’ll spend most
of the lecture proving results that say, more-or-less, that a graph with a lot of
edges (where the point of the theorem is to make the sense of “a lot” precise) is
Hamiltonian. Two of the simplest results of this kind are:

Theorem 11.4 (Dirac1 , 1952). Let G be a graph with n ≥ 3 vertices. If each vertex
of G has deg(v) ≥ n/2, then G is Hamiltonian.

Theorem 11.5 (Ore, 1960). Let G be a graph with n ≥ 3 vertices. If

deg(u) + deg(v) ≥ n

for every pair of non-adjacent vertices u and v, then G is Hamiltonian.

Dirac’s theorem is a corollary of Ore’s, but we will not prove either of these
theorems directly. Instead, we’ll obtain both as corollaries of a more general result,
the Bondy-Chvátal Theorem. Before we can even formulate this mighty result, we
need a somewhat involved new definition: the closure of a graph.

11.2 The closure a graph


Video Suppose G is a graph on n vertices. Then the closure of G, written [G], is con-
8.2 structed by adding edges that connect pairs of non-adjacent vertices u and v for
which
deg(u) + deg(v) ≥ n. (11.1)
One continues recursively, adding new edges according to (11.1) until all non-
adjacent pairs u, v satisfy
deg(u) + deg(v) < n.
The graphs G and [G] have the same vertex set—I’ll call it V —but the edge set
of [G] may contain extra edges. In the next section I’ll give an explicit algorithm
that constructs the closure.

1
This Dirac, Gabriel Andrew Dirac, was the adopted son of the Nobel prize winning theoretical
physicist Paul A. M. Dirac, and the nephew of another Nobel prize winner, the physicist and
mathematician Eugene Wigner. Wigner’s sister Margit was visiting her brother in Princeton when
she met Paul Dirac.

85
11.2.1 An algorithm to construct [G]
The algorithm below constructs a finite sequence of graphs

G = G1 (V, E1 ), G2 (V, E2 ), . . . , GK (V, EK ) = [G] (11.2)

that all have the same vertex set V , but different edge sets

E = E1 , E2 , . . . , EK . (11.3)

These edge sets form an increasing sequence in the sense that that Ej ⊂ Ej+1 . In
fact, Ej+1 is produced by adding a single edge to Ej .
Algorithm 11.6 (Graph Closure).
Given a graph G(V, E) with vertex set V = {v1 , . . . , vn }, find [G].
(1) Set an index j to one: j ← 1,
Also set E1 to be the edge set of the original graph,
E1 ← E.

(2) Given Ej , construct Ej+1 , which contains, at most, one more edge than Ej .
Begin by setting Ej+1 ← Ej , so that Ej+1 automatically includes every edge in
Ej . Now work through every possible edge in the graph. For each one—let’s
call it e = (vr , vs )—there are three possibilities:

(i) the edge e is already present in Ej .


(ii) The edge e = (vr , vs ) is not in Ej , but the degrees of the vertices vr and
vs are low in the sense that

degGj (vr ) + degGj (vs ) < n,

where the subscript Gj is meant to show that the degree is being calculated
in the graph Gj , whose vertex set is V and whose edge set is Ej . In this
case we do not include e in Ej+1 .
(iii) the edge e = (vr , vs ) is not in Ej , but the degrees of the vertices vr and
vs are high in the sense that

degGj (vr ) + degGj (vs ) ≥ n. (11.4)

Such an edge should be part of the closure, so we set

Ej+1 ← Ej ∪ {e}.

and then jump straight to step 3 below.

(3) Decide whether to stop: ask whether we added an edge during step 2.

• If not, then stop: the closure [G] has vertex set V and edge set Ej .
• Otherwise set j ← j + 1 and go back to step (2) to try to add another
edge.

86
1 2 3 1 2 3

G1 G2

4 5 6 7 4 5 6 7

1 2 3 1 2 3

G3 G4

4 5 6 7 4 5 6 7

Figure 11.1: The results of applying Algorithm 11.6 to the seven-vertex graph G1 .
Each round of the construction (each pass through step 2 of the algorithm) adds a
single new edge—shown with red, dotted curves—to the graph.

11.2.2 An example
Figure 11.1 shows the result of applying Algorithm 11.6 to a graph with 7 vertices.
The details of the process are discussed below.

Making G2 from G1
When constructing E2 from E1 , notice that the vertex with highest degree, v1 , has
degG1 (v1 ) = 4 and all the other vertices have lower degree. Thus, in step 2 of the
algorithm we need only think about edges connecting v1 to vertices of degree three.
There are three such vertices—v2 , v4 and v5 —but two of them are already adjacent
to v1 in G1 , so the only new edge we need to add at this stage is e = (v1 , v2 ).
Making G3 from G2
Now v1 has degG2 (v1 ) = 5, so the closure condition (11.4) says that we should
connect v1 to any vertex whose degree is two or more, which requires us to add the
edge (v1 , v3 ).
Making G4 from G3
Now v1 has degree degG3 (v1 ) = 6, so it is already connected to every other vertex
in the graph and cannot receive any new edges. Vertex v2 has degG3 (v2 ) = 4 and so
should be connected to any vertex vj with degG3 (vj ) ≥ 3. This means we need to
add the edge e = (v2 , v3 ).
Conclusion: [G] = G4
Careful study of G4 shows that the rule (11.4) will not add any further edges, so the
closure of the original graph is G4 .

87
v1 v1
vn v2 vn e v2
vn-1 v3 vn-1 v3
vn-2 vn-2

Gj Gj+1

vk+1 vk-1 vk+1 vk-1


vk vk

Figure 11.2: The setup for the proof of the Bondy-Chvátal Theorem: adding the edge
e = (v1 , vn ) to Gj creates the Hamiltonian cycle (v1 , . . . , vn , v1 ) that’s found in Gj+1 .
The dashed lines spraying off into the middles of the diagrams are meant to indicate
that the vertices may have other edges besides those shown in black.

11.3 The Bondy-Chvátal Theorem


Video The point of defining the closure is that it enables us to state the following lovely
8.3 result:

Theorem 11.7 (Bondy and Chvátal, 1976). A graph G is Hamiltonian if and only
if its closure [G] is Hamiltonian.

Before we prove this, notice that Dirac’s and Ore’s theorems are easy corollaries,
for when deg(v) ≥ n/2 for all vertices (Dirac’s condition) or when deg(u) + deg(v) ≥
n for all non-adjacent pairs (Ore’s condition), it’s clear that [G] is isomorphic to Kn
and, as we’ve seen, Kn is trivially Hamiltonian.
Proof. As the theorem is an if-and-only-if statement, we need to establish two things:
(1) if G is Hamiltonian then [G] is and (2) if [G] is Hamiltonian then G is too. The
first of these is easy in that the closure construction only adds edges to the graph,
so in the sequence of edge sets (11.3) G has edge set E1 and [G] has edge set EK
with K ≥ 1 and EK ⊇ E1 . This means that any edges appearing in a Hamiltonian
tour in G are automatically present in [G] too, so if G is Hamiltonian, [G] is also.
The second implication is harder and depends on an ingenious proof by contra-
diction. Assume for contradiction that G isn’t Hamiltonian, but [G] is. Now notice
that—by an argument similar to the one above—if some graph Gj⋆ in the sequence
(11.2) is Hamiltonian, then so are all the other Gj with j ≥ j⋆ . This means that if
the sequence is to begin with a non-Hamiltonian graph G = G1 and finish with a
Hamiltonian one GK = [G] there must be a single point at which the nature of the
graphs in the sequence changes. That is, there must be some j ≥ 1 such that Gj
isn’t Hamiltonian, but Gj+1 is, even though Gj+1 differs from Gj by only a single
edge. This situation is illustrated in Figure 11.2, where I have numbered the vertices
v1 . . . vn according to their position in the Hamiltonian cycle in Gj+1 and arranged
things so that the single edge whose addition creates the cycle is e = (v1 , vn ).

88
v1
vn v2
vn - 1
v3
vn - 2

vi

vi-1

Figure 11.3: Here the blue vertex, vi , is in X because it is connected indirectly to


vn , through its neighbour vi−1 , while the orange vertex is in Y because it is connected
directly to v1 .

Let’s now focus on Gj and define two interesting sets of vertices

X = {vi | (vi−1 , vn ) ∈ Ej and 2 < i < n}

and
Y = {vi | (v1 , vi ) ∈ Ej and 2 < i < n}.
The first set, X, consists of those vertices vi whose neighbour vi−1 has a direct
connection to vn , while the second set, Y , consists of vertices that have a direct
connection to v1 : both sets are illustrated in Figure 11.3.
Notice that X and Y are defined to be subsets of {v3 , . . . , vn−1 }, so they exclude
v1 , v2 and vn . Thus X has degGj (vn ) − 1 members as it includes one element for
each the neighbours of vn except for vn−1 , while |Y | = degGj (v1 ) − 1 as Y includes
all neighbours of v1 other than v2 . So then
   
|X| + |Y | = degGj (vn ) − 1 + degGj (v1 ) − 1
= degGj (vn ) + degGj (v1 ) − 2
≥n−2

where the inequality follows because we know

degGj (vn ) + degGj (v1 ) ≥ n

as the closure construction is going to add the edge e = (v1 , vn ) when passing
from Gj to Gj+1 . But then, both X and Y are drawn from the set of vertices
{vi | 2 < i < n} which has only n − 3 members and so, by the pigeonhole principle,
there must be some vertex vk that is a member of both X and Y .

89
v1
vn v2
vn - 1
v3
vn - 2

Gj

vk - 2
vk + 1 vk - 1
vk

Figure 11.4: The vertex vk is a member of X ∩ Y , which implies that there is, as
shown above, a Hamiltonian cycle in Gj .

The existence of such a vk implies the presence of a Hamiltonian tour in Gj . As


is illustrated in Figure 11.4, this tour:
• runs from v1 to vk−1 , in the same order as the tour found in Gj+1 ,
• then jumps from vk−1 to vn : this is possible because vk ∈ X.
• The tour then continues, passing from vn to vn−1 and on to vk , visiting vertices
in the opposite order from the tour in Gj+1
• and concludes with a jump from vk to v1 , which is possible because vk ∈ Y .
The existence of this tour contradicts our initial assumption that Gj is not Hamil-
tonian, but Gj+1 is. This means no such Gj can exist: the sequence of graphs in the
closure construction can never switch from non-Hamiltonian to Hamiltonian and so
if [G] is Hamiltonian, then G must be too.

11.4 Afterword
Students sometimes have trouble remembering the difference between Eulerian and
Hamiltonian graphs and I’m not unsympathetic: after all, both are named after
very famous, long-dead European mathematicians. One way out of this difficulty is
to learn more about the two men. Leonhard Euler, who was born in Switzerland,
lived longer ago (1707–1783) and was tremendously prolific, writing many hundreds
of papers that made fundamental contributions to essentially all of 18th century
mathematics. He also lived in a very alien scientific world in that he wrote his
papers in Latin and relied on royal patronage, first from the Russian emperor Peter
the Great and then, later, from Frederick the Great of Prussia and finally, toward
the end of his life, from Catherine the Great of Russia.

90
By contrast William Rowan Hamilton, who was Irish, lived much more re-
cently (1805–1865). He also made fundamental contributions across the whole of
mathematics—the distinction between pure and applied maths didn’t really exist
then—but he inhabited a much more recognisable scientific community, first work-
ing as a Professor of Astronomy at Trinity College in Dublin and then, for the rest
of his career, as the directory of Dunsink Observatory, just outside the city.
Alternatively, one can remember the distinction between Eulerian and Hamil-
tonian tours by noting that everything about Eulerian multigraphs starts with ‘E’:
Eulerian tours go through every edge and are easy to find when every vertex has
even degree. On the other hand, Hamiltonian tours include every vertex and are
hard to find.

91
Part IV

Distance in Graphs and


Scheduling
Chapter 12

Distance in Graphs

This lecture introduces the notion of a weighted graph and explains how some choices
of weights permit us to define a notion of distance in a graph.
Reading:
The material in this lecture comes from Chapter 3 of

Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,


which is (available online via SpringerLink.

12.1 Adding weights to edges


Video The ideas and applications that will occupy us for the next few lectures involve
8.4 both directed and undirected graphs and will include one of the most important
applications in the course, which involves the scheduling of large complex projects.
To begin with, we introduce the notion of edge weights.

Definition 12.1. Given a graph G(V, E), which may be either directed or undirected,
we can associate edge weights with G by specifying a function w : E → R. We
will write G(V, E, w) to denote the graph G(V, E) with edge weights given by w and
we will call such a graph a weighted graph.

We will write w(a, b) to indicate the weight of the edge e = (a, b) and if G(V, E, w)
is an undirected weighted graph we will require w(a, b) = w(b, a) for all (a, b) ∈ E.
Note that Definition 12.1 allows the weights to be negative or zero. That’s
because, as we’ll see soon, the weights can represent many things. If the vertices
represent places, then we could define a weight function w so that, for an edge
e = (a, b) ∈ E, the weight w(e) is:

• the distance from a to b;

• the time it takes to travel from a to b, in which case it may happen that
w(a, b) ̸= w(b, a);

92
y
a 1 1
1 1
b a x z b
-5

Figure 12.1: In the graph at left there are no walks from a to b and so, by
convention, we define d(a, b) = ∞. The graph at right, which has edge weights
as indicated, illustrates a more serious problem. The cycle specified by the vertex
sequence (x, y, z, x) has negative weight and so there is no minimal-weight path from
a to b and hence no well-defined distance d(a, b).

• the profit made when we send a shipping container from a to b. This could
easily be negative if we had to bring an empty container back from someplace
we’d sent a shipment.
In any case, once we’ve defined weights for edges, it’s natural to define the weight
of a walk as follows.
Definition 12.2. Given a weighted graph G(V, E, w) and a walk from a to b defined
by the vertex sequence
a = v0 , . . . , vℓ = b,
so that the its edges are ej = (vj−1 , vj ), then the weight of the walk is

X

w(ej ).
j=1

12.2 A notion of distance


Given two vertices a and b in a weighted graph G(V, E, w), we might try to define
a distance d(a, b) from a to b as

d(a, b) = min {w(ω) | ω is a walk from a to b} ,

but two issues, both of which are illustrated in Figure 12.1, present themselves
immediately:
(1) What if there aren’t any walks from a to b? In this case, by convention, we
define d(a, b) = ∞.

(2) What if some cycle in G has negative weight? As we will see below, this leads
to insurmountable problems and so we’ll just have to exclude this possibility.
The problem with cycles of negative weight is illustrated at the right in Fig-
ure 12.1. The graph has V = {a, x, y, z, b} and edge weights

w(a, x) = w(x, y) = w(y, z) = w(z, b) = 1 and w(x, z) = −5.

93
The cycle specified by the vertex sequence (z, x, y, z) thus has weight
w(z, x) + w(x, y) + w(y, z) = −5 + 1 + 1 = −3
and one can see that this presents a problem for our definition of d(a, b) by consid-
ering the sequence of walks:
Walk Weight
(a, x, y, z, b) 1+1+1+1 = 4
(a, x, y, z, x, y, z, b) 4 + (−5 + 1 + 1) = 1
(a, x, y, z, x, y, z, x, y, z, b) 4−2×3 = −2
..
.
(a, x, y, z, x, y, z, . . . , x, y, z , b) 4 − k × 3 = 4 − 3k.
| {z }
k times around the cycle

There is no walk of minimal weight from a to b: one can always find a walk of lower
weight by tracing over the negative-weight cycle a few more times. We could escape
this problem by defining d(a, b) as the weight of a minimal-weight path1 , but instead
we will exclude the problematic cases explicitly:
Definition 12.3. Suppose G(V, E, w) is a weighted graph that does not contain any
cycles of negative weight. For vertices a and b we define the distance function
d : V × V → R as follows:
• d(a, a) = 0 for all a ∈ V ;
• d(a, b) = ∞ if there is no walk from a to b;
• d(a, b) is the weight of a minimal-weight walk from a to b when such walks
exist.

A warning
The word “distance” in Definition 12.3 is potentially misleading in that it is perfectly
possible to find weighted graphs in which d(a, b) < 0 for some (or even all) a and
b. Further, it’s possible that in a directed graph there may be vertices a and b such
that d(a, b) ̸= d(b, a). If we want our distance function to have the all the properties
that the word “distance” normally suggests, it’s helpful to recall (or learn for the
first time) the definition of a metric on a set X. It’s a function d : X × X → R with
the following properties:
Non-negativity d(x, y) ≥ 0 ∀x, y ∈ X and d(x, y) = 0 ⇐⇒ x = y;
symmetry d(x, y) = d(y, x) ∀x, y ∈ X;
triangle inequality d(x, y) + d(y, z) ≥ d(x, z) ∀x, y, z ∈ X.
If d is a metric on X we say that the pair (X, d) constitute a metric space.
It’s not hard to prove (see the Problem Sets) that if G(V, E, w) is a weighted,
undirected graph in which w(e) > 0 ∀e ∈ E, then the function d : V × V → R from
Definition 12.3 is a metric on the vertex set V .
1
A path cannot revisit a vertex and hence cannot trace over a cycle.

94
s x

z
w
u y

Figure 12.2: A SSSP problem in which all the edge weights are 1 and the source
vertex s is shown in yellow.

12.3 Shortest path problems


Video Once one has a definition for distance in a weighted graph G(V, E, w), two natural
8.5 problems present themselves:
Single-Source Shortest Path (SSSP): Given a vertex s—the so-called source
vertex—compute d(s, v) ∀v ∈ V .

All-Pairs Shortest Path: Compute d(u, v) ∀u, v ∈ V .


We will develop algorithms to solve the first of these, but not the second. Of
course, if one has an algorithm for SSSP, one can also solve the second by applying
the SSSP algorithm with each vertex as the source, though there are more efficient
approaches as well.

12.3.1 Uniform weights & Breadth First Search


The simplest SSSP problems are those in undirected weighted graphs where all
edges have the same weight, say, w(e) = 1 ∀e ∈ E. In this case one can use an
algorithm called Breadth First Search (BFS), which is one of the fundamental tools
of algorithmic graph theory. I’ll present the algorithm twice, once informally, by
way of an example, and then again in an appendix that’s not examinable, in a
sufficiently detailed way that one could, for example, implement it in MATLAB. As
we’re working on a single-source problem, it’s convenient to define

d(v) ≡ d(s, v),

where s is the source vertex. Our goal is then to compute d(v) for all vertices in the
graph.
To illustrate the main ideas, we’ll use BFS to compute d(v) for all the vertices
in the graph pictured in Figure 12.2:
• Set d(s) = 0.

• Set d(v) = 1 for all s’s neighbours. That is, set d(v) = 1 for all vertices
v ∈ As = {u, w}.

• Set d(v) = 2 for those vertices that (a) are adjacent to vertices t with d(t) = 1
and (b) have not yet had a value for d(v) assigned.

95
s 0 x s 0 x 2 s 0 x 2

1 z 1 z 1 3 z
w w w
u 1 y u 1 2 y u 1 2 y

Figure 12.3: The leftmost graph shows the result of the first two stages of the
informal BFS algorithm: we set d(s) = 0 and and d(v) = 1 for all v ∈ As . In the
second stage we set d(v) = 2 for neighbours of vertices t with d(t) = 1 · · · and so
on.

• Set d(v) = 3 for those vertices that (a) are adjacent to vertices t with d(t) = 2
and (b) have not yet had a values of d(v) assigned.

This process is illustrated in Figure 12.3 and, for the current example, these four
steps assign a value of d(v) to every vertex. In Section 12.4 we will return to this
algorithm and rewrite it in a more general way, but I’d like to conclude this section
by discussing why this approach works.

12.3.2 Bellman’s equations


Video BFS, and indeed all the shortest-path algorithms we’ll study, work because of a char-
8.6 acterisation of minimum-weight walks due to Richard Bellman. Suppose G(V, E, w)
is a weighted graph (either directed or undirected) on |V | = n vertices. Specify a
single-source shortest path problem by numbering the vertices so that the source
vertex s comes first, v1 = s, and assemble the edge weights into an n × n matrix w
whose entries are given by

w(vk , vj ) if (vk , vj ) ∈ E
wk,j =
∞ otherwise

Then we have the following theorem, which captures the idea that a minimal-
weight path from v1 to vj consists of a minimal-weight path from v1 to one of
vj ’s neighbours—say vk such that (vk , vj ) ∈ E—followed by a final step from vk to
vj .

Theorem 12.4 (Bellman’s equations). The quantities uj = d(v1 , vj ) satisfy the


equations

u1 = 0 and uj = min(uk + wk,j ) for 2 ≤ j ≤ n. (12.1)


k̸=j

Further, if all cycles in G(V, E, w) have positive weight, then the equations (12.1)
have a unique solution.

96
12.4 Appendix: BFS revisited
The last two steps of our informal introduction to BFS had the general form
• Set d(v) = j + 1 for those vertices that (a) are adjacent to vertices t with
d(t) = j and (b) have not yet had a values of d(v) assigned.
The main technical problem in formalising the algorithm is to find a systematic way
of working our way outward from the source vertex. A data structure from computer
science called a queue provides an elegant solution. It’s an ordered list that we’ll
write from left-to-right, so that a queue containing vertices might look like

Q = {x, z, u, w, . . . , a},

where x is the first entry in the queue and a is the last.


Just as in a well-behaved bus or bakery queue, we “serve” the vertices in order
(left to right) and require that any new items added to the queue get added at the
end (the right). There are two operations that one can perform on a queue:
push: add a new entry onto the end of the queue (i.e. at the right);

pop: remove the first (i.e. leftmost) entry and, typically, do something with it.
Thus if our queue is Q = {b, c, a} and we push a vertex d onto it the result is
push x onto Q
{b, c, a} −−−−−−−−−−−−→ {b, c, a, x},

while if we pop the queue we get


pop Q
{b, c, a} −−−−−−→ {c, a}.

We can use this idea to organise the order in which we visit the vertices in BFS.
Our goal will be to compute d(v) = d(s, v) for all vertices in the graph and we’ll
start by setting d(s) = 0 and

d(v) = ∞ ∀v ̸= s.

This has two advantages: first, d(v) = ∞ is the correct value for any vertex that
is not reachable from s and second, it serves as a way to indicate that, as far as
our algorithm is concerned, we have yet to visit vertex v and so d(v) is yet-to-be-
determined.
We’ll then work our way through the vertices that lie in the same connected
component as s by
• pushing a vertex v onto the end of the queue whenever we set d(v), beginning
with the source vertex s and

• popping vertices off the queue in turn, working through the adjacency list of
the popped vertex u and examining its neighbours w ∈ Au in turn, setting
d(w) = d(u) + 1 whenever d(w) is currently marked as yet-to-be-determined.

97
Algorithm 12.5 (BFS for SSSP). Given an undirected graph G(V, E) and a dis-
tinguished source vertex s ∈ V , assume uniform edge weights w(e) = 1 ∀e ∈ E and
find the distances d(v) = d(s, v) for all v ∈ V .

(1) Set things up:


d(v) ← ∞ ∀v ̸= s ∈ V Set d(v) = ∞ to indicate that it is yet-to-be-determined
d(s) ← 0 Note that we know d(s, s) = 0
Q ← {s} Get ready to process s’s neighbours

(2) Main loop: continues until the queue is empty


While ( Q ̸= ∅ ) {
Pop a vertex u off the left end of Q.

Examine each of u’s neighbours


For each w ∈ Au {
If( d(w) = ∞ ) then {
Set d(w) and get ready to process w’s neighbours
d(w) ← d(u) + 1
Push w onto the right end of Q.
}
}
}

Figure 12.4 illustrates the early stages of applying the algorithm to a small graph,
while Table 12.1 provides a complete account of the computation.

Remarks
• If d(u) = ∞ when the algorithm finishes, then u and s lie in separate connected
components.

• Because the computation works through adjacency lists, each edge gets con-
sidered at most twice and so the algorithm requires O(|E|) steps, where a step
consists of checking whether d(u) = ∞ and, if so, updating its value.

• It is possible to prove by induction on the lengths of the shortest paths, that


BFS really does compute the distance d(s, v): interested readers should see
Jungnickel’s Theorem 3.3.2.

98
u w ∈ Au Action Resulting Queue
– – Start {S}
S A set d(A) = 1 and push A {A}
S C set d(C) = 1 and push C {A, C}
S G set d(G) = 1 and push G {A, C, G}
A B set d(B) = 2 and push B {C, G, B}
A S none, as d(S) = 0 {C, G, B}
C D set d(D) = 2 and push D {G, B, D}
C E set d(E) = 2 and push E {G, B, D, E}
C F set d(F ) = 2 and push F {G, B, D, E, F }
C S none {G, B, D, E, F }
G F none {B, D, E, F }
G H set d(H) = 2 and push H {B, D, E, F, H}
G S none {B, D, E, F, H}
B A none {D, E, F, H}
D C none {E, F, H}
E C none {F, H}
E H none {F, H}
F C none {H}
F G none {H}
H E none {}
H G none {}

Table 12.1: A complete record of the execution of the BFS algorithm for the graph
in Figure 12.4. Each row corresponds to one pass through the innermost loop of
Algorithm 12.5, those steps that check whether d(w) = ∞ and act accordingly.
The table is divided into sections—separated by horizontal lines—within which the
algorithm works through the adjacency list of the most recently-popped vertex u.

99
B D B D
1 1
A C E A C E

0 0
S F S F
1
G H G H

Figure 12.4: In the graphs above the source vertex s is shown in yellow and a
vertex v is shown with a number on it if, at that stage in the algorithm, d(v) has
been determined (that is, if d(v) ̸= ∞). The graph at left illustrates the state of
the computation just after the initialisation: d(s) has been set to d(s) = 0, all other
vertices have d(v) = ∞ and the queue is Q = {s}. The graph at right shows the
state of the computation after we have popped s and processed its neighbours: d(v)
has been determined for A, C and G and they have been pushed onto the queue,
which is now Q = {A, C, G}.

100
Chapter 13

Tropical Arithmetic and Shortest


Paths

This lecture introduces tropical arithmetic1 and explains how to use it to calculate
the lengths of all the shortest paths in a graph.
Reading:
The material here is not discussed in any of the main references for the course. The
lecture is meant to be self-contained, but if you find yourself intrigued by tropical
mathematics, you might want to look at a recent introductory article

David Speyer and Bernd Sturmfels (2004), Tropical mathematics. Lec-


ture notes from a Clay Mathematics Institute Senior Scholar Lecture,
Park City, Utah, 22 July 2004. Available as preprint 0408099 from the
arXiv preprint repository.

Very keen, mathematically sophisticated readers might also enjoy

Diane Maclagan and Bernd Sturmfels (2015), Introduction to Tropi-


cal Geometry, Vol. 161 of Graduate Studies in Mathematics, American
Mathematical Society, Providence, RI. ISBN: 978-0-8218-5198-2

while those interested in applications might prefer

B. Heidergott, G. Olsder, and J. van der Woude (2006), Max Plus at


Work, Vol. 13 of the Princeton Series in Applied Mathematics, Princeton
Univ Press. ISBN: 978-0-6911-1763-8.

The book by Heidergott et al. includes a tropical model of the Dutch railway network
and is more accessible than either the book by Maclagan and Sturmfels or the latter
parts of the article by Speyer and Sturmfels.

1
Maclagan & Sturmfels write: The adjective “tropical” was coined by French mathematicians,
notably Jean-Eric Pin, to honour their Brazilian colleague Imre Simon, who pioneered the use of
min-plus algebra in optimisation theory. There is no deeper meaning to the adjective “tropical”.
It simply stands for the French view of Brazil.

101
13.1 All pairs shortest paths
Video In a previous lecture we used Breadth First Search (BFS) to solve the single-source
9.1 shortest paths problem in a weighted graph G(V, E, w) where the weights are trivial
in the sense that w(e) = 1 ∀e ∈ E. Today we’ll consider the problem where the
weights can vary from edge to edge, but are constrained so that all cycles have
positive weight. This ensures that Bellman’s equations have a unique solution. Our
approach to the problem depends on two main ingredients: a result about powers
of the adjacency matrix and a novel kind of arithmetic.

13.2 Counting walks using linear algebra


Our main result is very close in spirit to the following, simpler one.
Theorem 13.1 (Powers of the adjacency matrix count walks). Suppose G(V, E) is
a graph (directed or undirected) on n = |V | vertices and that A is its adjacency
matrix. If we define Aℓ , the ℓ-th matrix power of A, by
Aℓ+1 = Aℓ A and A0 = In ,
where ℓ ∈ N, then for ℓ > 0,
Aℓij = the number of walks of length ℓ from vertex i to vertex j, (13.1)
where Aℓij is the i, j entry in Aℓ .
Proof. We’ll prove this by induction on ℓ, the number of edges in the walk. The
base case is ℓ = 1 and so Aℓ = A1 = A and Aij certainly counts the number of
one-step walks from vertex i to vertex j: there is either exactly one such walk, or
none.
Now suppose the result is true for all ℓ ≤ ℓ0 and consider
X
n
Aℓij0 +1 = Aℓik0 Akj .
k=1

The only nonzero entries in this sum appear for those values of k for which both
Aℓik0 and Akj are nonzero. Now, the only possible nonzero value for Akj is 1, which
happens when the edge (k, j) is present in the graph. Thus we could also think of
the sum above as running over vertices k such that the edge (k, j) is in E:
X
Aℓij0 +1 = Aℓik0 .
{k|(k,j)∈E}

By the inductive hypothesis, Aℓik0 is the number of distinct, length-ℓ0 walks from i to
k. And if we add the edge (k, j) to the end of such a walk, we get a walk from i to
j. All the walks produced in this way are clearly distinct (those that pass through
different intermediate vertices k are obviously distinct and even those that have
the same k are, by the inductive hypothesis, different somewhere along the i to k
segment). Further, every walk of length ℓ0 + 1 from i to j must consist of a length-ℓ0
walk from i to some neighbour or predecessor k of j, followed by a final step from k
to j, so we have completed the inductive argument and proved the result.

102
G H
1 2 3 1 2 3

Figure 13.1: In G, the graph at left, any walk from vertex 1 to vertex 3 must have
even length while in H, the directed graph at right, there are no walks of length 3 or
greater.

Two examples
The graph at left in Figure 13.1 contains six walks of length 2. If we represent them
with vertex sequences they’re

(1, 2, 1), (1, 2, 3), (2, 1, 2), (2, 3, 2), (3, 2, 1), and (3, 2, 3), (13.2)

while the first two powers of AG , G’s adjacency matrix, are


   
0 1 0 1 0 1
AG =  1 0 1  A2G =  0 2 0  (13.3)
0 1 0 1 0 1

Comparing these we see that the computation based on powers of AG agrees with
the list of paths, just as Theorem 13.1 leads us to expect:
• A21,1 = 1 and there is a single walk, (1, 2, 1), from vertex 1 to itself;

• A21,3 = 1: counts the single walk, (1, 2, 3), from vertex 1 to vertex 3;

• A22,2 = 2: counts the two walks from vertex 2 to itself, (2,1,2) and (2,3,2);

• A23,1 = 1: counts the single walk, (3, 2, 1), from vertex 3 to vertex 1;

• A23,3 = 1: counts the single walk, (3, 2, 3), from vertex 3 to itself.
Something similar happens for the directed graph H that appears at right in Fig-
ure 13.1, but it has only a single walk of length two and none at all for lengths three
or greater.
     
0 1 0 0 0 1 0 0 0
AH =  0 0 1  A2H =  0 0 0  A3H =  0 0 0  (13.4)
0 0 0 0 0 0 0 0 0

An alternative to BFS
The theorem we’ve just proved suggests a way to find all the shortest paths in the
special case where w(e) = 1 ∀e ∈ E. Of course, in this case the weight of a path is
the same as its length.
(1) Observe that a shortest path has length at most n − 1.

103
(2) Compute the sequence of powers of the adjacency matrix A, A2 , · · · , An−1 .

(3) To find the length of a shortest path from vi to vj , look through the sequence
of matrix powers and find the smallest ℓ such that Aℓij > 0. This ℓ is the
desired length.

In the rest of the lecture we’ll generalise this strategy to graphs with arbitrary
weights.

13.3 Tropical arithmetic


Video Tropical arithmetic acts on the set R ∪ {∞} and has two binary operations, ⊕ and
9.2 ⊗, defined by
x ⊕ y = min(x, y) and x⊗y =x+y (13.5)
where, in the definition of ⊗, x + y means ordinary addition of real numbers sup-
plemented by the extra rule that x ⊗ ∞ = ∞ for all x ∈ R ∪ {∞}. These novel
arithmetic operators have many of the properties familiar from ordinary arithmetic.
In particular, they are commutative. For all a, b ∈ R ∪ {∞} we have both

a ⊕ b = min(a, b) = min(b, a) = b ⊕ a

and
a ⊗ b = a + b = b + a = b ⊗ a.
The tropical arithmetic operators are also associative:

a ⊕ (b ⊕ c) = min(a, min(b, c)) = min(a, b, c) = min(min(a, b), c) = (a ⊕ b) ⊕ c

and
a ⊗ (b ⊗ c) = a + (b + c) = a + b + c = (a + b) + c = (a ⊗ b) ⊗ c.
Also, there are distinct additive and multiplicative identity elements. For all
a ∈ R ∪ {∞} we have

a ⊕ ∞ = min(a, ∞) = a and a ⊗ 0 = 0 + a = a.

Finally, the multiplication is distributive: For all a, b, c ∈ R ∪ {∞}

a ⊗ (b ⊕ c) = a + min(b, c) = min(a + b, a + c) = (a ⊗ b) ⊕ (a ⊗ c).

There are, however, important differences between tropical and ordinary arithmetic.
In particular, there are no additive inverses2 in tropical arithmetic and so one cannot
always solve linear equations. For example, there is no x ∈ R ∪ {∞} such that
(2 ⊗ x) ⊕ 5 = 11. To see why, rewrite the equation as follows:

(2 ⊗ x) ⊕ 5 = (2 + x) ⊕ 5 = min(2 + x, 5) ≤ 5.
2
Students who did Algebraic Structures II might recognise that this collection of properties
means that tropical arithmetic over R ∪ {∞} is a semiring.

104
13.3.1 Tropical matrix operations
Given two m × n matrices A and B whose entries are drawn from R ∪ {∞}, we’ll
define the tropical matrix sum A ⊕ B by:

(A ⊕ B)ij = Aij ⊕ Bij = min(Aij , Bij )

And for compatibly-shaped tropical matrices A and B we can also define the tropical
matrix product by
M
n
(A ⊗ B)ij = Aik ⊗ Bkj = min (Aik + Bkj ).
1≤k≤n
k=1

Finally, if B is an n × n square matrix, we can define tropical matrix powers as


follows:
B ⊗k+1 = B ⊗k ⊗ B and B ⊗0 = Iˆn , (13.6)
where Iˆn is the n × n tropical identity matrix,
 
0 ∞ ... ∞
 .. 
 ∞ 0 ∞ . 
 
Iˆn = 

..
.
..
.
..
. .
 (13.7)
 .. 
 . ∞ 0 ∞ 
∞ ... ∞ 0

It has zeroes on the diagonal and ∞ everywhere else. It’s easy to check that if A is
an m × n tropical matrix then

Iˆm ⊗ A = A = A ⊗ Iˆn .

Example 13.2 (Tropical matrix operations). If we define two tropical matrices A


and B by    
1 2 ∞ 1
A= and B=
0 ∞ 1 ∞
then
     
1⊕∞ 2⊕1 min(1, ∞) min(2, 1) 1 1
A⊕B = = =
0⊕1 ∞⊕∞ min(0, 1) min(∞, ∞) 0 ∞

and
 
(1 ⊗ ∞) ⊕ (2 ⊗ 1) (1 ⊗ 1) ⊕ (2 ⊗ ∞)
A⊗B =
(0 ⊗ ∞) ⊕ (∞ ⊗ 1) (0 ⊗ 1) ⊕ (∞ ⊗ ∞)
 
min(1 + ∞, 2 + 1) min(1 + 1, 2 + ∞)
=
min(0 + ∞, ∞ + 1) min(0 + 1, ∞ + ∞)
 
3 2
= .
∞ 1

105
13.3.2 A tropical version of Bellman’s equations
Video Recall Bellman’s equations from Section 12.3.2. Given a weighted graph G(V, E, w)
9.3 in which all cycles have positive weight, we can find uj = d(v1 , vj ) by solving the
system
u1 = 0 and uj = min uk + wk,j for 2 ≤ j ≤ n,
k̸=j

where wk,j is an entry in a weight matrix w given by



w(vk , vj ) if (vk , vj ) ∈ E
wk,j = . (13.8)
∞ otherwise

We can rewrite Bellman’s equations using tropical arithmetic


M
uj = min uk + wk,j = min uk ⊗ wk,j = uk ⊗ wk,j
k̸=j k̸=j
k̸=j

which looks almost like the tropical matrix product u ⊗ w: we’ll exploit this obser-
vation in the next section.

13.4 Minimal-weight paths in a tropical style


We’ll now return to the problem of finding the weights of all the minimal-weight
paths in a weighted graph. The calculations are very similar to those in Section 13.2,
but now we’ll take tropical powers of a weight matrix W whose entries are given by:

 0 if j = k
Wk,j = w(vk , vj ) if (vk , vj ) ∈ E . (13.9)

∞ otherwise

Note that W is very similar to the matrix w defined by Eqn. (13.8): the two differ
only along the diagonal, where wii = ∞ for all i, while Wii = 0.

Lemma 13.3. Suppose G(V, E, w) is a weighted graph (directed or undirected) on


n vertices. If all the cycles in G have positive weight and a matrix W is defined as
in Eqn. (13.9), then the entries in W ⊗ℓ , the ℓ-th tropical power of W , are such that

Wii⊗ℓ = 0 for all i

and, for i ̸= j, either

Wij⊗ℓ = weight of a minimal-weight walk from vi to vj containing at most ℓ edges

when G contains such a walk or Wij⊗ℓ = ∞ if no such walks exist.

Proof. We proceed by induction on ℓ. The base case is ℓ = 1 and it’s clear that the
only length-one walks are the edges themselves, while Wii = 0 by construction.

106
Now suppose the result is true for all ℓ ≤ ℓ0 and consider the case ℓ = ℓ0 + 1.
We will first prove the result for the off-diagonal entries, those for which i ̸= j. For
these entries we have
M n
⊗ℓ0 +1 ⊗ℓ0
Wi,j = Wi,k ⊗ Wk,j = min Wik⊗ℓ0 + Wk,j (13.10)
1≤k≤n
k=1
⊗ℓ0
and the inductive hypothesis says that Wi,k is either the weight of a minimal-weight
walk from vi to vk containing ℓ0 or fewer edges or, if no such walks exist, Wik⊗ℓ0 = ∞.
Wk,j is given by Eqn. (13.9) and so there are three possibilities for the terms
⊗ℓ0
Wi,k + Wk,j (13.11)
that appear in the tropical sum (13.10):
⊗ℓ0
(1) They are infinite for all values of k, and so direct calculation gives Wi,j = ∞.
This happens when, for each k, we have one or both of the following:
⊗ℓ0
• Wi,k = ∞, in which case the inductive hypothesis says that there are
no walks of length ℓ0 or less from vi to vk or
• Wk,j = ∞ in which case there is no edge from vk to vj .
And since this is true for all k, it implies that there are no walks of length
ℓ0 + 1 or less that run from vi to vj . Thus the lemma holds when i ̸= j and
Wij⊗ℓ0 +1 = ∞.
(2) The expression in (13.11) is finite for at least one value of k, but not for k = j.
Then as Wjj = 0 by construction, we know Wij⊗ℓ0 = ∞ and so there are no
walks of length ℓ0 or less running from vi to vj . Further,
⊗ℓ0 +1 ⊗ℓ0
Wi,j = min Wi,k + Wk,j
k̸=j

= min Wij⊗ℓ0 + w(vk , vj ). (13.12)


{k | (vk ,vj )∈E}

and reasoning such as we used when discussing Bellman’s equations—a minimal-


weight walk from vi to vj consists of a minimal weight walk from vi to some
neighbour (or, in a digraph, some predecessor) vk of vj , plus the edge (vk , vj )—
means that the (13.12) gives the weight of a minimal-weight walk of length
ℓ0 + 1 and so the lemma holds here too.
(3) The expression in (13.11) is finite for case k = j and perhaps also for some
k ̸= j. When k = j we have
Wik⊗ℓ0 + Wk,j = Wij⊗ℓ0 + Wj,j = Wij⊗ℓ0 + 0 = Wij⊗ℓ0 ,
which, by the inductive hypothesis, is the weight of a minimal-weight walk of
length ℓ0 or less. If there are other values of k for which (13.11) is finite, then
they give rise to a sum over neighbours (or, if G is a digraph, over predecessors)
such as (13.12), that computes the weight of the minimal-weight walk of length
ℓ0 + 1. The minimum of this quantity and Wij⊗ℓ0 is then the minimal weight
for any walk involving ℓ0 + 1 or fewer edges and so the lemma holds in this
case too.

107
Finally, note that reasoning above works for Wii⊗ℓ too: Wii⊗ℓ is the weight of a
minimal-weight walk from vi to itself. And given that any walk from vi to itself
must contain a cycle and that all cycles have positive weight, we can conclude that
the tropical sum
⊗ℓ0 +1 ⊗ℓ0
Wi,i = min Wi,k + Wk,i
k
⊗ℓ0 ⊗ℓ0
is minimised by k = i, when Wi,k = Wi,i = 0 (by the inductive hypothesis) and
Wi,i = 0 (by construction) so
⊗ℓ0 ⊗ℓ0
min Wi,k + Wk,i = Wi,i + Wi,i = 0 + 0 = 0
k

and the theorem is proven for the diagonal entries too.


Finally, we can state our main result:
Theorem 13.4 (Tropical matrix powers and shortest paths). If G(V, E, w) is a
weighted graph (directed or undirected) on n vertices in which all the cycles have
positive weight, then d(vi , vj ), the weight of a minimal-weight walk from vi to vj , is
given by
⊗(n−1)
d(vi , vj ) = Wi,j (13.13)
Proof. First note that for i ̸= j, any minimal-weight walk from vi to vj must actually
be a minimal weight path. One can prove this by contradiction, as any walk that
isn’t a path must revisit at least one vertex. Say that v⋆ is one of these revisited
vertices. Then the segment of the walk that runs from the first appearance of v⋆ to
the second must have positive weight (it’s a cycle and all cycles in G have positive
weight) and so we can reduce the total weight of the walk by removing this cycle.
But this contradicts our initial assumption that the walk had minimal weight.
Combining this observation with the previous lemma and the observation that
a path in G contains at most n − 1 edges establishes the result.

An example
The graph illustrated in Figure 13.2 is small enough that we can just read off the
weights of its minimal-weight paths. If we assemble these results into a matrix D
whose entries are given by

 0 if i = j
Dij = d(vi , vj ) if i ̸= j and vj is reachable from vi

∞ otherwise
we get  
0 −2 −1
D =  3 0 1 .
2 0 0
To verify Theorem 13.4 we need to write down the weight matrix and compute
its tropical square, which are
   
0 −2 1 0 −2 −1
W = ∞ 0 1  and W ⊗2 =  3 0 1 .
2 ∞ 0 2 0 0

108
v1

-2

v2
2 1

v3

Figure 13.2: The graph above contains two directed cycles, (v1 , v2 , v3 , v1 ), which has
weight 1, and (v1 , v3 , v1 ), which has weight 2. Theorem 13.4 thus applies and we
can compute the weights of minimal-weight paths using tropical powers of the weight
matrix.

The graph has n = 3 vertices and so we expect W ⊗(n−1) = W ⊗2 to agree with the
distance matrix D, which it does.

109
Chapter 14

Critical Path Analysis

This lecture applies ideas about distance in weighted graphs to solve problems in
the scheduling of large, complex projects.
Reading:
The topic is discussed in Section 3.5 of

Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,


which is available online via SpringerLink,

but it is such an important application that it is also treated in many other places.

14.1 Scheduling problems


Video Suppose you are planning a dinner that involves a number of dishes, some of which
9.4 have multiple components or stages of preparation, say, a roast with sauces or a pie
with pastry and filling that have to be prepared separately. Especially for a novice
cook, it can be difficult to arrange the cooking so that all the food is ready at the
same moment. Somewhat surprisingly, graph theory can help in this, as well as in
many more complex scheduling problems.
The key abstraction is a certain kind of directed graph constructed from a list
of tasks such as the one in Table 14.1, which breaks a large project down into a
list of smaller tasks and, for each one, notes (a) how long it takes to complete and
(b) which other tasks are its immediate prerequisites. Here, for example, task A
might be “wash and peel all the vegetables” while D and E—which have A as a
prerequisite—might be “assemble the salad” and “fry the garlic briefly over very
high heat.”

14.1.1 From tasks to weighted digraphs


Figure 14.1 shows a directed graph associated with the project summarised in Ta-
ble 14.1. It has:

• a vertex for each task;

110
Time to
Task Complete Prerequisites
A 1 None
B 2 None
C 3 A&B
D 4 B
E 4 C
F 4 C&D
G 6 E&D
H 6 E&F

Table 14.1: Summary of a modestly-sized project. The first column lists various
tasks required for the completion of the project, while the second column gives the
time (in minutes) needed to complete each task and the third column gives each task’s
immediate prerequisites.

• edges that run from prerequisites to the tasks that depend on them. Thus for
example, there is a directed edge (A, C), as task C has task A as a prerequisite.

• There are also two extra vertices, one called S (for “start”) that requires
no time to complete, but is is a predecessor for all the tasks that have no
prerequisites and another, Z, that corresponds to finishing the project and is
a successor of all tasks that are not prerequisites for any other task.

• There are edge weights that correspond to the time it takes to complete the
task at the tail vertex. Thus—as task B takes 2 minutes to complete—both
edges coming out of the vertex B have weight 2.

Figure 14.1: The digraph associated with the scheduling problem from Table 14.1.

111
Task Prereq’s
P: get a pink from G P B G
B: get a blue form P
G: get a green form B

Figure 14.2: An example showing why the digraph associated with a scheduling
problem shouldn’t contain cycles. It represents a bureaucratic nightmare in which
one needs a pink form P in order to get a blue form B in order to get the green
form G that one needs to get a pink form P .

14.1.2 From weighted digraphs to schedules


Once we have a graph such as the one in Figure 14.1 we can answer a number of
important questions about the project including:

(1) What is the shortest time in which we can complete the work?

(2) What is the earliest time (measured from the start of the project) at which
we can start a given task?

(3) Are there any tasks whose late start would delay the whole project?

(4) For any tasks that don’t need to be started as early as possible, how long can
we delay their starts?

In the discussion that follows, we’ll imagine that we have as many resources as we
need (as many hands to help in the kitchen, as many employees and as much equip-
ment as needed to pursue multiple tasks in parallel · · · ). In this setting, Lemma 14.1,
proved below, provides a tool to answer all of these questions.

14.2 Graph-theoretic details


Video A directed graph representing a project that can actually be completed cannot
9.5 contain any cycles. To see why, consider the graph in Figure 14.2. It tells us that
we cannot start task G until we have completed its prerequisite, task B, which we
cannot start before we complete its prerequisite, P · · · which we cannot start until
we’ve completed G. This means we can never even start the project, much less finish
it.
Thus any graph that describes a feasible project should be a directed, acyclic
graph (often abbreviated DAG) with non-negative edge weights. From now on
we’ll restrict attention to such graphs and call them task-dependency graphs. We’ll
imagine that they are constructed from a project described by a list of tasks such
as the one in Table 14.1 and that they look like the example in Figure 14.1. In
particular, we’ll require our graphs to have a starting vertex S which is the only
vertex with degin (v) = 0 and a terminal vertex Z, which is the only vertex with
degout (v) = 0.

112
A
10
0

Task Time Needed Prereq’s


S B Z
A 10 hours None 0 6
B 6 hours None

Figure 14.3: The shortest time in which we can complete this project is 10 hours, the
weight of a maximal-weight path from the starting vertex S to the terminal vertex
Z.

14.2.1 Shortest times and maximal-weight paths


Now consider the very simple project illustrated in Figure 14.3. It involves just two
tasks: A, which takes 10 hours to complete and B which takes 6 hours. Even if—as
our assumptions allow—we start both tasks at the same time and work on them in
parallel, the soonest we can possibly finish the project is 10 hours after we start.
This is a special case of the following result, whose proof I’ll only sketch briefly.

Lemma 14.1 (Shortest times are maximal weights). If G(V, E, w) is a task-dependency


graph that describes a scheduling problem, and if we start the work at t = 0, then
the earliest time, tv , at which we can start the task corresponding to vertex v is the
weight of a maximal-weight path from S to v.

The proof of Lemma 14.1 turns on the observation that the times tv satisfy
equations that look similar to Bellman’s Equations, except that they have a max()
where Bellman’s Equations have a min():

tS = 0 and tv = max (tu + w(u, v)) ∀ v ̸= S. (14.1)


u∈Pv

In the equation at right, Pv is v’s predecessor list and w(u, v) is the weight of the edge
from u to v or, equivalently, the time it takes to complete the task corresponding to
u.
Although the Bellman-like equations above provide an elegant characterisation
of the tv , they aren’t necessarily all that practical as a way to calculate the tv .
The issue is that in order to use Eqn. (14.1) to compute tv , we need tu for all v’s
predecessors u ∈ Pv . And for each of them, we need tw for w ∈ Pu · · · , and so on.
Fortunately this problem has a simple resolution in DAGs, as we’ll see below. The
idea is to find a clever way to organise the computations so that the results we need
when computing tv are certain to be available.

113
1 2 2 3

4 4

3 1

Figure 14.4: A digraph with two distinct topological orderings.

14.2.2 Topological ordering


Definition 14.2. If G(V, E) is a directed, acyclic graph with |V | = n, then a
topological ordering (sometimes also called a topological sorting) of G is a
map Φ : V → {1, 2, . . . , n} with the properties that

• Φ(v) = Φ(u) ⇒ u = v;

• (u, v) ∈ E ⇒ Φ(u) < Φ(v).

In other words, a topological ordering is a way of numbering the vertices so that the
graph’s directed edges always point from a vertex with a smaller index to a vertex
with a bigger one.
Topological orderings are not, in general, unique, as is illustrated in Figure 14.4,
but as the following results show, a DAG always has at least one.

Lemma 14.3 (DAGs contain sink vertices). If G(V, E) is a directed, acyclic graph
then it contains at least one vertex v with degout (v) = 0. Such a vertex is sometimes
called a sink vertex or a sink.

Proof of Lemma 14.3. Construct a walk through G(V, E) as follows. First choose
an arbitrary vertex v0 ∈ V . If degout (v0 ) = 0 we are finished, but if not choose an
arbitrary successor of v0 , v1 ∈ Sv0 . If degout (v1 ) = 0 we are finished, but if not,
choose an arbitrary successor of v1 , v2 ∈ Sv1 · · · and so on. This construction can
never revisit a vertex as G is acyclic. Further, as G has only finitely many vertices,
the construction must come to a stop after at most |V | − 1 steps. But the only way
for it to stop is to reach a vertex vj such that degout (vj ) = 0, which proves that such
a vertex must exist.

Theorem 14.4 (DAGs have topological orderings). A directed, acyclic graph G(V, E)
always has a topological ordering.

Proof of Theorem 14.4. One can prove this by induction on the number of vertices.
The base case is |V | = 1 and clearly, assigning the number 1 to the sole vertex gives
a topological ordering.

Now suppose the result is true for all DAGs with |V | ≤ n0 and consider a
DAG with |V | = n0 + 1. Lemma 14.3 tells us that G contains a vertex w with
degout (w) = 0. Construct G′ (V ′ , E ′ ) = G\w. It is a DAG (because G was one), but

114
2 4 6 8
0
1 10
0
3 5 7 9

Figure 14.5: A topological ordering for the digraph associated with the scheduling
problem from Table 14.1 in which the vertex label v has been replaced by the value
Φ(v) assigned by the ordering that’s listed in Table 14.2.

v S A B C D E F G H Z
Φ(v) 1 2 3 4 5 6 7 8 9 10

Table 14.2: The topological ordering illustrated in Figure 14.5.

has only |V ′ | = n0 vertices and so, by the inductive hypothesis, G′ has a topolog-
ical ordering Φ′ : V ′ → {1, 2, . . . , n0 }. We can extend this to a obtain a function
Φ : V → {1, 2, . . . , n0 + 1} by choosing
 ′
Φ (v) if v ̸= w
Φ(v) =
n0 + 1 if v = w

Further, this Φ is clearly a topological ordering because all predecessors u ∈ Pw of


w have
Φ(u) = Φ′ (u) ≤ n0 < n0 + 1 = Φ(w)
and, by construction, w has no successors. This concludes the inductive step and
establishes that all DAGs have at least one topological ordering.

14.3 Critical paths


Video Figure 14.5 shows a topological ordering for the graph from Figure 14.1. The reason
9.6 we’re interested in such orderings is that they provide a way to solve Eqns (14.1) in a
task-dependency graph. By construction, the starting vertex S is the only one that
has no predecessors and so any topological ordering must have Φ(S) = 1. Similarly,
the terminal vertex Z is the only one that has no successors, so for a project with n
tasks, Φ(Z) = n + 2.
By convention, we start the project at tS = 0. If we then use Eqns (14.1) to
compute the rest of the tv , working through the vertex list in the order assigned by
the topological ordering, it will always then be true that when we want to compute

tv = max (tu + w(u, v)) ,


u∈Pv

we will have all the tu for u ∈ Pv available.

115
0 2 0 2 5 9
0 0

0 0 16
0 0
0 2 6 0 2 6 10

Figure 14.6: In the graphs above the vertex labels have been replaced with values of
tj , the earliest times at which the corresponding task can start. The graph at left
shows the edges that enter into Eqn. (14.1) for the computation of t7 while the graph
at right shows all the tj .

14.3.1 Earliest starts


For the digraph in Figure 14.5, we get
tS = t1 = 0
tA = t2 = t1 + w(1, 2) = 0 + 0
..
.
tF = t7 = max(t3 + w(3, 7), t4 + w(4, 7)) = max(2 + 3, 2 + 4) = 6 (14.2)
..
.
Figure 14.6 illustrates both the computation of t7 and the complete set of tj . As
tZ = t10 = 16, we can conclude that it takes a minimum of 16 minutes to complete
the project.

14.3.2 Latest starts


The tv that we computed in the previous section is the earliest time at which the task
for vertex v could start, but it may be possible to delay the task without delaying
the whole project. Consider, for example, task G in our main example. It could
start as early as tG = t8 = 9, but since it only takes 6 minutes to complete, we could
delay its start a bit without disrupting the project. If we define Tv to be the time
by which task v must start if the project is not to be delayed, then it’s clear that
TG = TZ − 6 = 10. More generally, the latest time at which a task can start depends
on the latest starts of its successors, so that
Tv = min (Tu − w(v, u)) . (14.3)
u∈Sv

This expression, along with the observation that TZ = tZ , allows us to find Tv for all
tasks by working backwards through the DAG. Figure 14.7 illustrates this for our
main example, while Table 14.3 lists tv and Tv for all vertices v ∈ V .

14.3.3 Critical paths


Notice that some of the vertices in Figure 14.7 have tv = Tv . This happens because
they lie on a maximal-weight path from S to Z and so a delay to any one of them will

116
0:2 2:3 5:6 9:10
0

0:0 16:16

0
0:0 2:2 6:6 10:10

Figure 14.7: Here a vertex v is labelled with a pair tv : Tv that shows both both tv ,
the earliest time at which the corresponding task could start, and Tv , the latest time
by which the task must start if the whole project is not to be delayed. This project
has only a single critical path, (S, B, D, F, H, Z), which is highlighted in red.

v S A B C D E F G H Z
tv 0 0 0 2 2 5 6 9 10 16
Tv 0 2 0 3 2 6 6 10 10 16

Table 14.3: The earliest starts tv and latest starts Tv for the main example.

delay the whole project. Such maximal-weight paths play a crucial role in project
management and so there is a term to describe them:

Definition 14.5 (Critical path). A maximal-weight path from S to Z in a task-


dependency graph G(V, E, w) is called a critical path and G may contain more
than one of them.

Tasks whose vertices do not lie on a critical path have Tv > tv and so do not require
such keen supervision.

117
Part V

Planar Graphs
Chapter 15

Planar Graphs

This lecture introduces the idea of a planar graph—one that you can draw in such
a way that the edges don’t cross. Such graphs are of practical importance in, for
example, the design and manufacture of integrated circuits as well as the automated
drawing of maps. They’re also of mathematical interest in that, in a sense we’ll
explore, there are really only two non-planar graphs.
Reading:
The first part of our discussion is based on that found in Chapter 10 of
J. A. Bondy and U. S. R. Murty (2008), Graph Theory, Vol. 244 of
Springer Graduate Texts in Mathematics, Springer Verlag,
but in subsequent sections I’ll also draw on material from Section 1.5 of
Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,
which is available online via SpringerLink.

15.1 Drawing graphs in the plane


Video A graph G is said to be planar if it is possible to draw it in such a way that the
10.1 edges intersect only at their end points (the vertices). Such a drawing is also called
a planar diagram for G or a planar embedding of G. Indeed, it is possible to think of
such a drawing—call it G̃—as a graph isomorphic to G. Recall our original definition
of a graph: it involved only a vertex set V and a set E of pairs of vertices. Take the
vertex set of G̃ to be the set of end points of the arcs in the drawing and say that
the edge set consists of pairs made up of the two of end points of each arc.

15.1.1 The topology of curves in the plane


To give a clear treatment of this topic, it’s helpful to use some ideas from plane
topology. That takes us outside the scope of this module and so, in this subsection,
I’ll give some definitions and state one main result without proof. If you find this
material interesting (and it is pretty interesting, as well as beautiful and useful) you
might consider doing MATH31052, Topology.

118
Definition 15.1. A curve in the plane is a continuous image of the unit interval.
That is, a curve is a set of points

C = γ(t) ∈ R2 | 0 ≤ t ≤ 1

traced out as t varies across the closed unit interval. Here γ(t) = ((x(t), y(t)), where
x(t) : [0, 1] → R and y(t) : [0, 1] → R are continuous functions. If the curve does
not intersect itself (that is, if γ(t1 ) = γ(t2 ) ⇒ t1 = t2 ) then it is a simple curve.

Definition 15.2. A closed curve is a continuous image of the unit circle or,
equivalently, a curve in which γ(0) = γ(1). If a closed curve doesn’t intersect itself
anywhere other than at γ(0) = γ(1), then it is a simple closed curve.

Figure 15.1 and Table 15.1 give examples of these two definitions, while the
following one, which is illustrated in Figure 15.2, sets the stage for this section’s key
result.

Figure 15.1: From left to right: γ1 , a simple curve; γ2 , a curve that has an
intersection, so is not simple; γ3 , a simple closed curve and γ4 , a closed curve with
an intersection. Explicit formulae for the curves and their intersections appear in
Table 15.1.

Definition 15.3. A set S ⊂ R2 is arcwise-connected if, for every pair of points


x, y ∈ S, there is a curve γ(t) : [0, 1] → S with γ(0) = x and γ(1) = y.

Theorem 15.4 (The Jordan Curve Theorem). A simple closed curve C in the plane
divides the rest of the plane into two disjoint, arcwise-connected, open sets. These
two open sets are called the interior and exterior of C, often denoted Int(C ) and
Ext(C ), and any curve joining a point x ∈ Int(C ) to a point y ∈ Ext(C ) intersects
C at least once.

This is illustrated in Figure 15.3.

119
Curve x(t) y(t)
γ1 (t) 2t 24t − 36t2 + 14t − 1
3

γ2 (t) 24t − 36t + 14t − 1


3 2
8t2 − 8t + 1
γ3 (t) cos(2πt) sin(2πt)
γ4 (t) sin(4πt) sin(2πt)

Table 15.1: Explicit formulae


 forq the curves
 appearing
q  in Figure 15.1. The

intersection in γ2 occurs at γ2 2 − 6 = γ2 2 + 6 = 0, 3 , while the one
1 1 1 1 1

for γ4 happens at γ4 (0) = γ4 12 = (0, 0).

Figure 15.2: The two shaded regions above are, individually, arcwise connected,
but their union is not: any curve connecting x to y would have to pass outside the
shaded regions.

Int(C ) Ext(C )

x
y

Figure 15.3: An illustration of the Jordan Curve Theorem: x ∈ Int(C ), while


y ∈ Ext(C ), so any curve connecting them must cross C at least once.

120
1 (infinite) face 2 faces
4 vertices 3 vertices
3 edges 3 edges

3 faces 6 faces
4 vertices 8 vertices
5 edges 12 edges

Figure 15.4: Four examples of planar graphs with numbers of faces, vertices and
edges for each.

15.1.2 Faces of a planar graph


The definitions in the previous section allow us to be a bit more formal about the
definition of a planar graph:

Definition 15.5. A planar diagram for a graph G(V, E) with edge set E =
{e1 , . . . , em } is a collection of simple curves {γ1 , . . . , γm } that represent the edges
and have the property that the curves γj and γk corresponding to two distinct edges
ej and ek intersect if and only if the two edges are incident on the same vertex and,
in this case, they intersect only at the endpoints that correspond to their common
vertex.

Definition 15.6. A graph G is planar if and only if it has a planar diagram.

If a planar graph G contains cycles then the curves that correspond to the edges
in the cycles link together to form simple closed curves that divide the plane into
finitely many disjoint open sets called faces. Even if the graph has no cycles, there
will still be one infinite face: see Figure 15.4.

15.2 Euler’s formula for planar graphs


Video Our first substantive result about planar graphs is:
10.2
Theorem 15.7 (Euler’s formula). If G(V, E) is a connected planar graph with
n = |V | vertices and m = |E| edges, then any planar diagram for G has f = 2+m−n
faces.

Before giving a full proof, we begin with an easy special case:

Lemma 15.8 (Euler’s formula for trees). If G(V, E) is a tree then f = 2 + m − n.

121
G G´ = G \ e

Figure 15.5: Deleting the edge e causes two adjacent faces in G to merge into a
single face in G′ .

Proof of the lemma about trees: As G is a tree we know m = n − 1, so


2 + m − n = 2 + (n − 1) − n = 2 − 1 = 1 = f,
where the last equality follows because every planar diagram for a tree has only a
single, infinite face.
Proof of Euler’s formula in the general case: We’ll prove the result for arbitrary con-
nected planar graphs by induction on m, the number of edges.

Base case The smallest connected planar graph contains only a single vertex, so
has n = 1, m = 0 and f = 1. Thus
2+m−n=2+0−1=1=f
just as Euler’s formula demands.
Inductive step Suppose the result is true for all m ≤ m0 and consider a connected,
planar graph G(V, E) with |E| = m = m0 + 1 edges. Suppose further that G
has |V | = ns and a planar diagram with f faces. Then one of the following
things is true:
• G is a tree, in which case Euler’s formula is true by the lemma proved
above;
• G contains at least one cycle.
If G contains a cycle, choose an edge e ∈ E that’s part of that cycle and form
G′ = G\e, which has m′ = m0 edges, n′ = n vertices and f ′ = f − 1 faces.
This last follows because breaking a cycle merges two adjacent faces, as is
illustrated in Figure 15.5.
As G′ has only m0 edges, we can use the inductive hypothesis to say that
f ′ = m′ − n′ + 2. Then, again using unprimed symbols for quantities in G, we
have:
f ′ = m′ − n′ + 2
f − 1 = m0 − n + 2
f = (m0 + 1) − n + 2
f = m − n + 2,
which establishes Euler’s formula for graphs that contain cycles.

122
Figure 15.6: Planar graphs with the maximal number of edges for a given number
of vertices. The graph with the yellow vertices has n = 5 and m = 9 edges, while
those with the blue vertices have n = 6 and m = 12

15.3 Planar graphs can’t have many edges


Video To set the scene for our next result, consider graphs on n ∈ {1, 2, . . . , 5} vertices
10.3 and, for each n, try to draw a planar graph with as many edges as possible. At first
this is easy: it’s possible to find a planar diagram for each of the complete graphs
K1 , K2 , K3 and K4 , but, as we will prove below, K5 is not planar and the the best
one can do is to find a planar graph with n = 5 and m = 9. For n = 6 there
are two non-isomorphic planar graphs with m = 12 edges, but none with m ≥ 12.
Figure 15.6 shows examples of planar graphs having the maximal number of edges.
Larger planar graphs (those with n ≫ 5) tend to be even sparser, which means
that they have many fewer edges than they could. The relevant comparison for a
graph on n vertices is n(n − 1)/2, the number of edges in the complete graph Kn ,
so we’ll say that a graph is sparse if

n(n − 1) |E|
|E| ≪ or s≡ ≪1 (15.1)
2 n(n − 1)/2

Table 15.2 makes it clear that when n > 5 the planar graphs become increasingly
sparse1 .

15.3.1 Preliminaries: bridges and girth


The next two definitions will help us to formulate and prove our main result, a
somewhat technical theorem that gives a precise sense to the intuition that a planar
graph can’t have very many edges.

Definition 15.9. An edge e in a connected graph G(V, E) is a bridge if the graph


G′ = G\e formed by deleting e has more than one connected component.

Definition 15.10. If a graph G(V, E) contains one or more cycles then the girth
of G is the length of a shortest cycle.

These definitions are illustrated in Figures 15.7 and 15.8.


1
I wrote software to compute the first few rows of this table myself, but got the counts for n > 9
from the On-Line Encyclopedia of Integer Sequences, entries A003094 and A001349.

123
Number of non-isomorphic, connected graphs that are . . .
n mmax s planar, with m = mmax planar planar or not
5 9 0.9 1 20 21
6 12 0.8 2 99 112
7 15 0.714 5 646 853
8 18 0.643 14 5,974 11,117
9 21 0.583 50 71,885 261,080
10 24 0.583 ? 1,052,805 11,716,571
11 27 0.533 ? 17,449,299 1,006,700,565
12 30 0.491 ? 313,372,298 164,059,830,476

Table 15.2: Here mmax is the maximal number of edges appearing in a planar
graph on the given number of vertices, while the column labelled s lists the measure
of sparsity given by Eqn. 15.1 for connected, planar graphs with mmax edges. The
remaining columns list counts of various kinds of graphs and make the point that as
n increases, planar graphs with m = mmax become rare in the set of all connected
planar graphs and that this family itself becomes rare in the family of connected
graphs.

In a tree, every The blue edge


edge is a bridge. below is a
bridge

A cycle
contains no
bridges

Figure 15.7: Several examples about bridges.

124
Girth is 4

Girth is 3

Figure 15.8: The girth of a graph is the length of a shortest cycle.

Remark 15.11. A graph with n vertices has girth in the range 3 ≤ g ≤ n. The
lower bound arises because all cycles include at least three edges and the upper one
because the longest possible cycle occurs when G is isomorphic to Cn .

15.3.2 Main result: an inequality relating n and m


We are now in a position to state our main result:

Theorem 15.12 (Jungnickel’s 1.5.3). If G(V, E) is a connected planar graph with


n = |V | vertices and m = |E| edges then either:

A: G is acyclic and m = n − 1;

B: G has at least one cycle and so has a well-defined girth g. In this case

g(n − 2)
m≤ . (15.2)
g−2

Video Outline of the Proof. We deal first with the case where G is acyclic and then move
10.4 on to the harder, more general case:

A: G is connected, so if it has no cycles it’s a tree and we’ve already proved (see
Theorem 6.13) that trees have m = n − 1.

B: When G contains one or more cycles, we’ll prove the inequality 15.2 mainly
by induction on n, but we’ll need several sub-cases. To see why, let’s plan out
the argument.
Base case: n = 3
There is only a single graph on three vertices that contains a cycle, it’s K3 ,

125
which has girth g = 3 and n = 3, so our theorem says
g(n − 2)
m≤
g−2
3 × (3 − 2)

3−2
≤3

which is obviously true.

Inductive hypothesis:
Assume the result is true for all connected, planar graphs that contain a cycle
and have n ≤ n0 vertices.

Inductive step:
Now consider a connected, planar graph G(V, E) with n0 + 1 vertices that
contains a cycle. We need, somehow, to reduce this graph to one for which we
can exploit the inductive hypothesis and so one naturally thinks of deleting
something. This leads to two main sub-cases, which are illustrated2 below.
B.1 G contains at least one bridge. In this case the road to a proof by in-
duction seems clear: we’ll delete the bridge and break G into two smaller
graphs.

B.2 G does not contains any bridges. Equivalently, every edge in G is part
of some cycle. Here it’s less clear how to handle the inductive step and
so we will use an altogether different, non-inductive approach.

We’ll deal with these cases in turn, beginning with B.1.


As mentioned above, a natural approach is to delete a bridge and break G into
two smaller graphs—say, G1 (V1 , E1 ) and G2 (V2 , E2 )—then apply the inductive
2
The examples illustrating cases B.1 and B.2 are meant to help the reader follow the argument,
but are not part of the logic of the proof.

126
hypothesis to the pieces. If we define nj = |Vj | to be the number of vertices
in Gj and mj = |Ej | to be the corresponding number of edges, then we know

n1 + n2 = n = n0 + 1 and m1 + m2 = m − 1. (15.3)

But we need to take a little care as deleting a bridge leads to two further sub-
cases and we’ll need a separate argument for each. Given that the original
graph G contained at least one cycle—and noting that removing a bridge
can’t break a cycle—we know that at least one of the two pieces G1 and G2
contains a cycle. Our two sub-cases are thus:

B.1a Exactly one of the two pieces contains a cycle. We can assume without
loss of generality that it’s G2 , so that G1 is a tree.

B.1b Both G1 and G2 contain cycles.

Thus we can complete the proof of Theorem 15.12 by producing arguments


(full details below) that cover the following three possibilities

B.1a G contains a bridge and at least one cycle. Deleting the bridge leaves
two subgraphs, a tree G1 and a graph, G2 , that contains a cycle: we
handle this possibility in Case 15.13 below.
B.1b G contains a bridge and at least two cycles. Deleting the bridge leaves
two subgraphs, G1 and G2 , each of which contains at least one cycle: see
Case 15.14.
B.2 G contains one or more cycles, but no bridges: see Case 15.15.

127
15.3.3 Gritty details of the proof of Theorem 15.12
Video Before we plunge into the Lemmas, it’s useful to make a few observations about the
10.5 ratio g/(g − 2) that appears in Eqn. (15.2). Recall (from Remark 15.11) that if a
graph on n vertices contains a cycle, then the girth is well-defined and lies in the
range 3 ≤ g ≤ n.

• For g > 2, the ratio g/(g − 2) is a monotonically decreasing function of g and


so    
g1 g2
g1 > g2 ⇒ < . (15.4)
g1 − 2 g2 − 2

• The monotonicity of g/(g − 2), combined with the fact that g ≥ 3, implies
that g/(g − 2) is bounded from above by 3:
   
g 3
g≥3 ⇒ ≤ = 3. (15.5)
g−2 3−2

• And at the other extreme, g/(g − 2) is bounded from below (strictly) by 1:


   
g n
g≤n ⇒ ≥ > 1. (15.6)
g−2 n−2

The three cases


The cases below are all part of an inductive argument in which, G(V, E) is a con-
nected planar graph with |V | = n0 + 1 and |E| = m. It also contains at least one
cycle and so has a well-defined girth, g. Finally, we have an inductive hypothesis
saying that Theorem 15.12 holds for all trees and for all connected planar graphs
with |V | ≤ n0 .

Case 15.13 (Case B.1a of Theorem 15.12). Here G contains a bridge and deleting
this bridge breaks G into two connected planar, subgraphs, G1 (V1 , E1 ) and G2 (V2 , E2 ),
one of which is a tree.

Proof. We can assume without loss that G1 is the tree and then argue that every
cycle that appears in G is also in G2 (we’ve only deleted a bridge), so the girth of
G2 is still g. Also, n1 ≥ 1, so n2 ≤ n0 and, by the inductive hypothesis, we have

g(n2 − 2)
m2 ≤ .
g−2
But then, because G1 is a tree, we know that m1 = n1 − 1. Adding this to both
sides of the inequality yields

g(n2 − 2)
m1 + m2 ≤ (n1 − 1) +
g−2

128
or, equivalently,
g(n2 − 2)
m1 + m2 + 1 ≤ n1 + .
g−2
Finally, noting that m = m1 + m2 + 1, we can say
g(n2 − 2)
m ≤ n1 +
g−2
 
g g(n2 − 2)
≤ n1 +
g−2 g−2
g(n1 + n2 − 2)

g−2
g(n − 2)
≤ ,
g−2
which is the result we sought. Here the step from the first line to the second follows
because 1 < g/(g − 2) (recall Eqn. (15.6)), so
 
g
n1 < n1
g−2
and the last line follows because n = n1 + n2 .
Case 15.14 (Case B.1b of Theorem 15.12). This case is similar to the previous one
in that here again G contains a bridge, but in this case deleting the bridge breaks G
into two connected planar, subgraphs, each of which contains at least one cycle (and
so has a well defined-girth).
Proof. We’ll say that G1 has girth g1 and G2 has girth g2 and note that, as the girth
is defined as the length of a shortest cycle—and as every cycle that appears in the
original graph G must still be present in one of the Gj —we know that

g ≤ g1 and g ≤ g2 . (15.7)

Now, n = n0 + 1 and n = n1 + n2 so as we know that nj ≥ 3 (the shortest


possible cycle is of length 3 and the Gj contain cycles), it follows that we have both
n1 < n0 and n2 < n0 . This means that the inductive hypothesis applied to both Gj
and so we have
g1 (n1 − 2) g2 (n2 − 2)
m1 ≤ and m2 ≤ .
g1 − 2 g2 − 2
Adding these together yields:
g1 (n1 − 2) g2 (n2 − 2)
m1 + m2 ≤ +
g1 − 2 g2 − 2
g(n1 − 2) g(n2 − 2)
≤ +
g−2 g−2
g(n1 + n2 − 4)
≤ ,
g−2

129
where the step from the first line to the second follows from Eqn. 15.7 and the
monotonicity of the ratio g/(g − 2) (recall Eqn. (15.4)). If we again note that
1 < g/(g − 2) we can conclude that

g(n1 + n2 − 4)
m1 + m2 + 1 ≤ +1
g−2
g(n1 + n2 − 4) g
≤ +
g−2 g−2
g(n1 + n2 − 3)

g−2
and so
g(n1 + n2 − 3) g(n1 + n2 − 2)
m = m1 + m2 + 1 ≤ ≤
g−2 g−2
or, as n = n1 + n2 ,
g(n − 2)
m≤ ,
g−2
which is the result we sought.

Video Case 15.15 (Case B.2 of Theorem 15.12). In the final case G(V, E) does not
11.1 contain any bridges, which implies that every edge in E is part of some cycle. This
makes it harder to see how to use the inductive hypothesis (we’d have to delete two
or more edges to break G into disconnected pieces . . . ) and so we will use an entirely
different argument based on Euler’s Formula (Theorem 15.7).

Proof. First, define fj to be the number of faces whose boundary has j edges, making
sure to include the infinite face: Figures 15.9 illustrates this definition. Then, as
each edge appears in the boundary of exactly two faces, we have both
X
n X
n
fj = f and j × fj = 2m.
j=g j=g

Note that both sums start at g, the girth, as we know that there are no cycles of
shorter length. But then
X
n X
n X
n
2m = j × fj ≥ g × fj = g fj = gf,
j=g j=g j=g

where we obtain the inequality by replacing the length of the cycle j in j × fj with
g, the length of the shortest cycle (and hence the smallest value of j for which fj is
nonzero). Thus we have

2m ≥ gf or f ≤ 2m/g.

If we now use Euler’s Formula to say that f = m − n + 2, we have


2m 2m
m−n+2 ≤ or m− ≤ n − 2.
g g

130
f3 = 2 f4 = 2

f5 = 1 f9 = 1

Figure 15.9: The example used to illustrate case B.2 of Theorem 15.12 has f3 = 2,
f4 = 2, f5 = 1 and f9 = 1 (for the infinite face): all other fj are zero.

And then, finally,

gm 2m (g − 2)m g(n − 2)
− ≤ n−2 so ≤ n−2 and m ≤
g g g g−2
which is the result we sought.

15.3.4 The maximal number of edges in a planar graph


Theorem 15.12 has an easy corollary that gives a simple bound on the maximal
number of edges in a graph with |V | = n.

Corollary 15.16. If G(V, E) is a connected planar graph with n = |V | ≥ 3 vertices


and m = |E| edges then m ≤ 3n − 6.

Proof. Either G is a tree, in which case m = n − 1 and the bound in the Corollary
is certainly satisfied, or G contains at least one cycle. In the latter case, say that
the girth of G is g. We know 3 ≤ g ≤ n and our main result says
 
g
m ≤ (n − 2).
g−2

Thus, recalling that g/(g − 2) ≤ 3, the result follows immediately.

131
Figure 15.10: Both K5 and K3,3 are non-planar.

15.4 Two non-planar graphs


Video The hard-won inequalities from the previous section—which both say something like
11.2 “G planar implies m small”—cannot be used to prove that a graph is planar3 , but
can help establish that a graph isn’t. The idea is to use the contrapositives, which
are statements like “If m is too big, then G can’t be planar.”
To illustrate this, we’ll use our inequalities to prove that neither of the graphs in
Figure 15.10—K5 at left and K3,3 at right—is planar. Let’s begin with K5 : it has
n = 5 so Corollary 15.16 says that if it is planar,

m ≤ 3n − 6 = 3 × 5 − 6 = 15 − 6 = 9,

but K5 actually has m = 10 edges, which is one too many for a planar graph. Thus
K5 can’t have a planar diagram.
K3,3 isn’t planar either, but Corollary 15.16 isn’t strong enough to establish
this. K3,3 has n = 6 and m = 3 × 3 = 9. Thus it easily satisfies the bound from
Corollary 15.16, which requires only that m ≤ 3 × 6 − 6 = 12. But if we now apply
our main result, Theorem 15.12, we’ll see that K3,3 can’t be planar. The relevant
inequality is

g(n − 2)
m≤
g−2
4 × (6 − 2)

4−2
16

2
≤8

where, in passing from the first line to the second, I’ve used the fact that the girth
of K3,3 is g = 4. To see this, first note that any cycle in a bipartite graph has even
3
There is an O(n) algorithm that determines whether a graph on n vertices is planar and, if it
is, finds a planar diagram. We don’t have time to discuss it, but interested readers might like to
look at John Hopcroft and Robert Tarjan (1974), Efficient Planarity Testing, Journal of the ACM,
21(4):549–568. DOI: 10.1145/321850.321852

132
Figure 15.11: Knowing that K5 and K3,3 are non-planar makes it clear that these
two graphs can’t be planar either, even though neither violates the inequalities from
the previous section (check this).

length, so the shortest possible cycle in K3,3 has length 4, and then find such a cycle
(there are lots).
Once we know that K3,3 and K5 are nonplanar, we can see immediately that
many other graphs must be non-planar too, even when this would not be detected
by either of our inequalities: Figure 15.11 shows two such examples. The one on the
left has K5 as a subgraph, so even though it satisfies the bound from Theorem 15.12,
it can’t be planar. The example at right is similar in that any planar diagram for this
graph would obviously produce a planar diagram for K3,3 , but the sense in which
this second graph “contains” K3,3 is more subtle: we’ll clarify and formalise this in
the next section, then state a theorem that says, essentially, that every non-planar
graph contains K5 or K3,3 .

15.5 Kuratowski’s Theorem


Video We begin with a pair of definitions designed to capture the sense in which the graph
11.3 at right in Figure 15.11 contains K3,3 .
Definition 15.17. A subdivision of a graph G(V, E) is a graph H(V ′ , E ′ ) formed
by (perhaps repeatedly) removing an edge e = (a, b) ∈ E from G and replacing it
with a path
{(a, v1 ), (v1 , v2 ), . . . , (vk , b)}
containing of some number k > 0 of new vertices {v1 , . . . , vk }, each of which has
degree 2.
Figure 15.12 shows a couple examples of subdivisions, including one at left that
gives an indication of where the name comes from: the extra vertices can be thought
of as dividing an existing edge into smaller ones.

133
b d b d

f f

c c
g a j g a j

e h e h
i i
G H

Figure 15.12: H at right is a subdivision of G. The connection between b and d,


which was a single edge in G, becomes a blue path in H: one can imagine that the
original edge (b, d) has had three new, white vertices inserted into it, “sub-dividing”
it. The other deleted edge, (i, j) is shown as a pale grey, dashed line (to indicate
that it’s not part of H), while the new path that replaces it is again shown in blue
and white.

Definition 15.18. Two graphs G1 (V1 , E1 ) and G2 (V2 , E2 ) are said to be homeo-
morphic if they are isomorphic to subdivisions of the same graph.

That is, we say G1 and G2 are homeomorphic if there is some third graph—call
it G0 —such that both G1 and G2 are subdivisions of G0 . Figure 15.13 shows several
graphs that are homeomorphic to K5 . Homeomorphism is an equivalence relation
on graphs4 and so all the graphs in Figure 15.13 are homeomorphic to each other as
well as to K5 .
The notion of homeomorphism allows us to state the following remarkable result:

Theorem 15.19 (Kuratowski’s Theorem (1930)). A graph G is planar if and only


if it does not contain a subgraph homeomorphic to K5 or K3,3 .

Figure 15.13: These three graphs are homeomorphic to K5 , and hence also to each
other.

4
The keen reader should check this for herself.

134
G G/e

Contraction
u v w
e

Figure 15.14: Contracting the blue edge e = (u, v) in G yields the graph G/e at
right.

15.6 Wagner’s Theorem


Video I’d like to finish our discussion of planarity with a result that’s intuitively appealing
11.4 and often easier to apply than Kuratowski’s Theorem. To begin with, we need one
last piece of graph surgery.
Definition 15.20 (Edge Contraction). Given a graph G(V, E) and an edge e =
(u, v) ∈ E, the operation of edge contraction forms a new graph by:
• deleting the vertices u and v and all their edges;

• introducing a new vertex, w, that is adjacent to all the vertices that are adjacent
to u or v in G. The adjacency lists of u, v and w are related by

Aw = (Au ∪ Av ) \{u, v}.

We will write G/e to indicate the graph formed by contracting the edge e.
Definition 15.21 (Contractable Graphs). A graph G is contractible to a graph H
if G can be converted into a graph isomorphic to H by a sequence of edge contractions.
Theorem 15.22 (Wagner’s Theorem). A graph G is planar if and only if it does
not contains a subgraph that is contractible to K5 or K3,3 .
Figure 15.15 illustrates the use of Wagner’s theorem to establish non-planarity.

Figure 15.15: One can use Wagner’s Theorem to prove that the Petersen graph,
illustrated above, is non-planar by contracting all the blue edges.

135
Figure 15.16: The two-torus cut twice and flattened into a square.

15.7 Afterword
Video The fact that there are, in a natural sense, only two non-planar graphs is one of the
11.5 main reasons we study the topic. But this turns out to be the easiest case of an even
more amazing family of results that I’ll discuss briefly. These other theorems have
to do with drawing graphs on arbitrary surfaces (spheres, tori . . . )—it’s common
to refer to this as embedding the graph in the surface—and the process uses curves
similar to those discussed in Section 15.1.1, except that now we want, for example,
curves γ : [0, 1] → S2 , where S2 is the two-sphere, the surface of a three-dimensional
unit ball.
Embedding a graph in the sphere turns out to be the same as embedding it in
the plane: you can imagine drawing the planar diagram on a large, thin, stretchy
sheet and then smoothing it onto a big ball in such a way that the diagram lies in
the northern hemisphere while the edges of the sheet are all drawn together in a
bunch at the south pole. Similarly, if we had a graph embedded in the sphere we
could get a planar diagram for it by punching a hole in the sphere. Thus a graph
can be embedded in the sphere unless it contains—in the senses of Kuratowski’s and
Wagner’s Theorems—a copy of K5 or K3,3 . For this reason, these two graphs are
called topological obstructions to embedding a graph in the plane or sphere. They
are also sometimes referred to as forbidden subgraphs.
But if we now consider the torus, the situation for K5 and K3,3 is different. To
make drawings, I’ll use a standard representation of the torus as a square: you should
imagine this square to have been peeled off a more familiar torus-as-a-doughnut, as
illustrated in Figure 15.16. Figure 15.17 then shows embeddings of K5 ad K3,3 in
the torus—these are analogous to planar diagrams in that the curves representing
the edges don’t intersect except at their endpoints.
There are, however, graphs that one cannot embed in the torus and there is
even an analogue of Kuratowski’s Theorem that says that there are finitely many
forbidden subgraphs and that all non-toroidal5 graphs include at least one of them.
In fact, something even more spectacular is true: early in an epic series6 of papers,
5
By analogy with the term non-planar, a graph is said to be non-toroidal if it cannot be
embedded in the torus.
6
The titles all begin with the words “Graph Minors”. The series began in 1983 with
“Graph Minors. I. Excluding a forest” (DOI: 10.1016/0095-8956(83)90079-5) and seems

136
Figure 15.17: Embeddings of K5 (left) and K3,3 (right) in the torus. Edges that
run off the top edge of the square return on the bottom, while those that run off the
right edge come back on the left.

Figure 15.18: Neither of these graphs can be embedded in the two-torus.


These examples come from Andrei Gargarin, Wendy Myrvold and John Chambers
(2009), The obstructions for toroidal graphs with no K3,3 ’s, Discrete Mathematics,
309(11):3625–3631. DOI: 10.1016/j.disc.2007.12.075

Neil Robertson and Paul D. Seymour proved that every surface (the sphere, the
torus, the torus with two holes. . . ) has a Kuratowski-like theorem with a finite list
of forbidden subgraphs: two of those for the torus are shown in Figure 15.18. One
shouldn’t, however, draw too much comfort from the word “finite”. In her recent MSc
thesis7 Ms. Jennifer Woodcock developed a new algorithm for embedding graphs in
the torus and tested it against a database that, although known to be incomplete,
includes 239,451 forbidden subgraphs.

to have concluded with “Graph Minors. XXIII. Nash-Williams’ immersion conjecture” in


2010 (DOI: 10.1016/j.jctb.2009.07.003). The result about embedding graphs in surfaces
appeared in 1990 in “Graph Minors. VIII. A Kuratowski theorem for general surfaces”
(DOI: 10.1016/0095-8956(90)90121-F).
7
Ms. Woodcock’s thesis, A Faster Algorithm for Torus Embedding, is lovely and is the
source of much of the material in this section.

137

You might also like