Full Lecture Notes
Full Lecture Notes
Mark Muldoon
3 Graph Colouring 17
3.1 Notions and notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 An algorithm to do colouring . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 The greedy colouring algorithm . . . . . . . . . . . . . . . . . 19
3.2.2 Greedy colouring may use too many colours . . . . . . . . . . 20
3.3 An application: avoiding clashes . . . . . . . . . . . . . . . . . . . . . 22
4 Efficiency of algorithms 24
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2 Examples and issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.1 Greedy colouring . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.2 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . 27
4.2.3 Primality testing and worst-case estimates . . . . . . . . . . . 27
4.3 Bounds on asymptotic growth . . . . . . . . . . . . . . . . . . . . . . 28
4.4 Analysing the examples . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4.1 Greedy colouring . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4.2 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . 29
4.4.3 Primality testing via trial division . . . . . . . . . . . . . . . . 30
4.5 Afterword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8 Matrix-Tree Ingredients 55
8.1 Lightning review of permutations . . . . . . . . . . . . . . . . . . . . 55
8.1.1 The Symmetric Group Sn . . . . . . . . . . . . . . . . . . . . 56
8.1.2 Cycles and sign . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.2 Using graphs to find the cycle decomposition . . . . . . . . . . . . . . 57
8.3 The determinant is a sum over permutations . . . . . . . . . . . . . . 58
8.4 The Principle of Inclusion/Exclusion . . . . . . . . . . . . . . . . . . 60
8.4.1 A familiar example . . . . . . . . . . . . . . . . . . . . . . . . 60
8.4.2 Three subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8.4.3 The general case . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.4.4 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.5 Appendix: Proofs for Inclusion/Exclusion . . . . . . . . . . . . . . . 64
8.5.1 Proof of Lemma 8.12, the case of two sets . . . . . . . . . . . 64
8.5.2 Proof of Theorem 8.13 . . . . . . . . . . . . . . . . . . . . . . 65
8.5.3 Alternative proof . . . . . . . . . . . . . . . . . . . . . . . . . 66
iv
Part I
This chapter introduces Graph Theory, the main subject of the course, and includes
some basic definitions as well as a number of standard examples.
Reading: Some of the material in this chapter comes from the beginning of Chap-
ter 1 in
If you are at the university, either physically or via the VPN, you can download the
chapters of this book as PDFs.
1
North Bank
West
East
Island
Island
South Bank
Figure 1.1: The panel at left shows the seven bridges and four land masses
that provide the setting for the Königsberg bridge problem, which asks whether it is
possible to make a circular walking tour of the city that crosses every bridge exactly
once. The panel at right includes a graph-theoretic abstraction that helps one prove
that no such tour exists.
Figure 1.2: Königsberg is a real place—a port on the Baltic—and during Euler’s
lifetime it was part of the Kingdom of Prussia. The panel at left is a bird’s-eye view
of the city that shows the celebrated seven bridges. It was made by Matthäus Merian
and published in 1652. The city is now called Kaliningrad and is part of the Russian
Federation. It was bombed heavily during the Second World War: the panel at right
shows a recent satellite photograph and one can still recognise the two islands and
modern versions of some of the bridges, but very little else appears to remain.
point—and do the same with the other island, as well as with the north and south
banks of the river—and then connect them with arcs that represent the bridges.
The problem then reduces to the question whether it is possible to draw a path that
starts and finishes at the same dot, but traces each of over the seven arcs exactly
once.
One can prove that such a tour is impossible by contradiction. Suppose that
one exists: it must then visit the easternmost island (see Figure 1.3) and we are
free to imagine that the tour actually starts there. To continue we must leave the
island, crossing one of its three bridges. Then, later, because we are required to
2
North Bank
West East
Island Island
South Bank
Figure 1.3: The Königsberg Bridge graph on its own: it is not possible to trace a
path that starts and ends on the eastern island without crossing some bridge at least
twice.
cross each bridge exactly once, we will have to return to the eastern island via a
different bridge from the one we used when setting out. Finally, having returned
to the eastern island once, we will need to leave again in order to cross the island’s
third bridge. But then we will be unable to return without recrossing one of the
three bridges. And this provides a contradiction: the walk is supposed to start and
finish in the same place and cross each bridge exactly once.
Definition 1.1. A graph is a finite, nonempty set V , the vertex set, along with
a set E, the edge set, whose elements e ∈ E are pairs e = (a, b) with a, b ∈ V .
We will often write G(V, E) to mean the graph G with vertex set V and edge
set E. An element v ∈ V is called a vertex (plural vertices) while an element e ∈ E
is called an edge.
The definition above is deliberately vague about whether the pairs that make
up the edge set E are ordered pairs—in which case (a, b) and (b, a) with a ̸= b are
distinct edges—or unordered pairs. In the unordered case (a, b) and (b, a) are just
two equivalent ways of representing the same edge.
Definition 1.2. An undirected graph is a graph in which the edge set consists of
unordered pairs.
3
a b a b
Figure 1.4: Diagrams representing graphs with vertex set V = {a, b} and edge
set E = {(a, b)}. The diagram at left is for an undirected graph, while the one at
right shows a directed graph. Thus the arrow on the right represents the ordered pair
(a, b).
Definition 1.3. A directed graph is a graph in which the edge set consists of
ordered pairs. The term “directed graph” is often abbreviated as digraph.
Although graphs are defined abstractly as above, it’s very common to draw
diagrams to represent them. These are drawings in which the vertices are shown
as points or disks and the edges as line segments or arcs. Figure 1.4 illustrates the
graphical convention used to mark the distinction between directed and undirected
edges: the former are drawn as line segments or arcs, while the latter are shown as
arrows. A directed edge e = (a, b) appears as an arrow that points from a to b.
Sometimes one sees graphs with more than one edge3 connecting the same two
vertices; the Königsberg Bridge graph is an example. Such edges are called multiple
or parallel edges. Additionally, one sometimes sees graphs with edges of the form
e = (v, v). These edges, which connect a vertex to itself, are called loops or self
loops. All these terms are illustrated in Figure 1.5.
v1
v4 v2
v3
Figure 1.5: A graph whose edge set includes the self loop (v1 , v1 ) and two parallel
copies of the edge (v1 , v2 ).
Remark. In this course when we say “graph” we will normally mean an undirected
graph that contains no loops or parallel edges: if you look in other books you may
3
In this case it is a slight abuse of terminology to talk about the edge “set” of the graph, as sets
contain only a single copy of each of their elements. Very scrupulous books (and students) might
prefer to use the term edge list in this context, but I will not insist on this nicety.
4
a b a b
d c c d
Figure 1.6: Two diagrams for the same graph: the crossed edges in the leftmost
version do not signify anything.
see such objects referred to as simple graphs. By contrast, we will refer to a graph
that contains parallel edges as a multigraph.
V = {v1 , v2 , . . . , vn }
E = {(vj , vk ) | 1 ≤ j ≤ (n − 1), (j + 1) ≤ k ≤ n} .
5
K1 K5
K3 K4
K2
Figure 1.7: The first five members of the family Kn of complete graphs.
P4
P5
V = {v1 , v2 , . . . , vn }
E = {(vj , vj+1 ) | 1 ≤ j < n} ,
V = {v1 , v2 , . . . , vn }
E = {(v1 , v2 ), (v2 , v3 ), . . . , (vj , vj+1 ), . . . , (vn−1 , vn ), (vn , v1 )} .
Cn has n edges that are often written (vj , vj+1 ), where the subscripts are taken to
be defined periodically so that, for example, vn+1 ≡ v1 . See Figure 1.9 for examples.
v1
v1 v2
v1
v5 v2
C3 C4 C5
v3 v2
v4 v3 v4 v3
Figure 1.9: The first three members of the family Cn of cycle graphs.
6
The complete bipartite graphs Km,n
The complete bipartite graph Km,n is a graph whose vertex set is the union of a set V1
of m vertices with second set V2 of n different vertices and whose edge set includes
every possible edge running between these two subsets:
V = V1 ∪ V2
= {u1 , . . . , um } ∪ {v1 , . . . , vn }
E = {(u, v) | u ∈ V1 , v ∈ V2 } .
Km,n thus has |E| = mn edges: see Figure 1.10 for examples.
Figure 1.10: A few members of the family Km,n of complete bipartite graphs.
Here the two subsets of the vertex set are illustrated with colour: the white vertices
constitute V1 , while the red ones form V2 .
7
0 I1 1 010 110
01 11 011 111
I3
I2
001 101
00 10 000 100
Figure 1.11: The first three members of the family Id of cube graphs. Notice
that all the cube graphs are bipartite (the red and white vertices are the two disjoint
subsets from Definition 1.6), but that, for example, I3 has only 12 edges, while the
complete bipartite graph K4,4 has 16.
c
b d
a e v a b c d e f g h
g deg(v) 1 1 1 2 1 1 4 1
f
h
Figure 1.12: The degrees of the vertices in a small graph. Note that the graph
consists of two “pieces”.
So, for example, every vertex in the complete graph Kn has degree n − 1, while
every vertex in a cycle graph Cn has degree 2; Figure 1.12 provides more examples.
The generalisation of degree to directed graphs is slightly more involved. A vertex v
in a digraph has two degrees: an in-degree that counts the number of edges having
v at their tip and an out-degree that counts number of edges having v at their tail.
See Figure 1.13 for an example.
8
a v degin (v) degout (v)
a 2 0
d b b 1 1
c 1 1
c d 0 2
Once we have the notion of degree, we can formulate our first theorem:
Proof. Each edge contributes twice to the sum of degrees, once for each of the two
vertices on which it is incident.
The following two results are immediate consequences:
The first is fairly obvious: the right hand side of (1.1) is clearly an even number, so
the sum of degrees appearing on the left must be even as well. To get the formula
for the number of edges in Id , note that it has 2d vertices, each of degree d, so the
Handshaking Lemma tells us that
X
2|E| = deg(v) = 2d × d
v∈V
9
Chapter 2
Reading: Some of the material in this chapter comes from the beginning of Chap-
ter 1 in
Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,
which is available online via SpringerLink.
If you are at the university, either physically or via the VPN, you can download the
chapters of this book as PDFs.
is K3 , the complete graph on three vertices. But if we regard the edges as directed
then G is the graph pictured at the right of Figure 2.1
Of course, if every vertex in G(V, E) appears in some edge (equivalently, if every
vertex has nonzero degree), then we can dispense with the vertex set and specify
the graph by its edge list alone.
10
1 2 1 2
3 3
Figure 2.1: If graph from Example 2.1 is regarded as undirected (our default
assumption) then it is K3 , the complete graph on three vertices, but if it’s directed
then it’s the digraph at right above.
Once again, the directed and undirected cases are different. For the graphs from
Example 2.1 we have:
1 2 0 1 1
if G is then A = 1 0 1 ,
3 1 1 0
1 2 0 1 1
but if G is then A = 0 0 1 .
3 0 0 0
Remark 2.2. The following properties of the adjacency matrix follow readily from
the definition in Eqn. (2.1).
• If the graph has no loops then Ajj = 0 for 1 ≤ j ≤ n. That is, there are zeroes
down the main diagonal of A.
11
• One can compute the degree of a vertex by adding up entries in the adjacency
matrix. I leave it as an exercise for the reader to establish that in an undirected
graph,
Xn Xn
deg(vj ) = Ajk = Akj , (2.2)
k=1 k=1
where the first sum runs across the j-th row, while the second runs down the
j-th column. Similarly, in a directed graph we have
X
n X
n
degout (vj ) = Ajk and degin (vj ) = Akj . (2.3)
k=1 k=1
• Sometimes one sees a modified form of the adjacency matrix used to describe
multigraphs (graphs that permit two or more edges between a given pair of
vertices). In this case one takes
Definition 2.3. In an undirected graph G(V, E) the adjacency list associated with
a vertex v is the set Av ⊆ V defined by
Av = {u ∈ V | (u, v) ∈ E}.
An example appears in Figure 2.2. It follows readily from the definition of degree
that
deg(v) = |Av |. (2.5)
3 5
A1 = {2}
A2 = {1, 3, 4}
1 2 A3 = {2, 4, 5}
A4 = {2, 3}
4
A5 = {3}
Figure 2.2: The graph at left has adjacency lists as shown at right.
12
1 2 v Predecessors Successors
1 ∅ {2, 3}
2 {1} {3}
3 3 {1, 2} ∅
Figure 2.3: The directed graph at left has the predecessor and successor lists shown
at right.
Similarly, one can specify a directed graph by providing separate lists of succes-
sors or predecessors (these terms were defined in Lecture 1) for each vertex.
Pv = {u ∈ V | (u, v) ∈ E}
Sv = {u ∈ V | (v, u) ∈ E}.
Figure 2.3 gives some examples. The analogues of Eqn. (2.5) for a directed graph
are
degin (v) = |Pv | and degout (v) = |Sv |. (2.6)
Definition 2.5. Two graphs G1 (V1 , E1 ) and G2 (V2 , E2 ) are said to be isomorphic
if there exists a bijection1 α : V1 → V2 such that the edge (α(a), α(b)) ∈ E2 if and
only if (a, b) ∈ E1 .
Generally it’s difficult to decide whether two graphs are isomorphic. In particu-
lar, there are no known fast algorithms2 (we’ll learn to speak more precisely about
what it means for an algorithm to be “fast” later in the term) to decide. One can,
1
Recall that a bijection is a mapping that’s one-to-one and onto.
2
Algorithms for graph isomorphism are the subject of intense current research: see Erica Klar-
reich’s Jan. 2017 article in Quanta Magazine, Complexity Theory Problem Strikes Back, for a
popular account of some recent results.
13
a c
000 001
5 7
e g
6 8 011 010
1 3 110 100
2 4 f h
101 111
b d
Figure 2.4: Here are three different graphs that are all isomorphic to the cube
graph I3 , which is the middle one. The bijections that establish the isomorphisms
are listed in Table 2.1.
v 000 001 010 011 100 101 110 111
αL (v) 1 2 3 4 5 6 7 8
αR (v) a b c d e f g h
Table 2.1: If we number the graphs in Figure 2.4 so that the leftmost is G1 (V1 , E1 )
and the rightmost is G3 (V3 , E3 ), then the bijections αL : V2 → V1 and αR : V2 → V3
listed above establish that G2 is isomorphic, respectively, to G1 and G3 .
of course, simply try all possible bijections between the two vertex sets, but there
are n! of these for graphs on n vertices and so this brute force approach rapidly
becomes impractical. On the other hand, it’s often possible to detect quickly that
two graphs aren’t isomorphic. The simplest such tests are based on the following
propositions, whose proofs are left to the reader.
Proposition 2.6. If G1 (V1 , E1 ) and G2 (V2 , E2 ) are isomorphic then |V1 | = |V2 | and
|E1 | = |E2 |.
Another simple test depends on the following quantity, examples of which appear
in Figure 2.5.
Proposition 2.9. If G1 (V1 , E1 ) and G2 (V2 , E2 ) are isomorphic then they have the
same degree sequence.
14
(1, 2, 2, 3) (2, 2, 2) (1, 1, 2, 2)
v3 u4
v1 v2 v4 u1 u2 u3
v5 u5
Figure 2.6: These two graphs both have degree sequence (1, 2, 2, 2, 3), but they’re
not isomorphic: see Example 2.10 for a proof.
Example 2.10 (Proof that the graphs in Figure 2.6 aren’t isomorphic). Both graphs
in Figure 2.6 have the same degree sequence, (1, 2, 2, 2, 3), so both contain a single
vertex of degree 1 and a single vertex of degree 3. These vertices are adjacent in the
graph at left, but not in the one at right and this observation forms the basis for a
proof by contradiction that the graphs aren’t isomorphic.
Assume, for contradiction, that they are isomorphic and that
α : {v1 , v2 , v3 , v4 , v5 } → {u1 , u2 , u3 , u4 , u5 }
is the bijection that establishes the isomorphism. Then Prop. 2.7 implies that it must
be true that α(v1 ) = u1 (as these are the sole vertices of degree one) and α(v2 ) = u3 .
But then the presence of the edge (v1 , v2 ) on the left would imply the existence of an
edge (α(v1 ), α(v2 )) = (u1 , u3 ) on the right, and no such edge exists. This contradicts
our assumption that α establishes an isomorphism, so no such α can exist and the
graphs aren’t isomorphic.
15
2.3 Terms for parts of graphs
Video Finally, we’ll often want to speak of parts of graphs and the two most useful defini-
2.2 tions here are:
and
Definition 2.12. Given a graph G(V, E) and a subset of its vertices V ′ ⊆ V , the
subgraph induced by V ′ is the subgraph G′ (V ′ , E ′ ) where
That is, the subgraph induced by the vertices V ′ consists of V ′ itself and all those
edges in the original graph that involve only vertices from V ′ . Both these definitions
are illustrated in Figure 2.7.
Figure 2.7: The three graphs at right are subgraphs of the one at left. The middle
one is the subgraph induced by the blue shaded vertices.
16
Chapter 3
Graph Colouring
The material from the first two chapters provides enough background that we can
begin to discuss a problem—graph colouring—that is both mathematically rich and
practically applicable.
Reading:
The material for this chapter appears, in very condensed form, in Chapter 9 of
that assigns distinct values to adjacent vertices: that is, (u, v) ∈ E ⇒ ϕ(u) ̸= ϕ(v).
If G has a k-colouring then it is said to be k-colourable.
I’ll refer to the values assigned by ϕ(v) as “colours” and say that a graph is k-
colourable if one can draw it in such a way that no two adjacent vertices have the
same colour. Examples of graphs and colourings include
17
1 2 1 1
4 3 2 2
3 1 1 1 1
4 2 2 2 2 2
Figure 3.1: The complete graphs K4 and K5 as well as the complete bipartite
graphs K2,2 and K3,4 , each coloured using the smallest possible number of colours.
Here the colouring is represented in two ways: as numbers giving ϕ(v) for each vertex
v and with, well, colours (see the electronic version).
18
3.2 An algorithm to do colouring
Video The chromatic number χ(G) is defined as a kind of ideal: it’s the minimal k for
2.4 which we can find a k-colouring. This might make you suspect that it’s hard to find
χ(G) for an arbitrary graph—how could you ever know that you’d used the smallest
possible number of colours? And, aside from a few exceptions such as those in the
previous section, you’d be right to think this: there is no known fast (we’ll make
the notion of “fast” more precise soon) algorithm to find an optimal (in the sense of
using the smallest number of colours) colouring.
(1) Initialize
Set c(vj ) ← 0 for all 1 ≤ j ≤ n
c(v1 ) ← 1
j←2
(2) c(vj ) ← min k ∈ N | k > 0 and c(u) ̸= k ∀u ∈ Avj
Remarks
• The algorithm above is meant to be explicit enough that one could implement
it in R or MATLAB. It thus includes expressions such as j ← 2 which means
“set j to 2” or “j gets the (new) value 2”. The operator ← is sometimes called
the assignment operator and it appears in some form in all the programming
languages I know. Sometimes it’s expressed with notation like j = j + 1,
but this is a jarring, nonsensical-looking thing for a mathematician and so I’ll
avoid it.
19
• We will discuss several more algorithms in this course, but will not be much
more formal about how they are specified. This is mainly because a truly rig-
orous account of computation would take us into the realms of computability
theory, a part of mathematical logic, and would require much of the rest of
the term, leaving little time for our main subjects.
Finally, to emphasise further the mechanical nature of greedy colouring, we could
rewrite it in a style that looks even closer to MATLAB code:
Algorithm 3.7 (Greedy colouring: as pseudo-code).
Given a graph G with edge set E, vertex set V = {v1 , . . . , vn } and adjacency lists
Av , construct a function c : V → N such that if the edge e = (vi , vj ) ∈ E, then
c(vi ) ̸= c(vj ).
(1) Set c(vj ) ← 0 for 1 ≤ j ≤ n.
(2) c(v1 ) ← 1.
(3) for 2 ≤ j ≤ n {
(4) Choose a colour k > 0 for vertex vj that differs from those of its neighbours
c(vj ) ← min k ∈ N | k > 0 and c(u) ̸= k ∀u ∈ Avj
(5) } End of loop over vertices vj .
Both versions of the algorithm perform exactly the same steps, in the same
order, so comparison of these two examples may clarify the different approaches to
presenting algorithms.
For the reasons discussed above, this k provides only an upper bound on the chro-
matic number of G. To drive this point home, consider Figure 3.2, which illustrates
the process of applying the greedy colouring algorithm to two graphs, one in each
column.
For the graph in the left column—call it G1 —the algorithm produces a 3-
colouring, which is actually optimal. To see why, notice that the subgraph induced
by the vertices {v1 , v2 , v3 } is isomorphic to K3 . Thus we need at least 3 colours for
these three vertices and so, using Lemma 3.5, we can conclude that χ(G1 ) ≥ 3. On
the other hand, the greedy algorithm provides an explicit example of a 3-colouring,
which implies that χ(G1 ) ≤ 3, so we have proven that χ(G1 ) = 3.
The graph in the right column–call it G2 —is isomorphic to G1 (a very keen
reader could write out the isomorphism explicitly), but its vertices are numbered
differently and this means that Algorithm 3.7 colours them in a different order and
arrives at a sub-optimal k-colouring with k = 4.
20
v2 v2
v1 v4 v5 v1 v5 v4
1 1
v3 v3
v2 v2
2 2
v1 v4 v5 v1 v5 v4
1 1
v3 v3
v2 v2
2 2
v1 v4 v5 v1 v5 v4
1 1
3 3
v3 v3
v2 v2
2 2
v1 v4 v5 v1 v5 v4
1 1 1 1
3 3
v3 v3
v2 v2
2 2
v1 v4 v5 v1 v5 v4
1 1 2 1 4 1
3 3
v3 v3
Figure 3.2: Two examples of applying Algorithm 3.7: the colouring process runs
from the top of a column to the bottom. The graphs in the right column are the same
as those in the left, save that the labels on vertices 4 and 5 have been switched. As
in Figure 3.1, the colourings are represented both numerically and graphically.
21
3.3 An application: avoiding clashes
Video I’d like to conclude by introducing a family of applications that involve avoiding
2.5 some sort of clash—where some things shouldn’t be allowed to happen at the same
time or in the same place. A prototypical example is:
One can turn this into a graph-colouring problem by constructing a graph whose
vertices are committees and whose edges connect those that have members in com-
mon: such committees can’t meet simultaneously, or their shared members will
have clashes. A suitable graph appears at left in Figure 3.3, where, for example,
the vertex for the Justice committee (labelled Just) is connected to the one for the
Education committee (Ed) because Gove serves on both.
The version of the graph at right in Figure 3.3 shows a three-colouring and,
as the vertices CMS, Ed and Just form a subgraph isomorphic to K3 , this is the
smallest number of colours one can possibly use and so the chromatic number of the
committee-and-clash graph is 3. This means that we need at least three time slots to
schedule the meetings. To see why, think of a vertex’s colour as a time slot: none of
the vertices that receive the same colour are adjacent, so none of the corresponding
committees share any members and thus that whole group of committees can be
scheduled to meet at the same time. There are variants of this problem that involve,
for example, scheduling exams so that no student will be obliged to be in two places
at the same time or constructing sufficiently many storage cabinets in a lab so that
chemicals that would react explosively if stored together can be housed separately:
see this week’s Problem Set for another example.
22
Def
CMS Ed
CMS Tech
Ed Def FRA
Tech
FRA
Just FA
Just FA
Figure 3.3: The graph at left has vertices labelled with abbreviated committee names
and edges given by shared members. The graph at right is isomorphic, but has been
redrawn for clarity and given a three-colouring, which turns out to be optimal.
23
Chapter 4
Efficiency of algorithms
though the discussion there mentions some graph-theoretic matters that we have
not yet covered.
4.1 Introduction
Video The aim of today’s lecture is to develop some convenient terms for the way in which
3.1 the amount of work required to solve a problem with a particular algorithm depends
on the “size” of the input. Note that this is a property of the algorithm: it may
be possible to find a better approach that solves the problem with less work. If we
were being very careful about these ideas we would make a distinction between the
quantities I’ll introduce below, which, strictly speaking, describe the time complexity
of an algorithm, and a separate set of bounds that say how much computer memory
(or how many sheets of paper, if we’re working by hand) an algorithm requires. This
latter quantity is called the space complexity of the algorithm, but we won’t worry
about that much in this course.
To get an idea of the kinds of results we’re aiming for, recall the standard algo-
rithm for pencil-and-paper addition of two numbers (write one number above the
other, draw a line underneath . . . ).
24
2011
21
2032
The basic step in this process is the addition of two decimal digits, for example, in
the first column, 1 + 1 = 2. The calculation here thus requires two basic steps.
More generally, the number of basic steps required to perform an addition a + b
using the pencil-and-paper algorithm depends on the numbers of decimal digits in
a and b. The following proposition (whose proof is left to the reader) explains why
it’s thus natural to think of log10 (a) and log10 (b) as the sizes of the inputs to the
addition algorithm.
Definition 4.1 (Floor and ceiling). For a real number x ∈ R, define ⌊x⌋, which
is read as “floor of x”, to be the greatest integer less-than-or-equal-to x. Similarly,
define ⌈x⌉, “ceiling of x”, to be the least integer greater-than-or-equal-to x. In more
conventional notation these functions are given by
Proposition 4.2 (Logs and length in decimal digits). The decimal representation
of a number n > 0 ∈ N has exactly d = 1 + ⌊log10 (n)⌋ decimal digits.
In light of these results, one might hope to say something along the lines of
The pencil-and-paper addition algorithm computes the sum a + b
in 1 + min (⌊log10 (a)⌋, ⌊log10 (b)⌋) steps.
This is a bit of a mouthful and, worse, isn’t even right. Quick-witted readers will
have noticed that we haven’t taken proper account of carried digits. The example
above didn’t involve any, but if instead we had computed
1959
21
1980
we would have carried a 1 from the first column to the second and so would have
needed to do three basic steps: 9 + 1, 1 + 5, and 6 + 2.
In general, if the larger of a and b has d decimal digits then computing a + b
could require as many as d − 1 carrying additions. That means our statement above
should be replaced by something like
The number of steps N required for the pencil-and-paper addition algo-
rithm to compute the sum a + b satisfies the following bounds:
This introduces an important theme: knowing the size of the input isn’t always
enough to determine exactly how long a computation will take, but may enable one
to place bounds on the running-time.
25
The statements above are rather fiddly and not especially useful. In practice,
people want such estimates so they can decide whether a problem is do-able at
all. They want to know whether, say, given a calculator that can add two 5-digit
numbers in one second1 , it would be possible to work out the University of Manch-
ester’s payroll in less than a month. For these sorts of questions2 one doesn’t want
the cumbersome, though precise sorts of statements formulated above, but rather
something semi-quantitative along the lines of:
The remainder of today’s lecture will develop a rigorous framework that we can use
to make such statements.
In the rest of this section I’ll discuss the associated algorithms briefly, with an eye
to answering three key questions:
(1) What are the details of the algorithm and what should we regard as the basic
step?
26
step the process of looking at a neighbour’s colour, then the algorithm requires a
number of basic steps given by
X X
|Av | = deg(v) = 2|E| (4.1)
v∈V v∈V
where the final equality follows from Theorem 1.8, the Handshaking Lemma. This
suggests that we should measure the size of the problem in terms of the number of
edges.
Here we’ll measure the size of the problem with n, the number of rows in the matrices,
and take as our basic steps the arithmetic operations, addition and multiplication
of two real numbers. The formula above then makes it easy to see that it takes
n multiplications and (n − 1) additions to compute a single entry in the product
matrix and so, as there are n2 such entries, we need
We can thus test whether a number is prime with the following simple algorithm:
27
Algorithm 4.4 (Primality testing via trial division).
Given a natural number n ∈ N, determine whether it is prime.
√
(1) For each b ∈ N in the range 2 ≤ b ≤ n {
(3) }
This problem is more subtle than the previous examples in a couple of ways. A
natural candidate for the basic step here is the computation of (n mod b), which
answers the question “Does b divide n?”. But the kinds of numbers whose primality
one wants to test in, for example, cryptographic applications are large and so we
might want take account of the magnitude of n in our measure of the input size.
If we compute (n mod b) with the standard long-division algorithm, the amount of
work required for a basic step will itself depend on the number of digits in n and so,
as in our analysis of the pencil-and-paper addition algorithm, it’ll prove convenient
to measure the size of the input with d = log10 (n), which is approximately the
number of decimal digits in n.
A further subtlety is that because the algorithm reports an answer as soon as
it finds a factor, the amount of work required varies wildly, even among n with the
same number of digits. For example, half of all 100-digit numbers are even and so
will be revealed as composite
√ by the very first value of b we’ll try. Primes, on the
other hand, will require ⌊ n⌋ − 1 tests. A standard way to deal with this second
issue is to make estimates about the worst case efficiency: in this case, that’s the
running-time required for primes. A much harder approach is to make an estimate
of the average case running-time obtained by averaging over all inputs with a given
size.
• f (n) = O(g(n)) if ∃ c1 > 0 such that, for all sufficiently large n, f (n) ≤ c1 g(n);
• f (n) = Ω(g(n)) if ∃ c2 > 0 such that, for all sufficiently large n, f (n) ≥ c2 g(n);
28
Notice that the definitions of f (n) = O(g(n)) and f (n) = Ω(g(n)) include the phrase
“for all sufficiently large n”. This is equivalent to saying, for example,
f (n) = Ω(g(n)) if there exist some c1 > 0 and N1 ≥ 0 such that for all
n ≥ N1 , f (n) ≤ c1 g(n).
The point is that the definitions are only concerned with asymptotic growth—they’re
all about the limit of large n.
29
which is a special case of a more general result. One can prove—see the Problem
Sets—that if f : N → R+ is a polynomial in n of degree k, then f = Θ(nk ). Algo-
rithms that are O(nk ) for some k ∈ R are often called polynomial time algorithms.
To get the righthand side in terms of the problem size d = ⌊log10 (n)⌋ + 1, note that,
for all x ∈ R, x < ⌊x⌋ + 1 and so
Then √ 1/2
n = n1/2 = 10log10 (n)
which implies that
√
f (d) ≤ n
1/2
≤ 10log10 (n)
1/2
≤ 10d
≤ 10d/2 ,
where, in passing from the second line to the third, I have replaced log10 (n) with d.
This change increases the righthand side of the inequality (which thus still provides
an upper bound on f (d)) because, as we saw above, d > log10 (n). Thus we have
established that primality testing via trial division is O(10d/2 ).
Such algorithms are often called exponential-time or just exponential and they
are generally regarded as impractical for, as one increases d, the computational
requirements can jump abruptly from something modest and doable to something
impossible. Further, we haven’t yet taken any account of the way the sizes of the
numbers n and b affect the amount of work required for the basic step. If we were to
do so—if, say, we choose single-digit arithmetic operations as our basic steps—the
bound on the operation count would only grow larger: trial division is not a feasible
way to test large numbers for primality.
30
4.5 Afterword
I chose the algorithms discussed above for simplicity, but they are not necessarily
the best known ways to solve the problems. I also simplified the analysis by judi-
cious choice of problem-size measurement and basic step. For example, in practical
matrix multiplication problems the multiplication of two matrix elements is more
computationally expensive than addition of two products, to the extent that people
often just ignore the additions and try to estimate the number of multiplications.
The standard algorithm is still Θ(n3 ), but more efficient algorithms are known. The
basic idea is related to a clever observation about the number of multiplications
required to compute the product of two complex numbers
The most straightforward approach requires us to compute the four products ac,
bd, ad and bc. But Gauss noticed that one can instead compute just three products,
ac, bd and
q = (a + b)(c + d) = ac + ad + bc + bd,
and then use the relation
(ad + bc) = q − ac − bd
to compute the imaginary part of the product in Eqn. (4.3). In 1969 Volker Strassen
discovered a similar trick whose simplest application allows one to compute the
product of two 2 × 2 matrices with only 7 multiplications, as opposed to the 8
that the standard algorithm requires. Building on this observation, he found an
algorithm that can compute all the entries in the product of two n × n matrices
using only O(nlog2 (7) ) ≈ O(n2.807 ) multiplications3 .
More spectacularly, it turns out that there is a polynomial-time algorithm for
primality testing. It was discovered in the early years of this century by Agrawal,
Kayal and Saxena (often shortened to AKS)4 . This is particularly cheering in that
two of the authors, Kayal and Saxena, were undergraduate project students when
they did this work.
3
I learned about Strassen’s work in a previous edition of William H. Press, Saul A. Teukolsky,
William T. Vetterling and Brian P. Flannery (2007), Numerical Recipes in C++, 3rd edition, CUP,
Cambridge. ISBN: 978-0-521-88068-8, which is very readable, but for a quick overview of the area
you might want to look at Sara Robinson (2005), Toward an optimal algorithm for matrix
multiplication, SIAM News, 38(9).
4
See: Manindra Agrawal, Neeraj Kayal and Nitin Saxena (2004), PRIMES is in P, Annals of
Mathematics, 160(2):781–793. DOI: 10.4007/annals.2004.160.781. The original AKS paper
is quite approachable, but an even more reader-friendly treatment of their proof appears in An-
drew Granville (2005), It is easy to determine whether a given integer is prime, Bulletin of the
AMS, 42:3–38. DOI: 10.1090/S0273-0979-04-01037-7.
31
Chapter 5
Reading: Some of the material in this lecture comes from Section 1.2 of
Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,
which is available online via SpringerLink.
If you are at the university, either physically or via the VPN, you can download the
chapters of this book as PDFs.
Several of the examples in the previous lectures—for example two of the sub-
graphs in Figure 2.7 and the graph in Figure 1.12—consist of two or more “pieces”.
If one thinks about the definition of a graph as a pair of sets, these multiple pieces
don’t present any mathematical problem, but it proves useful to have precise vocab-
ulary to discuss them.
(v0 , v1 , . . . , vL ). (5.2)
such that ej = (vj−1 , vj ) ∈ E. Note that the vertices don’t have to be distinct. A
walk for which v0 = vL is a closed walk.
This definition makes sense in both directed and undirected graphs and in the latter
case corresponds to a path that goes along the edges in the sense of the arrows that
represent them.
32
The walk specified by the edge sequence
Definition 5.2. The length of a walk is the number of edges in the sequence. For
the walk in Eqn. 5.1 the length is thus L.
Definition 5.3. A trail is a walk in which all the edges ej are distinct and a closed
trail is a closed walk that is also a trail.
Definition 5.4. A path is a trail in which all the vertices in the sequence in
Eqn (5.2) are distinct.
Definition 5.5. A cycle is a closed trail in which all the vertices are distinct, except
for the first and last, which are identical.
Remark 5.7. As the three terms walk, trail and path mean very similar things in
ordinary speech, it can be hard to keep their graph-theoretic definitions straight, even
though they make useful distinctions. The following observations may help:
• All trails are walks and all paths are trails. In set-theoretic notation:
5.2 Connectedness
Video We want to be able to say that two vertices are connected if we can get from one
3.4 to the other by moving along the edges of the graph. Here’s a definition that builds
on the terms defined in the previous section:
33
a c
b d
f e
Figure 5.2: The walk specified by the vertex sequence (a, b, c, d, e, b, f ) is a trail as
all the edges are distinct, but it’s not a path as the vertex b is visited twice.
Definition 5.8. In a graph G(V, E), two vertices a and b are said to be connected
if there is a walk given by a vertex sequence (v0 , . . . , vL ) where v0 = a and vL = b.
Additionally, we will say that a vertex is connected to itself.
Definition 5.9. A graph in which each pair of vertices is connected is a connected
graph.
See Figure 5.3 for an example of a connected graph and another that is not con-
nected.
a a
b b
Figure 5.3: The graph at left is connected, but the one at right is not, because
there is no walk connecting the shaded vertices labelled a and b.
Once we have the definitions above, it’s possible to make a precise definition of
the “pieces” of a graph. It depends on the notion of an equivalence relation, which
you should have met earlier your studies.
Definition 5.10. A relation ∼ on a set S is an equivalence relation if it is:
reflexive: a ∼ a for all a ∈ S;
34
5.2.1 Connectedness in undirected graphs
The key idea is that “is-connected-to” is an equivalence relation on the vertex set of
a graph. To see this, we need only check the three properties:
reflexive: This is true by definition, and is the main reason why we say that a
vertex is always connected to itself.
symmetric: If there is a walk from a to b then we can simply reverse the corre-
sponding sequence of edges to get a walk from b to a.
(a = u0 , u1 , . . . , uL1 −1 , uL1 = b)
The process of traversing one walk after another, as we did in the proof of the
transitive property, is sometimes called concatenation of walks.
The disjointness of equivalence classes means that each vertex belongs to exactly one
connected component and so we will sometimes talk about the connected component
of a vertex.
35
u v
Figure 5.4: In a directed graph it’s possible to have a walk from vertex a to vertex
b without having a walk from b to a, as in the digraph at left. In the digraph at right
there are walks from u to v and from v to u so this pair is strongly connected.
Definition 5.13. Two vertices a and b in a directed graph are strongly connected
if b is accessible from a and a is accessible from b. Additionally, we regard a vertex
as strongly connected to itself.
With these definitions it’s easy to show (see the Problem Sets) that “is-strongly-
connected-to” is an equivalence relation on the vertex set of a directed graph and so
the vertex set decomposes into a disjoint union of strongly connected components.
This prompts the following definition:
Definition 5.14. A directed graph G(V, E) is strongly connected if every pair of
its vertices is strongly connected. Equivalently, a digraph is strongly connected if it
contains exactly one strongly connected component.
Finally, there’s one other notion of connectedness applicable to directed graphs,
weak connectedness:
Definition 5.15. A directed graph G(V, E) is weakly connected if, when one
converts all its edges to undirected ones, it becomes a connected, undirected graph.
Figure 5.5 illustrates the difference between strongly and weakly connected graphs.
Finally, I’d like to introduce a piece of notation for the graph that one gets by
ignoring the directedness of the edges in a digraph:
Definition 5.16. If G(V, E) is a directed multigraph then |G| is the undirected
multigraph produced by ignoring the directedness of the edges. Note that if both the
directed edges (a, b) and (b, a) are present in a digraph G(V, E), then two parallel
copies of the undirected edge (a, b) appear in |G|.
36
convert
directed edges to
undirected ones
Figure 5.5: The graph at the top is weakly connected, but not strongly connected,
while the one at the bottom is both weakly and strongly connected.
(v0 , . . . , vL ) (5.3)
In the first case the walk is also a path and we are finished. In the second case
it is always possible to find a path from a to b by removing some edges from the
walk in Eqn. (5.3). This sort of “path surgery” is outlined below and illustrated in
Example 5.18.
We are free to assume that the set of repeated vertices doesn’t include a or b
as we can easily make this true by trimming some vertices off the two ends of the
sequence. To be concrete, we can define a new walk by first trimming off everything
before the last appearance of a—say that’s vj —to yield a walk specified by the
vertex sequence
(vj , . . . , vL )
and then, in that walk, remove everything that comes after the first appearance of
b—say that’s vk —so that we end up with a new walk
To finish the proof we then need to deal with the case where the walk in (5.4),
which doesn’t contain any repeats of a or b, still contains repeats of one or more
37
q t w
a r s u v b
Figure 5.6: In the graph above the shaded vertices a and b are connected by the path
(a, r, s, u, v, b).
where vj′ is the first appearance of c in the sequence at left in Eqn. (5.4) and vk′ is
the last. There can only be finitely many repeated vertices in the original walk (5.3)
and so, by using the approach sketched above repeatedly, we can eliminate them all,
leaving a path from a to b. Very scrupulous students may wish to rewrite this proof
using induction on the number of repeated vertices.
Example 5.18 (Connected vertices are connected by a path). Consider the graph
is Figure 5.6. The vertices a and b are connected by the walk
38
Part II
though the discussion there includes a lot of material about counting trees that we’ll
handle in a different way.
Trees play an important role in many applications: see Figure 6.1 for examples.
39
Two trees Graphs that aren’t trees
Figure 6.1: The two graphs at left (white and yellow vertices) are trees, but the two
at right aren’t: the one at upper right (with green vertices) has multiple connected
components (and so it isn’t connected) while the one at lower right (blue vertices)
contains a cycle. The graph at upper right is, however, a forest as each of its
connected components is a tree.
Figure 6.2: In the two trees above the internal nodes are white, while the leaf nodes
are coloured green or yellow.
40
6.1.2 Kinds of trees
Definition 6.5. A binary tree is a tree in which every internal node has degree
three.
Definition 6.6. A rooted tree is a tree with a distinguished leaf node called the
root node.
Warning to the reader: The definition of rooted tree above is common among
biologists, who use trees to represent evolutionary lineages (see Darwin’s sketch at
right in Figure 6.3). Other researchers, especially computer scientists, use the same
term to mean something slightly different.
Figure 6.3: At left are three examples of rooted binary trees. In all cases the root
node is brown, the leaves are green and the internal nodes are white. At right is a page
from one of Darwin’s notebooks, showing the first known sketch of an evolutionary
tree: here the nodes represent species and the edges indicate evolutionary descent.
Video Lemma 6.7 (Minimal |E| in a connected graph). A connected graph on n vertices
4.2 has at least (n − 1) edges.
Lemma 6.8 (Maximal |E| in an acyclic graph). An acyclic graph on n vertices has
at most (n − 1) edges.
Lemma 6.10 (Vertices of degree 1). If a graph G(V, E) has n ≥ 2 vertices, none
of which are isolated, and (n − 1) edges then G has at least two vertices of degree 1.
41
G\v
G
v
e
G\e
Figure 6.4: A graph G(V, E) and the subgraphs G\v formed by deleting the yellow
vertex v and G\e formed by deleting the red edge e.
• a base case that typically involves a graph with very few vertices or edges
(often just one or two) and for which the result is obvious;
• an inductive hypothesis in which one assumes the result is true for all
graphs with, say, n0 or fewer vertices (or perhaps m0 or fewer edges);
• an inductive step where one starts with a graph that satisfies the hypotheses
of the theorem and has, say, n0 + 1 vertices (or m0 + 1 edges or whatever is
appropriate) and then reduces the theorem as it applies to this larger graph
to something involving smaller graphs (to which the inductive hypothesis ap-
plies), typically by deleting an edge or vertex.
Definition 6.11. If G(V, E) is a graph and v ∈ V is one of its vertices then G\v
is defined to be the subgraph formed by deleting v and all the edges that are incident
on v.
Definition 6.12. If G(V, E) is a graph and e ∈ E is one of its edges then G\e is
defined to be the subgraph formed by deleting e.
42
G\v1
v1
v2
G\v2
Figure 6.5: In the inductive step of the proof of Lemma 6.7 we delete some arbitrary
vertex v ∈ V in a connected graph G(V, E) to form the graph G\v. The result may
still be a connected graph, as in G\v1 at upper right, or may fall into several connected
components, as in G\v2 at lower right.
Base case: There is only one graph with |V | = 1 and it is, by definition, connected
and has |E| = 0, which satisfies the lemma. One could alternatively start from
K2 , which is the only connected graph on two vertices and has |E| = 1.
Inductive hypothesis: Suppose that the lemma is true for all graphs G(V, E) with
1 ≤ |V | ≤ n0 , for some fixed n0 .
|E| ≥ |E ′ | + 1 ≥ (n0 − 1) + 1 ≥ n0
43
In the second case—where deleting v causes G to fall into k ≥ 2 connected
components—we can call the components G1 (V1 , E1 ), G2 (V2 , E2 ), · · · , Gk (Vk , Ek )
with nj = |Vj |. Then
X
k X
k
nj = |Vj | = |V ′ | = |V | − 1 = n0 .
j=1 j=1
And, as we know that the original graph G was connected, we also know
that the deleted vertex v was connected by at least one edge to each of the k
components of G\v. Combining this observation with Eqn. (6.1) gives us
|E| ≥ |E ′ | + k ≥ (n0 − k) + k ≥ n0 ,
Base case: Either K1 or K2 could serve as the base case: both are acyclic graphs
that have a maximum of |V | − 1 edges.
Inductive hypothesis: Suppose that Lemma 6.8 is true for all acyclic graphs with
|V | ≤ n0 , for some fixed n0 .
44
If we again define nj = |Vj |, we know that nj ≤ n0 for all j and so the inductive
hypothesis applies to each component separately: |Ej | ≤ nj − 1. Adding these
up yields
!
Xk X k X k
|E ′ | = |Ej | ≤ (nj − 1) ≤ nj − k ≤ (n0 + 1) − k.
j=1 j=1 j=1
As there are no isolated vertices, we also know that deg(vj ) ≥ 1 for all j. Now
assume—aiming for a contradiction—that there is at most a single vertex with
degree one. That is, assume deg(v1 ) ≥ 1, but deg(vj ) ≥ 2 ∀j ≥ 2. Then
X
n X
n
deg(vj ) = deg(v1 ) + deg(vj )
j=1 j=2
X
n
≥1+ 2
j=2
≥ 1 + (n − 1) × 2
≥ 2n − 1.
This contradicts Eqn. (6.2), which says that the sum of degrees is 2n − 2. Thus
it must be true that two or more vertices have degree one, which is the result we
sought.
45
Theorem 6.13 (Jungnickel’s Theorem 1.2.8). For a graph G(V, E) on |V | = n
vertices, any two of the following imply the third:
(a) G is connected.
(b) G is acyclic.
Base case: There is only one graph with |V | = 1. It’s acyclic, has |V | − 1 = 0
edges and is connected.
46
Thus G contains no isolated vertices and so, by the technical lemma from
the previous section (Lemma 6.10), we know that it has at least two vertices
of degree one. Say that one of these two is u ∈ V and delete it to make
G′ (V ′ , E ′ ) = G\u. Then G′ is still acyclic, because G is, and deleting vertices
can’t create cycles. Furthermore G′ has |V ′ | = |V | − 1 = n0 vertices and
|E ′ | = |E| − 1 = n0 − 1 edges. This means that the inductive hypothesis
applies and we can conclude that G′ is connected. But if G′ is connected, so
is G and we are finished.
47
Chapter 7
This section of the notes introduces a pair of very beautiful theorems that use linear
algebra to count trees in graphs.
Reading:
The next few lectures are not covered in Jungnickel’s book, though a few definitions
in our Section 7.2.1 come from his Section 1.6. But the main argument draws on
ideas that you should have met in Foundations of Pure Mathematics, Linear Algebra
and Algebraic Structures.
48
v1 v3
v2
v4
G T1 T2 T3
Figure 7.1: A graph G(V, E) with V = {v1 , . . . , v4 } and three of its spanning
trees: T1 , T2 and T3 . Note that although T1 and T3 are isomorphic, we regard them
as different spanning trees for the purposes of the Matrix-Tree Theorem.
Example 7.3 (Graph Laplacian). The graph G whose spanning trees are illustrated
in Figure 7.1 has graph Laplacian
L=D−A
2 0 0 0 0 1 1 0
0 2 0 0 1 0 1 0
=
0 0 3
−
0 1 1 0 1
0 0 0 1 0 0 1 0
2 −1 −1 0
−1 2 −1 0
=
−1 −1
(7.1)
3 −1
0 0 −1 1
Once we have these two definitions it’s easy to state the Matrix-Tree theorem
Theorem 7.4 (Kirchoff’s Matrix-Tree Theorem, 1847). If G(V, E) is an undirected
graph and L is its graph Laplacian, then the number NT of spanning trees contained
in G is given by the following computation.
(1) Choose a vertex vj and eliminate the j-th row and column from L to get a new
matrix L̂j ;
(2) Compute
NT = det(L̂j ). (7.2)
The number NT in Eqn. (7.2) counts spanning trees that are distinct as subgraphs
of G: equivalently, we regard the vertices as distinguishable. Thus some of the trees
that contribute to NT may be isomorphic: see Figure 7.1 for an example.
This result is remarkable in many ways—it seems amazing that the answer
doesn’t depend on which vertex we choose when constructing L̂j —but to begin
with let’s simply use the theorem to compute the number of spanning trees for the
graph in Example 7.3
Example 7.5 (Counting spanning trees). If we take G to be the graph whose Lapla-
cian is given in Eqn. (7.1) and choose vj = v1 we get
2 −1 0
L̂1 = −1 3 −1
0 −1 1
49
and so the number of spanning trees is
NT = det(L̂1 )
3 −1 −1 −1
= 2 × det − (−1) × det
−1 1 0 1
= 2 × (3 − 1) + (−1 − 0)
=4−1 = 3
I’ll leave it as an exercise for the reader to check that one gets the same result from
det(L̂2 ), det(L̂3 ) and det(L̂4 ).
(ii) The graph |T | that one obtains by ignoring the directedness of the edges is a
tree.
See Figure 7.2 for an example. Of course, it’s then natural to define an analogue of
a spanning tree:
1
Proved by Bill Tutte about a century after Kirchoff’s result in W.T. Tutte (1948), The dis-
section of equilateral triangles into equilateral triangles, Math. Proc. Cambridge Phil. Soc.,
44(4):463–482.
50
Figure 7.2: The graph at left is an arborescence whose root vertex is shaded red,
while the graph at right contains a spanning arborescence whose root is shaded red
and whose edges are blue.
Nj = det(L̂j ) (7.4)
where L̂j is the matrix produced by deleting the j-th row and column from L.
Here again, the number Nj in Eqn. (7.4) counts spanning arborescences that are
distinct as subgraphs of G: equivalently, we regard the vertices as distinguishable.
Thus some of the arborescences that contribute to Nj may be isomorphic, but if they
involve different edges we’ll count them separately.
Example 7.10 (Counting spanning arborescences). First we need to build the matrix
L defined by Eqn. (7.3) in the statement of Tutte’s theorem. If we choose G to be
the graph pictured at upper left in Figure 7.3 then this is L = Din − A where Din is
a diagonal matrix with Djj = degin (vj ) and A is the graph’s adjacency matrix.
L = Din − A
2 0 0 0 0 1 0 0
0 3 0
0 1 0 1 1
= 0 0 1 −
0 0 1 0 1
0 0 0 2 1 1 0 0
2 −1 0 0
−1 3 −1 −1
= 0 −1
1 −1
−1 −1 0 2
Then Table 7.1 summarises the results for the number of rooted trees.
51
v4
v3 v1
v2
Figure 7.3: The digraph at upper left, on which the vertices are labelled, has three
spanning arborescences rooted at v4 .
j L̂j det(L̂j )
3 −1 −1
1 −1 1 −1 2
−1 0 2
2 0 0
2 0 1 −1 4
−1 0 2
2 −1 0
3 −1 3 −1 7
−1 −1 2
2 −1 0
4 −1 3 −1 3
0 −1 1
Table 7.1: The number of spanning arborescences for the four possible roots in the
graph at upper left in Figure 7.3.
52
G H
Figure 7.5: The undirected graph at left is a spanning tree for G in Figure 7.4,
while the directed graph at right is a spanning arborescence for H (right side of
Fig. 7.4) rooted at the shaded vertex v.
53
the number of spanning arborescences in H, the result will be the same numerically
as if we’d used Kirchoff’s theorem to count spanning trees in G.
54
Chapter 8
Matrix-Tree Ingredients
This lecture introduces some ideas that we will need for the proof of the Matrix-Tree
Theorem. Many of them should be familiar from Foundations of Pure Mathematics
or Algebraic Structures.
Reading:
The material about permutations and the determinant of a matrix presented here
is pretty standard and can be found in many places: the Wikipedia articles on
Determinant (especially the section on n × n matrices) and Permutation are not
bad places to start.
The remaining ingredient for the proof of the Matrix-Tree theorems is the Prin-
ciple of Inclusion/Exclusion. It is covered in the first year module Foundations of
Pure Mathematics, but it is also a standard technique in Combinatorics and so is
discussed in many introductory books1 . The Wikipedia article is, again, a good place
to start. Finally, as a convenience for students from outside the School of Mathe-
matics, I have included an example in Section 8.4.4 and an Appendix, Section 8.5,
that provides full details of all the proofs.
55
8.1.1 The Symmetric Group Sn
One can turn the set of all permutations on n objects into a group by using composi-
tion (applying one function to the output of another) of permutations as the group
multiplication. The resulting group is called the symmetric group on n objects
or Sn and it has the following properties.
• The identity element is the permutation in which σ(j) = j for all 1 ≤ j ≤ n.
• Sn has n! elements.
σ = (i1 , i2 , . . . , iℓ ).
and we’ll say that two cycles σ1 = (i1 , i2 , . . . , iℓ1 ) and σ2 = (j1 , j2 , . . . , jℓ2 ) are dis-
joint if
{i1 , . . . iℓ1 } ∩ {j1 , . . . jℓ2 } = ∅.
The main point about cycles is that they’re like the “prime factors” of permu-
tations in the following sense:
Proposition 8.4. A permutation has a unique (up to reordering of the cycles)
representation as a product of disjoint cycles.
This representation is often referred to as the cycle decomposition of the permutation.
Finally, given the cycle decomposition of a permutation one can define a function
that we will need in the next section.
Definition 8.5 (Sign of a permutation). The function sgn : Sn → {±1} can be
computed as follows:
• If σ is the identity permutation, then sgn(σ) = 1.
• If σ is a cycle of length ℓ then sgn(σ) = (−1)ℓ−1 .
• If σ has a decomposition into k ≥ 2 disjoint cycles whose lengths are ℓ1 , . . . , ℓk
then
Xk
L−k
sgn(σ) = (−1) where L= ℓj .
j=1
56
This definition of sgn(σ) is equivalent to one that you may know from other
courses:
1 If σ is the product of an even number of transpositions
sgn(σ) =
−1 otherwise
The following proposition then makes it easy to find the cycle decomposition of
a permutation σ:
(i1 , i2 , . . . , il , i1 )
is a subgraph of Gσ .
E = {(1, 2), (2, 6), (3, 4), (4, 3), (6, 1)}.
A diagram for this graph appears below and clearly includes two disjoint directed
cycles
1 2 3 4 5 6
57
Thus our permutation has fix(σ) = {5} and its cycle decomposition is
where I have included some versions where the order of the two disjoint cycles is
switched and one where the terms within the cycle are written in a different (but
equivalent) order.
With a little more work (that’s left to the reader) one can prove that this graph-
ical approach establishes a bijection between Sn and that family of subgraphs of Kn
which consists of unions of disjoint cycles. The bijection sends the identity permu-
tation to the subgraph consisting of n isolated vertices and sends a permutation σ
that is the product of k ≥ 1 disjoint cycles
to the subgraph Gσ that has vertex set V = {v1 , . . . , vn } and whose edges are those
that appear in the k disjoint, directed cycles C1 , . . . , Ck , where Cj is the cycle
specified by the vertex sequence
vij,1 , . . . , vij,ℓj , vij,1 . (8.2)
In Eqns. (8.1) and (8.2) the notation ij,r is the vertex number of the r-th vertex in
the j-th cycle, while ℓj is the length of the j-th cycle.
Some of you may have encountered this elsewhere, though most will be meeting
it for the first time. I won’t prove it, as that would be too much of a diversion
from graphs, but Eqn. (8.3) has the blessing of Wikipedia, where it is attributed to
Leibniz, and proofs appear in many undergraduate algebra texts2 . The very keen
reader could also construct an inductive proof herself, starting from the familiar
recursive formula.
Finally, I’ll demonstrate that it works for the two smallest nontrivial examples.
2
I found one in I.N. Herstein (1975), Topics in Algebra, 2nd ed., Wiley.
58
Example 8.10 (2 × 2 matrices). First we need a list of the elements of S2 and their
signs:
Name σ sgn(σ)
1 2
σ1 1
1 2
1 2
σ2 -1
2 1
Then we can compute the determinant of a 2 × 2 matrix in the usual way
a11 a12
det = a11 a22 − a12 a21
a21 a22
= sgn(σ1 ) × a1σ1 (1) a2σ1 (2) + sgn(σ2 ) × a1σ2 (1) a2σ2 (2)
= (1) × a11 a22 + (−1) × a12 a21
= a11 a22 − a12 a21
Example 8.11 (3 × 3 matrices). Table 8.1 lists the elements of S3 in the form
we’ll need. First, we compute the determinant in the usual way, a tedious but
straightforward business.
a11 a12 a13
det a21 a22 a23 =
a31 a32 a33
a22 a23 a21 a23 a21 a22
a11 × det − a12 × det + a13 × det
a32 a33 a31 a33 a31 a32
= a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31
Then we calculate again using Eqn (8.3) and the numbering scheme for the elements
of S3 that’s shown in Table 8.1.
a11 a12 a13 X6 Yn
det a21 a22 a23 = sgn(σk ) ajσk (j)
a31 a32 a33 k=1 j=1
= sgn(σ1 ) × a1σ1 (1) a2σ1 (2) a3σ1 (3) + · · · + sgn(σ6 ) × a1σ6 (1) a2σ6 (2) a3σ6 (3)
= (1) × a11 a22 a33 + (−1) × a11 a23 a32 + (−1) × a12 a21 a33
+ (1) × a12 a23 a31 + (1) × a13 a21 a32 + (−1) × a13 a22 a31
= a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31
59
σk fix(σk ) Cycle Decomposition sgn(σk )
1 2 3
σ1 = {1, 2, 3} – 1
1 2 3
1 2 3
σ2 = {1} (2, 3) -1
1 3 2
1 2 3
σ3 = {3} (1, 2) -1
2 1 3
1 2 3
σ4 = ∅ (1, 2, 3) 1
2 3 1
1 2 3
σ5 = ∅ (3, 2, 1) 1
3 1 2
1 2 3
σ6 = {2} (1, 3) -1
3 2 1
Table 8.1: The cycle decompositions of all the elements in S3 , along with the
associated functions sgn(σ) and fix(σ).
Lemma 8.12 (Inclusion/Exclusion for two sets). If X1 and X2 are finite sets then
Note that this formula, which is illustrated in Figure 8.1, works even X1 ∩ X2 = ∅,
as then |X1 ∩ X2 | = 0. The proof of this lemma appears in the Appendix, in
Section 8.5.1.
60
U U
X1 X2 X1 X1∩ X2 X2
Figure 8.1: In the example at left X1 ∩ X2 = ∅, so |X1 ∪ X2 | = |X1 | + |X2 |, but in the
example at right X1 ∩X2 ̸= ∅ and so |X1 ∪X2 | = |X1 |+|X2 |−|X1 ∩X2 | < |X1 |+|X2 |.
of Lemma 8.12 from the previous section. If we regard (X1 ∪ X2 ) as a single set and
X3 as a second set, then Eqn. (8.4) says
Focusing on the final term, we can use standard relations about unions and inter-
sections to say
(X1 ∪ X2 ) ∩ X3 = (X1 ∩ X3 ) ∪ (X2 ∩ X3 ).
Then, applying Eqn. (8.4) to the pair of sets (X1 ∩ X3 ) and (X2 ∩ X3 ), we obtain
where, in going from the second line to the third, we have used
(X1 ∩ X3 ) ∩ (X2 ∩ X3 ) = X1 ∩ X2 ∩ X3 .
Finally, putting all these results together, we obtain the analogue of Eqn. (8.4)
for three subsets:
Figure 8.2 helps make sense of this formula and prompts the following observations:
• Elements of X1 ∪ X2 ∪ X3 that belong to exactly one of the Xj are counted
exactly once by the sum (|X1 | + |X2 | + |X3 |) and do not contribute to any of
the terms involving intersections.
61
U
X1 X2
X1 X2
X3
Figure 8.2: In the diagram above all of the intersections appearing in Eqn. (8.5) are
nonempty.
= |X1 | + · · · + |Xn |
− |X1 ∩ X2 | − · · · − |Xn−1 ∩ Xn |
+ |X1 ∩ X2 ∩ X3 | + · · · + |Xn−2 ∩ Xn−1 ∩ Xn |
..
.
X
+ (−1)m−1 |Xi1 ∩ · · · ∩ Xim |
1≤i1 ≤···≤im ≤n
..
.
+ (−1)n−1 |X1 ∩ · · · ∩ Xn | (8.6)
62
or, more concisely,
X \
|X1 ∪ · · · ∪ Xn | = (−1)|I|−1 Xi (8.7)
I⊆{1,...,n}, I̸=∅ i∈I
The proof of this result appears in the Appendix, in Section 8.5.2 below.
8.4.4 An example
How many of the integers n with 1 ≤ n ≤ 150 are coprime to 70? This is a job
for the Principle of Inclusion/Exclusion. First note that the prime factorisation of
70 is 70 = 2 × 5 × 7. Now consider a universal set U = {1, . . . , 150} and the three
subsets X1 , X2 and X3 consisting of multiples of 2, 5 and 7, respectively. A member
of U that shares a prime factor with 70 belongs to at least one of the Xj and so the
number we’re after is
where I have used the numbers in Table 8.2 which lists the various cardinalities that
we need.
Table 8.2: The sizes of the various intersections needed for the calculation in
Eqn. (8.8).
63
8.5 Appendix: Proofs for Inclusion/Exclusion
The proofs in this section will not appear on the exam, but are provided for those
who are interested or for whom the subject is new.
Figure 8.3: Here X1 \X2 and X1 ∩ X2 are shown in shades of blue, while X2 \X1 is
in yellow.
X1 \X2 = {x ∈ U | x ∈ X1 , but x ∈
/ X2 }.
Then, as is illustrated in Figure 8.3, X1 = (X1 \X2 ) ∪ (X1 ∩ X2 ). Further, the sets
X1 \X2 and X1 ∩ X2 are disjoint by construction, so
where, in passing from the first line to the second, we have used (8.9). The last line
is the result we were trying to prove, so we are finished.
64
8.5.2 Proof of Theorem 8.13
One can prove this result in at least two ways:
• by induction, with a calculation that is essentially the same as the one used
to obtain the n = 3 case—Eqn. (8.5)—from the n = 2 one—Eqn. (8.4);
• by showing that each x ∈ X1 ∪ · · · ∪ Xn contributes exactly one to the sum on
the right hand side of Eqn. (8.7).
The first approach is straightforward, if a bit tedious, but the second is more inter-
esting and is the one discussed here.
The key idea is to think of the the elements of X1 ∪ · · · ∪ Xn individually and
ask what each one contributes to the sum in Eqn. (8.7). Suppose that an element
x ∈ X1 ∪ · · · ∪ Xn belongs to exactly ℓ of the subsets, with 1 ≤ ℓ ≤ n: we will
prove that x makes a net contribution of 1. For the sake of concreteness, we’ll say
x ∈ Xi1 , . . . , Xiℓ where i1 , . . . , iℓ are distinct elements of {1, . . . , n}.
• As we’ve assumed that x belongs to exactly ℓ of the subsets Xj , it contributes
a total of ℓ to the first row, |X1 | + · · · + |Xn |, of the long sum in Eqn. (8.6).
• Further, x contributes a total of − 2ℓ to the sum in the row involving two-way
intersections
− |X1 ∩ X2 | − · · · − |Xn−1 ∩ Xn |.
To see this, note that if x ∈ Xj ∩ Xk then both j and k must be members of
the set {i1 , . . . , iℓ }.
• Similar arguments show that if k ≤ ℓ, then x contributes a total of
k−1 ℓ k−1 ℓ!
(−1) = (−1)
k k! (ℓ − k)!
to the sum in the row of Eqn. (8.6) that involves k-fold intersections.
• Finally, for k > ℓ there are no k-fold intersections that contain x and so x
makes a contribution of zero to the corresponding rows in Eqn. (8.6).
Putting these observations together we see that x make a net contribution of
ℓ ℓ ℓ−1 ℓ
ℓ− + − . . . + (−1) (8.10)
2 3 ℓ
This sum can be made to look more familiar by considering the following application
of the Binomial Theorem:
0 = (1 − 1)ℓ
Xℓ
j ℓ−j ℓ
= (−1) (1)
j=0
j
ℓ ℓ ℓ ℓ
=1−ℓ+ − + . . . + (−1) .
2 3 ℓ
65
Thus
ℓ ℓ ℓ−1 ℓ
0 = 1− ℓ− + − . . . + (−1)
2 3 ℓ
or
ℓ ℓ ℓ−1 ℓ
ℓ− + − . . . + (−1) = 1.
2 3 ℓ
The left hand side here is the same as the sum in Eqn. (8.10) and so we’ve established
that any x which belongs to exactly ℓ of the subsets Xj makes a net contribution of 1
to the sum on the right hand side of Eqn. (8.7). And as every x ∈ X1 ∪· · ·∪Xn must
belong to at least one of the Xj , this establishes the Principle of Inclusion/Exclusion.
where, in passing from the second to third lines, I have dropped the second sum
because all its terms are zero.
Then the Principle of Inclusion/Exclusion is equivalent to
X X \
1X1 ∪···∪Xn (x) = (−1)|I|−1 Xi
x∈X1 ∪···∪Xn I⊆{1,...,n}, I̸=∅ i∈I
X X
= (−1)|I|−1 1∩i∈I Xi (x)
∩
I⊆{1,...,n}, I̸=∅ x∈ i∈I Xi
X
n X X
= (−1)k−1 1∩i∈I Xi (x)
∩
k=1 I⊆{1,...,n}, |I|=k x∈ i∈I Xi
which I have obtained by using of Eqn. (8.11) to replace terms in Eqn. (8.7) with
the corresponding sums of values of characteristic functions.
66
We can then rearrange the expression on the right, first expanding the ranges of
the sums over elements of k-fold intersections (this doesn’t change the result since
1X (x) = 0 for x ∈/ X) and then interchanging the order of summation so that
the sum over elements comes first. This calculation proves that the Principle of
Inclusion/Exclusion is equivalent to the following:
X
1X1 ∪···∪Xn (x)
x∈X1 ∪···∪Xn
!
X
n X X
= (−1)k−1 1∩i∈I Xi (x)
k=1 I⊆{1,...,n}, |I|=k x∈X1 ∪···∪Xn
X Xn X
= k−1
(−1) 1∩i∈I Xi (x) (8.12)
x∈X1 ∪···∪Xn k=1 I⊆{1,...,n}, |I|=k
Arguments similar to those in Section 8.5.2 then establish the following results, the
last of which, along with Eqn. (8.12), proves Theorem 8.13.
while if k > ℓ X
1∩i∈I Xi (x) = 0
I⊆{1,...,n}, |I|=k
Lemma 8.16. The characteristic function 1X1 ∪···∪Xn of the set X1 ∪· · ·∪Xn satisfies
X
n X
1X1 ∪···∪Xn (x) = (−1)k−1 1∩i∈I Xi (x).
k=1 I⊆{1,...,n}, |I|=k
67
Chapter 9
The proof here is derived from a terse account in the lecture notes from a course on
Algebraic Combinatorics taught by Lionel Levine at MIT in Spring 2011.1 I studied
them with Samantha Barlow, a former Discrete Maths student who did a third-year
project with me in 2011-12.
Reading:
I don’t know of any textbook accounts of the proof given here, but the intrepid reader
might like to look at the following two articles, both of which make the connection
between the Principle of Inclusion/Exclusion and Tutte’s Matrix Tree theorem.
• J.B. Orlin (1978), Line-digraphs, arborescences, and theorems of Tutte and
Knuth, Journal of Combinatorial Theory, Series B, 25(2):187–198. DOI:
10.1016/0095-8956(78)90038-2
• S. Chaiken (1983), A combinatorial proof of the all minors matrix tree the-
orem, SIAM Journal on Algebraic and Discrete Methods, 3:319–329. DOI:
10.1137/0603033
68
Figure 9.1: Three examples of single predecessor graphs (spregs). In each the
distinguished vertex is white, while the other vertices, which all have degin (u) = 1,
are shaded in other colours. The example at left, has multiple weakly connected
components, while the other two are arborescences.
Figure 9.1 includes several examples of spregs, including two that are arborescences,
which prompts the following proposition:
Proof. By definition, G and T share the same vertex set, so all we need check is
that the vertices u ̸= v in T have a single predecessor. Recall that an arborescence
rooted at v is a directed graph T (V, E) such that
(i) Every vertex u ̸= v is accessible form v. That is, there is a directed path from
v to every other vertex.
The proposition consists of two separate claims: that degin (v) = 0 and that
degin (u) = 1 ∀u ̸= v ∈ V . We’ll prove both by contradiction.
Suppose that degin (v) > 0: it’s then easy to see that T must include a directed
cycle. Consider one of v’s predecessors—call is u0 . It is accessible from v, so there is
a directed path from v to u0 . And u0 is a predecessor of v, so there is also a directed
edge (u0 , v) ∈ E. If we append this edge to the end of the path, we get a directed
path from v back to itself. This contradicts the second property of an arborescence
and so we must have degin (v) = 0.
The proof for the second part of the proposition is illustrated in Figure 9.2. Sup-
pose that ∃ u ̸= v ∈ V such that degin (u) ≥ 2 and choose two distinct predecessors
of u: call them v1 and v2 and note that one of them may be the root vertex v.
Now consider the directed paths from v to v1 and v2 . In the undirected version of
T these paths, along with the edges (v1 , u) and (v2 , u), must include a cycle, which
contradicts the second property of an arborescence.
The examples in Figure 9.1 make it clear that there are other kinds of spregs
besides spanning arborescences, but there aren’t that many kinds:
69
v2 v1 = v
v
v1 u u
v2
Figure 9.2: Two examples to illustrate the second part of the proof that an
arborescence is a spreg. If one ignores the directedness of the edges in the graphs
above, both contain cycles.
Nj = det(L̂j )
where L̂j is the matrix produced by deleting the j-th row and column from L.
First note that—because we can always renumber the vertices before we apply
the theorem—it is sufficient to prove the result for the case with root vertex v = vn .
Now consider the representation of det(L̂n ) as a sum over permutations:
X Y
n−1
det(L̂n ) ≡ det(L) = sgn(σ) Ljσ(j) . (9.2)
σ∈Sn−1 j=1
70
Predecessor of Is a spanning
v1 v2 v3 arboresence?
v2 v1 v2 No
v2 v3 v2 No
v2 v4 v2 Yes
v4 v1 v2 Yes
v4 v3 v2 No
v4 v4 v2 Yes
Table 9.1: Each row here corresponds to one of the spregs in Figure 9.3.
where I have introduced the notation L ≡ L̂n to avoid the confusion of having two
kinds of subscripts on L̂n . This means that L is an (n − 1) × (n − 1) matrix in which
Lij = Lij ,
where Lij is the i, j entry in the matrix L defined by Eqn. (9.1) in the statement of
Tutte’s theorem.
Y
n−1 Y
n−1 Y
n−1
Ljσ(j) = Ljj = degin (vj ). (9.3)
j=1 j=1 j=1
This product is also equal to the total number of spregs in G(V, E) that have dis-
tinguished vertex vn . To see why, look back at the definition of a spreg and think
about what we’d need to do if we wanted to write down a complete list of these
spregs. We could specify a spreg by listing the single predecessor for each vertex
other than vn in a table like the one below
Vertex v1 v2 v3
Predecessor v2 v1 v2
which describes one of the spregs rooted at v4 contained in the four-vertex graph
shown in Figure 9.3. And if we wanted to list all the four-vertex spregs contained
in this graph we could start by assembling the predecessor lists of all the vertices
other than the distinguished vertex,
71
v4
v3 v1
v2
v4 v4 v4
v3 v1 v3 v1 v3 v1
v2 v2 v2
v4 v4 v4
v3 v1 v3 v1 v3 v1
v2 v2 v2
Figure 9.3: The graph G(V, E) at upper left contains six spregs with distinguished
vertex v4 , all of which are shown in the two rows below. Three of them are spanning
arborescences rooted at v4 , while the three others contain cycles.
72
Thus if one or more of the edges (vik , vik+1 ) is absent from the graph we have
Y
ℓ
Lik ik+1 = 0,
k=1
but if all the edges (vik , vik+1 ) are present we have can make the following observa-
tions:
• the graph contains a directed cycle given by the vertex sequence
Arguments
Q similar to those in the previous section then show that the product
j∈fix(σ) degin (vj ) in Eqn. (9.4) counts the number of ways to choose predecessors
for those vertices that aren’t part of the cycle. We can summarise all these ideas
with the following pair of results:
Proposition 9.4. For a permutation σ ∈ Sn−1 consisting of a single cycle
σ = (i1 , . . . , iℓ )
define an associated directed cycle Cσ specified by the vertex sequence (vi1 , . . . , viℓ , vi1 ).
Then the term in det(L) corresponding to σ satisfies
Y
− degin (vj ) if Cσ ⊆ G(V, E) and fix(σ) ̸= ∅
Y
n−1
j∈fix(σ)
sgn(σ) Ljσ(j) =
−1 if Cσ ⊆ G(V, E) and fix(σ) = ∅
j=1
0 if Cσ ̸⊆ G(V, E)
Y
n−1
Ljσ(j) = |{spregs containing Cσ }| .
j=1
73
Number of spregs
Qn−1
σ Cσ Cσ ⊆ G? j=1 Ljσ(j) containing Cσ
(1,2) (v1 , v2 , v1 ) Yes degin (v3 ) = 1 1
(1,3) (v1 , v3 , v1 ) No degin (v2 ) × 0 0
(2,3) (v2 , v3 , v2 ) Yes degin (v1 ) = 2 2
(1,2,3) (v1 , v2 , v3 , v1 ) No 0 0
(1,3,2) (v1 , v3 , v2 , v1 ) No 0 0
Table 9.2: The results of using Corollary 9.5 to count spregs containing the various
cycles Cσ associated with the non-identity elements of S3 . The right column lists the
number one gets by direct counting of the spregs shown in Figure 9.3.
9.2.2 An example
Before pressing on to generalise the results of the previous section to arbitrary
permutations, let’s see what Corollary 9.5 allows us to say about the graph in
Figure 9.3. There G(V, E) is a digraph on four vertices, so the determinant that
comes into Tutte’s theorem is that Qof L4 , a three-by-three matrix. We’ve already
seen that if σ = id the product j∈fix(σ) degin (vj ) gives six, the total number of
spregs contained in the graph. The results for the remaining elements of S3 are
listed in Table 9.2 and all are covered by Corollary 9.5, as all non-identity elements
of S3 are single cycles.
where ℓj is the length of the j-th cycle. Associate the directed cycle Cj defined by
the vertex sequence (vij,1 , . . . , vij,ℓj , vij,1 ) with the j-th cycle in the permutation and
define
[k
Cσ = Cj .
j=1
74
Further,
Y
n−1 n Sk o
Ljσ(j) = spregs containing Cσ = j=1 Cj . (9.5)
j=1
The proof of this result requires reasoning much like that used in Section 9.2.1 and
so is left to the reader.
C = {C1 , . . . , CM }.
• U is the set of all spregs with distinguished vertex vn . That is, U is the set of
subgraphs of G(V, E) in which
[
M
Nn = |{spanning arborescences rooted at vn }| = |U | − Xj .
j=1
X \
= |U | + (−1)|I| Xj (9.6)
I⊆{1,...,M }, I̸=∅ j∈I
75
As we know that spregs contain only disjoint cycles, we can say
|Xj ∩ Xk | = 0 unless Cj ∩ Ck = ∅
and so can eliminate many of the terms in the sum over intersections in Eqn. (9.6),
rewriting it as a sum over collections of disjoint cycles:
X \
Nn = |U | + (−1)|I| Xj . (9.7)
j∈I
I⊆{1,...,M }, I̸=∅
Cj ∩Ck =∅ ∀j̸=k∈I
Then we can use the lemma from the previous section—Lemma 9.6, which re-
lates non-identity permutations to numbers of spregs containing cycles—to rewrite
Eqn. (9.7) in terms of permutations. First note that Eqn. (9.5) allows us to write
\ n S o Y
n−1
Xj = spregs containing j∈I Cj = LkσI (k) .
j∈I k=1
X Y
n−1
Nn = |U | + sgn(σI ) LjσI (j) (9.8)
j=1
I⊆{1,...,M }, I̸=∅
Cj ∩Ck =∅ ∀j̸=k∈I
As the sum in Eqn. (9.8) ranges over all collections of disjoint cycles, the permuta-
tions σI range over all non-identity permutations in Sn−1 and so we have
X Y
n−1
Nn = |U | + sgn(σ) Ljσ(j) . (9.9)
σ ̸= id j=1
76
which is the term in det(L) corresponding to the identity permutation. Combining
this observation with Eqn. (9.9) gives us
X Y
n−1
Nn = sgn(σ) Ljσ(j) = det(L),
σ∈Sn−1 j=1
77
Part III
Eulerian Multigraphs
This section of the notes revisits the Königsberg Bridge Problem and generalises
it to explore Eulerian multigraphs: those that contain a closed walk that traverses
every edge exactly once.
Reading:
The material in today’s lecture comes from Section 1.3 of
78
North Bank
West
East
Island
Island
South Bank
Figure 10.1: We proved in the first lecture of the term that it is impossible to find
a closed walk that traverses every edge in the graph above exactly once.
The main theorem we’ll prove today relies on the following definitions:
Definition 10.1. An Eulerian trail in a multigraph G(V, E) is a trail that includes
each of the graph’s edges exactly once.
Definition 10.2. An Eulerian tour in a multigraph G(V, E) is an Eulerian trail
that starts and finishes at the same vertex. Equivalently, it is a closed trail that
traverses each of the graph’s edges exactly once.
Definition 10.3. A multigraph that contains an Eulerian tour is said to be an
Eulerian multigraph.
Armed with these, it’s then easy to formulate the following characterisation of Eu-
lerian multigraphs:
Theorem 10.4 (Jungnickel’s Theorem 1.3.1). Let G be a connected multigraph.
Then the following statements are equivalent:
(1) G is Eulerian.
(2) Each vertex of G has even degree.
(3) The edge set of G can be partitioned into cycles.
The last of these characterisations may be new to you: it means that it is possible
to arrange the edges of G into a collection of disjoint cycles. Figure 10.2 shows
an example of such a partition for a graph derived from the Königsberg Bridge
multigraph by adding two extra edges, shown in blue at left. Adding these edges
makes the graph Eulerian, and a decomposition of the edge set into cycles appears
at right. Note that undirected multigraphs can contain cycles of length two that
consist of a pair of parallel edges.
The proof of the theorem is simpler if one has the following lemma, whose proof
I’ll defer until after that of the main result. Note that the lemma, unlike the theorem,
does not require the multigraph to be connected.
Lemma 10.5 (Vertices of even degree and cycles). If G(V, E) is a multigraph with
a nonempty edge set E ̸= ∅ and the property that deg(v) is an even number for all
v ∈ V , then G contains a cycle.
79
Figure 10.2: The panel at left shows a graph produced by adding two edges (shown
in blue) to the graph from the Königsberg Bridge Problem. These extra edges make
the graph Eulerian and the panel at right illustrates a partition of the edge set into
cycles
Proof of Theorem 10.4. The theorem says these statements are all “equivalent”,
which encompasses a total of six implications1 but we don’t need to prove all of
them: it’s sufficient to prove, say, that (1) =⇒ (2), (2) =⇒ (3) and (3) =⇒ (1).
That is, it’s sufficient to make a directed graph whose vertices are the statements
and whose edges indicate implications. If this graph is strongly connected, so that
one can get from any statement to any other by following a chain of implications,
then the result is proven.
(1) =⇒ (2):
Proof. We know G is Eulerian, so it has a closed trail that includes each edge exactly
once. Imagine that this trail is specified by the following sequence of vertices
v0 , . . . , vm = v0 (10.1)
where |E| = m and the vj are the vertices encountered along the trail, so that
some of them may appear more than once. In particular, v0 = vm because the trail
starts and finishes at the same vertex. As G is a connected multigraph, every vertex
appears somewhere in the sequence (if not, the absent vertices would have degree
zero and not be connected to any of the others).
Consider first some vertex u ̸= v0 . It must appear one or more times in the
sequence above and, each time, it appears in a pair of successive edges: if u = vj
with 0 < j < m, then these edges are (vj−1 , vj ) and (vj , vj+1 ). This means that
deg(u) is a sum of 2’s, with one term in the sum for each appearance of u in the
sequence (10.1). A similar argument applies to v0 , save that the edge that forms a
pair with (v0 , v1 ) is (vm−1 , vm = v0 ).
Video (2) =⇒ (3): The theorem requires this implication to hold for connected multi-
7.2 graphs, but this particular result is more general and applies to any multigraph in
which all vertices have even degree. We’ll prove this stronger version by induction
on the number of edges. That is, we’ll prove:
1
(1) =⇒ (2), (2) =⇒ (1), (1) =⇒ (3) . . .
80
Proposition. If G(V, E) is a multigraph (whether connected or not) in which deg(v)
is an even number for all vertices v ∈ V , then the edge set E can be partitioned into
cycles.
Proof. The base case is a multigraph with |E| = 0. Such a graph consists of one or
more isolated vertices and, as the graph has no edges, deg(v) = 0 (an even number)
for all v ∈ V and the (empty) edge set can clearly be partitioned into a union of
zero cycles.
Now suppose the result is true for every multigraph G(V, E) with |E| ≤ m0 edges
whose vertices all have even degree. Consider such a multigraph with |E| = m0 + 1:
we need to demonstrate that the edge set of such a graph can be partitioned into
cycles. We can use Lemma 10.5 to establish that we can find at least one cycle C
contained in G. And then we can form a new graph G′ (V, E ′ ) = G\C formed by
removing C from G. This bit of graph surgery either leaves the degree of a vertex
unchanged (if the vertex isn’t part of C) or decreases it by two, but either way, all
vertices in G′ have even degree because the corresponding vertices in G do.
The cycle C will contain at least one edge (and, unless we permit self-loops, two
or more) and so G′ will have at most m0 edges and so the inductive hypothesis will
apply to it. This means that we can partition E ′ = E\C into cycles. But then we
can add C to the partition of E ′ and so get a partition into cycles for E, completing
the inductive step and so proving our result.
(3) =⇒ (1): Here we need to establish that if the edge set of a connected multigraph
G(V, E) consists of a union of cycles, then G contains an Eulerian tour. This result is
trivial unless the partition of E involves at least two cycles, so we’ll restrict attention
to that case from now on.
The key observation is that we can always find two cycles that we can merge to
produce a single, longer closed trail that includes all the edges from the two cycles.
To see why, note that there must be a pair of cycles that share a vertex (if there
weren’t, all the cycles would all lie in distinct connected components, contradicting
the connectedness of G). Suppose that the shared vertex is v⋆ and that the cycles
are C1 and C2 given by the vertex sequences
C1 = {v⋆ = v0 , v1 , . . . , vℓ1 = v⋆ } and C2 = {v⋆ = u0 , u1 , . . . , uℓ2 = v⋆ } .
We can combine them, as illustrated in Figure 10.3 to make a closed trail given by
the vertex sequence
{v⋆ = v0 , v1 , . . . , vℓ1 = v⋆ = u0 , u1 , . . . , uℓ2 = v⋆ } .
Scrupulous readers may wish to use this observation as the basis of a proof by
induction (on the number of elements in the partition of E) of the somewhat stronger
result:
Proposition 10.6. If G(V, E) is a connected graph whose edge set can be partitioned
as a union of disjoint, closed trails, then G is Eulerian.
Then, as a cycle is a special case of a closed trail, we get the desired implication
as an immediate corollary.
81
u0 = v* = v0
u1
u7
u2 u6
v3 v1
u3 u5
u4
v2
Figure 10.3: The key step in the proof of the implication (3) =⇒
(1) in the proof of Theorem 10.4. The cycles C1 = (v⋆ = v0 , v1 , v2 , v3 , v0 ),
whose vertices are shown in red, and C2 = (v⋆ = u0 , u1 , u2 , u3 , u4 , u5 , u6 , u7 , u0 ),
whose vertices are shown in yellow, may be merged to create the closed trail
(v⋆ = v0 , v1 , v2 , v3 , v0 = v⋆ = u0 , u1 , u2 , u3 , u4 , u5 , u6 , u7 , u0 ) indicated by the dotted
line.
I’d like to conclude by giving an algorithmic proof of Lemma 10.5. The idea is
to choose some initial vertex u0 and then construct a trail in the graph by following
one of u0 ’s edges, then one of the edges of u0 ’s successor in the trail . . . and so on
until we revisit some vertex and thus discover a cycle. Provided that we can do
as I say—always move on through the graph without ever tracing over some edge
twice—this approach is bound to work because there are only finitely many vertices.
The proof that follows formalises this approach by spelling out an explicit algorithm.
Proof of Lemma 10.5. Consider the following algorithmic process, which finds a cy-
cle in a multigraph G(V, E) for which E ̸= ∅ and deg(v) is even for all v ∈ V .
Algorithm 10.7 (Finding a cycle).
Given a multigraph G(V, E) in which |E| > 0 and all vertices have even degree,
construct a trail T given by a sequence of edges
(1) Number the vertices, so that V = {v1 , . . . , vn }. This is for the sake of con-
creteness: later in the algorithm, when we need to choose one of a set of
vertices that have a particular property, we can choose the lowest-numbered
one.
(2) Initialize some things
• Set a counter j ← 0.
• Choose the first vertex in the trail, u0 , to be the lowest-numbered vertex
that has deg(vk ) > 0. Such vertices exist, as we know |E| > 0.
82
• Initialise a list A (for “available”) of edges that we have not yet included
in T . At the outset we set A ← E as we haven’t used any edges yet.
• T ← T ∪ {(uj , w)}
• A ← A\{(uj , w)} (We’ve used (one copy of) the edge (uj , w)).
• If yes, stop. The trail includes a cycle that starts and finishes at w.
• If no, set uj+1 ← w, then set j ← j + 1 and go to Step 3.
The only way this process can stop is by revisiting a vertex and it must do this
within |V | = n steps. And once we’ve revisited a vertex, we’ve found a cycle and so
are finished.
83
Chapter 11
This lecture introduces the notion of a Hamiltonian graph and proves a lovely the-
orem due to J. Adrian Bondy and Vašek Chvátal that says—in essence—that if a
graph has lots of edges, then it must be Hamiltonian.
Reading:
The material in today’s lecture comes from Section 1.4 of
84
easy: the cycle graphs Cn consist of nothing except one big Hamiltonian tour, and
the complete graphs Kn with n ≥ 3 obviously contain the Hamiltonian cycle
(v1 , v2 , . . . , vn , v1 )
obtained by numbering the vertices and visiting them in order. We’ll spend most
of the lecture proving results that say, more-or-less, that a graph with a lot of
edges (where the point of the theorem is to make the sense of “a lot” precise) is
Hamiltonian. Two of the simplest results of this kind are:
Theorem 11.4 (Dirac1 , 1952). Let G be a graph with n ≥ 3 vertices. If each vertex
of G has deg(v) ≥ n/2, then G is Hamiltonian.
deg(u) + deg(v) ≥ n
Dirac’s theorem is a corollary of Ore’s, but we will not prove either of these
theorems directly. Instead, we’ll obtain both as corollaries of a more general result,
the Bondy-Chvátal Theorem. Before we can even formulate this mighty result, we
need a somewhat involved new definition: the closure of a graph.
1
This Dirac, Gabriel Andrew Dirac, was the adopted son of the Nobel prize winning theoretical
physicist Paul A. M. Dirac, and the nephew of another Nobel prize winner, the physicist and
mathematician Eugene Wigner. Wigner’s sister Margit was visiting her brother in Princeton when
she met Paul Dirac.
85
11.2.1 An algorithm to construct [G]
The algorithm below constructs a finite sequence of graphs
that all have the same vertex set V , but different edge sets
E = E1 , E2 , . . . , EK . (11.3)
These edge sets form an increasing sequence in the sense that that Ej ⊂ Ej+1 . In
fact, Ej+1 is produced by adding a single edge to Ej .
Algorithm 11.6 (Graph Closure).
Given a graph G(V, E) with vertex set V = {v1 , . . . , vn }, find [G].
(1) Set an index j to one: j ← 1,
Also set E1 to be the edge set of the original graph,
E1 ← E.
(2) Given Ej , construct Ej+1 , which contains, at most, one more edge than Ej .
Begin by setting Ej+1 ← Ej , so that Ej+1 automatically includes every edge in
Ej . Now work through every possible edge in the graph. For each one—let’s
call it e = (vr , vs )—there are three possibilities:
where the subscript Gj is meant to show that the degree is being calculated
in the graph Gj , whose vertex set is V and whose edge set is Ej . In this
case we do not include e in Ej+1 .
(iii) the edge e = (vr , vs ) is not in Ej , but the degrees of the vertices vr and
vs are high in the sense that
Ej+1 ← Ej ∪ {e}.
(3) Decide whether to stop: ask whether we added an edge during step 2.
• If not, then stop: the closure [G] has vertex set V and edge set Ej .
• Otherwise set j ← j + 1 and go back to step (2) to try to add another
edge.
86
1 2 3 1 2 3
G1 G2
4 5 6 7 4 5 6 7
1 2 3 1 2 3
G3 G4
4 5 6 7 4 5 6 7
Figure 11.1: The results of applying Algorithm 11.6 to the seven-vertex graph G1 .
Each round of the construction (each pass through step 2 of the algorithm) adds a
single new edge—shown with red, dotted curves—to the graph.
11.2.2 An example
Figure 11.1 shows the result of applying Algorithm 11.6 to a graph with 7 vertices.
The details of the process are discussed below.
Making G2 from G1
When constructing E2 from E1 , notice that the vertex with highest degree, v1 , has
degG1 (v1 ) = 4 and all the other vertices have lower degree. Thus, in step 2 of the
algorithm we need only think about edges connecting v1 to vertices of degree three.
There are three such vertices—v2 , v4 and v5 —but two of them are already adjacent
to v1 in G1 , so the only new edge we need to add at this stage is e = (v1 , v2 ).
Making G3 from G2
Now v1 has degG2 (v1 ) = 5, so the closure condition (11.4) says that we should
connect v1 to any vertex whose degree is two or more, which requires us to add the
edge (v1 , v3 ).
Making G4 from G3
Now v1 has degree degG3 (v1 ) = 6, so it is already connected to every other vertex
in the graph and cannot receive any new edges. Vertex v2 has degG3 (v2 ) = 4 and so
should be connected to any vertex vj with degG3 (vj ) ≥ 3. This means we need to
add the edge e = (v2 , v3 ).
Conclusion: [G] = G4
Careful study of G4 shows that the rule (11.4) will not add any further edges, so the
closure of the original graph is G4 .
87
v1 v1
vn v2 vn e v2
vn-1 v3 vn-1 v3
vn-2 vn-2
Gj Gj+1
Figure 11.2: The setup for the proof of the Bondy-Chvátal Theorem: adding the edge
e = (v1 , vn ) to Gj creates the Hamiltonian cycle (v1 , . . . , vn , v1 ) that’s found in Gj+1 .
The dashed lines spraying off into the middles of the diagrams are meant to indicate
that the vertices may have other edges besides those shown in black.
Theorem 11.7 (Bondy and Chvátal, 1976). A graph G is Hamiltonian if and only
if its closure [G] is Hamiltonian.
Before we prove this, notice that Dirac’s and Ore’s theorems are easy corollaries,
for when deg(v) ≥ n/2 for all vertices (Dirac’s condition) or when deg(u) + deg(v) ≥
n for all non-adjacent pairs (Ore’s condition), it’s clear that [G] is isomorphic to Kn
and, as we’ve seen, Kn is trivially Hamiltonian.
Proof. As the theorem is an if-and-only-if statement, we need to establish two things:
(1) if G is Hamiltonian then [G] is and (2) if [G] is Hamiltonian then G is too. The
first of these is easy in that the closure construction only adds edges to the graph,
so in the sequence of edge sets (11.3) G has edge set E1 and [G] has edge set EK
with K ≥ 1 and EK ⊇ E1 . This means that any edges appearing in a Hamiltonian
tour in G are automatically present in [G] too, so if G is Hamiltonian, [G] is also.
The second implication is harder and depends on an ingenious proof by contra-
diction. Assume for contradiction that G isn’t Hamiltonian, but [G] is. Now notice
that—by an argument similar to the one above—if some graph Gj⋆ in the sequence
(11.2) is Hamiltonian, then so are all the other Gj with j ≥ j⋆ . This means that if
the sequence is to begin with a non-Hamiltonian graph G = G1 and finish with a
Hamiltonian one GK = [G] there must be a single point at which the nature of the
graphs in the sequence changes. That is, there must be some j ≥ 1 such that Gj
isn’t Hamiltonian, but Gj+1 is, even though Gj+1 differs from Gj by only a single
edge. This situation is illustrated in Figure 11.2, where I have numbered the vertices
v1 . . . vn according to their position in the Hamiltonian cycle in Gj+1 and arranged
things so that the single edge whose addition creates the cycle is e = (v1 , vn ).
88
v1
vn v2
vn - 1
v3
vn - 2
vi
vi-1
and
Y = {vi | (v1 , vi ) ∈ Ej and 2 < i < n}.
The first set, X, consists of those vertices vi whose neighbour vi−1 has a direct
connection to vn , while the second set, Y , consists of vertices that have a direct
connection to v1 : both sets are illustrated in Figure 11.3.
Notice that X and Y are defined to be subsets of {v3 , . . . , vn−1 }, so they exclude
v1 , v2 and vn . Thus X has degGj (vn ) − 1 members as it includes one element for
each the neighbours of vn except for vn−1 , while |Y | = degGj (v1 ) − 1 as Y includes
all neighbours of v1 other than v2 . So then
|X| + |Y | = degGj (vn ) − 1 + degGj (v1 ) − 1
= degGj (vn ) + degGj (v1 ) − 2
≥n−2
as the closure construction is going to add the edge e = (v1 , vn ) when passing
from Gj to Gj+1 . But then, both X and Y are drawn from the set of vertices
{vi | 2 < i < n} which has only n − 3 members and so, by the pigeonhole principle,
there must be some vertex vk that is a member of both X and Y .
89
v1
vn v2
vn - 1
v3
vn - 2
Gj
vk - 2
vk + 1 vk - 1
vk
Figure 11.4: The vertex vk is a member of X ∩ Y , which implies that there is, as
shown above, a Hamiltonian cycle in Gj .
11.4 Afterword
Students sometimes have trouble remembering the difference between Eulerian and
Hamiltonian graphs and I’m not unsympathetic: after all, both are named after
very famous, long-dead European mathematicians. One way out of this difficulty is
to learn more about the two men. Leonhard Euler, who was born in Switzerland,
lived longer ago (1707–1783) and was tremendously prolific, writing many hundreds
of papers that made fundamental contributions to essentially all of 18th century
mathematics. He also lived in a very alien scientific world in that he wrote his
papers in Latin and relied on royal patronage, first from the Russian emperor Peter
the Great and then, later, from Frederick the Great of Prussia and finally, toward
the end of his life, from Catherine the Great of Russia.
90
By contrast William Rowan Hamilton, who was Irish, lived much more re-
cently (1805–1865). He also made fundamental contributions across the whole of
mathematics—the distinction between pure and applied maths didn’t really exist
then—but he inhabited a much more recognisable scientific community, first work-
ing as a Professor of Astronomy at Trinity College in Dublin and then, for the rest
of his career, as the directory of Dunsink Observatory, just outside the city.
Alternatively, one can remember the distinction between Eulerian and Hamil-
tonian tours by noting that everything about Eulerian multigraphs starts with ‘E’:
Eulerian tours go through every edge and are easy to find when every vertex has
even degree. On the other hand, Hamiltonian tours include every vertex and are
hard to find.
91
Part IV
Distance in Graphs
This lecture introduces the notion of a weighted graph and explains how some choices
of weights permit us to define a notion of distance in a graph.
Reading:
The material in this lecture comes from Chapter 3 of
Definition 12.1. Given a graph G(V, E), which may be either directed or undirected,
we can associate edge weights with G by specifying a function w : E → R. We
will write G(V, E, w) to denote the graph G(V, E) with edge weights given by w and
we will call such a graph a weighted graph.
We will write w(a, b) to indicate the weight of the edge e = (a, b) and if G(V, E, w)
is an undirected weighted graph we will require w(a, b) = w(b, a) for all (a, b) ∈ E.
Note that Definition 12.1 allows the weights to be negative or zero. That’s
because, as we’ll see soon, the weights can represent many things. If the vertices
represent places, then we could define a weight function w so that, for an edge
e = (a, b) ∈ E, the weight w(e) is:
• the time it takes to travel from a to b, in which case it may happen that
w(a, b) ̸= w(b, a);
92
y
a 1 1
1 1
b a x z b
-5
Figure 12.1: In the graph at left there are no walks from a to b and so, by
convention, we define d(a, b) = ∞. The graph at right, which has edge weights
as indicated, illustrates a more serious problem. The cycle specified by the vertex
sequence (x, y, z, x) has negative weight and so there is no minimal-weight path from
a to b and hence no well-defined distance d(a, b).
• the profit made when we send a shipping container from a to b. This could
easily be negative if we had to bring an empty container back from someplace
we’d sent a shipment.
In any case, once we’ve defined weights for edges, it’s natural to define the weight
of a walk as follows.
Definition 12.2. Given a weighted graph G(V, E, w) and a walk from a to b defined
by the vertex sequence
a = v0 , . . . , vℓ = b,
so that the its edges are ej = (vj−1 , vj ), then the weight of the walk is
X
ℓ
w(ej ).
j=1
but two issues, both of which are illustrated in Figure 12.1, present themselves
immediately:
(1) What if there aren’t any walks from a to b? In this case, by convention, we
define d(a, b) = ∞.
(2) What if some cycle in G has negative weight? As we will see below, this leads
to insurmountable problems and so we’ll just have to exclude this possibility.
The problem with cycles of negative weight is illustrated at the right in Fig-
ure 12.1. The graph has V = {a, x, y, z, b} and edge weights
93
The cycle specified by the vertex sequence (z, x, y, z) thus has weight
w(z, x) + w(x, y) + w(y, z) = −5 + 1 + 1 = −3
and one can see that this presents a problem for our definition of d(a, b) by consid-
ering the sequence of walks:
Walk Weight
(a, x, y, z, b) 1+1+1+1 = 4
(a, x, y, z, x, y, z, b) 4 + (−5 + 1 + 1) = 1
(a, x, y, z, x, y, z, x, y, z, b) 4−2×3 = −2
..
.
(a, x, y, z, x, y, z, . . . , x, y, z , b) 4 − k × 3 = 4 − 3k.
| {z }
k times around the cycle
There is no walk of minimal weight from a to b: one can always find a walk of lower
weight by tracing over the negative-weight cycle a few more times. We could escape
this problem by defining d(a, b) as the weight of a minimal-weight path1 , but instead
we will exclude the problematic cases explicitly:
Definition 12.3. Suppose G(V, E, w) is a weighted graph that does not contain any
cycles of negative weight. For vertices a and b we define the distance function
d : V × V → R as follows:
• d(a, a) = 0 for all a ∈ V ;
• d(a, b) = ∞ if there is no walk from a to b;
• d(a, b) is the weight of a minimal-weight walk from a to b when such walks
exist.
A warning
The word “distance” in Definition 12.3 is potentially misleading in that it is perfectly
possible to find weighted graphs in which d(a, b) < 0 for some (or even all) a and
b. Further, it’s possible that in a directed graph there may be vertices a and b such
that d(a, b) ̸= d(b, a). If we want our distance function to have the all the properties
that the word “distance” normally suggests, it’s helpful to recall (or learn for the
first time) the definition of a metric on a set X. It’s a function d : X × X → R with
the following properties:
Non-negativity d(x, y) ≥ 0 ∀x, y ∈ X and d(x, y) = 0 ⇐⇒ x = y;
symmetry d(x, y) = d(y, x) ∀x, y ∈ X;
triangle inequality d(x, y) + d(y, z) ≥ d(x, z) ∀x, y, z ∈ X.
If d is a metric on X we say that the pair (X, d) constitute a metric space.
It’s not hard to prove (see the Problem Sets) that if G(V, E, w) is a weighted,
undirected graph in which w(e) > 0 ∀e ∈ E, then the function d : V × V → R from
Definition 12.3 is a metric on the vertex set V .
1
A path cannot revisit a vertex and hence cannot trace over a cycle.
94
s x
z
w
u y
Figure 12.2: A SSSP problem in which all the edge weights are 1 and the source
vertex s is shown in yellow.
where s is the source vertex. Our goal is then to compute d(v) for all vertices in the
graph.
To illustrate the main ideas, we’ll use BFS to compute d(v) for all the vertices
in the graph pictured in Figure 12.2:
• Set d(s) = 0.
• Set d(v) = 1 for all s’s neighbours. That is, set d(v) = 1 for all vertices
v ∈ As = {u, w}.
• Set d(v) = 2 for those vertices that (a) are adjacent to vertices t with d(t) = 1
and (b) have not yet had a value for d(v) assigned.
95
s 0 x s 0 x 2 s 0 x 2
1 z 1 z 1 3 z
w w w
u 1 y u 1 2 y u 1 2 y
Figure 12.3: The leftmost graph shows the result of the first two stages of the
informal BFS algorithm: we set d(s) = 0 and and d(v) = 1 for all v ∈ As . In the
second stage we set d(v) = 2 for neighbours of vertices t with d(t) = 1 · · · and so
on.
• Set d(v) = 3 for those vertices that (a) are adjacent to vertices t with d(t) = 2
and (b) have not yet had a values of d(v) assigned.
This process is illustrated in Figure 12.3 and, for the current example, these four
steps assign a value of d(v) to every vertex. In Section 12.4 we will return to this
algorithm and rewrite it in a more general way, but I’d like to conclude this section
by discussing why this approach works.
Then we have the following theorem, which captures the idea that a minimal-
weight path from v1 to vj consists of a minimal-weight path from v1 to one of
vj ’s neighbours—say vk such that (vk , vj ) ∈ E—followed by a final step from vk to
vj .
Further, if all cycles in G(V, E, w) have positive weight, then the equations (12.1)
have a unique solution.
96
12.4 Appendix: BFS revisited
The last two steps of our informal introduction to BFS had the general form
• Set d(v) = j + 1 for those vertices that (a) are adjacent to vertices t with
d(t) = j and (b) have not yet had a values of d(v) assigned.
The main technical problem in formalising the algorithm is to find a systematic way
of working our way outward from the source vertex. A data structure from computer
science called a queue provides an elegant solution. It’s an ordered list that we’ll
write from left-to-right, so that a queue containing vertices might look like
Q = {x, z, u, w, . . . , a},
pop: remove the first (i.e. leftmost) entry and, typically, do something with it.
Thus if our queue is Q = {b, c, a} and we push a vertex d onto it the result is
push x onto Q
{b, c, a} −−−−−−−−−−−−→ {b, c, a, x},
We can use this idea to organise the order in which we visit the vertices in BFS.
Our goal will be to compute d(v) = d(s, v) for all vertices in the graph and we’ll
start by setting d(s) = 0 and
d(v) = ∞ ∀v ̸= s.
This has two advantages: first, d(v) = ∞ is the correct value for any vertex that
is not reachable from s and second, it serves as a way to indicate that, as far as
our algorithm is concerned, we have yet to visit vertex v and so d(v) is yet-to-be-
determined.
We’ll then work our way through the vertices that lie in the same connected
component as s by
• pushing a vertex v onto the end of the queue whenever we set d(v), beginning
with the source vertex s and
• popping vertices off the queue in turn, working through the adjacency list of
the popped vertex u and examining its neighbours w ∈ Au in turn, setting
d(w) = d(u) + 1 whenever d(w) is currently marked as yet-to-be-determined.
97
Algorithm 12.5 (BFS for SSSP). Given an undirected graph G(V, E) and a dis-
tinguished source vertex s ∈ V , assume uniform edge weights w(e) = 1 ∀e ∈ E and
find the distances d(v) = d(s, v) for all v ∈ V .
Figure 12.4 illustrates the early stages of applying the algorithm to a small graph,
while Table 12.1 provides a complete account of the computation.
Remarks
• If d(u) = ∞ when the algorithm finishes, then u and s lie in separate connected
components.
• Because the computation works through adjacency lists, each edge gets con-
sidered at most twice and so the algorithm requires O(|E|) steps, where a step
consists of checking whether d(u) = ∞ and, if so, updating its value.
98
u w ∈ Au Action Resulting Queue
– – Start {S}
S A set d(A) = 1 and push A {A}
S C set d(C) = 1 and push C {A, C}
S G set d(G) = 1 and push G {A, C, G}
A B set d(B) = 2 and push B {C, G, B}
A S none, as d(S) = 0 {C, G, B}
C D set d(D) = 2 and push D {G, B, D}
C E set d(E) = 2 and push E {G, B, D, E}
C F set d(F ) = 2 and push F {G, B, D, E, F }
C S none {G, B, D, E, F }
G F none {B, D, E, F }
G H set d(H) = 2 and push H {B, D, E, F, H}
G S none {B, D, E, F, H}
B A none {D, E, F, H}
D C none {E, F, H}
E C none {F, H}
E H none {F, H}
F C none {H}
F G none {H}
H E none {}
H G none {}
Table 12.1: A complete record of the execution of the BFS algorithm for the graph
in Figure 12.4. Each row corresponds to one pass through the innermost loop of
Algorithm 12.5, those steps that check whether d(w) = ∞ and act accordingly.
The table is divided into sections—separated by horizontal lines—within which the
algorithm works through the adjacency list of the most recently-popped vertex u.
99
B D B D
1 1
A C E A C E
0 0
S F S F
1
G H G H
Figure 12.4: In the graphs above the source vertex s is shown in yellow and a
vertex v is shown with a number on it if, at that stage in the algorithm, d(v) has
been determined (that is, if d(v) ̸= ∞). The graph at left illustrates the state of
the computation just after the initialisation: d(s) has been set to d(s) = 0, all other
vertices have d(v) = ∞ and the queue is Q = {s}. The graph at right shows the
state of the computation after we have popped s and processed its neighbours: d(v)
has been determined for A, C and G and they have been pushed onto the queue,
which is now Q = {A, C, G}.
100
Chapter 13
This lecture introduces tropical arithmetic1 and explains how to use it to calculate
the lengths of all the shortest paths in a graph.
Reading:
The material here is not discussed in any of the main references for the course. The
lecture is meant to be self-contained, but if you find yourself intrigued by tropical
mathematics, you might want to look at a recent introductory article
The book by Heidergott et al. includes a tropical model of the Dutch railway network
and is more accessible than either the book by Maclagan and Sturmfels or the latter
parts of the article by Speyer and Sturmfels.
1
Maclagan & Sturmfels write: The adjective “tropical” was coined by French mathematicians,
notably Jean-Eric Pin, to honour their Brazilian colleague Imre Simon, who pioneered the use of
min-plus algebra in optimisation theory. There is no deeper meaning to the adjective “tropical”.
It simply stands for the French view of Brazil.
101
13.1 All pairs shortest paths
Video In a previous lecture we used Breadth First Search (BFS) to solve the single-source
9.1 shortest paths problem in a weighted graph G(V, E, w) where the weights are trivial
in the sense that w(e) = 1 ∀e ∈ E. Today we’ll consider the problem where the
weights can vary from edge to edge, but are constrained so that all cycles have
positive weight. This ensures that Bellman’s equations have a unique solution. Our
approach to the problem depends on two main ingredients: a result about powers
of the adjacency matrix and a novel kind of arithmetic.
The only nonzero entries in this sum appear for those values of k for which both
Aℓik0 and Akj are nonzero. Now, the only possible nonzero value for Akj is 1, which
happens when the edge (k, j) is present in the graph. Thus we could also think of
the sum above as running over vertices k such that the edge (k, j) is in E:
X
Aℓij0 +1 = Aℓik0 .
{k|(k,j)∈E}
By the inductive hypothesis, Aℓik0 is the number of distinct, length-ℓ0 walks from i to
k. And if we add the edge (k, j) to the end of such a walk, we get a walk from i to
j. All the walks produced in this way are clearly distinct (those that pass through
different intermediate vertices k are obviously distinct and even those that have
the same k are, by the inductive hypothesis, different somewhere along the i to k
segment). Further, every walk of length ℓ0 + 1 from i to j must consist of a length-ℓ0
walk from i to some neighbour or predecessor k of j, followed by a final step from k
to j, so we have completed the inductive argument and proved the result.
102
G H
1 2 3 1 2 3
Figure 13.1: In G, the graph at left, any walk from vertex 1 to vertex 3 must have
even length while in H, the directed graph at right, there are no walks of length 3 or
greater.
Two examples
The graph at left in Figure 13.1 contains six walks of length 2. If we represent them
with vertex sequences they’re
(1, 2, 1), (1, 2, 3), (2, 1, 2), (2, 3, 2), (3, 2, 1), and (3, 2, 3), (13.2)
Comparing these we see that the computation based on powers of AG agrees with
the list of paths, just as Theorem 13.1 leads us to expect:
• A21,1 = 1 and there is a single walk, (1, 2, 1), from vertex 1 to itself;
• A21,3 = 1: counts the single walk, (1, 2, 3), from vertex 1 to vertex 3;
• A22,2 = 2: counts the two walks from vertex 2 to itself, (2,1,2) and (2,3,2);
• A23,1 = 1: counts the single walk, (3, 2, 1), from vertex 3 to vertex 1;
• A23,3 = 1: counts the single walk, (3, 2, 3), from vertex 3 to itself.
Something similar happens for the directed graph H that appears at right in Fig-
ure 13.1, but it has only a single walk of length two and none at all for lengths three
or greater.
0 1 0 0 0 1 0 0 0
AH = 0 0 1 A2H = 0 0 0 A3H = 0 0 0 (13.4)
0 0 0 0 0 0 0 0 0
An alternative to BFS
The theorem we’ve just proved suggests a way to find all the shortest paths in the
special case where w(e) = 1 ∀e ∈ E. Of course, in this case the weight of a path is
the same as its length.
(1) Observe that a shortest path has length at most n − 1.
103
(2) Compute the sequence of powers of the adjacency matrix A, A2 , · · · , An−1 .
(3) To find the length of a shortest path from vi to vj , look through the sequence
of matrix powers and find the smallest ℓ such that Aℓij > 0. This ℓ is the
desired length.
In the rest of the lecture we’ll generalise this strategy to graphs with arbitrary
weights.
a ⊕ b = min(a, b) = min(b, a) = b ⊕ a
and
a ⊗ b = a + b = b + a = b ⊗ a.
The tropical arithmetic operators are also associative:
and
a ⊗ (b ⊗ c) = a + (b + c) = a + b + c = (a + b) + c = (a ⊗ b) ⊗ c.
Also, there are distinct additive and multiplicative identity elements. For all
a ∈ R ∪ {∞} we have
a ⊕ ∞ = min(a, ∞) = a and a ⊗ 0 = 0 + a = a.
There are, however, important differences between tropical and ordinary arithmetic.
In particular, there are no additive inverses2 in tropical arithmetic and so one cannot
always solve linear equations. For example, there is no x ∈ R ∪ {∞} such that
(2 ⊗ x) ⊕ 5 = 11. To see why, rewrite the equation as follows:
(2 ⊗ x) ⊕ 5 = (2 + x) ⊕ 5 = min(2 + x, 5) ≤ 5.
2
Students who did Algebraic Structures II might recognise that this collection of properties
means that tropical arithmetic over R ∪ {∞} is a semiring.
104
13.3.1 Tropical matrix operations
Given two m × n matrices A and B whose entries are drawn from R ∪ {∞}, we’ll
define the tropical matrix sum A ⊕ B by:
And for compatibly-shaped tropical matrices A and B we can also define the tropical
matrix product by
M
n
(A ⊗ B)ij = Aik ⊗ Bkj = min (Aik + Bkj ).
1≤k≤n
k=1
It has zeroes on the diagonal and ∞ everywhere else. It’s easy to check that if A is
an m × n tropical matrix then
Iˆm ⊗ A = A = A ⊗ Iˆn .
and
(1 ⊗ ∞) ⊕ (2 ⊗ 1) (1 ⊗ 1) ⊕ (2 ⊗ ∞)
A⊗B =
(0 ⊗ ∞) ⊕ (∞ ⊗ 1) (0 ⊗ 1) ⊕ (∞ ⊗ ∞)
min(1 + ∞, 2 + 1) min(1 + 1, 2 + ∞)
=
min(0 + ∞, ∞ + 1) min(0 + 1, ∞ + ∞)
3 2
= .
∞ 1
105
13.3.2 A tropical version of Bellman’s equations
Video Recall Bellman’s equations from Section 12.3.2. Given a weighted graph G(V, E, w)
9.3 in which all cycles have positive weight, we can find uj = d(v1 , vj ) by solving the
system
u1 = 0 and uj = min uk + wk,j for 2 ≤ j ≤ n,
k̸=j
which looks almost like the tropical matrix product u ⊗ w: we’ll exploit this obser-
vation in the next section.
Note that W is very similar to the matrix w defined by Eqn. (13.8): the two differ
only along the diagonal, where wii = ∞ for all i, while Wii = 0.
Proof. We proceed by induction on ℓ. The base case is ℓ = 1 and it’s clear that the
only length-one walks are the edges themselves, while Wii = 0 by construction.
106
Now suppose the result is true for all ℓ ≤ ℓ0 and consider the case ℓ = ℓ0 + 1.
We will first prove the result for the off-diagonal entries, those for which i ̸= j. For
these entries we have
M n
⊗ℓ0 +1 ⊗ℓ0
Wi,j = Wi,k ⊗ Wk,j = min Wik⊗ℓ0 + Wk,j (13.10)
1≤k≤n
k=1
⊗ℓ0
and the inductive hypothesis says that Wi,k is either the weight of a minimal-weight
walk from vi to vk containing ℓ0 or fewer edges or, if no such walks exist, Wik⊗ℓ0 = ∞.
Wk,j is given by Eqn. (13.9) and so there are three possibilities for the terms
⊗ℓ0
Wi,k + Wk,j (13.11)
that appear in the tropical sum (13.10):
⊗ℓ0
(1) They are infinite for all values of k, and so direct calculation gives Wi,j = ∞.
This happens when, for each k, we have one or both of the following:
⊗ℓ0
• Wi,k = ∞, in which case the inductive hypothesis says that there are
no walks of length ℓ0 or less from vi to vk or
• Wk,j = ∞ in which case there is no edge from vk to vj .
And since this is true for all k, it implies that there are no walks of length
ℓ0 + 1 or less that run from vi to vj . Thus the lemma holds when i ̸= j and
Wij⊗ℓ0 +1 = ∞.
(2) The expression in (13.11) is finite for at least one value of k, but not for k = j.
Then as Wjj = 0 by construction, we know Wij⊗ℓ0 = ∞ and so there are no
walks of length ℓ0 or less running from vi to vj . Further,
⊗ℓ0 +1 ⊗ℓ0
Wi,j = min Wi,k + Wk,j
k̸=j
107
Finally, note that reasoning above works for Wii⊗ℓ too: Wii⊗ℓ is the weight of a
minimal-weight walk from vi to itself. And given that any walk from vi to itself
must contain a cycle and that all cycles have positive weight, we can conclude that
the tropical sum
⊗ℓ0 +1 ⊗ℓ0
Wi,i = min Wi,k + Wk,i
k
⊗ℓ0 ⊗ℓ0
is minimised by k = i, when Wi,k = Wi,i = 0 (by the inductive hypothesis) and
Wi,i = 0 (by construction) so
⊗ℓ0 ⊗ℓ0
min Wi,k + Wk,i = Wi,i + Wi,i = 0 + 0 = 0
k
An example
The graph illustrated in Figure 13.2 is small enough that we can just read off the
weights of its minimal-weight paths. If we assemble these results into a matrix D
whose entries are given by
0 if i = j
Dij = d(vi , vj ) if i ̸= j and vj is reachable from vi
∞ otherwise
we get
0 −2 −1
D = 3 0 1 .
2 0 0
To verify Theorem 13.4 we need to write down the weight matrix and compute
its tropical square, which are
0 −2 1 0 −2 −1
W = ∞ 0 1 and W ⊗2 = 3 0 1 .
2 ∞ 0 2 0 0
108
v1
-2
v2
2 1
v3
Figure 13.2: The graph above contains two directed cycles, (v1 , v2 , v3 , v1 ), which has
weight 1, and (v1 , v3 , v1 ), which has weight 2. Theorem 13.4 thus applies and we
can compute the weights of minimal-weight paths using tropical powers of the weight
matrix.
The graph has n = 3 vertices and so we expect W ⊗(n−1) = W ⊗2 to agree with the
distance matrix D, which it does.
109
Chapter 14
This lecture applies ideas about distance in weighted graphs to solve problems in
the scheduling of large, complex projects.
Reading:
The topic is discussed in Section 3.5 of
but it is such an important application that it is also treated in many other places.
110
Time to
Task Complete Prerequisites
A 1 None
B 2 None
C 3 A&B
D 4 B
E 4 C
F 4 C&D
G 6 E&D
H 6 E&F
Table 14.1: Summary of a modestly-sized project. The first column lists various
tasks required for the completion of the project, while the second column gives the
time (in minutes) needed to complete each task and the third column gives each task’s
immediate prerequisites.
• edges that run from prerequisites to the tasks that depend on them. Thus for
example, there is a directed edge (A, C), as task C has task A as a prerequisite.
• There are also two extra vertices, one called S (for “start”) that requires
no time to complete, but is is a predecessor for all the tasks that have no
prerequisites and another, Z, that corresponds to finishing the project and is
a successor of all tasks that are not prerequisites for any other task.
• There are edge weights that correspond to the time it takes to complete the
task at the tail vertex. Thus—as task B takes 2 minutes to complete—both
edges coming out of the vertex B have weight 2.
Figure 14.1: The digraph associated with the scheduling problem from Table 14.1.
111
Task Prereq’s
P: get a pink from G P B G
B: get a blue form P
G: get a green form B
Figure 14.2: An example showing why the digraph associated with a scheduling
problem shouldn’t contain cycles. It represents a bureaucratic nightmare in which
one needs a pink form P in order to get a blue form B in order to get the green
form G that one needs to get a pink form P .
(1) What is the shortest time in which we can complete the work?
(2) What is the earliest time (measured from the start of the project) at which
we can start a given task?
(3) Are there any tasks whose late start would delay the whole project?
(4) For any tasks that don’t need to be started as early as possible, how long can
we delay their starts?
In the discussion that follows, we’ll imagine that we have as many resources as we
need (as many hands to help in the kitchen, as many employees and as much equip-
ment as needed to pursue multiple tasks in parallel · · · ). In this setting, Lemma 14.1,
proved below, provides a tool to answer all of these questions.
112
A
10
0
Figure 14.3: The shortest time in which we can complete this project is 10 hours, the
weight of a maximal-weight path from the starting vertex S to the terminal vertex
Z.
The proof of Lemma 14.1 turns on the observation that the times tv satisfy
equations that look similar to Bellman’s Equations, except that they have a max()
where Bellman’s Equations have a min():
In the equation at right, Pv is v’s predecessor list and w(u, v) is the weight of the edge
from u to v or, equivalently, the time it takes to complete the task corresponding to
u.
Although the Bellman-like equations above provide an elegant characterisation
of the tv , they aren’t necessarily all that practical as a way to calculate the tv .
The issue is that in order to use Eqn. (14.1) to compute tv , we need tu for all v’s
predecessors u ∈ Pv . And for each of them, we need tw for w ∈ Pu · · · , and so on.
Fortunately this problem has a simple resolution in DAGs, as we’ll see below. The
idea is to find a clever way to organise the computations so that the results we need
when computing tv are certain to be available.
113
1 2 2 3
4 4
3 1
• Φ(v) = Φ(u) ⇒ u = v;
In other words, a topological ordering is a way of numbering the vertices so that the
graph’s directed edges always point from a vertex with a smaller index to a vertex
with a bigger one.
Topological orderings are not, in general, unique, as is illustrated in Figure 14.4,
but as the following results show, a DAG always has at least one.
Lemma 14.3 (DAGs contain sink vertices). If G(V, E) is a directed, acyclic graph
then it contains at least one vertex v with degout (v) = 0. Such a vertex is sometimes
called a sink vertex or a sink.
Proof of Lemma 14.3. Construct a walk through G(V, E) as follows. First choose
an arbitrary vertex v0 ∈ V . If degout (v0 ) = 0 we are finished, but if not choose an
arbitrary successor of v0 , v1 ∈ Sv0 . If degout (v1 ) = 0 we are finished, but if not,
choose an arbitrary successor of v1 , v2 ∈ Sv1 · · · and so on. This construction can
never revisit a vertex as G is acyclic. Further, as G has only finitely many vertices,
the construction must come to a stop after at most |V | − 1 steps. But the only way
for it to stop is to reach a vertex vj such that degout (vj ) = 0, which proves that such
a vertex must exist.
Theorem 14.4 (DAGs have topological orderings). A directed, acyclic graph G(V, E)
always has a topological ordering.
Proof of Theorem 14.4. One can prove this by induction on the number of vertices.
The base case is |V | = 1 and clearly, assigning the number 1 to the sole vertex gives
a topological ordering.
Now suppose the result is true for all DAGs with |V | ≤ n0 and consider a
DAG with |V | = n0 + 1. Lemma 14.3 tells us that G contains a vertex w with
degout (w) = 0. Construct G′ (V ′ , E ′ ) = G\w. It is a DAG (because G was one), but
114
2 4 6 8
0
1 10
0
3 5 7 9
Figure 14.5: A topological ordering for the digraph associated with the scheduling
problem from Table 14.1 in which the vertex label v has been replaced by the value
Φ(v) assigned by the ordering that’s listed in Table 14.2.
v S A B C D E F G H Z
Φ(v) 1 2 3 4 5 6 7 8 9 10
has only |V ′ | = n0 vertices and so, by the inductive hypothesis, G′ has a topolog-
ical ordering Φ′ : V ′ → {1, 2, . . . , n0 }. We can extend this to a obtain a function
Φ : V → {1, 2, . . . , n0 + 1} by choosing
′
Φ (v) if v ̸= w
Φ(v) =
n0 + 1 if v = w
115
0 2 0 2 5 9
0 0
0 0 16
0 0
0 2 6 0 2 6 10
Figure 14.6: In the graphs above the vertex labels have been replaced with values of
tj , the earliest times at which the corresponding task can start. The graph at left
shows the edges that enter into Eqn. (14.1) for the computation of t7 while the graph
at right shows all the tj .
This expression, along with the observation that TZ = tZ , allows us to find Tv for all
tasks by working backwards through the DAG. Figure 14.7 illustrates this for our
main example, while Table 14.3 lists tv and Tv for all vertices v ∈ V .
116
0:2 2:3 5:6 9:10
0
0:0 16:16
0
0:0 2:2 6:6 10:10
Figure 14.7: Here a vertex v is labelled with a pair tv : Tv that shows both both tv ,
the earliest time at which the corresponding task could start, and Tv , the latest time
by which the task must start if the whole project is not to be delayed. This project
has only a single critical path, (S, B, D, F, H, Z), which is highlighted in red.
v S A B C D E F G H Z
tv 0 0 0 2 2 5 6 9 10 16
Tv 0 2 0 3 2 6 6 10 10 16
Table 14.3: The earliest starts tv and latest starts Tv for the main example.
delay the whole project. Such maximal-weight paths play a crucial role in project
management and so there is a term to describe them:
Tasks whose vertices do not lie on a critical path have Tv > tv and so do not require
such keen supervision.
117
Part V
Planar Graphs
Chapter 15
Planar Graphs
This lecture introduces the idea of a planar graph—one that you can draw in such
a way that the edges don’t cross. Such graphs are of practical importance in, for
example, the design and manufacture of integrated circuits as well as the automated
drawing of maps. They’re also of mathematical interest in that, in a sense we’ll
explore, there are really only two non-planar graphs.
Reading:
The first part of our discussion is based on that found in Chapter 10 of
J. A. Bondy and U. S. R. Murty (2008), Graph Theory, Vol. 244 of
Springer Graduate Texts in Mathematics, Springer Verlag,
but in subsequent sections I’ll also draw on material from Section 1.5 of
Dieter Jungnickel (2013), Graphs, Networks and Algorithms, 4th edition,
which is available online via SpringerLink.
118
Definition 15.1. A curve in the plane is a continuous image of the unit interval.
That is, a curve is a set of points
C = γ(t) ∈ R2 | 0 ≤ t ≤ 1
traced out as t varies across the closed unit interval. Here γ(t) = ((x(t), y(t)), where
x(t) : [0, 1] → R and y(t) : [0, 1] → R are continuous functions. If the curve does
not intersect itself (that is, if γ(t1 ) = γ(t2 ) ⇒ t1 = t2 ) then it is a simple curve.
Definition 15.2. A closed curve is a continuous image of the unit circle or,
equivalently, a curve in which γ(0) = γ(1). If a closed curve doesn’t intersect itself
anywhere other than at γ(0) = γ(1), then it is a simple closed curve.
Figure 15.1 and Table 15.1 give examples of these two definitions, while the
following one, which is illustrated in Figure 15.2, sets the stage for this section’s key
result.
Figure 15.1: From left to right: γ1 , a simple curve; γ2 , a curve that has an
intersection, so is not simple; γ3 , a simple closed curve and γ4 , a closed curve with
an intersection. Explicit formulae for the curves and their intersections appear in
Table 15.1.
Theorem 15.4 (The Jordan Curve Theorem). A simple closed curve C in the plane
divides the rest of the plane into two disjoint, arcwise-connected, open sets. These
two open sets are called the interior and exterior of C, often denoted Int(C ) and
Ext(C ), and any curve joining a point x ∈ Int(C ) to a point y ∈ Ext(C ) intersects
C at least once.
119
Curve x(t) y(t)
γ1 (t) 2t 24t − 36t2 + 14t − 1
3
Figure 15.2: The two shaded regions above are, individually, arcwise connected,
but their union is not: any curve connecting x to y would have to pass outside the
shaded regions.
Int(C ) Ext(C )
x
y
120
1 (infinite) face 2 faces
4 vertices 3 vertices
3 edges 3 edges
3 faces 6 faces
4 vertices 8 vertices
5 edges 12 edges
Figure 15.4: Four examples of planar graphs with numbers of faces, vertices and
edges for each.
Definition 15.5. A planar diagram for a graph G(V, E) with edge set E =
{e1 , . . . , em } is a collection of simple curves {γ1 , . . . , γm } that represent the edges
and have the property that the curves γj and γk corresponding to two distinct edges
ej and ek intersect if and only if the two edges are incident on the same vertex and,
in this case, they intersect only at the endpoints that correspond to their common
vertex.
If a planar graph G contains cycles then the curves that correspond to the edges
in the cycles link together to form simple closed curves that divide the plane into
finitely many disjoint open sets called faces. Even if the graph has no cycles, there
will still be one infinite face: see Figure 15.4.
121
G G´ = G \ e
Figure 15.5: Deleting the edge e causes two adjacent faces in G to merge into a
single face in G′ .
Base case The smallest connected planar graph contains only a single vertex, so
has n = 1, m = 0 and f = 1. Thus
2+m−n=2+0−1=1=f
just as Euler’s formula demands.
Inductive step Suppose the result is true for all m ≤ m0 and consider a connected,
planar graph G(V, E) with |E| = m = m0 + 1 edges. Suppose further that G
has |V | = ns and a planar diagram with f faces. Then one of the following
things is true:
• G is a tree, in which case Euler’s formula is true by the lemma proved
above;
• G contains at least one cycle.
If G contains a cycle, choose an edge e ∈ E that’s part of that cycle and form
G′ = G\e, which has m′ = m0 edges, n′ = n vertices and f ′ = f − 1 faces.
This last follows because breaking a cycle merges two adjacent faces, as is
illustrated in Figure 15.5.
As G′ has only m0 edges, we can use the inductive hypothesis to say that
f ′ = m′ − n′ + 2. Then, again using unprimed symbols for quantities in G, we
have:
f ′ = m′ − n′ + 2
f − 1 = m0 − n + 2
f = (m0 + 1) − n + 2
f = m − n + 2,
which establishes Euler’s formula for graphs that contain cycles.
122
Figure 15.6: Planar graphs with the maximal number of edges for a given number
of vertices. The graph with the yellow vertices has n = 5 and m = 9 edges, while
those with the blue vertices have n = 6 and m = 12
n(n − 1) |E|
|E| ≪ or s≡ ≪1 (15.1)
2 n(n − 1)/2
Table 15.2 makes it clear that when n > 5 the planar graphs become increasingly
sparse1 .
Definition 15.10. If a graph G(V, E) contains one or more cycles then the girth
of G is the length of a shortest cycle.
123
Number of non-isomorphic, connected graphs that are . . .
n mmax s planar, with m = mmax planar planar or not
5 9 0.9 1 20 21
6 12 0.8 2 99 112
7 15 0.714 5 646 853
8 18 0.643 14 5,974 11,117
9 21 0.583 50 71,885 261,080
10 24 0.583 ? 1,052,805 11,716,571
11 27 0.533 ? 17,449,299 1,006,700,565
12 30 0.491 ? 313,372,298 164,059,830,476
Table 15.2: Here mmax is the maximal number of edges appearing in a planar
graph on the given number of vertices, while the column labelled s lists the measure
of sparsity given by Eqn. 15.1 for connected, planar graphs with mmax edges. The
remaining columns list counts of various kinds of graphs and make the point that as
n increases, planar graphs with m = mmax become rare in the set of all connected
planar graphs and that this family itself becomes rare in the family of connected
graphs.
A cycle
contains no
bridges
124
Girth is 4
Girth is 3
Remark 15.11. A graph with n vertices has girth in the range 3 ≤ g ≤ n. The
lower bound arises because all cycles include at least three edges and the upper one
because the longest possible cycle occurs when G is isomorphic to Cn .
A: G is acyclic and m = n − 1;
B: G has at least one cycle and so has a well-defined girth g. In this case
g(n − 2)
m≤ . (15.2)
g−2
Video Outline of the Proof. We deal first with the case where G is acyclic and then move
10.4 on to the harder, more general case:
A: G is connected, so if it has no cycles it’s a tree and we’ve already proved (see
Theorem 6.13) that trees have m = n − 1.
B: When G contains one or more cycles, we’ll prove the inequality 15.2 mainly
by induction on n, but we’ll need several sub-cases. To see why, let’s plan out
the argument.
Base case: n = 3
There is only a single graph on three vertices that contains a cycle, it’s K3 ,
125
which has girth g = 3 and n = 3, so our theorem says
g(n − 2)
m≤
g−2
3 × (3 − 2)
≤
3−2
≤3
Inductive hypothesis:
Assume the result is true for all connected, planar graphs that contain a cycle
and have n ≤ n0 vertices.
Inductive step:
Now consider a connected, planar graph G(V, E) with n0 + 1 vertices that
contains a cycle. We need, somehow, to reduce this graph to one for which we
can exploit the inductive hypothesis and so one naturally thinks of deleting
something. This leads to two main sub-cases, which are illustrated2 below.
B.1 G contains at least one bridge. In this case the road to a proof by in-
duction seems clear: we’ll delete the bridge and break G into two smaller
graphs.
B.2 G does not contains any bridges. Equivalently, every edge in G is part
of some cycle. Here it’s less clear how to handle the inductive step and
so we will use an altogether different, non-inductive approach.
126
hypothesis to the pieces. If we define nj = |Vj | to be the number of vertices
in Gj and mj = |Ej | to be the corresponding number of edges, then we know
n1 + n2 = n = n0 + 1 and m1 + m2 = m − 1. (15.3)
But we need to take a little care as deleting a bridge leads to two further sub-
cases and we’ll need a separate argument for each. Given that the original
graph G contained at least one cycle—and noting that removing a bridge
can’t break a cycle—we know that at least one of the two pieces G1 and G2
contains a cycle. Our two sub-cases are thus:
B.1a Exactly one of the two pieces contains a cycle. We can assume without
loss of generality that it’s G2 , so that G1 is a tree.
B.1a G contains a bridge and at least one cycle. Deleting the bridge leaves
two subgraphs, a tree G1 and a graph, G2 , that contains a cycle: we
handle this possibility in Case 15.13 below.
B.1b G contains a bridge and at least two cycles. Deleting the bridge leaves
two subgraphs, G1 and G2 , each of which contains at least one cycle: see
Case 15.14.
B.2 G contains one or more cycles, but no bridges: see Case 15.15.
127
15.3.3 Gritty details of the proof of Theorem 15.12
Video Before we plunge into the Lemmas, it’s useful to make a few observations about the
10.5 ratio g/(g − 2) that appears in Eqn. (15.2). Recall (from Remark 15.11) that if a
graph on n vertices contains a cycle, then the girth is well-defined and lies in the
range 3 ≤ g ≤ n.
• The monotonicity of g/(g − 2), combined with the fact that g ≥ 3, implies
that g/(g − 2) is bounded from above by 3:
g 3
g≥3 ⇒ ≤ = 3. (15.5)
g−2 3−2
Case 15.13 (Case B.1a of Theorem 15.12). Here G contains a bridge and deleting
this bridge breaks G into two connected planar, subgraphs, G1 (V1 , E1 ) and G2 (V2 , E2 ),
one of which is a tree.
Proof. We can assume without loss that G1 is the tree and then argue that every
cycle that appears in G is also in G2 (we’ve only deleted a bridge), so the girth of
G2 is still g. Also, n1 ≥ 1, so n2 ≤ n0 and, by the inductive hypothesis, we have
g(n2 − 2)
m2 ≤ .
g−2
But then, because G1 is a tree, we know that m1 = n1 − 1. Adding this to both
sides of the inequality yields
g(n2 − 2)
m1 + m2 ≤ (n1 − 1) +
g−2
128
or, equivalently,
g(n2 − 2)
m1 + m2 + 1 ≤ n1 + .
g−2
Finally, noting that m = m1 + m2 + 1, we can say
g(n2 − 2)
m ≤ n1 +
g−2
g g(n2 − 2)
≤ n1 +
g−2 g−2
g(n1 + n2 − 2)
≤
g−2
g(n − 2)
≤ ,
g−2
which is the result we sought. Here the step from the first line to the second follows
because 1 < g/(g − 2) (recall Eqn. (15.6)), so
g
n1 < n1
g−2
and the last line follows because n = n1 + n2 .
Case 15.14 (Case B.1b of Theorem 15.12). This case is similar to the previous one
in that here again G contains a bridge, but in this case deleting the bridge breaks G
into two connected planar, subgraphs, each of which contains at least one cycle (and
so has a well defined-girth).
Proof. We’ll say that G1 has girth g1 and G2 has girth g2 and note that, as the girth
is defined as the length of a shortest cycle—and as every cycle that appears in the
original graph G must still be present in one of the Gj —we know that
g ≤ g1 and g ≤ g2 . (15.7)
129
where the step from the first line to the second follows from Eqn. 15.7 and the
monotonicity of the ratio g/(g − 2) (recall Eqn. (15.4)). If we again note that
1 < g/(g − 2) we can conclude that
g(n1 + n2 − 4)
m1 + m2 + 1 ≤ +1
g−2
g(n1 + n2 − 4) g
≤ +
g−2 g−2
g(n1 + n2 − 3)
≤
g−2
and so
g(n1 + n2 − 3) g(n1 + n2 − 2)
m = m1 + m2 + 1 ≤ ≤
g−2 g−2
or, as n = n1 + n2 ,
g(n − 2)
m≤ ,
g−2
which is the result we sought.
Video Case 15.15 (Case B.2 of Theorem 15.12). In the final case G(V, E) does not
11.1 contain any bridges, which implies that every edge in E is part of some cycle. This
makes it harder to see how to use the inductive hypothesis (we’d have to delete two
or more edges to break G into disconnected pieces . . . ) and so we will use an entirely
different argument based on Euler’s Formula (Theorem 15.7).
Proof. First, define fj to be the number of faces whose boundary has j edges, making
sure to include the infinite face: Figures 15.9 illustrates this definition. Then, as
each edge appears in the boundary of exactly two faces, we have both
X
n X
n
fj = f and j × fj = 2m.
j=g j=g
Note that both sums start at g, the girth, as we know that there are no cycles of
shorter length. But then
X
n X
n X
n
2m = j × fj ≥ g × fj = g fj = gf,
j=g j=g j=g
where we obtain the inequality by replacing the length of the cycle j in j × fj with
g, the length of the shortest cycle (and hence the smallest value of j for which fj is
nonzero). Thus we have
2m ≥ gf or f ≤ 2m/g.
130
f3 = 2 f4 = 2
f5 = 1 f9 = 1
Figure 15.9: The example used to illustrate case B.2 of Theorem 15.12 has f3 = 2,
f4 = 2, f5 = 1 and f9 = 1 (for the infinite face): all other fj are zero.
gm 2m (g − 2)m g(n − 2)
− ≤ n−2 so ≤ n−2 and m ≤
g g g g−2
which is the result we sought.
Proof. Either G is a tree, in which case m = n − 1 and the bound in the Corollary
is certainly satisfied, or G contains at least one cycle. In the latter case, say that
the girth of G is g. We know 3 ≤ g ≤ n and our main result says
g
m ≤ (n − 2).
g−2
131
Figure 15.10: Both K5 and K3,3 are non-planar.
m ≤ 3n − 6 = 3 × 5 − 6 = 15 − 6 = 9,
but K5 actually has m = 10 edges, which is one too many for a planar graph. Thus
K5 can’t have a planar diagram.
K3,3 isn’t planar either, but Corollary 15.16 isn’t strong enough to establish
this. K3,3 has n = 6 and m = 3 × 3 = 9. Thus it easily satisfies the bound from
Corollary 15.16, which requires only that m ≤ 3 × 6 − 6 = 12. But if we now apply
our main result, Theorem 15.12, we’ll see that K3,3 can’t be planar. The relevant
inequality is
g(n − 2)
m≤
g−2
4 × (6 − 2)
≤
4−2
16
≤
2
≤8
where, in passing from the first line to the second, I’ve used the fact that the girth
of K3,3 is g = 4. To see this, first note that any cycle in a bipartite graph has even
3
There is an O(n) algorithm that determines whether a graph on n vertices is planar and, if it
is, finds a planar diagram. We don’t have time to discuss it, but interested readers might like to
look at John Hopcroft and Robert Tarjan (1974), Efficient Planarity Testing, Journal of the ACM,
21(4):549–568. DOI: 10.1145/321850.321852
132
Figure 15.11: Knowing that K5 and K3,3 are non-planar makes it clear that these
two graphs can’t be planar either, even though neither violates the inequalities from
the previous section (check this).
length, so the shortest possible cycle in K3,3 has length 4, and then find such a cycle
(there are lots).
Once we know that K3,3 and K5 are nonplanar, we can see immediately that
many other graphs must be non-planar too, even when this would not be detected
by either of our inequalities: Figure 15.11 shows two such examples. The one on the
left has K5 as a subgraph, so even though it satisfies the bound from Theorem 15.12,
it can’t be planar. The example at right is similar in that any planar diagram for this
graph would obviously produce a planar diagram for K3,3 , but the sense in which
this second graph “contains” K3,3 is more subtle: we’ll clarify and formalise this in
the next section, then state a theorem that says, essentially, that every non-planar
graph contains K5 or K3,3 .
133
b d b d
f f
c c
g a j g a j
e h e h
i i
G H
Definition 15.18. Two graphs G1 (V1 , E1 ) and G2 (V2 , E2 ) are said to be homeo-
morphic if they are isomorphic to subdivisions of the same graph.
That is, we say G1 and G2 are homeomorphic if there is some third graph—call
it G0 —such that both G1 and G2 are subdivisions of G0 . Figure 15.13 shows several
graphs that are homeomorphic to K5 . Homeomorphism is an equivalence relation
on graphs4 and so all the graphs in Figure 15.13 are homeomorphic to each other as
well as to K5 .
The notion of homeomorphism allows us to state the following remarkable result:
Figure 15.13: These three graphs are homeomorphic to K5 , and hence also to each
other.
4
The keen reader should check this for herself.
134
G G/e
Contraction
u v w
e
Figure 15.14: Contracting the blue edge e = (u, v) in G yields the graph G/e at
right.
• introducing a new vertex, w, that is adjacent to all the vertices that are adjacent
to u or v in G. The adjacency lists of u, v and w are related by
We will write G/e to indicate the graph formed by contracting the edge e.
Definition 15.21 (Contractable Graphs). A graph G is contractible to a graph H
if G can be converted into a graph isomorphic to H by a sequence of edge contractions.
Theorem 15.22 (Wagner’s Theorem). A graph G is planar if and only if it does
not contains a subgraph that is contractible to K5 or K3,3 .
Figure 15.15 illustrates the use of Wagner’s theorem to establish non-planarity.
Figure 15.15: One can use Wagner’s Theorem to prove that the Petersen graph,
illustrated above, is non-planar by contracting all the blue edges.
135
Figure 15.16: The two-torus cut twice and flattened into a square.
15.7 Afterword
Video The fact that there are, in a natural sense, only two non-planar graphs is one of the
11.5 main reasons we study the topic. But this turns out to be the easiest case of an even
more amazing family of results that I’ll discuss briefly. These other theorems have
to do with drawing graphs on arbitrary surfaces (spheres, tori . . . )—it’s common
to refer to this as embedding the graph in the surface—and the process uses curves
similar to those discussed in Section 15.1.1, except that now we want, for example,
curves γ : [0, 1] → S2 , where S2 is the two-sphere, the surface of a three-dimensional
unit ball.
Embedding a graph in the sphere turns out to be the same as embedding it in
the plane: you can imagine drawing the planar diagram on a large, thin, stretchy
sheet and then smoothing it onto a big ball in such a way that the diagram lies in
the northern hemisphere while the edges of the sheet are all drawn together in a
bunch at the south pole. Similarly, if we had a graph embedded in the sphere we
could get a planar diagram for it by punching a hole in the sphere. Thus a graph
can be embedded in the sphere unless it contains—in the senses of Kuratowski’s and
Wagner’s Theorems—a copy of K5 or K3,3 . For this reason, these two graphs are
called topological obstructions to embedding a graph in the plane or sphere. They
are also sometimes referred to as forbidden subgraphs.
But if we now consider the torus, the situation for K5 and K3,3 is different. To
make drawings, I’ll use a standard representation of the torus as a square: you should
imagine this square to have been peeled off a more familiar torus-as-a-doughnut, as
illustrated in Figure 15.16. Figure 15.17 then shows embeddings of K5 ad K3,3 in
the torus—these are analogous to planar diagrams in that the curves representing
the edges don’t intersect except at their endpoints.
There are, however, graphs that one cannot embed in the torus and there is
even an analogue of Kuratowski’s Theorem that says that there are finitely many
forbidden subgraphs and that all non-toroidal5 graphs include at least one of them.
In fact, something even more spectacular is true: early in an epic series6 of papers,
5
By analogy with the term non-planar, a graph is said to be non-toroidal if it cannot be
embedded in the torus.
6
The titles all begin with the words “Graph Minors”. The series began in 1983 with
“Graph Minors. I. Excluding a forest” (DOI: 10.1016/0095-8956(83)90079-5) and seems
136
Figure 15.17: Embeddings of K5 (left) and K3,3 (right) in the torus. Edges that
run off the top edge of the square return on the bottom, while those that run off the
right edge come back on the left.
Neil Robertson and Paul D. Seymour proved that every surface (the sphere, the
torus, the torus with two holes. . . ) has a Kuratowski-like theorem with a finite list
of forbidden subgraphs: two of those for the torus are shown in Figure 15.18. One
shouldn’t, however, draw too much comfort from the word “finite”. In her recent MSc
thesis7 Ms. Jennifer Woodcock developed a new algorithm for embedding graphs in
the torus and tested it against a database that, although known to be incomplete,
includes 239,451 forbidden subgraphs.
137