Agarwal Dissertation 2019
Agarwal Dissertation 2019
by
Udit Agarwal
2019
The Dissertation Committee for Udit Agarwal
certifies that this is the approved version of the following dissertation:
Committee:
Valerie King
Greg Plaxton
David Zuckerman
Algorithms, Parallelism and Fine-Grained Complexity
for Shortest Path Problems in Sparse Graphs
by
Udit Agarwal
Dissertation
in Partial Fulfillment
of the Requirements
Doctor of Philosophy
December 2019
Acknowledgments
I would like to thank my advisor, Prof. Vijaya Ramachandran, for her guidance
during my PhD. I would also like to thank the rest of my thesis committee: Prof.
Valerie King, Prof. Greg Plaxton, and Prof. David Zuckerman.
I thank the Department of Computer Science for providing me with the TA
support for the major duration of my PhD studies as well as a supplemental Graduate
Calhoun Fellowship for the first three years of my PhD. This research was also
supported in part by National Science Foundation under grant NSF-CCF-1320675
for first three years.
Udit Agarwal
iv
Algorithms, Parallelism and Fine-Grained Complexity
for Shortest Path Problems in Sparse Graphs
Publication No.
v
graph problems whose time complexities have stayed at Õ(mn) over the past several
decades, where m is the number of edges and n is the number of vertices in the input
graph. All of these problems are known to be subcubic equivalent and this shows that
achieving sub-mn running time is hard, but only for dense graphs where m = Θ(n2 ).
We introduce the notion of a sparse reduction which preserves the sparsity of graphs,
and we present near linear-time sparse reductions between various pairs of graph
problems in the Õ(mn) class. We also introduce the MWC-hardness conjecture,
which states that Minimum Weight Cycle problem cannot be solved in sub-mn time.
We establish that several important graph problems in the Õ(mn) class such as
APSP, second simple shortest path (2-SiSP), Radius, and Betweenness Centrality
are MWC-Hard, establishing sub-mn fine-grained hardness for these problems.
A well-known generalization of the shortest path problem is the k-simple
shortest paths (k-SiSP) problem, where we want to find k simple paths from s to t
in a non-decreasing order of their weight. In this thesis we present a new approach for
computing all pairs k simple shortest paths (k-APSiSP), which is based on forming
suitable path extensions to find simple shortest paths; this method is different from
the ‘detour finding’ technique used in all prior work on computing multiple simple
shortest paths, replacement paths, and distance sensitivity oracles. The Õ(mn) time
bound of our 2-APSiSP algorithm matches the fine-grained time complexity for the
simpler 2-SiSP problem, which is the single source-sink version of this problem.
Computing APSP is one of the most fundamental problems in distributed
computing. We present a simple Õ(n3/2 ) rounds deterministic algorithm for com-
puting APSP in the well-known Congest model which is the first o(n2 ) round
deterministic algorithm for this problem. We then improve this further by reducing
the round complexity to Õ(n4/3 ). We also present a faster algorithm for graphs with
moderate integer edge weights. We develop several derandomization techniques for
our deterministic APSP algorithms. These include efficient deterministic distributed
vi
algorithms for computing a small blocker set, which is a set that intersects a desired
collection of shortest paths, and several deterministic pipelined approaches for com-
puting the shortest path distance values as well as for propagating the messages in
the network. Aside from our deterministic results, all non-trivial distributed algo-
rithms currently known for computing APSP are randomized.
vii
Contents
Acknowledgments iv
Abstract v
List of Figures xv
Chapter 1 Introduction 1
1.1 Shortest Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Overview of Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Sequential Results . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1.1 Fine-Grained Complexity for Sparse Graphs . . . . . 2
1.2.1.2 k-Simple Shortest Paths and Cycles . . . . . . . . . 3
1.2.2 Distributed Results . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2.1 Deterministic Distributed All Pairs Shortest Paths in
Õ(n3/2 ) Rounds . . . . . . . . . . . . . . . . . . . . 4
1.2.2.2 Deterministic Distributed All Pairs Shortest Paths
Through Pipelining . . . . . . . . . . . . . . . . . . 4
1.2.2.3 Faster Deterministic Distributed All Pairs Shortest
Paths . . . . . . . . . . . . . . . . . . . . . . . . . . 4
viii
1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
I Sequential Results 6
ix
3.2 Our Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.3 The k-APSiSP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 63
3.3.1 The Compute-APSiSP Procedure . . . . . . . . . . . . . . . . 64
3.3.2 Computing the Qk Sets . . . . . . . . . . . . . . . . . . . . . 70
3.3.2.1 Computing Qk for k = 2 . . . . . . . . . . . . . . . . 70
3.3.2.2 Computing Qk for k ≥ 3 . . . . . . . . . . . . . . . . 72
3.3.3 Generating k Simple Shortest Cycles . . . . . . . . . . . . . . 73
3.4 Enumerating Simple Shortest Cycles (k-All-SiSC) . . . . . . . . . . . 75
3.5 Generating Simple Shortest Paths (k-All-SiSP) . . . . . . . . . . . . 77
3.6 Conclusion and Open Problems . . . . . . . . . . . . . . . . . . . . . 80
II Distributed Results 83
x
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.1.1 Other Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.2 The Pipelined APSP Algorithm . . . . . . . . . . . . . . . . . . . . . 111
6.2.1 Our (h, k)-SSP algorithm . . . . . . . . . . . . . . . . . . . . 112
6.2.2 Correctness of Algorithm 1 . . . . . . . . . . . . . . . . . . . 116
6.2.3 Establishing an Upper bound on Z.ν . . . . . . . . . . . . . . 121
6.2.4 Establishing an Upper Bound on the round r by which an
entry Z is sent . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.2.5 Establishing an Upper Bound on the round r by which Algo-
rithm 1 terminates . . . . . . . . . . . . . . . . . . . . . . . . 125
6.2.6 Simplified Versions of Short-Range Algorithms in [50] . . . . 126
6.3 Faster k-SSP Algorithm using blocker set . . . . . . . . . . . . . . . 128
6.4 Computing Consistent h-hop trees (CSSSP) . . . . . . . . . . . . . . 130
6.4.1 Computing a Blocker Set . . . . . . . . . . . . . . . . . . . . 134
6.5 Additional Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.5.1 An Õ(n)-Rounds (1+) Approximation Algorithm for Weighted
APSP with Non-negative Integer Edge-Weights . . . . . . . . 136
6.5.2 A Simple Õ(n4/3 ) Rounds Randomized Algorithm for Weighted
APSP with Arbitrary Edge-Weights . . . . . . . . . . . . . . 137
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
xi
7.3.3 Distributed Computation of Terms νPi and νPij . . . . . . . . 160
7.3.3.1 Computing νPi . . . . . . . . . . . . . . . . . . . . . 161
7.3.3.2 Computing νPij . . . . . . . . . . . . . . . . . . . . 163
7.4 A Õ(n4/3 ) Rounds Algorithm for Step 6 of Algorithm 5 . . . . . . . 164
7.4.1 Correctness of Step 9 of Algorithm 11 . . . . . . . . . . . . . 169
7.5 Helper Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7.5.1 Helper Algorithms for Randomized Blocker Set Algorithm . . 172
7.5.1.1 Algorithm for Computing Vi and Pi . . . . . . . . . 172
7.5.1.2 Algorithm for Computing Pij . . . . . . . . . . . . . 173
7.5.1.3 Algorithm for Computing |Pij | . . . . . . . . . . . . 174
7.5.1.4 Remove Subtrees rooted at z ∈ Z . . . . . . . . . . 175
7.5.2 h-hop Shortest Path Extension Algorithm [50] . . . . . . . . . 176
7.5.3 Helper Algorithms for Algorithm 11 . . . . . . . . . . . . . . 176
7.5.3.1 Computing Bottleneck Nodes . . . . . . . . . . . . . 176
7.5.3.2 Computing countv,c Values . . . . . . . . . . . . . . 179
7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
Bibliography 181
Vita 191
xii
List of Tables
2.1 Our sparse reduction results for undirected graphs. The definitions
for these problems are in Section 2.2. Note that Min-Wt-∆ can be
solved in m3/2 time. . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Our sparse reduction results for directed graphs. The definitions for
these problems are in Section 2.2. Note that Min-Wt-∆ can be solved
in m3/2 time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 Our results for directed graphs. All algorithms are deterministic.
(DSO stands for Distance Sensitivity Oracles). . . . . . . . . . . . . 60
6.1 Table comparing our approximate APSP results for non-negative edge-
weighted graphs (including zero edge weights) with previous known
results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
xiii
6.2 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
xiv
List of Figures
2.1 Our sparse reductions for weighted directed graphs. The regular edges
represent sparse O(m + n) reductions, the squiggly edges represent
tilde-sparse O(m + n) reductions, and the dashed edges represent
reductions that are trivial. The n2 label on dashed edge to APSP
denotes an O(n2 ) time reduction. . . . . . . . . . . . . . . . . . . . . 17
2.2 Construction of Gi,j,k . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Here Figure (a) represent the MWC C in G. The path πy,z (in bold)
is the shortest path from y to z in G. The path πy1 ,z 2 (in bold) in
Figure (b) is the shortest path from y 1 to z 2 in Gi,j,k : where the edge
1 , v 2 ) is absent due to i, j bits. The paths π
(vp−1 p y,z in G and πy 1 ,z 2 in
2.4 G00 for l = 3. The gray and the bold edges have weight 11
9 M
0 and
1 0
3M respectively. All the outgoing (incoming) edges from (to) A have
weight 0 and the outgoing edges from B have weight M 0 . . . . . . . . 35
sprs
2.5 G00 for n = 3 in the reduction: MWC ≤sprs
m+n 2-SiSP . . . . . . . . . . 40
2.6 G0 for l = 3 in the reduction: directed s-t Replacement Paths ≤sprs
m+n
ANSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
xv
2.7 Known sparse reductions for centrality problems, all from [2]. The reg-
ular edges represent sparse O(m + n) reductions, the squiggly edges
represent tilde-sparse O(m + n) reductions, and the dashed edges rep-
resent reductions that are trivial. BC and Min-Wt-∆ (shaded with
gray) are known to be sub-cubic equivalent to APSP [2, 91]. . . . . . 44
2.8 Sparse reductions for weighted directed graphs. The regular edges
represent sparse O(m + n) reductions, the squiggly edges represent
tilde-sparse O(m + n) reductions, and the dashed edges represent re-
ductions that are trivial. All problems except APSP are MWCC-hard.
Eccentricities, BC, ANBC and Pos ANBC are also SETH/ k-DSH
hard. ANBC and Pos ANBC (problems inside the dashed circles) are
both not known to be subcubic equivalent to APSP. BC and Eccen-
tricities (in bold) are MWC-hard, sub-cubic equivalent to APSP and
SETH/k-DSH hard. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
sprs
2.9 G00 for l = 3 in the reduction: directed 2-SiSP .sprs
m+n Betweenness
Centrality. The gray and the bold edges have weight M 0 and 1
3M
0
6.1 This figure gives an example graph G where the union of the edges
on the 2-hop shortest paths from source node b differs from the 2-hop
SSSP constructed by Bellman-Ford (or using our (h, k)-SSP pipelined
algorithm in Chapter 5), and both are different from the 2-hop CSSSP
generated for source node b. . . . . . . . . . . . . . . . . . . . . . . . 131
xvi
Chapter 1
Introduction
1
works even if the graph has negative edge weights (but no negative weight cycle). For
sparse graphs with negative edge weights, one can use Johnson’s transformation [54]
to transform the graph into a non-negative edge weighted graph in O(mn + n2 log n)
time and then compute APSP on it.
Since most graphs that occur in practice are sparse, in this work our em-
phasis is on algorithms where input size is parameterized in terms of m and n. In
this thesis we explore the shortest path and cycle problems in both sequential and
distributed/parallel settings. We also look at the fine-grained hardness for some of
these problems. In the next section we give an overview of the topics covered in this
thesis.
2
ity are MWC-Hard, establishing sub-mn fine-grained hardness for these problems.
Currently the Minimum Weight Cycle problem can be solved in O(mn) time [71]
and the problem of obtaining a o(mn) time algorithm for MWC is open from many
decades.
We also identify Eccentricities and BC as key problems in the Õ(mn) class
which are simultaneously MWC-hard, SETH-hard and k-DSH-hard, where SETH
is the Strong Exponential Time Hypothesis, and k-DSH is the hypothesis that a
dominating set of size k cannot be computed in time polynomially smaller than nk .
In Chapter 2 we showed that k-SiSP does not have a sub-mn time algorithm unless
Minimum Weight Cycle can be solved in sub-mn time. In Chapter 3 we present
several results for the problem of computing k simple shortest paths (k-SiSP), where
we want to find k simple paths from s to t in a non-decreasing order of their weight.
We present a new approach for computing all pairs k simple shortest paths (k-
APSiSP), which is based on forming suitable path extensions to find simple shortest
paths; this method is different from the ‘detour finding’ technique used in all prior
work on computing multiple simple shortest paths, replacement paths, and distance
sensitivity oracles. The Õ(mn) time bound of our 2-APSiSP algorithm matches
the fine-grained time complexity for the simpler 2-SiSP problem, which is the single
source-sink version of this problem. For k = 3 our algorithm runs in O(mn2 +
n3 log n) time, which is almost a factor of n faster than the best previous algorithm.
We also present new results for related paths and cycle problems, such as
enumerating k simple cycles in non-decreasing order of weight, for which we give an
Õ(mn) time algorithm and we also show that it is MWC-hard for any constant k.
3
1.2.2 Distributed Results
In Chapters 4-7 we consider the shortest path problems in the distributed setting,
specifically the all pairs shortest path (APSP) problem. After giving an initial in-
troduction to the distributed APSP problem in Chapter 4, in Chapter 5 we present
an Õ(n3/2 ) rounds algorithm for computing APSP in directed graphs with arbitrary
edge weights in the well-known Congest model (Section 4.2). This was the first
o(n2 ) round non-trivial deterministic algorithm for this problem. The most critical
component of this algorithm is a new distributed algorithm for computing a small
blocker set deterministically, which is a set that intersects a desired collection of
shortest paths.
4
sented in Chapter 5. The main component of this algorithm is a new faster technique
for computing a small blocker set deterministically and a new pipelined method for
deterministically propagating distance values from source nodes to the blocker nodes
in the network.
1.3 Organization
5
Part I
Sequential Results
6
Chapter 2
2.1 Introduction
In recent years there has been considerable interest in determining the fine-grained
complexity of problems in P, see e.g. [88]. For instance, the 3SUM [35] and OV (Or-
thogonal Vectors) [90, 14] problems have been central to the fine-grained complexity
of several problems with quadratic time algorithms, in computational geometry and
related areas for 3SUM and in edit distance and related areas for OV. APSP (all
pairs shortest paths) has been central to the fine-grained complexity of several path
problems with cubic time algorithms on dense graphs [91].
For several graph problems related to shortest paths that currently have
Õ(n3 ) 1 time algorithms, equivalence under sub-cubic reductions has been shown
in work starting with [91] between APSP, finding a minimum weight cycle (MWC),
finding a second simple shortest path from a given vertex u to a given vertex v (2-
SiSP) in a weighted directed graph, finding a minimum weight triangle (Min-Wt-∆),
1
Õ and Θ̃ can hide sub-polynomial factors; in our new results they only hide polylog factors.
7
etc. This gives compelling evidence that a large class of problems on dense graphs
is unlikely to have sub-cubic algorithms as a function of n, the number of vertices,
unless fundamentally new algorithmic techniques are developed.
We consider a central collection of graph problems related to APSP, which
refines the subcubic equivalence class. We let n be the number of vertices and m
the number of edges. All of the sub-cubic equivalent graph problems mentioned
above (and several others) have Õ(mn) time algorithms; additionally, many sub-
cubic equivalent problems related to minimum triangle detection and triangle listing
have O(m3/2 ) time complexities for sparse graphs [52]. When a graph is truly sparse
with m = O(n) the Õ(mn) APSP bound is essentially optimal or very close to
optimal, since the size of the output for APSP is n2 . Thus, a cubic in n bound
for APSP does not fully capture what is currently achievable by known algorithms,
especially since graphs that arise in practice tend to have m close to linear in n or
at least are sparse, i.e., have m = O(n1+δ ) for δ < 1. This motivates our study of
the fine-grained complexity of graph path problems in the Õ(mn) class.
Another fundamental problem in the Õ(mn) class is MWC (Minimum Weight
Cycle). In both directed and undirected graphs, MWC can be computed in Õ(mn)
time using an algorithm for APSP. Recently Orlin and Sedeno-Noda [71] gave an
improved O(mn) time algorithm for directed MWC. This is an important result but
the bound still remains Ω(mn). Finding an MWC algorithm that runs polynomially
faster than mn time is a long-standing open problem in graph algorithms.
We present both fine-grained reductions and hardness results for graph prob-
lems in the Õ(mn) class, most of which are equivalent under sub-cubic reductions
on dense graphs, but now taking sparseness of edges into consideration. We use the
current long-standing upper bound of Õ(mn) for these problems as our reference,
both for our fine-grained reductions and for our hardness results. Our results give a
partial order on hardness of several problems in the Õ(mn) class, with equivalence
8
within some subsets of problems, and gives rise to a hardness conjecture (MWC
hardness) for this class. Most of the sub-cubic reductions in previous work (includ-
ing all in [91]) created dense intermediate problems and hence are not fine-grained
reductions for the Õ(mn) class. Fine-grained reductions and hardness results with
respect to time bounds that consider only n or only m, such as bounds of the form
m2 , n2 , nc , m1+δ , are given in [78, 5, 3, 4, 45]. One exception is in [2] where some
reductions that preserve graph sparsity are given for problems with Õ(mn) time
algorithms, such as diameter and some betweenness centrality problems. However,
these are either reductions that start from a triangle finding problem that can be
solved in Õ(m3/2 ), hence not fine-grained reductions for the Õ(mn) class, or start
from a problem, such as Diameter, that is not known to be sub-cubic equivalent to
APSP. In contrast, the time bounds in our fine-grained results for the Õ(mn) class
consider both m and n.
In [66], Lincoln et al. formulated the Min-Weight-k-Clique hypothesis, which
states that a minimum weight k-clique cannot be found in time smaller than nk−o(1)
for sufficiently large edge weights. They show that for all sparsities of the form m =
Θ(n1+1/l ) where l is a constant, MWC cannot be computed in O(mn1− + n2 ) time
under Min-Weight-k-Clique hypothesis. Our sparse reduction results for directed
graphs in conjunction with their result show that a large class of problems in the mn
class including Second Shortest Path, Replacement Paths, Radius, and Betweenness
Centrality also do not have O(mn1− + n2 ) time algorithm under Min-Weight-k-
Clique hypothesis. Similarly our sparse reduction result from MWC to APSP in
undirected graphs in conjunction with this result also establishes that there is no
O(mn1− + n2 ) time algorithm for undirected APSP under Min-Weight-k-Clique
hypothesis.
9
2.2 Definitions of Graph Problems
All Pairs Shortest Paths (APSP). This is the problem of computing the shortest
path distances for every pair of vertices in G together with a concise representation
of the shortest paths, which in our case is an n × n matrix, LastG , that contains, in
position (x, y), the predecessor vertex of y on a shortest path from x to y.
All Pairs Shortest Distances (APSD). Given a graph G = (V, E), the APSD
problem only involves computing the shortest path distances for every pair of vertices
in G. Most of the currently known APSD algorithms, including matrix multiplication
based methods for small integer weights [82, 83, 94], can compute APSP in the same
bound as APSD.
Minimum Weight Cycle (MWC). Given a graph G = (V, E), the minimum
weight cycle problem is to find the weight of a minimum weight cycle in G.
All Nodes Shortest Cycles (ANSC). Given a graph G = (V, E), the ANSC
problem is to find the weight of a shortest cycle through each vertex in G.
k-SiSP. Given a graph G = (V, E) and s, t ∈ V , the k-SiSP problem is to find the k
10
shortest simple paths from s to t: the i-th path must be different from first (i − 1)
paths and must have weight greater than or equal to the weight of any of these (i−1)
paths.
k-SiSC. The corresponding cycle version of k-SiSP is known as k-SiSC, where the
goal is to compute the k shortest simple cycles through a given vertex x, such that
the i-th cycle generated is different from all previously generated (i − 1) cycles and
has weight greater than or equal to the weight of any of these (i − 1) cycles.
Radius. For a given graph G = (V, E), the Radius problem is to compute the value
minx∈V maxy∈V dG (x, y). The center of a graph is the vertex x which minimizes
this value.
Diameter. For a given graph G = (V, E), the Diameter problem is to compute the
value maxx,y∈V dG (x, y).
the number of shortest paths from s to t and σs,t (v) is the number of shortest paths
from s to t passing through v.
As in [2] we assume that the graph has unique shortest paths, hence BC(v)
is simply the number of s, t pairs such that the shortest path from s to t passes
through v.
All Nodes Positive Betweenness Centrality (Pos ANBC). The all-nodes ver-
11
sion of Positive Betweenness Centrality.
Reach Centrality (RC). For a given graph G = (V, E) and a node v ∈ V , the
Reach Centrality of v, RC(v), is the value
maxs,t∈V :dG (s,v)+dG (v,t)=dG (s,t) min(dG (s, b), dG (b, t))
12
need it for our reductions.
with n vertices and m edges, we can solve P in O(TQ (m, n)+f (m, n)) time on graphs
with n vertices and m edges, by making a constant number of oracle calls to Q.
P in Õ(TQ (m, n) + f (m, n)) time (by making polylog oracle calls to Q on graphs
with Õ(n) vertices and Õ(m) edges). We will also use ≡sprs ∼sprs
f (m,n) and =f (m,n) in place
sprs sprs
of ≤sprs sprs
f (m,n) and .f (m,n) when there are reductions in both directions. In a weighted
graph we allow the Õ term to have a log ρ factor. (Recall that ρ = Wmax /Wmin .
We present several sparse reductions for problems that currently have Õ(mn)
time algorithms. This gives rise to a partial order on problems that are known to
be sub-cubic equivalent, and currently have Õ(mn) time algorithms. For the most
part, our reductions take Õ(m + n) time (many are in fact O(m + n) time), except
reductions to APSP take Õ(n2 ) time. This ensures that any improvement in the
time bound for the target problem will give rise to the same improvement to the
source problem, to within a polylog factor. Surprisingly, very few of the known
sub-cubic reductions for the problems we consider carry over to the sparse case (and
none from [91]). This is due to one or both of the following features.
1. A central technique used in many of the reductions that show sub-cubic equiv-
alence to APSP is to reduce from triangle finding problems such as Min-Wt-∆
(e.g., [2, 91]). These reductions start with a triangle finding problem and pro-
ceed by constructing suitable tripartite graphs where the property of the target
13
Table 2.1: Our sparse reduction results for undirected graphs. The definitions for
these problems are in Section 2.2. Note that Min-Wt-∆ can be solved in m3/2 time.
problem can be used to detect a desired triangle. However, a cycle can have
Θ(n) vertices, and using such an approach starting with MWC would create
n-partite graphs with Θ(n2 ) vertices. Hence this approach does not work in
our sparse setting and this highlights the need to develop new techniques to
reduce from MWC. Further as noted above, in the sparse setting, all triangle
finding and enumeration problems are in Õ(m3/2 ) time, which is an asymptot-
ically smaller bound than mn for graphs with m = o(n2 ). Hence reductions to
triangle finding problems (e.g., [79] in the dense case) are not relevant in the
sparse setting (unless the mn time bound can be improved).
2. Many of the known sub-cubic reductions convert a sparse graph into a dense
one (e.g., [79, 91]), and again these are not relevant in the sparse setting.
Tables 2.1 and 2.2 summarize the improvements our reductions achieve over
prior results.
(a) Undirected Graphs: Finding the weight of a minimum weight cycle (MWC) is
14
Table 2.2: Our sparse reduction results for directed graphs. The definitions for these
problems are in Section 2.2. Note that Min-Wt-∆ can be solved in m3/2 time.
15
a fundamental problem. A simple sparse O(m + n) reduction from MWC to APSD
is known for directed graph but it does not work in the undirected case mainly
because an edge can be traversed in either direction in an undirected graph, and
known algorithms for the directed case would create non-simple paths when applied
to an undirected graph. Roditty and Williams [79], in a follow-up to [91], pointed
out the challenges of reducing from undirected MWC to APSD in sub-nω time,
where ω is the matrix multiplication exponent, and then gave a Õ(n2 ) reduction
from undirected MWC to undirected Min-Wt-∆ in a dense bipartite graph. But a
reduction that increases the density of the graph is not helpful in our sparse setting.
Instead, in this chapter we give a sparse Õ(n2 ) time reduction from undirected MWC
to APSD. Similar techniques allow us to obtain a sparse Õ(n2 ) time reduction from
undirected ANSC (All Nodes Shortest Cycles) [93, 81], which asks for a shortest
cycle through every vertex) to APSP. This reduction improves the running time for
unweighted ANSC in dense graphs [93], since we can now solve it in Õ(nω ) time using
the unweighted APSP algorithm in [82, 12]. Our ANSC reduction and resulting
improved algorithm is only for unweighted graphs and extending it to weighted
graphs appears to be challenging.
We introduce a new bit-sampling technique in these reductions. This tech-
nique contains a simple construction with exactly log n hash functions for Color
Coding [13] with 2 colors. Our bit-sampling method also gives the first near-linear
time algorithm for k-SiSC in weighted undirected graphs.
The full proofs of Theorems 2.3.2 and 2.3.3 are in Section 2.4.1 and Sec-
tion 2.9.1 respectively.
Theorem 2.3.2. In a weighted undirected n-node m-edge graph with edge weight ra-
tio ρ = Wmax /Wmin where Wmax is the largest edge weight and Wmin is the smallest
edge weight, MWC can be computed with 2 · log n · log ρ calls to APSD on graphs with
2n nodes, at most 2m edges, and edge weight ratio at most ρ, with O(n + m) cost for
16
2SiSC ANSC
[6]
s-t replacement n2
MWC 2SiSP Eccentricities APSP
paths
Radius
Figure 2.1: Our sparse reductions for weighted directed graphs. The regular edges
represent sparse O(m+n) reductions, the squiggly edges represent tilde-sparse O(m+
n) reductions, and the dashed edges represent reductions that are trivial. The n2
label on dashed edge to APSP denotes an O(n2 ) time reduction.
constructing each reduced graph, and with additional O(n2 · log n · log(nρ)) processing
time. Additionally, every edge in the reduced graph retains its corresponding edge
sprs
weight from the original graph. Hence, MWC .sprs
n2
APSD.
sprs
Theorem 2.3.3. In undirected unweighted graphs, ANSC .sprs
n2
APSP and ANSC
can be computed in Õ(nω ) time.
(b) Directed Graphs. We give several nontrivial sparse reductions starting from
MWC in directed graphs, as noted in the following theorem (also highlighted in
Figure 2.1).
17
sprs sprs sprs sprs
1. MWC ≤sprs sprs sprs sprs
m+n 2-SiSP ≤m+n s-t replacement paths ≡m+n ANSC .m+n Eccen-
tricities
sprs sprs
2. 2-SiSP .sprs sprs
m+n Radius ≤m+n Eccentricities, and
sprs
3. 2-SiSP .sprs
m+n BC
18
Definition 2.3.5 (Sub-mn). A function g(m, n) is sub-mn if g(m, n) = O(mα ·nβ ),
where α, β are constants such that α + β < 2.
Directed MWC is a natural candidate for hardness for the mn class since it is
a fundamental problem for which a simple Õ(mn) time algorithm has been known for
many decades, and very recently, an O(mn) time algorithm [71]. But a sub-mn time
19
algorithm remains as elusive as ever. Further, through the fine-grained reductions
that we came up with in this work, many other problems in the mn class have MWC
hardness for sub-mn time as noted in the following theorem.
Theorem 2.3.8. The following problems on directed graphs do not have sub-mn
time algorithms: 2-SiSP, 2-SiSC, s-t Replacement Paths, ANSC, Radius, BC, and
Eccentricities assuming that MWC cannot be computed in sub-mn time.
In [66], Lincoln et al. showed that MWC is hard under the Min-Wt-k-Clique
Conjecture. This result in conjunction with Theorem 2.3.8 shows that the prob-
lems in the Õ(mn) class including 2-SiSP, 2-SiSC, Replacement Paths, Radius, BC,
Eccentricities and ANSC are also Min-Wt-k-Clique hard.
We now discuss the SETH and k-DSH hardness of Eccentricities and BC,
which are also MWC and Min-Wt-k-Clique hard.
SETH and k-DSH hardness for Diameter, Eccentricities and BC. Another
fundamental graph problem in the Õ(mn) class is Diameter, which has a trivial sub-
mn reduction to Eccentricities. Even though Diameter is in the Õ(mn) class, we do
not have an MWC-hardness result for it (nor is a sub-cubic reduction from MWC
known). However computing Diameter in sub-m2 time in graphs with m = O(n)
edges was shown to be SETH-hard in [78] for both directed and undirected graphs.
The Strong Exponential Time Hypothesis (SETH) [51] states that that for every
δ < 1 there exists a k such that there is no 2δ·n time algorithm for k-SAT. On the
other hand, a SETH-based conditional hardness is not known for either MWC or
APSP. The construction in [78] that gives sub-m2 hardness for Diameter on very
sparse graphs also gives a sub-mn hardness under SETH listed below.
20
The k Dominating Set Hypothesis (k-DSH) [72] states that there exists k0
such that for all k ≥ k0 , a dominating set of size k in an undirected graph on n
vertices cannot be found in O(nk− ) for any constant > 0. It is known that k-DSH
hardness implies SETH-hardness, and a sub-m2 hardness result for Diameter under
k-DSH for even values of k was shown in [78] for graphs with Õ(n) edges. We show
that Diameter is k-DSH hard for sub-mn time for all values of k, both odd and
even, thus strengthening the SETH and k-DSH hardness results for Diameter and
Eccentricities.
APSP. APSP has a special status in the mn class. Since its output size is n2 , it has
near-optimal algorithms [85, 42, 76, 77] for graphs with m = O(n). Also, the n2 size
for the APSP output means that any inference made through sparse reductions to
APSP will not be based on a sub-mn time bound but instead on a sub-mn + n2 time
bound. It also turns out that the SETH and k-DSH hardness results for Diameter
depend crucially on staying with a purely sub-mn bound, and hence even though
Diameter has a simple sparse n2 reduction to APSP, we do not have SETH or k-DSH
hardness for computing APSP in sub-mn + n2 time.
21
sub-cubic equivalent to APSP. Fig. 2.8 in Section 2.6 is an enhancement of Fig. 2.1
that includes our results for BC variants.
Separation of Time Bounds for Sparse Graphs. It is readily seen that the
Õ(m3/2 ) bound for triangle finding problems is a better bound than the Õ(mn)
bound for the mn class. But imposing a total ordering on functions of two variables
requires some care. For example, maximal 2-connected subgraphs of a given directed
graph can be computed in O(m3/2 ) time [24] as well as in O(n2 ) time [43]. with m3/2
a better bound for very sparse graphs and n2 for very dense graphs, In Section 2.8 we
motivate a natural definition of what it means for for one time bound to be smaller
than another time bound for sparse graphs. By our definitions, m3/2 is a smaller
time bound than both mn and n2 for sparse graphs. Our definitions establish that
the problems related to triangle listing must have provably smaller time bounds for
sparse graphs than the mn class under the hardness conjectures and fine-grained
reductions for the MWC class.
Overview of the Chapter. Sections 2.4 and 2.5 present our sparse reductions
for undirected and directed graphs. Section 2.6 present our sparse reductions for
centrality problems in directed graphs. In Sections 2.7 and 2.8 we present SETH and
k-DSH hardness results, and the resulting provable split of the sub-cubic equivalence
class under these hardness results for sparse time bounds.
In undirected graphs, the only known sub-cubic reduction from MWC to APSD [79]
uses a dense reduction to Min-Wt-∆. Described in [79] for integer edge weights
of value at most Wmax , it first uses an algorithm in [67] to compute, in O(n2 ·
log n log nWmax ) time, a 2-approximation W to the weight of a minimum weight
22
cycle as well as shortest paths between all pairs of vertices with pathlength at most
W/2. The reduced graph for Min-Wt-∆ is constructed as a (dense) bipartite graph
with edges to represent all of these shortest paths, together with the edges of the
original graph in one side of the bipartition. This results in each triangle in the
reduced graph corresponding to a cycle in the original graph, and with a minimum
weight cycle guaranteed to be present as a triangle. An MWC is then constructed
using a version of Color Coding [13] with 2 colors.
The approach in [79] does not work in our case, as we are dealing with sparse
reductions. Instead, we give a sparse reduction directly from MWC to APSD. In
contrast to [79], where finding a minimum weight 3-edge triangle gives the MWC
in the original graph, in our reduction the MWC is constructed as a path P in a
reduced graph followed by a shortest path in the original graph.
One may ask if we can sparsify the dense reduction from MWC to Min-Wt-∆
in [79] but such a reduction, though very desirable, would immediately refute MWCC
and would achieve a major breakthrough by giving an Õ(m3/2 ) time algorithm for
undirected MWC.
We now sketch our sparse reduction from undirected MWC to APSP. We
start with stating from [79] the notion of a ‘critical edge’ and then present some
additional properties we will use.
dC (vi+1 , v1 ) ≤ b w(C)
2 c. The edge (vi , vi+1 ) is called the critical edge of C with respect
23
2 is a second simple shortest path from x to y, i.e. a path from x to y that
y and πx,y
1 .
is shortest among all paths from x to y that are not identical to πx,y
3
Proof. Assume to the contrary that πx,y is a second simple shortest path from x
2 ). Let π 3 deviate from π 1 at vertex u and then
to y of weight less than w(πx,y x,y x,y
1 and π 3 together
merge back at vertex v. Then the subpaths from u to v in πx,y x,y
Observation 2.4.3 holds since the path P there must be either a shortest path
or a second simple shortest path in G by the above lemma, so in G0 it must be a
shortest path.
In our reduction we construct a collection of graphs Gi,j,k , each with 2n
vertices (containing 2 copies of V ) and O(m) edges, with the guarantee that, for
the minimum weight cycle C, in at least one of the graphs the edge (vp−1 , vp ) (in
Observation 2.4.3) will not connect across the two copies of V and the path P of
Observation 2.4.3 will be present. Then, if a call to APSP computes P as a shortest
path from v1 to vp (across the two copies of V ), we can verify that edge (vp , vp+1 ) is
not the last edge on the computed shortest path from v1 to vp in G, and so we can
form the concatenation of these two paths as a possible candidate for a minimum
weight cycle. The challenge is to construct a small collection of graphs where we
can ensure that the path we identify in one of the derived graphs is in fact the
simple path P in the input graph. We overcome this challenge by using our new
24
(u1 , v 2 ) present if (u, v) ∈ E
and u’s i-th bit is j
and W2max k < w(u, v) ≤ W max
2k−1
u1 v2
w(u, a)
a1
all no edges
c1
edges between
f1
from g2 vertices
w(f, g)
E V1 V2 in V2
contrast with a similar step in [79], we need to find the path P in the sparse derived
graph, while in [79] it suffices to look for the 2-edge path that represents P in a
triangle in their dense reduced graph.
25
Wmax
The second condition — that an edge (u1 , v 2 ) is present only if 2k
<
Wmax
w(u, v) ≤ 2k−1
— ensures that there is a graph Gi,j,k in which, not only is edge
1 , v 2 ) present and edge (v 1 , v 2 ) absent as noted by the first condition, but also
(vp+1 p p−1 p
the shortest path from v11 to vp2 is in fact the path P in Observation 2.4.3, and does
not correspond to a false path where an edge in G is traversed twice. In particular,
we show that this second condition allows us to exclude a shortest path from v11
to vp2 of the following form: take the shortest path from v1 to vp in G on vertices
in V1 , then take an edge (vp1 , x1 ), and then the edge (x1 , vp2 ). Such a path, which
has weight dC (v1 , vp ) + 2w(x, vp ), could be shorter than the desired path, which has
weight dC (v1 , vp+1 ) + w(vp+1 , vp ). In our reduction we avoid selecting this ineligible
path by requiring that the weight of the selected path should not exceed dG (v1 , vp )
by more than Wmax /2k−1 . We show that these conditions suffice to ensure that P
is identified in one of the Gi,j,k , and no spurious path of shorter length is identified.
Notice that, in contrast to [79], we do not estimate the MWC weight by computing
a 2-approximation. Instead, this second condition allows us to identify the critical
edge in the appropriate graph.
In the following two lemmas we identify three key properties of a path π from
y 1 to z 2 (y 6= z) in a Gi,j,k that (I) will be satisfied by the path P in Observation 2.4.3
26
for y 1 = v11 and z 2 = vp2 in some Gi,j,k (Lemma 2.4.4), and (II) will cause a simple
cycle in G to be contained in the concatenation of π with the shortest path from y
to z computed by APSP (Lemma 2.4.5). Once we have these two Lemmas in hand,
it gives us a method to find a minimum weight cycle in G (described in Algorithm
MWC-to-APSP) by calling APSP on each Gi,j,k and then identifying all pairs y 1 , z 2
in each graph that satisfy these properties. Since the path P is guaranteed to be one
of the pairs, and no spurious path will be identified, the minimum weight cycle can
be identified. We now fill in the details.
Wmax
(iii) dGi,j,k (v11 , vp2 ) ≤ dG (v1 , vp ) + 2k−1
Proof. Let i, j and k be such that: vp−1 and vp+1 differ on i-th bit and j be the i-th
Wmax Wmax 1 , v2)
bit of vp+1 and k be such that 2k
< w(vp , vp+1 ) ≤ 2k−1
. Hence, edge (vp−1 p
1 , v 2 ) is present in G 1 2
is not present and the edge (vp+1 p i,j,k and so LastGi,j,k (v1 , vp ) 6=
27
G
vp+1
v1 vp
vp−1
Figure 2.3: Here Figure (a) represent the MWC C in G. The path πy,z (in bold) is
the shortest path from y to z in G. The path πy1 ,z 2 (in bold) in Figure (b) is the
shortest path from y 1 to z 2 in Gi,j,k : where the edge (vp−1
1 , v 2 ) is absent due to i, j
p
bits. The paths πy,z in G and πy1 ,z 2 in Gi,j,k together comprise the MWC C.
28
Lemma 2.4.5. If there exists i ∈ {1, . . . , dlog ne}, j ∈ {0, 1}, k ∈ {1, 2, . . . , dlog ρe},
and y, z ∈ V such that the following conditions hold:
Wmax
(iii) dGi,j,k (y 1 , z 2 ) ≤ dG (y, z) + 2k−1
Proof. Let πy,z be a shortest path from y to z in G (see Figure 2.3a) and let πy1 ,z 2
0
be a shortest path from y 1 to z 2 in Gi,j,k (Figure 2.3b). Let πy,z be the path
corresponding to πy1 ,z 2 in G.
0 0
Now we need to show that the path πy,z is simple. Assume that πy,z is not
simple. It implies that the path πy1 ,z 2 must contain x1 and x2 for some x ∈ V . Now
if x 6= z, then we can remove the subpath from x1 to x2 (or from x2 to x1 ) to obtain
an even shorter path from y 1 to z 2 .
It implies that the path πy1 ,z 2 contains z 1 as an internal vertex. Let πz 1 ,z 2
be the subpath of πy1 ,z 2 from vertex z 1 to z 2 . If πz 1 ,z 2 contains at least 2 internal
vertices then this would be a simple cycle of weight less than wt, and we are done.
Otherwise, the path πz 1 ,z 2 contains exactly one internal vertex (say x1 ). Hence path
πz 1 ,z 2 corresponds to the edge (z, x) traversed twice in graph G. But the weight of the
Wmax
edge (x, z) must be greater than 2k
(as the edge (x1 , z 2 ) is present in Gi,j,k ). Hence
Wmax
w(πz 1 ,z 2 ) > 2k−1
and hence dGi,j,k (y 1 , z 2 ) ≥ dG (y, z) + w(πz 1 ,z 2 ) > dG (y, z) + Wmax
2k−1
,
resulting in a contradiction as condition 3 states otherwise. (It is for this property
that the index k in Gi,j,k is used.) Thus path πy1 ,z 2 does not contain z 1 as an internal
0
vertex and hence πy,z is simple.
0
If the paths πy,z and πy,z do not have any internal vertices in common, then
0
πy,z ◦ πy,z corresponds to a simple cycle C in G of weight wt that passes through y
29
0
and z. Otherwise, we can extract from πy,z ◦ πy,z a cycle of weight smaller than wt.
This establishes the lemma.
Sparse Reduction to APSD: We now describe how to avoid using the Last matrix
in the reduction. A 2-approximation algorithm for finding a cycle of weight at most
2t, where t is such that the minimum-weight cycle’s weight lies in the range (t, 2t],
as well as distances between pairs of vertices within distance at most t, was given
by Lingas and Lundell [67]. This algorithm can also compute the last edge on each
shortest path it computes, and its running time is Õ(n2 log(nρ)). For a minimum
weight cycle C = hv1 , v2 , . . . , vl i where the edge (vp , vp+1 ) is a critical edge with
respect to the start vertex v1 , the shortest path length from v1 to vp or to vp+1 is at
most t. Thus using this algorithm, we can compute the last edge on a shortest path
for such pair of vertices in Õ(n2 log(nρ)) time.
In our reduction to APSD, we first run the 2-approximation algorithm on the
input graph G to obtain the Last(y, z) for certain pairs of vertices. Then, in Step
5 we check if LastGi,j,k (y 1 , z 2 ) 6= LastG (y, z) only if LastG (y, z) has been computed
(otherwise the current path is not a candidate for computing a minimum weight
cycle). It appears from the algorithm that the Last values are also needed in the
Gi,j,k . However, instead of computing the Last values in each Gi,j,k , we check for the
30
shortest path from y to z only in those Gi,j,k graphs where the LastG (y, z) has been
computed, and the edge is not present in Gi,j,k . In other words, if Last(y, z) = q,
we will only consider the shortest paths from y 1 to z 2 in those graphs Gi,j,k where
q’s i-th bit is not equal to j. Thus our reduction to APSD goes through without
needing APSP to output the Last matrix. This gives rise to an improved algorithm
for MWC with small integer weights.
For our sparse Õ(n2 ) reduction from ANSC to APSP in unweighted undirected
graphs, we use the graphs from the previous section, but we do not use the index k,
since the graph is unweighted.
Our reduction exploits the fact that in unweighted graphs, every edge in a
cycle is a critical edge with respect to some vertex. Thus we construct 2dlog ne
graphs Gi,j , and in order to construct a shortest cycle through vertex z in G, we
will set z = vp2 in the reduction in the previous section. Then, by letting one of the
two edges incident on z in the shortest cycle through z be the critical edge for the
cycle, the construction from the previous section will allow us to find the length of a
minimum length cycle through z, for each z ∈ V , with the post-processing algorithm
ANSC-to-APSP.
ANSC-to-APSP
1: for each vertex z ∈ V do wt[z] ← ∞
2: for 1 ≤ i ≤ dlog ne, j ∈ {0, 1} do
3: Compute APSP0 on Gi,j,
4: for y, z ∈ V do
5: if dGi,j, (y 1 , z 2 ) ≤ dG (y, z) + 1 then
6: Check if LastGi,j, (y 1 , z 2 ) 6= LastG (y, z)
7: if both checks in Steps 5-6 hold then
8: wt[z] ← min(wt[z], dGi,j, (y 1 , z 2 ) + dG (y, z))
9: return wt array
31
Correctness of the above sparse reduction follows from the following two lem-
mas, which are similar to Lemmas 2.4.4 and 2.4.5.
Lemma 2.4.7. If there exists an i ∈ {1, . . . , dlog ne} and j ∈ {0, 1} and y, z ∈ V
such that the following conditions hold:
Proof of Theorem 2.3.3: We now show that the entries in the wt array returned
by the above algorithm correspond to the ANSC output for G. Let z ∈ V be an
arbitrary vertex in G and let q = wt[z]. Let y 0 be the vertex in Step 5 for which we
obtain this value of q. Hence by Lemma 2.4.7, there exists a simple cycle C passing
through z of length at most q in G. If there were a cycle through z of length q 0 < q
then by Lemma 2.4.6, there exists a vertex y 00 such that conditions in Step 5 hold
for q 0 , and the algorithm would have returned a smaller value than wt[z], which is
a contradiction. This is a sparse Õ(n2 ) reduction since it makes O(log n) calls to
APSP, and spends Õ(n2 ) additional time.
32
It would be interesting to see if we can obtain a reduction from weighted
ANSC to APSD or APSP. The above reduction does not work for the weighted case
since it exploits the fact that for any cycle C through a vertex z, an edge in C that
is incident on z is a critical edge for some vertex in C. However, this property need
not hold in the weighted case.
2.4.3 Bit-Sampling
We use the bit-sampling technique in our reductions for undirected graphs: from
weighted MWC to APSP (Section 2.4.1), unweighted ANSC to unweighted APSP
(Section 2.4.2) and from weighted k-SiSC to k-SiSP (Section 2.9.1). This technique
is crucial to all of these reductions. Using this technique, we obtain a new near-
linear time algorithm for undirected k-SiSC, a new Õ(nω ) algorithm for unweighted
undirected ANSC and a simpler Õ(Wmax · nω ) algorithm for weighted MWC. Here
we describe how this technique is different from the ‘bit-encoding’ technique in [2]
and how it gives an explicit construction for Color Coding for 2 colors.
Color Coding is a method introduced by Alon, Yuster and Zwick [13]. For the special
case of 2 colors, the method constructs a collection C of O(log n) different 2-coloring
on an n-element set V , such that for every pair {x, y} in V , there is a 2-coloring
in C that assigns different colors to x and y. When the elements of V have unique
log n-bit labels, e.g., by numbering them from 0 to n − 1, our bit-sampling method
on index i (ignoring indices j and k) can be viewed as an explicit construction of
exactly dlog ne hash functions for the 2-perfect hash family: the i-th hash function
assigns to each element the i-th bit in its label as its color.
In our construction we actually use 2 log n functions (using both i and j)
since we need a stronger version of color coding where, for any pair of vertices x,
33
y, there is a hash function that assigns color 0 to x and 1 to y and another that
assigns 1 to x and 0 to y. This is needed in order to ensure that when x = vp−1 and
1 , v 2 ) is absent and the edge (v 1 , v 2 ) is present. A different
y = vp+1 , the edge (vp−1 p p+1 p
variant of Color Coding with 2 colors is used in [79] in their dense reduction from
undirected MWC to Min-Wt-∆, and we do not immediately see how to apply our
bit-sampling technique there.
Our bit-sampling method differs from a ‘bit-encoding’ technique used in some
reductions in [2, 1], where the objective is to preserve sparsity in the constructed
graph while also preserving paths from the original graph G = (V, E). This technique
creates paths between two copies of V by adding Θ(log n) new vertices with O(log n)
bit labels, and using the O(log n) bit labels on these new vertices to induce the desired
paths in the constructed graph. The bit-encoding technique (from [2]) is useful for
certain types of reductions, and we use it in our sparse reduction from 2-SiSP to
Radius in Section 2.5, and from 2-SiSP to BC in Section 2.6.
The bit-sampling technique we use in our reduction here is different from this
bit-encoding technique. Here the objective is to selectively sample the edges from
the original graph to be placed in the reduced graph, based on the bit-pattern of the
end points and the edge weight. In our construction, we create Θ(log n) different
graphs, where in each graph the copies of V are connected by single-edge paths,
without requiring additional intermediate vertices.
In Section 2.9.1, we give another application of our bit-sampling technique
to obtain a new near-linear time algorithm for k-SiSC in undirected graphs (see
definition in Section 2.2). Note that this problem is not in the mn class and this
result is relevant here as an application of our new bit-sampling technique.
34
v0 (s) dG (s, v1 ) dG (v2 , t)
v1 G v2 v3 (t)
0 dG (v1 , t) dG (s, v2 ) 0
z0 i z1o z1 i z2o
z0o z2i
0 0 0 0 0
0 0 0
y0o y1o y1i y2o y2i
y0i
A B
C1,0 C1,1
C2,0 C2,1
Figure 2.4: G00 for l = 3. The gray and the bold edges have weight 11 0 1
9 M and 3 M
0
respectively. All the outgoing (incoming) edges from (to) A have weight 0 and the
outgoing edges from B have weight M 0 .
A sparse O(n2 ) reduction from 2-SiSP to APSP was given in [41]. Our sparse
reduction from 2-SiSP to Radius refines this result and the sub-mn partial order by
plugging the Radius and Eccentricities problems within the sparse reduction chain
sprs
from 2-SiSP to APSP. Also, in Section 2.5.2 we show MWC ≤sprs
m+n 2-SiSP, thus
establishing MWC-hardness for both 2-SiSP and Radius. Our 2-SiSP to Radius
reduction here is unrelated to the sparse reduction in [41] from 2-SiSP to APSP.
The input is G = (V, E, w), with source s and sink t in V , and a shortest
path P (s = v0 → v1 vl−1 → vl = t). We need to compute a second simple s-t
shortest path. Figure 2.4 gives an example of our reduction to an input G00 to the
Radius problem for l = 3. Our reduction differs from a sub-cubic reduction from
Min-Wt-∆ to Radius in [2] which transforms minimum weight triangle to Radius by
35
creating a 4-partite graph. However, since Min-Wt-∆ can be solved in O(m3/2 ) time
this is not relevant to us. Instead, we give a more complex reduction from 2-SiSP
where the second shortest path can have Θ(n) edges and hence we cannot start with
a k-partite graph for some constant k.
In G00 we first map every edge (vj , vj+1 ) lying on P to the vertices zjo and zji
such that the shortest path from zjo to zji corresponds to the shortest path from s
to t avoiding the edge (vj , vj+1 ). We then add vertices yjo and yji in the graph and
connect them to vertices zjo and zji by adding edges (yjo , zjo ) and (yji , zji ), and then
additional edges from yjo to other yko and yki vertices such that the longest shortest
path from yjo is to the vertex yji , which in turn corresponds to the shortest path from
zjo to zji . In order to preserve sparsity, we have an interconnection from each yjo
vertex to all yki vertices (except for k = j) with a sparse construction by using 2 log n
additional vertices Cr,s in a manner similar to a bit-encoding technique used in [2] in
their reduction from Min-Wt-∆ to Betweenness Centrality (this technique however,
is different from the new ‘bit-sampling’ technique used in Section 2.4), and we have
two additional vertices A, B with suitable edges to induce connectivity among the
yjo vertices. In our construction, we ensure that the center is one of the yjo vertices
and hence computing the Radius in the reduced graph gives the minimum among all
the shortest paths from zjo to zji . This corresponds to a shortest replacement path
from s to t.
sprs
Lemma 2.5.1. In weighted directed graphs, 2-SiSP .sprs
m+n Radius and s-t Replace-
sprs
ment Paths .sprs
m+n Eccentricities
Proof. We are given an input graph G = (V, E), a source vertex s and a sink/target
vertex t and we wish to compute the second simple shortest path from s to t. Let P
(s = v0 → v1 vl−1 → vl = t) be the shortest path from s to t in G.
Constructing the reduced graph G00 : We first create the graph G0 , which contain G
and l additional vertices z0 ,z1 ,. . .,zl−1 . We remove the edges lying on P from G0 .
36
For each 0 ≤ i ≤ l − 1, we add an edge from zi to vi of weight dG (s, vi ) and an edge
from vi+1 to zi of weight dG (vi+1 , t). Also for each 1 ≤ i ≤ l − 1, we add a zero
weight edge from zi to zi−1 .
Now form G00 from G0 . For each 0 ≤ j ≤ l − 1, we replace vertex zj by
vertices zji and zjo and we place a directed edge of weight 0 from zji to zjo , and we
also replace each incoming edge to (outgoing edge from) zj with an incoming edge
to zji (outgoing edge from zjo ) in G0 .
Let Wmax be the largest edge weight in G and let M 0 = 9nWmax . For each
0 ≤ j ≤ l − 1, we add additional vertices yji and yjo and we place a directed edge of
weight 0 from yjo to zjo and an edge of weight 11 0
9 M from zji to yji .
We add 2 additional vertices A and B, and we place a directed edge from A
to B of weight 0. We also add l incoming edges to A (outgoing edges from B) from
(to) each of the yj0 o s of weight 0 (M 0 ).
2M 0
We also add edges of weight 3 from yjo to yki (for each k 6= j). But due to
the addition of O(n2 ) edges, graph G0 becomes dense. To solve this problem, we add
a gadget in our construction that ensures that ∀0 ≤ j ≤ l − 1, we have at least one
2M 0
path of length 2 and weight equal to 3 from yjo to yki (for each k 6= j) (similar to
[2]). In this gadget, we add 2dlog ne vertices of the form Cr,s for 1 ≤ r ≤ dlog ne and
s ∈ {0, 1}. Now for each 0 ≤ j ≤ l − 1, 1 ≤ r ≤ dlog ne and s ∈ {0, 1}, we add an
M0
edge of weight 3 from yjo to Cr,s if j 0 s r-th bit is equal to s. We also add an edge of
M0
weight 3 from Cr,s to yji if j’s r-th bit is not equal to s. So overall we add 2n log n
edges that are incident to Cr,s vertices; for each yjo we add log n outgoing edges to
Cr,s vertices and for each yji we add log n incoming edges from Cr,s vertices.
We can observe that for 0 ≤ j ≤ l − 1, there is at least one path of weight
2M 0
3 from yjo to yki (for each k 6= j) and the gadget does not add any new paths
from yjo to yji . The reason is that for every distinct j, k, there is at least one bit
(say r) where j and k differ and let s be the r-th bit of j. Then there must be an
37
2M 0
edge from yjo to Cr,s and an edge from Cr,s to yki , resulting in a path of weight 3
from yjo to yki . And by the same argument we can also observe that this gadget
does not add any new paths from yjo to yki .
We call this graph as G00 . Figure 2.4 depicts the full construction of G00 for
l = 3. We now establish the following three properties.
(i) For each 0 ≤ j ≤ l−1, the longest shortest path in G00 from yjo is to the vertex yji .
It is easy to see that the shortest path from yjo to any of the vertices in G or any of
the z’s has weight at most nWmax . And the shortest paths from yjo to the vertices A
and B have weight 0. For k 6= j, the shortest path from yjo to yko and yki has weight
M 0 and 32 M 0 respectively. Whereas the shortest path from yjo to yji has weight at
least 10nWmax as it includes the last edge (zji , yji ) of weight 11 0
9 M = 11nWmax . It
is easy to observe that the shortest path from yjo to yji corresponds to the shortest
path from zjo to zji .
(ii) The shortest path from zjo to zji corresponds to the replacement path for the
edge (vj , vj+1 ) lying on P . Suppose not and let Pj (s vh vk t) (where
vh is the vertex where Pj separates from P and vk is the vertex where it joins P )
be the replacement path from s to t for the edge (vj , vj+1 ). But then the path πj
(zjo → zj−1i zho → vh ◦ Pj (vh , vk ) ◦ vk → zki → zko zji ) (where Pj (vh , vk ) is
the subpath of Pj from vj to vk ) from zjo to zji has weight equal to wt(Pj ), resulting
in a contradiction as the shortest path from zjo to zji has weight greater than that
of Pj .
(iii) One of the vertices among yjo ’s is a center of G00 . It is easy to see that none
of the vertices in G could be a center of the graph G00 as there is no path from any
v ∈ V to any of the yjo ’s in G00 . Using a similar argument, we can observe that
none of the z’s, or the vertices yji ’s could be a potential candidate for the center of
G00 . For vertices A and B, the shortest path to any of the yji ’s has weight exactly
5 0
3M = 15nWmax , which is strictly greater than the weight of the largest shortest
38
path from any of the yjo ’s. Thus one of the vertices among yjo ’s is a center of G00 .
Thus by computing the radius in G00 , from (i), (ii), and (iii), we can compute
the weight of the shortest replacement path from s to t, which by definition of 2-SiSP,
is the second simple shortest path from s to t. This completes the proof of 2-SiSP
sprs
.sprs
m+n Radius.
Constructing G00 takes O(m + n log n) time since we add O(n) additional
vertices and O(m + n log n) additional edges, and given the output of Radius (Ec-
centricities), we can compute 2-SiSP (s-t Replacement Paths) in O(1) (O(n)) time
and hence the cost of both reductions is O(m + n log n).
We first describe a sparse reduction from directed MWC to 2-SiSP, which we will
use for reducing ANSC to the s-t replacement paths problem. This reduction is
adapted from a sub-cubic non-sparse reduction from Min-Wt-∆ to 2-SiSP in [91].
The reduction in [91] reduces Min-Wt-∆ to 2-SiSP by creating a tripartite graph.
Since starting from Min-Wt-∆ is not appropriate for our results (as discussed in our
sparse reduction to directed Radius), we start instead from MWC, and instead of
the tripartite graph used in [91] we use the original graph G with every vertex v
replaced with 2 copies, vi and vo .
In this reduction, as in [91], we first create a path of length n with vertices
labeled from p0 to pn , which will be the initial shortest path. We then map every
edge (pi , pi+1 ) to the vertex i in the original graph G such that the replacement
path from p0 to pn for the edge (pi , pi+1 ) corresponds to the shortest cycle passing
39
1o 1i
W
2o 2i
G0 2W
3o 3i
3W 3W
2W
W
p0 p1 p2 p3
0 0 0
sprs
Figure 2.5: G00 for n = 3 in the reduction: MWC ≤sprs
m+n 2-SiSP
through i in G. Thus computing 2-SiSP (i.e., the shortest replacement path) from
p0 to pn in the constructed graph corresponds to the minimum weight cycle in the
original graph. Figure 2.5 gives an example of the constructed graph for n = 3.
sprs
Lemma 2.5.2. In weighted directed graphs, MWC ≤sprs
m+n 2-SiSP
Proof. To compute MWC in G, we first create the graph G0 , where we replace every
vertex z by vertices zi and zo , and we place a directed edge of weight 0 from zi to
zo , and we replace each incoming edge to (outgoing edge from) z with an incoming
edge to zi (outgoing edge from zo ). We also add a path P (p0 → p1 pn−1 → pn )
of length n and weight 0.
Let Q = n· Wmax , where Wmax is the maximum weight of any edge in G. For
each 1 ≤ j ≤ n, we add an edge of weight (n − j + 1)Q from pj−1 to jo and an edge
of weight jQ from ji to pj in G0 to form G00 . Figure 2.5 depicts the full construction
of G00 for n = 3. This is an (m + n) reduction, and it can be seen that the second
40
G
v1 v2
v0 = s v3 = t
0
0
z0 0 z1 0 z2
dG (v1 , t) dG (s, v1 ) dG (v2 , t) dG (s, v2 )
sprs
Figure 2.6: G0 for l = 3 in the reduction: directed s-t Replacement Paths ≤sprs
m+n
ANSC
We now establish the equivalence between ANSC and the s-t replacement
paths problem under (m+n)-reductions by first showing an (m+n)-sparse reduction
from s-t replacement paths problem to ANSC. We then describe a sparse reduction
from ANSC to the s-t replacement paths problem, which is similar to the reduction
from MWC to 2-SiSP.
sprs
Lemma 2.5.3. In weighted directed graphs, s-t replacement paths ≡sprs
m+n ANSC
Proof. We are given an input graph G = (V, E), a source vertex s and a sink vertex t
and we wish to compute the replacement paths for all the edges lying on the shortest
path from s to t. Let P (s = v0 → v1 vl−1 → vl = t) be the shortest path from s
to t in G.
(i) Constructing G0 : We first create the graph G0 , as described in the proof of
Lemma 2.5.1. Figure 2.6 depicts the full construction of G0 for l = 3.
(ii) We now show that for each 0 ≤ i ≤ l − 1, the replacement path from s to t for
the edge (vi , vi+1 ) lying on P has weight equal to the shortest cycle passing through
zi . If not, assume that for some i (0 ≤ i ≤ l − 1), the weight of the replacement
41
path from s to t for the edge (vi , vi+1 ) is not equal to the weight of the shortest cycle
passing through zi .
Let Pi (s vj vk t) (where vj is the vertex where Pi separates from
P and vk is the vertex where it joins P ) be the replacement path from s to t for
the edge (vi , vi+1 ) and let Ci (zi zp → v p vq → z q zi ) be the shortest cycle
passing through zi in G0 .
If wt(Pi ) < wt(Ci ), then the cycle Ci0 (zi → zi−1 zj → vj ◦ Pi (vj , vk ) ◦ vk →
zk → zk−1 zi ) (where Pi (vj , vk ) is the subpath of Pi from vj to vk ) passing through
zi has weight equal to wt(Pi ) < wt(Ci ), resulting in a contradiction as Ci is the
shortest cycle passing through zi in G0 .
Now if wt(Ci ) < wt(Pi ), then the path Pi0 (s vp ◦Ci (vp , vq )◦vq vl ) where
Ci (vp , vq ) is the subpath of Ci from vp to vq , is also a path from s to t avoiding the
edge (vi , vi+1 ), and has weight equal to wt(Ci ) < wt(Pi ), resulting in a contradiction
as Pi is the shortest replacement path from s to t for the edge (vi , vi+1 ).
We then compute ANSC in G0 . And by (ii), the shortest cycles for each of
the vertices z0 , z1 , . . . , zl−1 gives us the replacement paths from s to t. This leads to
an (m + n) sparse reduction from s-t replacement paths problem to ANSC.
Now for the other direction, we are given an input graph G = (V, E) and
00
we wish to compute the ANSC in G. We first create the graph G , as described
in Lemma 2.5.2. We can see that the shortest path from p0 to pn avoiding edge
(pj−1 , pj ) corresponds to a shortest cycle passing through j in G. This gives us an
(m + n)-sparse reduction from ANSC to s-t replacement paths problem.
In this section, we consider sparse reductions for Betweenness Centrality and related
problems. In its full generality, the Betweenness Centrality of a vertex v is the sum,
across all pairs of vertices s, t, of the fraction of shortest paths from s to t that
42
contain v as an internal vertex. This problem has a Õ(mn) time algorithm due to
Brandes [21]. Since there can be an exponential (in n) number of shortest paths
from one vertex to another, this general problem can deal with very large numbers.
In [2], a simplified variant was considered, where it is assumed that there is a unique
shortest path for each pair of vertices, and the Betweenness Centrality of vertex v,
BC(v), is defined as the number of vertex pairs s, t such that v is an internal vertex
on the unique shortest path from s to t. We will also restrict our attention to this
variant here.
A number of sparse reductions relating to the following problems were given
in [2].
• All Nodes Betweenness Centrality (ANBC): compute, for each v, the value of
BC(v).
• Positive All Nodes Betweenness Centrality (Pos ANBC): determine, for each
v, whether BC(v) > 0.
Figure 2.7 gives an overview of the previous fine-grained results given in [2]
for Centrality problems. In this figure, BC is the only centrality problem that is
known to be sub-cubic equivalent to APSP, and hence is shaded in the figure (along
with Min-Wt-∆). None of these sparse reductions in [2] imply MWC hardness for
any of the centrality problems since Diameter is not MWC-hard (or even sub-cubic
equivalent to APSP), and Min-Wt-∆ has an Õ(m3/2 ) time algorithm, and so will
give a sub-mn algorithm for MWC if it is MWC-hard. On other hand, Diameter
43
BC ANBC
Min-Wt-∆
Figure 2.7: Known sparse reductions for centrality problems, all from [2]. The
regular edges represent sparse O(m + n) reductions, the squiggly edges represent
tilde-sparse O(m+n) reductions, and the dashed edges represent reductions that are
trivial. BC and Min-Wt-∆ (shaded with gray) are known to be sub-cubic equivalent
to APSP [2, 91].
BC
ANSC
2SiSP
≡sprs
m+n replacement Pos
MWC 2SiSC ANBC
paths ANBC n2
[6] n2
Radius Eccentricities APSP
Figure 2.8: Sparse reductions for weighted directed graphs. The regular edges repre-
sent sparse O(m + n) reductions, the squiggly edges represent tilde-sparse O(m + n)
reductions, and the dashed edges represent reductions that are trivial. All problems
except APSP are MWCC-hard. Eccentricities, BC, ANBC and Pos ANBC are also
SETH/ k-DSH hard. ANBC and Pos ANBC (problems inside the dashed circles)
are both not known to be subcubic equivalent to APSP. BC and Eccentricities (in
bold) are MWC-hard, sub-cubic equivalent to APSP and SETH/k-DSH hard.
is known to be both SETH-hard [78] and k-DSH Hard (Section 2.7) and hence all
these problems in Figure 2.7 (except Min-Wt-∆) are also SETH and k-DSH hard.
In this section, we give a sparse reduction from 2-SiSP to BC, establishing
MWC-hardness for BC. We also give a tilde-sparse reduction from ANSC to Pos
ANBC, and thus we have MWC-hardness for both Pos ANBC and for ANBC, though
neither problem is known to be in the sub-cubic equivalence class. (Both have Õ(mn)
time algorithms, and have APSP-hardness under sub-cubic reductions.)
Figure 2.8 gives an updated partial order of our sparse reductions for weighted
directed graphs; this figure augments Figure 2.1 by including the sparse reductions
44
dG (s, v1 ) dG (v2 , t)
v1 v2
G v0 (s) v3 (t)
0 dG (v1 , t) dG (s, v2 ) 0
z0i z1o z1 i z2o
z0o z2i
0 0 0 0 0
0 0 0
y0o y1o y1i y2o y2i
y0i
A
C1,0 C1,1
C2,0 C2,1
sprs
Figure 2.9: G00 for l = 3 in the reduction: directed 2-SiSP .sprs
m+n Betweenness
Centrality. The gray and the bold edges have weight M and 3 M 0 respectively. All
0 1
the outgoing (incoming) edges from (to) A have weight M 0 + q (0). Here M 0 =
9nWmax where Wmax is the largest edge weight in G and q is some value in the
range from 0 to nWmax .
2.6.1 2-SiSP to BC
Our sparse reduction from 2-SiSP to BC is similar to the reduction from 2-SiSP to
Radius described in Section 2.5. The input is G = (V, E, w), with source s and sink
t in V , and a shortest path P (s = v0 → v1 vl−1 → vl = t). We need to compute
a second simple s-t shortest path. Figure 2.9 gives an example of our reduction to
an input G00 to the BC problem for l = 3.
In our reduction, we first map every edge (vj , vj+1 ) to new vertices yjo and yji
such that the shortest path from yjo to yji corresponds to the replacement path from
s to t for the edge (vj , vj+1 ). We then add an additional vertex A and connect it to
vertices yjo ’s and yji ’s. We also ensure that the only shortest paths passing through
A are from yjo to yji . We then do binary search on the edge weights for the edges
45
going from A to yji ’s with oracle calls to the Betweenness Centrality problem, to
compute the weight of the shortest replacement path from s to t, which by definition
of 2-SiSP, is the second simple shortest path from s to t.
sprs
Lemma 2.6.1. In weighted directed graphs, 2-SiSP .sprs
m+n BC
Proof. We are given an input graph G = (V, E), a source vertex s and a sink/target
vertex t and we wish to compute the second simple shortest path from s to t. Let P
(s = v0 → v1 vl−1 → vl = t) be the shortest path from s to t in G.
(i) Constructing G00 : We first construct the graph G00 , as described in the proof of
Lemma 2.5.1, without the vertices A and B. For each 0 ≤ j ≤ l − 1, we change
the weight of the edge from zji to yji to M 0 (where M 0 = 9nWmax and Wmax is the
largest edge weight in G).
We add an additional vertex A and for each 0 ≤ j ≤ l−1, we add an incoming
(outgoing) edge from (to) yjo (yji ). We assign the weight of the edges from yjo ’s to
A as 0 and from A to yji ’s as M 0 + q (for some q in the range 0 to nM ).
Figure 2.9 depicts the full construction of G00 for l = 3.
We observe that for each 0 ≤ j ≤ l − 1, a shortest path from yjo to yji with
(zji , yji ) as the last edge has weight equal to M 0 + dG00 (zjo , zji ).
(ii) We now show that the Betweenness Centrality of A, i.e. BC(A), is equal to l
iff q < dG00 (zjo , zji ) for each 0 ≤ j ≤ l − 1. The only paths that passes through the
vertex A are from vertices yjo ’s to vertices yji ’s. For j 6= k, as noted in the proof
of Lemma 2.5.1, there exists some r, s such that there is a path from yjo to yki that
goes through Cr,s and has weight equal to 32 M 0 . However a path from yjo to yki has
weight M 0 + q, which is strictly greater than 23 M 0 and hence the pairs (yjo , yki ) does
not contribute to the Betweenness Centrality of A.
Now if BC(A), is equal to l, it implies that the shortest paths for all pairs
(yjo , yji ) passes through A and there is exactly one shortest path for each such pair.
Hence for each 0 ≤ j ≤ l − 1, M 0 + q < M 0 + dG00 (zjo , zji ). Thus q < dG00 (zjo , zji )
46
for each 0 ≤ j ≤ l − 1.
On the other hand if q < dG00 (zjo , zji ) for each 0 ≤ j ≤ l − 1, then the path
from yjo to yji with (zji , yji ) as the last edge has weight M 0 + dG00 (zjo , zji ). However
the path from yjo to yji passing through A has weight M 0 + q < M 0 + dG00 (zjo , zji ).
Hence every such pair contributes 1 to the Betweenness Centrality of A and thus
BC(A) = l.
Thus using (ii), we just need to find the minimum value of q such that
BC(A) < l in order to compute the value min0≤j≤l−1 dG00 (zjo , zji ). We can find
such q by performing a binary search in the range 0 to nM and computing BC(A)
at every layer. Thus we make O(log nWmax ) calls to the Betweenness Centrality
algorithm.
As observed in the proof of Lemma 2.5.3, we know that the shortest path
from zjo to zji corresponds to the replacement path for the edge (vj , vj+1 ) lying on
P . Thus by making O(log nWmax ) calls to the Betweenness Centrality algorithm,
we can compute the second simple shortest path from s to t in G. This completes
the proof.
The cost of this reduction is O((m + n log n) · log nWmax ).
We now describe a tilde-sparse reduction from the ANSC problem to the All Nodes
Positive Betweenness Centrality problem (Pos ANBC). Sparse reductions from Min-
Wt-∆ and from Diameter to Pos ANBC are given in [2]. However, Min-Wt-∆ can
be solved in O(m3/2 ) time, and Diameter is not known to be MWC-hard, hence
neither of these reductions can be used to show hardness of the All Nodes Positive
Betweenness Centrality problem relative to MWC hardness. (Recall that Pos ANBC
is not known to be subcubic equivalent to APSP.)
Our reduction is similar to the reduction from 2-SiSP to the Betweenness
47
Centrality problem, but instead of computing betweenness centrality through one
vertex, it computes the positive betweenness centrality values for n different nodes.
The input is G = (V, E, w) and we wish to compute the ANSC in G.
In this reduction, we first split every vertex x into vertices xo and xi such that
the shortest path from xo to xi corresponds to the shortest cycle passing through x
in the original graph. We then add additional vertices zx for each vertex x in the
original graph and connect it to the vertices xo and xi such that the only shortest
path passing through zx is from xo to xi . We then perform binary search on the edge
weights for the edges going from zx to xi with oracle calls to the Positive Betweenness
Centrality problem, to compute the weight of the shortest cycle passing through x
in G.
sprs
Lemma 2.6.2. In weighted directed graphs, ANSC .sprs
m+n Pos ANBC
Proof. We are given an input graph G = (V, E) and we wish to compute the ANSC
in G. Let Wmax be the largest edge weight in G.
(i) Constructing G0 : Now we construct a graph G0 from G. For each vertex x ∈ V ,
we replace x by vertices xi and xo and we place a directed edge of weight 0 from xi
to xo , and we also replace each incoming edge to (outgoing edge from) x with an
incoming edge to xi (outgoing edge from xo ) in G0 . We can observe that the shortest
path from xo to xi in G0 corresponds to the shortest cycle passing through x in G.
For each vertex x ∈ V , we add an additional vertex zx in G0 and we add an
edge of weight 0 from xo to zx and an edge of weight qx (where qx lies in the range
from 0 to nWmax ) from zx to xi .
Figure 2.10 depicts the full construction of G0 for n = 3.
We observe that the shortest path from xo to xi for some vertex x ∈ V passes
through zx only if the shortest cycle passing through x in G has weight greater than
qx .
(ii) We now show that for each vertex x ∈ V , Positive Betweenness Centrality of zx
48
1o 0 1i 2o 0 2i 3o 0 3i
G
0 q1 0 q2 0 q3
z1 z2 z3
sprs
Figure 2.10: G0 for n = 3 in the reduction: directed ANSC .sprs
m+n Pos ANBC.
is true, i.e., BC(zx ) > 0 iff the shortest cycle passing through x has weight greater
than qx . It is easy to see that the only path that pass through vertex zx is from xo
to xi (as the only outgoing edge from xi is to xo and the only incoming edge to xo
is from xi ).
Now if BC(zx ) > 0, it implies that the shortest path from xo to xi passes
through zx and hence the path from xo to xi corresponding to the shortest cycle
passing through x has weight greater than qx .
On the other hand, if the shortest cycle passing through x has weight greater
than qx , then the shortest path from xo to xi passes through zx . And hence BC(zx ) >
0.
Then using (ii), we just need to find the maximum value of qx such that
BC(zx ) > 0 in order to compute the weight of the shortest cycle passing through
x in the original graph. We can find such qx by performing a binary search in the
range 0 to nWmax and computing Positive Betweenness Centrality for all nodes at
every layer. Thus we make O(log nWmax ) calls to the Pos ANBC algorithm. This
completes the proof.
The cost of this reduction is O((m + n) · log nWmax ).
49
2.7 Conditional Hardness Under k-DSH
Here we improve on a result shown in [78] that a sub-m2 algorithm for Diameter
would refute k-DSH for even values of k by showing sub-mn hardness for Diameter
for both odd and even values of k. Since Diameter trivially reduces to Eccentricities
and BC [2] and k-DSH hardness implies SETH hardness, this result also holds for
Eccentricities and BC, and relative to both k-DSH and SETH.
Lemma 2.7.1. Suppose for some constant α there is an O(mα · n2−α− ) time algo-
rithm, for some > 0, for solving Diameter in an unweighted m-edge n-node graph,
either undirected or directed. Then there exists a k 0 > 0 such that for all k ≥ k 0 , the
k-Dominating Set problem can be solved in O(nk− ) time.
50
size k that includes x in G, then for any u, v ∈ V1 , at least one vertex in V2 − Vx is
not covered by both u and v and hence there is a path of length 2 from u to v. If
we now compute the diameter in each of graphs Gx , x ∈ V , we will detect a graph
with diameter greater than 2 if and only if G has a dominating set of size k.
Each graph Gx has N = O(nr ) vertices and M = O(nr+1 ) edges. If we now
assume that Diameter can be computed in time O(M α · N 2−α− ), then the above
algorithm for k Dominating Set runs in time O(n · M α · N 2−α− ) = O(n2r+1−r+α ),
2α
which is O(nk− ) time when k ≥ 3 + . The analysis is similar for k even. In the
directed case, we get the same result by replacing every edge in G0 with two directed
edges in opposite directions.
51
smaller than T 0 (m, n), for some > 0. Further, this requirement is placed only on
sufficiently sparse graphs (and for a weakly smaller time bound, we also require a
certain minimum edge density). The consequence of this definition is that when one
time bound is not dominated by the other for all values of m, the domination needs
to hold for sufficiently sparse graphs in order for the dominated function to be a
smaller time bound for sparse graphs.
Definition 2.8.1 (Comparing Time Bounds for Sparse Graphs). Given two
time bounds T (m, n) and T 0 (m, n),
(i) T (m, n) is a smaller time bound than T 0 (m, n) for sparse graphs if there exist
constants γ, > 0 such that T (m, n) = O m1 · T 0 (m, n) for all values of
m = O(n1+γ ).
(ii) T (m, n) is a weakly smaller time bound than T 0 (m, n) for sparse graphs if
there exists a positive constant γ such that for any constant δ with γ > δ > 0,
there exists an > 0 such that T (m, n) = O m1 · T 0 (m, n) for all values of m
52
With Definition 2.8.1 in hand, the following lemma is straightforward.
Lemma 2.8.2. Let T1 (m, n) = O(mα1 nβ1 ) and T2 (m, n) = O(mα2 nβ2 ) be two time
bounds, where α1 , β1 , α2 , β2 are constants.
(i) T1 (m, n) is a smaller time bound than T2 (m, n) for sparse graphs if α2 + β2 >
α1 + β1 .
(ii) T1 (m, n) is a weakly smaller time bound than T2 (m, n) for sparse graphs if
α2 + β2 = α1 + β1 , and α2 > α1 .
Definition 2.8.1, in conjunction with Lemma 2.8.2 and Theorem 2.3.9, lead
to the following provable separation of time bounds for sparse graph problems in the
sub-cubic equivalence class:
Theorem 2.8.3 (Split of Time Bounds for Sparse Graphs.). Under either
SETH or k-DSH, triangle finding problems in the sub-cubic equivalence class have
algorithms with a smaller time bound for sparse graphs than any algorithm we can
design for Eccentricities.
53
i − 1 cycles. The corresponding path version of this problem is known as k-SiSP and
is solvable in near linear time in undirected graphs [57].
We now use our bit-sampling technique (described in Section 2.4) to get a
near-linear time algorithm for k-SiSC, which was not previously known. We obtain
this k-SiSC algorithm by giving a tilde-sparse Õ(m + n) time reduction from k-SiSC
to k-SiSP. This reduction uses our bit-sampling technique for sampling the edges
incident to v and creates dlog ne different graphs. Here we only use index i of our
bit-sampling method.
sprs
Lemma 2.9.1. In undirected graphs, k-SiSC .sprs
(m+n) k-SiSP.
Proof. Let the input be G = (V, E) and let x ∈ V be the vertex for which we need
to compute k-SiSC. Let N (x) be the neighbor-set of x. We create dlog ne graphs
Gi = (Vi , Ei ) such that ∀1 ≤ i ≤ dlog ne, Gi contains two additional vertices x0,i
and x1,i (instead of the vertex x) and ∀y ∈ N (x), the edge (y, x0,i ) ∈ Ei if y’s i-th
bit is 0, otherwise the edge (y, x1,i ) ∈ Ei . This is our bit-sampling method.
The construction takes O((m + n) · log n) time and we observe that every
cycle through x will appear as a path from x0,i to x1,i in at least one of the Gi .
Hence, the k-th shortest path in the collection of k-SiSPs from x0,i to x1,i in log n
Gi , 1 ≤ i ≤ dlog ne (after removing duplicates), corresponds to the k-th SiSC passing
through x.
Using the undirected k-SiSP algorithm in [57] that runs in O(k·(m+n log n)),
we obtain an O(k log n · (m + n log n)) time algorithm for k-SiSC in undirected
graphs.
54
open problems remain of which we mention two.
Our reduction from ANSC to APSP in undirected graphs only works for the
unweighted case. An open question here is to extend this reduction to the weighted
case or to come up with an altogether different reduction.
The directed and undirected versions of most of these problems are known
to be subcubic equivalent [91]. However such an equivalence is not known for the
sparse case. For path problems, the undirected versions trivially reduces to their
directed versions though nothing is known for the reverse case. Nothing is known in
either direction for cycle problems. It will be interesting to see if one can establish
sparse reductions for these problems.
55
Chapter 3
3.1 Introduction
56
paths (SiSP) and cycles (SiSC) in a weighted directed graph under the following set-
ups: the k simple shortest paths for all pairs of vertices (k-APSiSP), k simple shortest
paths in the overall graph (k-All-SiSP), and the corresponding problem of finding
simple shortest cycles in the overall graph (k-All-SiSC). We obtained significantly
faster algorithms for k-APSiSP for small values of k, and fast algorithms, that also
appear to be the first nontrivial algorithms, for the remaining two problems for all
k ≥ 1. Implicit in our method for k-All-SiSC are new algorithms for finding k
simple shortest cycles through a specified vertex (k-SiSC) and through every vertex
(k-ANSiSC) in weighted directed graphs.
The techniques we use in our algorithms are of special interest: We use two
path extension techniques, a new method for k-APSiSP, and another for k-All-SiSP
that is related to a method used in [25] for fully dynamic APSP, but which is still
new for the context in which we use it.
Related Work For the case when the k shortest paths need not be simple, the
all-pairs version (k-APSP) was considered in the classical papers of Lawler [61, 62]
and Minieka [69]. The most efficient current algorithm for k-APSP runs the k-
SSSP algorithm in [30] on each of the n vertices in turn, leading to a bound of
O(mn + n2 log n + kn2 ). It was noted in Minieka [69] that the all-pairs version of
k shortest paths becomes significantly harder when simple paths are required, i.e.,
that the problem we study here, k-APSiSP, appears to be significantly harder than
k-APSP.
Even for a single source-sink pair, the problem of generating k simple shortest
paths (k-SiSP) is considerably more challenging than the unrestricted version con-
sidered in [30]. Yen’s algorithm [92] finds the k simple shortest paths for a specific
pair of vertices in O(k · (mn + n2 log n)). This time bound was improved slightly [41],
using Pettie’s faster APSP algorithm [76], to O(k(mn + n2 log log n)). On the other
hand, it is shown in [91] that if the second simple shortest path for a single source-
57
sink pair (i.e., k = 2 in k-SiSP) can be found in O(n3−δ ) time for some δ > 0, then
APSP can also be computed in O(n3−α ) time for some α > 0; the latter is a major
open problem. Thus, for dense graphs, where m = Θ(n2 ), we cannot expect to im-
prove the Õ(mn) bound, even for 2-SiSP, unless we solve a major and long-standing
open problem for APSP.
The k-SiSP problem is much simpler in the undirected case and is known to
be solvable in O(k(m + n log n)) time [57]. For unweighted directed graphs, Roditty
√
and Zwick [80] gave an Õ(km n) randomized algorithm for directed k-SiSP. They
also showed that k-SiSP can be solved with O(k) executions of an algorithm for the
2-SiSP problem.
A problem related to 2-SiSP is the replacement paths problem. In the s-t
version of this problem, we need to output a shortest path from s to t when an edge
on the shortest path p is removed; the output is a collection of |p| paths, each a
shortest path from s to t when an edge on p is removed. Clearly, given a solution
to the s-t replacement paths problem, the second shortest path from s to t can be
computed as the path of minimum weight in this solution. This is essentially the
method used in all prior algorithms for 2-SiSP (and with modifications, for k-SiSP),
and thus the current fastest algorithms for 2-SiSP and replacement paths have the
same time bound. For the all-pairs case that is of interest to us, the output for the
replacement paths problem would be O(n3 ) paths, where each path is shortest for a
specific vertex pair, when a specific edge in its shortest path is removed. In view of
the large space needed for this output, in the all-pairs version of replacement paths,
the problem of interest is distance sensitivity oracles (DSO). Here, the output is a
compact representation from which any specific replacement path can be found with
O(1) time. The first such oracle was developed in Demetrescu et. al. [26], and it
has size O(n2 log n). The current best construction time for an oracle of this size is
O(mn log n + n2 log2 n) time for a randomized algorithm, and a log factor slower for
58
a deterministic algorithm, given in Bernstein and Karger [17]. Given such an oracle,
the output to 2-APSiSP can be computed with O(n) queries for each source-sink
pair, i.e., with O(n3 ) queries to the DSO.
To the best of our knowledge, for k > 1 the problem of generating k simple
shortest cycles in the overall graph in non-decreasing order of their weights (k-
All-SiSC) has not been studied before, and neither has k-SiSC (k Simple Shortest
Cycles through a given node) or k-ANSiSC (k All Nodes Simple Shortest Cycles);
for k = 1, 1-All-SiSC asks for a minimum weight cycle and 1-ANSiSC is the ANSC
problem [93], both of which can be found in Õ(mn) time, and 1-SiSC can be solved
in Õ(m + n) time. On the other hand, enumerating simple (or elementary) cycles
in no particular order — which is thus a special case of k-All-SiSC — has been
studied extensively [86, 89, 84, 53]. The first polynomial time algorithm was given
by Tarjan [84], and ran in O(kmn) time for k cycles. This result was improved to
O(k · m + n) by Johnson [53]. We do not expect to match this linear time result for
k-All-SiSC since it includes the minimum weight cycle problem for k = 1.
Definition 3.2.1. Let G = (V, E) be a directed graph with non-negative edge weights.
For k ≥ 2, and a vertex pair x, y, let k ∗ = min{r, k}, where r is the number of simple
paths from x to y in G. Then,
59
Problem Known Results Our Results
Table 3.1: Our results for directed graphs. All algorithms are deterministic. (DSO
stands for Distance Sensitivity Oracles).
Our algorithm for k-APSiSP first constructs Qk (x, y) for all pairs of vertices
x, y, and then uses these sets in an efficient algorithm, Compute-APSiSP, to com-
pute the Pk∗ (x, y) for all x, y. The latter algorithm runs in time O(k · n2 + n2 log n)
for any k, while our method for constructing the Qk (x, y) depends on k. For k = 2
we present an O(mn + n2 log n) time method to compute the Q2 (x, y) sets; this gives
a 2-APSiSP algorithm that matches Yen’s bound of O(mn + n2 log n) for 2-SiSP for
a single pair of vertices. It is also faster (by a poly-logarithmic factor) than the best
algorithm for DSO (distance sensitivity oracles) for the all-pairs replacement paths
60
problem [17]. In fact, we also show that the Q2 (x, y) sets can be computed in O(n2 )
time using a DSO, and hence 2-APSiSP can be computed in O(n2 log n) time plus
the time to construct the DSO.
For k ≥ 3 our algorithm to compute the Qk sets makes calls to an algorithm
for (k − 1)-APSiSP, so we combine the two components together in a single recursive
method, APSiSP, that takes as input G and k, and outputs the Pk∗ sets for all vertex
pairs. The time bound for APSiSP increases with k: it is faster than Yen’s method
for k = 3 by a factor of n (and hence is faster than the current fastest method by
almost a factor of n), it matches Yen for k = 4, and its performance degrades for
larger k.
If a faster algorithm can be designed to compute the Qk sets, then we can
run Compute-APSiSP on its output and hence compute k-APSiSP in additional
O(k ·n2 +n2 log n) time. Thus, a major open problem left by our results is the design
of a faster algorithm to compute the Qk sets for larger values of k.
61
ANSiSC) and k-All-SiSP. We consider the problem of generating the k simple
shortest cycles in the graph G in non-increasing order of their weight (k-All-SiSC). In
Section 3.4 we came up with an algorithm for k-All-SiSC that runs in Õ(k · mn) time
by generating each successive simple shortest cycle in G in Õ(mn) time. The same
algorithm can be used to enumerate all simple cycles in G in non-decreasing order of
their weights. Recall that the related problem of simply enumerating simple cycles in
a graph in no particular order was a very well-studied classical problem [86, 89, 84, 53]
until an algorithm that generates successive cycles in linear time was obtained [53].
Our algorithm does not match the linear time bound per successive cycle, but it is to
be noted that 1-All-SiSC (i.e., the problem of generating a minimum weight cycle)
is a very fundamental and well-studied problem for which the current best bound is
Õ(mn).
Our algorithm for k-All-SiSC creates a auxiliary graph on which suitable
SiSP computation can be performed to generate the desired output. Using the same
auxiliary graph, we came up with fast algorithms for k-SiSC and k-ANSiSC.
Complementing our result for k-All-SiSC, we present in Section 3.5 an al-
gorithm for k-All-SiSP that generates each successive simple path in Õ(k) time if
k < n, and in Õ(n) time if k > n, after an initial start-up cost of O(m) to find
the first path. This time bound is considerably faster than that for k-All-SiSC.
Our method, All-SiSP, is again one of extending existing paths by an edge (as is
Compute-APSiSP); it is, however, a different path extension method.
Path Extensions. We use two different path extension methods, one for k-APSiSP
and the other for k-All-SiSP. Path extensions have been used before in the hidden
paths algorithm for APSP [55] and more recently, for fully dynamic APSP [25]. Our
path extension method for k-All-SiSP is inspired by a method in [25] to compute
‘locally shortest paths’ for fully dynamic APSP. Our path extension method for
k-APSiSP appears to be new.
62
Here are the main theorems we establish for our algorithmic results. In all
cases, the input is a directed graph G = (V, E) with nonnegative edge weights.
Theorem 3.2.2. Given an integer k > 1, and the nearly simple shortest paths sets
Qk (x, y) (Definition 3.2.1) for all x, y ∈ V , Algorithm Compute-APSiSP produces
the k simple shortest paths for every pair of vertices in O(k · n2 + n2 log n) time.
(iii) T (m, n, 3), the time bound for algorithm APSiSP for k = 3, is O(m · n2 + n3 ·
log n).
Theorem 3.2.5. (k-All-SiSP) After an initial start-up cost of O(m) time to generate
the first path, Algorithm All-SiSP computes each succeeding simple shortest path
with the following bounds:
(i) amortized O(k + log n) time if k = O(n) and O(n + log k) time if k = Ω(n);
(ii) worst-case O(k · log n) time if k = O(n), and O(n · log k) time if k = Ω(n).
63
In the second step it computes the exact k-SiSP sets Pk∗ (x, y) for all x, y using the
Qk (x, y) sets. This second step is the same for any value of k, and we describe this
step first in Section 3.3.1. We then present efficient algorithms to compute the Qk
sets for k = 2 and k > 2.
In our algorithms we maintain the paths in each Pk∗ (x, y) and Qk (x, y) set in
an array in non-decreasing order of edge-weights.
Lemma 3.3.1. Suppose there are k simple shortest paths from x to y, all having
the same first edge (x, a). Then ∀i, 1 ≤ i ≤ k, the right subpath of the i-th simple
shortest path from x to y has weight equal to the weight of the i-th simple shortest
path from a to y.
Proof. By induction on k. Since subpaths of shortest paths are shortest paths, the
statement holds for k = 1. Assume the statement is true for all h ≤ k, and consider
the case when the h + 1 simple shortest paths from x to y all share the same first
edge (x, a). Inductively, the right subpath of each of the first h simple shortest
paths have the weight equal to the corresponding simple shortest paths from a to y.
64
Suppose the weight of the right subpath πa,y of the (h + 1)-th simple shortest path
from x to y is not equal to the weight of the (h + 1)-th simple shortest path from a
0
to y. Hence, if πa,y is the (h + 1)-th simple shortest path from a to y, we must have
0
wt(πa,y ) > wt(πa,y ).
Since πxa,y is the (h + 1)-th simple shortest path from x to y and wt(πa,y ) >
0
wt(πa,y ), there exists at least one path from a to y that contains x and is also the
00
j-th simple shortest path from a to y, where j ≤ h+1. Let this path be πa,y . Let the
00 00 00 00 0
subpath of πa,y from x to y be πxa0 ,y . But then wt(πxa0 ,y ) < wt(πa,y ) ≤ wt(πa,y ) <
wt(πa,y ) < wt(πxa,y ). But this is a contradiction to our assumption that all the
first h + 1 simple shortest paths from x to y contains (x, a) as the first edge. This
contradiction establishes the induction step and the lemma.
65
Algorithm 1 Compute-APSiSP(G = (V, E), wt, k, {Qk (x, y), ∀x, y})
1: Initialize:
2: H ← φ {H is a priority queue.}
3: for all x, y ∈ V, x 6= y do
4: Pk∗ (x, y) ← Qk (x, y)
5: if the k − 1 shortest paths in Pk∗ (x, y) have the same first edge then
6: Let (x, a) be the first edge in the (k − 1) shortest paths in Pk∗ (x, y)
7: Add (x, a) to the set Extensions(a, y)
8: if |Qk (a, y)| = k then
9: π ← the path of largest weight in Qk (a, y)
10: π 0 ← (x, a) ◦ π
11: Add π 0 to H with weight wt(x, a) + wt(π)
12: Main Loop:
13: while H 6= φ do
14: π ← Extract-min(H)
15: Let π = (xa, y) and π 0 a path of largest weight in Pk∗ (x, y)
16: if |Pk∗ (x, y)| = k − 1 then
17: add π to Pk∗ (x, y) and set update flag
18: else if wt(π) < wt(π 0 ) then
19: Replace π 0 with π in Pk∗ (x, y) and set update flag
20: if update flag is set then
21: for all (x0 , x) ∈ Extensions(x, y) do
22: Add (x0 , x) ◦ π to H with weight wt(x0 , x) + wt(π)
66
the first edge on all k − 1 SiSPs from x to y. In addition to adding the common
first edge (x, a) in the (k − 1) SiSPs in Pk∗ (x, y) to Extensions(a, y) in Step 7, the
algorithm creates the k-LESiP with start edge (x, a) and end vertex y using the k-th
shortest path in the set Pk∗ (a, y), and adds it to heap H in Steps 8-11. Let U denote
the set of Pk∗ (x, y) sets which may need to be updated; these are the sets for which
the if condition in Step 5 holds.
In the main while loop in Steps 13-22, a min-weight path is extracted in each
iteration. We establish below that this min-weight path is added to the corresponding
Pk∗ in Step 17 or 19 only if it is the k-th SiSP; in this case, its left extensions are
created and added to the heap H in Step 22, and we note that some of these paths
could be cyclic.
Lemma 3.3.2. Let G = (V, E) be a directed graph with non-negative edge weight
function wt, and ∀x, y ∈ V , let the set Qk (x, y) contain the nearly k-SiSPs from
x to y. Then, algorithm Compute-APSiSP correctly computes the sets Pk∗ (x, y)
∀x, y ∈ V .
Proof. First, we need to show that the paths in sets Pk∗ (x, y) are indeed simple.
Clearly, the paths added to Pk∗ from sets Qk in Step 4 are already simple (from the
definition of Qk ). So we only need to show that the paths added to Pk∗ in Steps
17 and 19 are simple. To the contrary assume that some of the paths that are
added to Pk∗ are non-simple. Clearly these paths must be of length greater than 1.
Let πxa,y = x → a y be the first minimum weight path extracted from H that
contains a cycle and was added to Pk∗ in Step 17 or 19. Clearly, Pk∗ (x, y) ∈ U and
(x, a) ∈ Extensions(a, y) and the right subpath πa,y must be in Pk∗ (otherwise the
path πxa,y would never have been added to heap H in Step 11 or 22). The right
subpath πa,y must also be simple (as wt(πa,y ) < wt(πxa,y )), and it must contain x
in order to create a cycle in πxa,y . Let πxa0 ,y (a0 6= a) be the subpath of πa,y from x
to y. Now there are two cases depending on whether πxa,y was added to Pk∗ in Step
67
17 or 19.
If πxa,y was added to Pk∗ (x, y) in Step 17 and as Pk∗ (x, y) ∈ U , it implies that
all k − 1 paths in Qk (x, y) have same first edge (x, a) and there is no simple path
from x to y in Qk (x, y) with some first edge (x, a00 ) 6= (x, a). This is a contradiction
as the subpath πxa0 ,y of πa,y contains (x, a0 ) 6= (x, a) as its first edge.
Otherwise, let πxa00 ,y ∈ Qk (x, y) (a00 6= a) be the path that was removed
from Pk∗ in Step 19 to accommodate πxa,y . Thus, we have wt(πxa0 ,y ) < wt(πxa,y ) <
wt(πxa00 ,y ), which is a contradiction as πxa00 ,y ∈ Qk (x, y) and is the shortest path
from x to y avoiding edge (x, a) (as the other k − 1 shortest paths in Qk (x, y) have
(x, a) as the first edge). As path πxa,y is arbitrary, hence all paths in Pk∗ are simple.
Now we need to show that Pk∗ (x, y) indeed contains the k ∗ SiSPs from x to
y.
From the definition of Qk (x, y), it is evident that Pk∗ (x, y) indeed contains the
k − 1 SiSPs from x to y. We now need to show that the k-th shortest path in each of
the sets Pk∗ is indeed the corresponding k-th SiSP. To the contrary assume that there
exists a Pk∗ set that does not contain the correct k-th SiSP. Let πxa,y = x → a y
be the minimum weight k-th SiSP that is not present in Pk∗ . Clearly, πxa,y ∈
/ Qk (x, y)
(otherwise it would have been added to Pk∗ (x, y) in Step 4). This implies that πxa,y
has the same first edge as that of the k − 1 SiSPs from x to y and hence Pk∗ (x, y) ∈ U
and (x, a) ∈ Extensions(a, y). By Lemma 3.3.1, the right subpath of πxa,y must
have weight equal to the k-th SiSP from a to y. Thus, there are at least k SiSPs
from a to y and the set Pk∗ (a, y) contains all the k SiSPs from a to y. And as
0
(x, a) ∈ Extensions(a, y), a path πxa,y with the k-th SiSP from a to y as the right
subpath and weight equal to wt(πxa,y ) must have been added to H either in Step
11 or 22 and would have been added to Pk∗ (x, y) in Step 17 or 19, resulting in a
contradiction to our assumption that Pk∗ (x, y) does not contain all the k SiSPs.
Thus, Pk∗ (x, y) does contain the k ∗ SiSPs from x to y.
68
The time bound for Algorithm Compute-APSiSP in Theorem 3.2.2 is es-
tablished with the following sequence of simple lemmas.
Lemma 3.3.3. There are O(kn2 ) paths in Pk∗ , and O(n2 ) elements across all Ex-
tensions sets.
Proof. |Pk∗ (x, y)| = O(kn2 ) since there are at most k paths in each of the n · (n − 1)
sets Pk∗ (x, y). For the second part, exactly one edge is contributed to a Extensions
set by each Pk∗ (x, y) ∈ U in Step 7.
Lemma 3.3.4. Each Pk∗ (x, y) set is updated at most once in the main while loop.
Proof. A path can be added to Pk∗ (x, y) at most once in Step 17 since its size will
increase to k after the addition. Also, a path is added at most once in either Step 17
or Step 19 since paths are extracted from H in nondecreasing order of their weights.
Proof. For each k-LESiP, the right subpath must be the k-th shortest path in Pk∗ .
For each pair of vertices x, y ∈ V , there is at most one entry across the Extensions
sets (say edge (x, a) ∈ Extensions(a, y)) and hence at most one k-LESiP will be
added to heap H in Step 11 for pair (x, y). By lemma 3.3.4, we know that the set
Pk∗ (a, y) is updated at most once and hence at most one k-LESiP will be added to
heap H for pair (x, y) in Step 22. Thus, there are only O(n2 ) k-LESiPs that were
added to the heap H in the algorithm.
Proof. A binary heap suffices for H. The initialization for loop in Steps 3-11 takes
O(kn2 ) time to initialize and inspect the Pk∗ sets. It is executed at most n2 times
and, outside of the inspection of Pk∗ (x, y) an iteration costs Θ(log n) time (cost for
69
insertion in heap), thus contributing O(n2 log n) to the running time. The while
loop is executed O(n2 ) times as by lemma 3.3.5, O(n2 ) elements are added to the
heap. The extract-min operation takes Θ(log n) time and hence Step 14 contributes
O(n2 log n) to the running time. Steps 15-19 takes constant time per iteration and
hence add O(n2 ) to the total running time. By lemma 3.3.3, Step 22 is executed
O(n2 ) times and contributes O(n2 log n) to the running time. Thus, the total running
time of the algorithm is O(kn2 + n2 log n).
We now give an O(mn + n2 log n) time algorithm to compute Q2 (x, y) for all pairs
x, y. This method uses the procedure fast-exclude from Demetrescu et al. [26],
which we now describe (full details of this algorithm can be found in [26]).
Given a rooted tree T , edges (u1 , v1 ) and (u2 , v2 ) on T are independent[26] if
the subtree of T rooted at v1 and the subtree of T rooted at v2 are disjoint. Given
the weighted directed graph G = (V, E), the SSSP tree Ts rooted at a source vertex
s ∈ V , and a set S of independent edges in Ts , algorithm fast-exclude in [26]
computes, for each edge e ∈ S, a shortest path from s to every other vertex in
G − {e}. This algorithm runs in time O(m + n log n).
We will compute the second path in each Q2 (x, y) set, for a given x ∈ V ,
by running fast-exclude with x as source, and with the set of outgoing edges
from x in the shortest path tree rooted at x, Tx , as the set S. Clearly, this set S is
independent, and hence algorithm fast-exclude will produce its specified output.
Now consider any vertex y 6= x, and let (x, a) be the first edge on the shortest path
from x to y in Tx . By its specification, fast-exclude will compute a shortest path
from x to y that avoids edge (x, a) in its output, which is the second path needed
for Q2 (x, y). This holds for every vertex y ∈ V − {x}. Thus we have:
70
Lemma 3.3.7. The Q2 (x, y) sets for pairs x, y can be computed in O(mn + n2 log n)
time.
This leads to the following algorithm for 2-APSiSP. Its time bound in The-
orem 3.2.3, part (i) follows from Lemma 3.3.7 and the time bound for Compute-
APSiSP given in Section 3.3.1.
The space bound is O(n2 ) since the Q2 sets contain O(n2 ) paths and the call
to Compute-APSiSP takes O(n2 ) space.
71
3.3.2.2 Computing Qk for k ≥ 3
Our algorithm will use the following types of sets. For each vertex x ∈ V , let Ix be
the set of incoming edges to x. Also, for a vertex x ∈ V , and vertices a, y ∈ V − {x},
let Pk∗x (a, y) be the set of k simple shortest paths from a to y in G − Ix , the graph
obtained after removing the incoming edges to x. Recall that we maintain all P ∗
and Q sets as sorted arrays.
∗x (a, y), for all vertices
Algorithm APSiSP(G, k) first computes the sets Pk−1
∗ (x, y),
a, y ∈ V . Then it computes each Qk (x, y) as the set of all paths in the set Pk−1
∗x (a, y)}
S
together with a shortest path in {(x,a) outgoing from x} {(x, a) ◦ p | p ∈ Pk−1
∗ (x, y)).
(which is not present in Pk−1
72
Lemma 3.3.8. Algorithm APSiSP (G, wt, k) correctly computes the sets Pk∗ (x, y)
∀x, y ∈ V .
Proof of Theorem 3.2.3, part (iii). The for loop starting in Step 4 is executed n
times, and for k = 3 the cost of each iteration is dominated by the call to Algorithm
2-APSiSP in Step 6, which takes O(mn + n2 log n) time. This contributes O(mn2 +
n3 log n) to the total running time. The inner for loop starting in Step 7 is executed
n times per iteration of the outer for loop, and the cost of each iteration is O(k +dx ).
Summing over all x ∈ V , this contributes O(kn2 + mn) to the total running time.
Step 13 runs in O(n2 log n) time as shown in Section 3.3.1. Thus, the total running
time is O(mn2 + n3 log n).
∗
The space bound for APSiSP is O(k 2 · n2 ), as the Pk−1 and Qk sets contain
O(kn2 ) paths, and the recursive call to APSiSP(G − Ix , wt, k − 1) needs to maintain
∗
the Pr−1 and Qr sets at each level of recursion. The call to Compute-APSiSP
takes O(kn2 ) space as noted earlier.
k-SiSC. This is the problem of generating the k simple shortest cycles through a
specific vertex z in G. We can reduce this problem to k-SiSP by forming G0z , where
we replace vertex z by vertices zi and zo in G0z , we place a directed edge of weight 0
73
from zi to zo , and we replace each incoming edge to (outgoing edge from) z with an
incoming edge to zi (outgoing edge from zo ) in G0z . Then the k-th simple shortest
path from zo to zi in G0z can been seen to correspond to the k-th simple shortest
cycle through z in G. This gives an O(k · (mn + n2 log log n)) time algorithm for
computing k-SiSC using [41]. We also observe that we can solve k-SiSP from s to t
in G if we have an algorithm for k-SiSC: create G0 by adding a new vertex x∗ and
zero weight edges (x∗ , s), (t, x∗ ), and then call k-SiSC for vertex x∗ . Thus k-SiSP
and k-SiSC are equivalent in complexity in weighted directed graphs.
k-ANSiSC. This is the problem of generating k simple shortest cycles that pass
through a given vertex x, for every vertex x ∈ V . For k = 1 this problem can be
solved in O(mn + n2 log log n) time by computing APSP [93]. For k = 2, we can
reduce this problem to k-APSiSP by forming the graph G0 where for each vertex x,
we replace vertex x in G by vertices xi and xo in G0 , we place a directed edge of
weight 0 from xi to xo , and we replace each edge (u, x) in G by an edge (uo , xi ) in
G0 (and hence we also replace each edge (x, v) in G by an edge (xo , vi ) in G0 ). For
k > 2, k-ANSiSC can be computed in O(k ·n·(mn+n2 log log n)) time by computing
k-SiSC for each vertex.This leads to the following theorem.
Theorem 3.3.9. Let G be a directed graph with non-negative edge weights. Then,
(i) k-SiSC can be computed in O(k · (mn + n2 log log n)) time, the same time as
k-SiSP.
(ii) 2-ANSiSC can be computed in O(mn+n2 log n) time, and for k > 2, k-ANSiSC
can be computed in O(k · n · (mn + n2 log log n)) time, the same time as n
applications of k-SiSP.
74
3.4 Enumerating Simple Shortest Cycles (k-All-SiSC)
In this section we give a method to generate each successive simple shortest cycle in
G (k-All-SiSC) in Õ(m · n) time. For enumerating simple paths in non-decreasing
order of weight (k-All-SiSP), we give a faster method in Section 3.5 that uses again
a path extension method, different from the one used in Section 3.3.1.
Let the input graph be G = (V, E). From G we form the graph G0 = (V 0 , E 0 )
as in the construction for k-ANSiSC in Section 3.3.3. We then proceed as follows.
We assume the vertices are numbered 1 through n. Our algorithm for k-All-SiSC
maintains an array A[1..n], where each A[j] contains a triple (ptrj , wj , kj ); here ptrj
is a pointer to the shortest cycle, not yet generated, that contains j as the minimum
vertex (if such a cycle exists), wj is the weight of this cycle, and kj is the number
of shortest simple cycles through vertex j that have already been generated. (Note
that any given cycle is assigned to exactly one position in array A.)
Initially, we compute the entry for each A[j] by running Dijkstra’s algorithm
with source jo on the subgraph G0j of G0 induced on Vj0 = {xi , xo | x ≥ j}, to find a
shortest path p from jo to ji ; we then initialize A[j] with a pointer to the cycle in G
associated with p, and with its weight, and with kj = 0.
For each k ≥ 1, we generate the k-th simple shortest cycle in G by choosing a
minimum weight cycle in array A. Let this entry be in A[r] and let κ = kr . We then
compute the (κ + 1)-th shortest cycle through vertex r by computing the (κ + 1)-th
shortest simple path from vertex ro to vertex ri in G0 using a k-SiSP algorithm. The
entry for A[r] is now updated to a pointer to this newly computed simple cycle and
its weight, and kr is updated to κ + 1.
The time bound in Theorem 3.2.4 is seen by noting that the initialization
takes O(mn+n2 log n) for the n calls to Dijkstra’s algorithm. Thereafter, we generate
each new cycle in the slightly faster APSP time bound of O(mn + n2 log log n)
with the k-SiSP algorithm in [41], by maintaining the relevant information from
75
G
d1 e1
d2 e2
the computation of earlier cycles. We now show that this time bound is optimal
with respect to the Minimum Weight Cycle problem by showing that the problem of
generating the k-th simple shortest cycle in a graph after the first k − 1 cycles have
been generated is at least as hard as the Min-Wt-Cyc problem.
76
corresponds to the minimum weight cycle in G.
As the number of vertices and edges in G0 are linear in the number of vertices
and edges, respectively, in G, we get the desired result.
A similar algorithm can generate successive simple shortest paths. But in the
next section, we present a faster algorithm for this problem.
Our algorithm for k-All-SiSP is inspired by the method in [25] for fully dynamic
APSP. With each path π we will associate two sets of paths L(π) and R(π) as
described below. Similar sets are used in [25] for ‘locally shortest paths’ but here
they have a different use.
Left and right extensions. Let P be a collection of simple paths. For a simple path
πxy from x to y in P, its left extension set L(πxy ) is the set of simple paths π 0 ∈ P
such that π 0 = (x0 , x) ◦ πxy , for some x0 ∈ V . Similarly, the right extension set
R(πxy ) is the set of simple paths π 00 = πxy ◦ (y, y 0 ) such that π 00 ∈ P. For a trivial
path π = hvi, L(π) is the set of incoming edges to v, and R(π) is the set of outgoing
edges from v.
Algorithm All-SiSP, generates all simple shortest paths in G in non-decreasing
order of weight. To generate the k shortest simple paths in G, we can terminate the
while loop after k iterations. Algorithm All-SiSP initializes a priority queue H
with the edges in G, and it initializes the extension sets for the vertices in G. In
each iteration of the main loop, the algorithm extracts the minimum weight path
π in H as the next simple path in the output sequence. It then generates suitable
extensions of π to be added to H as follows. Let the first edge on π be (x, a) and
the last edge (b, y). Then, All-SiSP left extends π along those edges (x0 , x) such
that there is a path πx0 b in L(l(π)); it also requires that x0 6= y, since extending to
77
x0 would create a cycle in the path. It forms similar extensions to the right in the
for loop starting at Step 15.
Proof. Since edge weights are non-negative, the first path generated by Algorithm 4
is a minimum weight edge inserted in Step 4, which is a simple path. Assume the
algorithm generates a path with a cycle, and let σ be the first path extracted in
Step 8 that contains a cycle. Let (x0 , a) and (b, y) be the first and last edges on
σ. Since σ contains a cycle, it contains at least two edges so (x0 , a) and (b, y) are
distinct edges.
Consider the step when the non-simple path σ is placed on H. This does not
occur in Step 4 since σ contains at least two edges. So σ is placed on H in some
iteration of the while loop. Let π be the path extracted from H in this iteration; π
78
is a simple path by assumption since it was extracted from H before σ. Then σ is
added to H either as a left extension of π (in Step 12) or as a right extension of π
in a step complementary to Step 12 in the for loop in Step 15.
Consider the left extension case, and let σ be formed when processing path
πx0 b ∈ L(l(π)) with x0 6= y in Step 11. Thus σ is formed as (x0 , x) ◦ π in Step 12.
But (x0 , x) ◦ π = (x0 , x) ◦ `(π) ◦ (b, y) = πx0 b ◦ (b, y). Since πx0 b ∈ L(l(π)), it was also
placed in H in either Step 4 or Step 12. And as wt(πx0 b ) < wt(σ), the path πx0 b
is simple. Since πx0 b is simple, a cycle can be formed in σ only if x0 = y. But this
is specifically forbidden in the condition in Step 11. A similar argument applies to
right extensions added to H in Step 15. Hence σ is a simple path, and Algorithm 4
does not generate any path containing a cycle.
Proof. Clearly the algorithm correctly generates the minimum weight edge in G as
the minimum weight simple path in the output in the first iteration of the while
loop. By Lemma 3.5.1 all generated paths are simple. Also, these simple paths are
generated in non-decreasing order of weight since any path added to H in Steps
12 and 15 has weight at least as large as the weights of the paths that have been
extracted at that time, due to non-negative edge-weights. It remains to show that
no simple path in G is omitted in the sequence of simple paths generated.
Suppose the algorithm fails to generate all simple shortest paths in G and
let π be a simple path of smallest weight that is not generated by Algorithm 4. Let
π be a path with first edge (x, a) and last edge (b, y); (x, a) 6= (b, y) since all single
edge paths is added to H in Step 4, and will be extracted in a future iteration. Let
πab be the subpath of π from a to b. By assumption, the paths πxb = `(π) and
πay = r(π) are placed in the output by Algorithm 4 since they are simple paths with
79
weight smaller than the weight of π. Without loss of generality assume that πxb was
extracted from H before πay .
Clearly, πxb was inserted in H before πay was extracted. In the iteration
of the while loop when πxb was added to H, πxb was added to L(πab ) in Step 13
since r(πxb ) = πab . In the later iteration when πay was extracted from H, the
paths in L(`(πay )) are considered in Step 12. But `(πay ) = πab . When the paths in
L(`(πay )) = L(πab ) are considered in Step 11 during the processing of πay , the path
πxb will be one of the paths processed, and in Step 12 the path (x, a) ◦ πay = π will
be formed and added to H. Thus π will be added to H, and hence will be extracted
and added to the output sequence.
Proof of Theorem 3.2.5. We will maintain paths with pointers to their left and right
subpaths, so each path takes O(1) space. For the amortized bound we will implement
H as a Fibonacci heap. The initialization takes O(m) time. Each L and R set can
contain at most n−2 paths, and further, since extensions are formed only with paths
already in H, each of these sets has size min{k, n−2}. The k-th iteration of the while
loop takes time O(log |H|) for the extract-min operation, and O(min{k, n}) time for
the processing of the L and R sets. At the start of the k-th iteration, the number of
paths in H is at most O(m+k·min{k, n}), and since m = O(n2 ), log |H| = O(log(n+
k)). Hence the amortized time for the k-th iteration is O(min{k, n} + log(n + k)).
For the worst-case bound we will use a binary heap. Then, the initializa-
tion takes O(m) time to build a heap on the m edges, and the k-th iteration costs
O(min{k, n} · log(n + k)) for the heap operations.
We have presented a new algorithm for the problem of generating k simple short-
est paths for every pair of vertices in a weighted directed graph (k-APSiSP). This
80
algorithm is of special interest since it is the first algorithm that does not use the
‘detour finding’ technique for computing multiple simple shortest paths. In fact, all
previous algorithms known for finding multiple simple shortest paths, replacements
paths, and distance sensitivity oracles find the solution by computing ‘detours’. In
contrast, we have introduced a novel path extension method. We have then consid-
ered the problem of enumerating simple cycles in the graph in non-decreasing order
of their weights (k-All-SiSC), and we have given an algorithm that generates each
successive simple cycle in Õ(mn) time. Finally, we have used a different path exten-
sion technique to obtain a very efficient algorithm to generate the k simple shortest
paths in the entire graph (k-All-SiSP).
Our k-All-SiSP algorithm is nearly optimal if the paths need to be output. It
is also not difficult to see that our bounds for 2-APSiSP and k-All-SiSC (for constant
k) are the best possible to within a polylog factor for sparse graphs unless the long-
standing Õ(mn) bounds for APSP and minimum weight cycles are improved. In
recent work [6] we give several fine-grained reductions that demonstrate that the
minimum weight cycle problem holds a central position for a class of problems that
currently have Õ(mn) time bound on sparse graphs, both directed and undirected.
For undirected graphs, our k-All-SiSP algorithm gives an algorithm with the
same bound. Also, our k-APSiSP algorithm works for undirected graphs, and this
gives a faster algorithm for k = 2 and matches the previous best bound for k = 3
(using [57]). However, our algorithms for the three variants of finding simple shortest
cycles do not work for undirected graphs. This is addressed in the work presented
in Chapter 2, where the fine-grained reductions also give new algorithms for finding
shortest cycles in undirected graphs.
We conclude with two avenues for further research.
1. The main open question for k-APSiSP is to come up with faster algorithms
to compute the Qk (x, y) sets for larger values of k. This is the key to a faster k-
81
APSiSP algorithm using our approach, for k > 2.
2. The space requirements of algorithms are high. Can we come up with
space-efficient algorithms that match our time bounds?
82
Part II
Distributed Results
83
Chapter 4
4.1 Introduction
In the previous chapters (Chapter 2 and 3) we studied the shortest path problems in
the sequential setting. In Chapters 5, 6 and 7, we study the shortest paths problem
in the distributed setting, specifically the weighted all pairs shortest path (APSP)
problem.
The design of distributed algorithms for various network (or graph) problems
such as shortest paths [65, 70, 28, 50] and minimum spanning tree [36, 75, 38, 59]
is a well-studied area of research. The most widely considered model for studying
distributed algorithms is the Congest model [73] (also see [28, 50, 49, 65, 70, 39]),
described in more detail below in Section 4.2. In Chapters 5, 6 and 7, we consider
the problem of computing all pairs shortest paths (APSP) in a weighted directed (or
undirected) graph in this model.
The problem of computing all pairs shortest paths (APSP) in distributed
networks is a very fundamental problem, and there has been a considerable line of
84
work for the Congest model as described later. However, for a weighted graph no
deterministic algorithm was known in this model other than a trivial method that
runs in n2 rounds. In Chapter 5 we present the first algorithm for this problem in
the Congest model that computes weighted APSP deterministically in less than
√
n2 rounds. Our algorithm computes APSP deterministically in O(n3/2 · log n)
rounds in this model in both directed and undirected graphs. We follow up on this
result with an improved Õ(n4/3 ) rounds deterministic APSP algorithm described in
Chapter 7. In Chapter 6 we present a deterministic APSP algorithm that improves
the round complexity for moderate integer edge weights.
Our APSP algorithms in Chapters 5, 6 and 7 follows the general 3-phase
strategy initiated by Ullman and Yannakakis [87] for parallel computation of path
problems in directed graphs:
1. Compute h-hop shortest paths for each source for a suitable value of h. (An
h-hop path is a path that contains at most h edges.)
2. Find a small blocker set Q that intersects all paths computed in Step 1. (With
randomization, this step is very simple: a random sample of the vertices of size
O((n/h) · log n) satisfies this property w.h.p. in n.)
3. Compute shortest paths between all pairs of vertices in Q, and using this
information and the h-hop trees from Step 1, compute the APSP output at
each node in V .
Congest directed APSP algorithms that fall in this framework include the
randomized algorithm in Huang et al. [50] that runs in Õ(n5/4 ) rounds for polyno-
mial integer edge-weights, the deterministic algorithm presented in this chapter for
arbitrary edge-weights, and the deterministic algorithm in Chapter 6 that improves
on the result in this chapter for moderate integer edge-weights, and the algorithm
in Chapter 7 that improves the APSP round complexity for arbitrary edge weights.
85
We now describe the Congest model for which we propose our APSP algo-
rithms.
We compare our results for distributed APSP with other results in Table 4.1.
86
Table 4.1: Table comparing our results for non-negative edge-weighted graphs (in-
cluding zero edge weights) with previous known results. Here W is the maximum
edge weight and ∆ is the maximum weight of a shortest path in G. Arb. stands for
arbitray edge weights and Int. stands for integer edge weights. Rand. stands for
randomized algorithm and Det. stands for deterministic algorithm. Dir. stands for
directed graphs and Undir. stands for undirected graphs.
87
Prior Work.
88
pectation for derandomizing an algorithm for computing Maximal Independent Set
(MIS) in the distributed setting. In Section 7.3.2 we instead use a linear-sized sample
space for generating pairwise independent random variables and then use an aggre-
gration of suitable parameters of sample point values to derandomize our randomized
blocker set algorithm.
89
Chapter 5
5.1 Introduction
In this Chapter we describe our Õ(n3/2 ) rounds deterministic weighted APSP algo-
rithm in the Congest model. Our distributed APSP algorithm is quite simple and
we give an overview in Section 5.2. It uses the notion of a blocker set introduced
by King [58] in the context of sequential fully dynamic APSP computation. Our de-
terministic distributed algorithm for computing a blocker set is the most nontrivial
component of our algorithm, and is described in Section 5.3.
90
message is of O(log n)-bit size, which restricts w(e) to be an O(log n) size integer
value. However, outside of this restriction imposed by the Congest model, our
algorithm works for arbitrary edge-weights (even negative edge-weights as long as
there is no negative-weight cycle). Given a path p we will use weight or distance to
denote the sum of the weights of the edges on the path and length (or sometimes
hops) to denote the number of edges on the path. We denote the shortest path
distance from a vertex x to a vertex y in G by δ(x, y). In the following we will
assume that G is directed, but the same algorithm works for undirected graphs as
well.
An h-hop shortest path from a source s to a vertex v is the minimum weight
path from s to v with at most h hops. In the case of multiple paths with the same
weight from s to v we assume that v chooses the path with its parent vertex of
√
minimum id. We will use h = n · log n in our algorithm.
Our overall APSP algorithm is given in Algorithm 1.
In Step 1 the h-hop SSSPs along with the h-hop shortest path distances,
δh (x, v), are computed at every vertex v for each source x ∈ V . These paths can be
easily converted to form a rooted tree at x by first computing 2h-hop shortest paths
and then just extracting out the first h-hop paths.
Step 2 computes a blocker set Q of q = Θ((n log n)/h) nodes for the collection
91
of h-hop SSSPs constructed in Step 1. This step is described in detail in Section 5.3,
where we describe a distributed implementation of King’s sequential method [58].
Our method computes the blocker set Q in O(nh + (n2 log n)/h) rounds. We now
give the definition of a blocker set for a collection of rooted h-hop trees.
Definition 5.2.1 (Blocker Set [58]). Let H be a collection of rooted h-hop trees in
a graph G = (V, E). A set Q ⊆ V is a blocker set for H if every root to leaf path
of length h in every tree in H contains a vertex in Q. Each vertex in Q is called a
blocker vertex for H.
Lemma 5.2.2. The δ(x, v) values computed at each v in Step 5 of Algorithm 1 are
the correct shortest path distances.
Proof. Fix vertices x, v and consider a shortest path p from x to v. If p has at most
h edges then w(p) = δh (x, v) and this value is directly computed at v in Step 1.
Otherwise by the property of the blocker set Q we know that there is a vertex c ∈ Q
which lies along p within the h-hop SSSP tree rooted at x that is constructed in
Step 1. Let p1 be the portion of p from x to c and let p2 be the portion from c to v.
So w(p1 ) = δh (x, c), w(p2 ) = δ(c, v) and w(p) = w(p1 ) + w(p2 ).
The value δh (x, c) is received by v in the broadcast step for center c in Step 4.
The value δ(c, v) is computed at v when SSSP with root c is computed in Step 3.
Hence v has the information needed to compute δ(x, v) in Step 5 for each x using
92
Equation 5.1.
We now bound the number of rounds needed for each step in Algorithm 1
(other than Step 2). For this we first state bounds for some simple primitives that
will be used to execute these steps.
(a) the shortest path distance δ(s, v) can be computed at each v ∈ V in n rounds.
(b) the h-hop shortest path distance δh (s, v) can be computed at each v ∈ V in h
rounds.
Lemma 5.2.4. A node v can broadcast k local values to all other nodes reachable
from it deterministically in O(n + k) rounds.
Proof. We construct a BFS tree rooted at v in at most n rounds and then we pipeline
the broadcast of the k values. The root v sends the i-th value to all its children in
round i for 1 ≤ i ≤ k. In a general round, each node x that received a value in the
previous round sends that value to all its children. It is readily seen that the i-th
value reaches all nodes at hop-length d from v in the BFS tree in round i + d − 1,
and this is the only value that node x receives in this round.
Lemma 5.2.5. All v ∈ V can broadcast a local value to every other node they can
reach in O(n) rounds deterministically.
Proof. This broadcast can be done in O(n) rounds in many ways, for example by
piggy-backing on an O(n) round unweighted APSP algorithm [65, 47] (and also
[49, 74] for undirected graphs) where now each message contains the value sent by
source s in addition to the current shortest path distance estimate for source s.
Lemma 5.2.6. Algorithm 1 runs in O(n · h + (n2 /h) · log n) rounds assuming Step
2 can be implemented to run within this bound.
93
n
Proof. Let the size of the blocker set be q = h · log n. Using part (b) of Lemma 5.2.3
and Lemma 6.4.2, Step 1 can be computed in O(n · h) rounds. Step 3 can be
computed in O(n · q) = O((n2 /h) · log n) rounds by part (a) of Lemma 5.2.3. Step
4 can be computed in O(n · q) = O((n2 /h) · log n) rounds by Lemma 5.2.4 (using
k = n). Finally, Step 5 involves only local computation and no communication. This
establishes the lemma.
The simplest method to find a blocker set is to chose the vertices randomly. An early
use of this method for path problems in graphs was in Ullman and Yannakakis [87]
√
where a random set of O( n·log n) distinguished nodes was picked. It is readily seen
√
that some vertex in this set will intersect any path of O( n) vertices in the graph
(and so this set would serve as a blocker set of size O((n log n)/h)for our algorithm
√
if h = n). Using this observation an improved randomized parallel algorithm (in
the PRAM model) was given in [87] to compute the transitive closure. Since then
this method of using random sampling to choose a suitable blocker set has been used
extensively in parallel and dynamic computation of transitive closure and shortest
paths, and more recently, in distributed computation of APSP [50].
It is not clear if the above simple randomized strategy can be derandomized
in its full generality. However, for our purposes a blocker set only needs to intersect
94
all paths in the set of hop trees we construct in Step 1 of Algorithm 1. For this, a
deterministic sequential algorithm for computing a blocker set was given in King [58]
in order to compute fully dynamic APSP. This algorithm computes a blocker set of
size O((n/h) ln p) for a collection F of h-hop trees with a total of p leaves across all
trees (and hence p root to leaf paths) in an n-node graph. In our setting p ≤ n2
since we have n trees and each tree could have up to n leaves.
King’s sequential blocker set algorithm uses the following simple observation:
Given a collection of p paths each with exactly h nodes from an underlying set V
of n nodes, there must exist a vertex that is contained in at least ph/n paths. The
algorithm adds one such vertex v to the blocker set, removes all paths that are covered
by this vertex and repeats this process until no path remains in the collection. The
number of paths is reduced from p to at most (1 − h/n) · p when the blocker vertex
v is removed, hence after O((n/h) ln p) removals of vertices, all paths are removed.
Since p is at most n2 the size of the blocker set is O((n log n)/h). King’s sequential
algorithm for finding a blocker set runs in O(n2 log n) deterministic time.
We now describe our distributed algorithm to compute a blocker set. As in
King [58], for each vertex v in a tree Tx in the collection of trees H we define:
• scorex (v) is the number of leaves at depth h in Tx that are in the subtree
rooted at v in Tx ;
P
• score(v) = x scorex (v).
Thus, score(v) is the number of root-to-leaf length paths of length h in the collection
of trees H that contain vertex v. Initially, our distributed algorithm computes all
scorex (v) and score(v) for all vertices v ∈ V and all h-hop trees Tx in O(n·h) rounds.
Then through an all-to-all broadcast of score(v) to all other nodes for all v, all nodes
identify the vertex c with maximum score as the next blocker vertex to be removed
from the trees and added to the blocker set Q. (In case there are multiple vertices
95
with the maximum score the algorithm chooses the vertex of minimum id having this
maximum score. This ensures that all vertices will locally choose the same vertex
as the next blocker vertex once they have received the scores of all vertices.) We
repeat this process until all scores are zeroed out. By the discussion above (and as
observed in [58]) we will identify all the vertices in Q in O((n · log n)/h) repeats of
this process.
What remains is to obtain an O(n) round procedure to update the score and
scorex values at all nodes each time a vertex c is removed so that we have the correct
values at each node for each tree when the leaves covered by c are removed from the
tree.
If a vertex v is a descendant of the removed vertex c in Tx then all paths in
Tx that pass through v are removed when c is removed and hence scorex (v) needs to
go down to zero for each such tree Tx where v is a descendant of the chosen blocker
node c. In order to facilitate an O(n)-round computation of these updated scorex
values in each tree at all nodes that are descendants of c, we initially precompute
at every node v a list Ancx (v) all of its ancestors in each tree Tx . This is computed
in O(n · h) rounds using our Ancestors algorithm (Algorithm 4). Thereafter, each
time a new blocker vertex c is selected to be removed from the trees and added to
Q, it is a local computation at each node v to determine which of the Ancx (v) sets
at v contain c and to zero out scorex (v) for each such x.
The other type of vertices whose scores change after a vertex c is removed are
the ancestors of c in each tree. If v is an ancestor of c in Tx then after c is removed
scorex (v) needs to be reduced by scorex (c) (i.e., c’s score before it was removed and
added to Q) since these paths no longer need to be covered by v. For these ancestor
updates we give an O(n)-round algorithm that runs after the addition of each new
blocker node to Q and correctly updates the scores for these ancestors in every tree.
(Algorithm 6). These algorithms together give the overall deterministic algorithm
96
(Algorithm 2) for the computation of the blocker set Q in O(n · h + (n2 log n)/h)
√
rounds. We present an improved blocker set algorithm that runs in O(nq + ∆hk)
rounds in Section 6.3 and another one that runs in Õ(nh) rounds in Section 7.3.
We now give the details of our algorithms. Recall that we use the h-hop
CSSSP algorithm (described in Section 6.4) for Step 1 in Algorithm 1. Hence after
that step, for each tree Ts rooted at s every node v in the tree knows its shortest
path distance from s, δ(s, v), its hop length hs (v) and its parent node in Ts . We also
determine for each node its children in Ts . We can compute this in one round for
each Ts by have each node send its child status to its parent. Thus after n rounds
all nodes know all their children in every tree Ts .
Algorithm 2 Compute-Blocker
Input: h-hop CSSSP Collection of all h-hop trees Tx ; Output: set Q
1: Initialization [lines 2-6]:
2: Run Algorithm 3 to compute scores for all v ∈ V
3: For each Tx compute the ancestors of each vertex v in Tx in Ancx (v) using
Algorithm 4
4: for each v ∈ V do
P
5: Local Step: score(v) ← x∈V scorex (v)
6: broadcast score(v) to all nodes in V (using Lemma 5.2.5)
7: Add blocker vertices to blocker set Q [lines 8-12]:
8: while there is a node c with score(c) > 0 do
9: for each v ∈ V do
10: Local Step: select the node c with max score as next vertex in Q
11: Run Algorithms 5 and 6 to update scorex (v) for each x ∈ V and score(v)
12: broadcast score(v) to all nodes in V and receive score(x) from all other
nodes x
97
for computing the exact weighted APSP.
Step 2 of Algorithm 2 executes Algorithm 3 to compute all the initial scores
at all nodes v. Step 3 involves running Algorithm 4 for pre-computing ancestors of
each node in every Tx . Step 5 is a local computation (no communication) where
all nodes v compute their total score by summing up the scores for all trees Tx to
which they belong. And in Step 6, each node v broadcasts its score value to all other
nodes.
The while loop in Steps 8-12 of Algorithm 2 runs as long as there is a node
with positive score. In Step 10, the node with maximum score is selected as the
vertex c to be added to Q (and if there are multiple nodes with the maximum score,
then among them the node with the minimum ID is selected, so that the same node
is selected locally at every vertex). In Step 11, after blocker vertex c is selected, each
node v checks whether it is a descendant of c in each Tx and if so update its score for
that tree using Algorithm 5. This is followed by an execution of Algorithm 6 which
updates the scores at each node v for each tree Tx in which v is an ancestor of c.
Then in Step 12, all the nodes broadcast their score to all other nodes so that they
can all select the next vertex to be added to Q. This leads to the following lemma,
assuming the results shown in the next section.
Lemma 5.3.1. Algorithm 2 correctly computes the blocker set Q in O(n · h + n · |Q|)
rounds.
Proof. Step 2 runs in O(n · h) rounds (by Lemma 5.3.2) and so does Step 3 (see
Lemma 5.3.3). Step 5 is a local computation and the broadcast in Step 6 runs in
O(n) rounds by Lemma 5.2.5.
The while loop starting in Step 8 runs for |Q| iterations since a new blocker
vertex is added to Q in each iteration. In each iteration, Step 10 is a local compu-
tation as is the execution of Algorithm 5 in Step 11. Algorithm 6 in Step 11 runs in
O(n) rounds (Lemma 5.3.6). The all-to-all broadcast in Step 12 is the same as the
98
initial all-to-all broadcast in Step 6 and runs in O(n) rounds. Hence each iteration
of the while loop runs in O(n) rounds giving the desired bound.
In this section we give the details of our algorithms for computing initial scores
(Algorithm 3) and for updating these scores values once a blocker vertex c is selected
and added to the blocker set Q (Algorithms 4-6).
Algorithm 3 gives the procedure for computing the initial scores for a node
v in a tree Tx . In Step 1 each leaf node at depth h initializes its score for Tx to 1
and all other nodes set their initial score to 0. In a general round r > 0, nodes with
hx (v) = h + 1 − r send out their scores to their parents and nodes with hx (v) = h − r
will receive all the scores from its children in Tx and set its score equal to the sum
of these received scores (Steps 5-9).
Lemma 5.3.2. Algorithm 3 computes the initial scores for every node v in Tx in
O(h) rounds.
Proof. The leaves at depth h correctly initialize their score to 1 locally in Step 1.
Since we only consider paths of length h from the root x to a leaf, it is readily
seen that a node v that is hx (v) hops away from x in Tx will receive scores from its
99
children in round h − hx (v) and thus will have the correct scorex (v) value to send
in Step 3.
For every x ∈ V , every node v ∈ Tx will run this algorithm to compute their
score in Tx . Since every run of Algorithm 3 for a given x takes h rounds, all the
initial scores can be computed in O(n · h) rounds.
Proof. We show that all nodes v correctly computes all their ancestors in Tx in the
set Ancx (v) using induction on round r. We show that by round r, every node v has
added all its ancestors that are at most r hops away from v.
100
If r = 1, then v’s parent in Tx (say y) would have send out its ID to v in
Step 5 and v would have added it to Ancx (v) in Step 11.
Assume that every node v has already added all ancestors in the set Ancx (v)
that are at most r − 1 hops away from v.
Let u be the ancestor of v in Tx that is exactly r hops away from v. Then by
induction, u ∈ Ancx (y) since u is exactly r − 1 hops away from y and thus y must
have send u’s ID to v in round r in Step 8 and hence v would have added u to its
set Ancx (v) in round r in Step 11.
Once we have pre-computed the Ancx (v) sets for all vertices v and all trees
Tx using Algorithm 4, updating the scores at each node for all trees in which it is a
descendant of the newly chosen blocker node c becomes a purely local computation.
Algorithm 5 describes the algorithm at node v that updates its scores after a vertex c
is added as a blocker node to Q. At node v for each given Tx , v checks if c ∈ Ancx (v)
and if so update its score values in Steps 4-5.
Lemma 5.3.4. Given a blocker vertex c, Algorithm 5 correctly updates the scores of
all nodes v such that v is a descendant of c in some tree Tx .
101
We now move to the last remaining part of the blocker set algorithm: our
method to correctly update scores at ancestors of the newly chosen blocker node c in
each Tx . Recall that if v is an ancestor of c in Tx we need to subtract scorex (c) from
scorex (v). Here, in contrast to Algorithms 4 and 5 for nodes that are descendants of
c in a tree, we do not precompute anything. Instead we give an O(n)-round method
in Algorithm 6 to correctly update scores for each vertex for all trees in which that
vertex is an ancestor of c.
Before we describe Algorithm 6 we establish the following lemma, which is
key to our O(n)-round method.
Lemma 5.3.5. Fix a vertex c. For each root vertex x ∈ V − {c}, let πx,c be the path
from x to c in the h-hop SSSP tree Tx . Let T = ∪x∈V −{c} {e | e lies on πx,c }, i.e., T
is the set of edges that lie on some πx,c . Then T is an in-tree rooted at c.
Proof. If not, there exists some x, y ∈ V − {c} such that πx,c and πy,c coincide first
at some vertex z and the subpaths in πx,c and πy,c from z to c are different.
Let these paths coincide again at some vertex z 0 (such a vertex exists since
their endpoint is same) after diverging from z. Let the subpath from z to z 0 in πx,c
1
be πz,z 2
0 and the corresponding subpath in πy,c be πz,z 0 . Similarly let πx,z be the
subpath of πx,c from x to z and let πy,z be the subpath of πy,c from y to z.
1
Clearly both πz,z 2
0 and πz,z 0 have equal weight (otherwise one of πx,c or πy,c
2
cannot be a shortest path). Thus the path πx,z ◦ πz,z 0 is also a shortest path.
2 .
the path πz,z 0
Now since the path πx,z 0 has (a, z 0 ) as the last edge and we break ties using
the IDs of the vertices, hence ID(a) < ID(b). But then the shortest path πy,z 0 must
also have chosen (a, z 0 ) as the last edge and hence πy,z ◦ πz,z
1
0 must be the subpath
Lemma 5.3.5 allows us to re-cast the task for ancestor nodes to the following
102
Algorithm 6 Pipelined Algorithm for updating scores at v for all trees Tx in which
v is an ancestor of newly chosen blocker node c
Input: current blocker set Q, newly chosen blocker node c
1: Send [lines 2-3]: (only for c)
2: Local Step at c: create a list listc and for each x ∈ V do add an entry
Z = hx, scorex (c)i to listc if scorex (c) 6= 0; then set scorex (c) to 0 for each
x ∈ V and set score(c) to 0
3: Round i: let Z = hx, scorex (c)i be the i-th entry in listc ; send hZi to c’s parent
in Tx
4: In round r > 0: (for vertices v ∈ V − Q − {c})
5: send [lines 6-8]:
6: if v received a message in round r − 1 then
7: let that message be hZi = hx, scorex (c)i.
8: if v 6= x then send hZi to v’s parent in Tx
9: receive [lines 10-11]:
10: if v receives a message M of the form hx, scorex (c)i then
11: scorex (v) ← scorex (v) − scorex (c); score(v) ← score(v) − scorex (c)
(where we use the notation in the statement of Lemma 5.3.5): the new blocker node
c needs to send scorex (c) to all nodes on πx,c for each tree Tx . Recall that in the
Congest model for directed graphs the graph edges are bi-directional. Hence this
task can be accomplished by having c send out scorex (c) for each tree Tx (other than
Tc ) in n − 1 rounds, one score per round (in no particular order) along the parent
edge for Tx . Each message hx, scorex (c)i will move along edges in πx,c (in reverse
order) along parent edges in Tx from c to x. Consider any node v. In general it will
be an ancestor of c in some subset of the n − 1 trees Tx . But the characterization
in Lemma 5.3.5 establishes that the incoming edge to v in all of these trees is the
same edge (u, v) and this is the unique edge on the path from c to v in the h-hop
SSSP. In fact, the messages for all of the trees in which v is an ancestor of c will
traverse exactly the same path from c to v. Hence, for the messages sent out by c
for the different trees in n − 1 different rounds (one for each tree other than Tc ), if
each vertex simply forwards any message hx, scorex (c)i it receives to its parent in
103
tree Tx all messages will be pipelined to all ancestors in n − 1 + h rounds. This is
what is done in Algorithm 6, whose steps we describe below, for completeness.
Step 2 of Algorithm 6 is local computation at the new blocker vertex c where
for each Tx to which c belongs, c adds an entry hx, scorex (c)i to a local list listc .
In round i, c sends the i-th entry in its list, say hy, scorey (c)i, to its parent in Ty .
For node v other than c, in a general round r > 0, if v receives a message for some
x ∈ V it updates its score value for x (Steps 10-11) and then forwards this message
to its parent in Tx in round r + 1 (Step 6-8).
Lemma 5.3.6. Given a new blocker vertex c, Algorithm 6 correctly updates the
scores of all nodes v in every tree Tx in which v is an ancestor of c in O(n + h)
rounds.
Proof. Correctness of Algorithm 6 was argued above. For the number of rounds, c
sends out it last message in round n − 1, and if πv,c has length k then v receives all
messages sent to it by round n − 1 + k. Since we only have h-hop trees k ≤ h for all
nodes, and the lemma follows.
5.4 Conclusion
We have presented a new distributed algorithm for the exact computation of weighted
all pairs shortest paths in both directed and undirected graphs. This algorithm runs
√
in O(n3/2 · log n) rounds and is the first o(n2 )-round deterministic algorithm for
this problem in the Congest model. At the heart of our algorithm is a determin-
istic algorithm for computing blocker set. Our blocker set construction may have
applications in other distributed algorithms that need to identify a relatively small
set of vertices that intersect all paths in a set of paths with the same (relatively long)
length.
In Chapter 6 we present a deterministic pipelined approach to solve the
104
weighted all pairs shortest path problem. This approach gives an improvement in
the round complexity for graphs with moderate integer edge weights. In Chapter 7,
we present a Õ(n4/3 ) rounds deterministic APSP algorithm that improves on the
Õ(n3/2 ) round bound presented in this chapter. The main component of this algo-
rithm is a new faster method for computing blocker set deterministically and a new
approach to propagate distance values from source nodes to blocker nodes.
105
Chapter 6
6.1 Introduction
106
of weight zero.
The presence of zero weight edges creates challenges in the design of dis-
tributed algorithms as observed in [50]. One approach used for positive integer edge
weights is to replace an edge of weight d with d unweighted edges and then run an un-
weighted APSP algorithm such as [65, 47] on this modified graph. This approach is
used in approximate APSP algorithms [70, 63]. However such an approach fails when
zero weight edges may be present. There are a few known algorithms that can handle
zero weights, such as our Õ(n3/2 )-round deterministic APSP algorithm (described
in Chapter 5) for graphs with arbitrary edge weights, and the randomized weighted
APSP algorithms of Huang et al. [50] (for polynomially bounded non-negative integer
edge weights), and of Elkin [28] and Bernstein and Nanongkai [18] for arbitrary edge
weights. However no previous sub-n3/2 -round deterministic algorithm was known
for weighted APSP that can handle zero weights.
All of our results hold for both directed and undirected graphs and we will
assume w.l.o.g. that G is directed. Here is a summary of our results.
107
the bounds in the following theorem.
108
Theorem 6.1.3. Let G = (V, E) be a directed or undirected edge-weighted graph,
where all edge weights are non-negative integers (with zero edge-weights allowed), and
the shortest path distances are bounded by 4. The following deterministic bounds
can be obtained in the Congest model.
The range of values for W and ∆ for which our results in Theorem 6.1.2 and
6.1.3 improve on the Õ(n3/2 ) deterministic APSP bound presented in Chapter 7 are
stated in the following Corollary.
(i) If the edge weights are bounded by W = n1− , then APSP can be computed in
O(n3/2−/4 log1/2 n) rounds.
(ii) For shortest path distances bounded by ∆ = n3/2− , APSP can be computed in
O(n3/2−/3 log2/3 n) rounds.
109
Table 6.1: Table comparing our approximate APSP results for non-negative edge-
weighted graphs (including zero edge weights) with previous known results.
dle zero weight edges. In Section 6.2.6 we present simple deterministic algorithms
that match the congest and dilation bounds in [50] for two of the three procedures
used there: the short-range and short-range-extension algorithms. Our simplified al-
gorithms are both obtained using a streamlined single-source version of our pipelined
APSP algorithm (Algorithm 1).
110
runs in Õ(n4/3 ) rounds, w.h.p. in n. No nontrivial sub-n3/2 round algorithm was
known prior to this result.
The corresponding bound for k-SSP is Õ(n + n2/3 k 2/3 ). This result improves
on the prior Õ(n3/2 )-round (deterministic) bound presented in Chapter 5 but it has
been subsumed by a very recent result in [18] that gives an Õ(n) rounds randomized
algorithm for weighted APSP.
111
shortest path values for all sources arrive at any given node v in less than 2n rounds.
For our weighted case, since d(s) is at most 4 for all s, v, it appears plausible
that the above pipelining method would apply here as well. Unfortunately, this does
not hold since we allow zero weight edges in the graph. The key to the guarantee that
a d(s) value arrives at v before round d(s) + pos(s) in the unweighted case in [47] is
that the predecessor y that sent its dy (s) value to v must have had dy (s) = dv (s) − 1.
(Recall that in the unweighted case, dy (s) is simply the hop-length of the path
taken from s to y.) If we have zero-weight edges this guarantee no longer holds for
the weighted path length, and it appears that the key property of the unweighted
pipelining methodology no longer applies. Since edge weights larger than 1 are also
possible (as long as no shortest path distance exceeds 4), the hop length of a path
can be either greater than or less than its weighted distance.
Algorithm 1 is our pipelined algorithm for a directed graph G = (V, E) with non-
negative edge-weights. The input is G, together with the subset S of k vertices for
which we need to compute h-hop SSPs. An innovative feature of this algorithm
is that the key κ it uses for a path is not its weighted distance, but a function of
both its hop length l and its weighted distance d. More specifically, κ = d · γ + l,
p
where γ = kh/∆. This allows the key to inherit some of the properties from the
algorithm in [47] through the fact that the hop length is part of κ’s value, while also
retaining the weighted distance which is the actual value that needs to be computed.
The new key κ by itself is not sufficient to adapt the algorithm for unweighted
APSP in [47] to the weighted case. In fact, the use of κ can complicate the compu-
tation since one can have two paths from s to v, with weighted distances d1 < d2 ,
and yet for the associated keys one could have κ1 > κ2 (because the path with the
smaller weight can have a larger hop-length). Our algorithm handles this with an-
112
other unusual feature: it may maintain several (though not all) of the key values it
receives, and may also send out several key values, even some that it knows cannot
correspond to a shortest distance. These features are incorporated into a carefully
√
tailored algorithm that terminates in O( 4kh) rounds with all h-hop shortest path
distances from the k sources computed.
It is not difficult to show that eventually every shortest path distance key
arrives at v for each source from which v is reachable when Algorithm 1 is executed.
In order to establish the bound on the number of rounds, we show that our pipelined
algorithm maintains two important invariants:
113
Table 6.2: Notations
Global Parameters:
S set of sources
k number of sources, or |S|
h maximum number of hops in a shortest path
4 maximum weighted distance of a shortest path
n number of nodes p
γ parameter equal to hk/∆
Local Variables at node v:
d∗x current shortest path distance from x to v; same as d∗x,v
listv list at v for storing the SP and non-SP entries
Variables/Parameters for entry Z = (κ, d, l, x) in listv :
κ key for Z; κ = d · γ + h
d weight (distance) of the path associated with this entry
l hop-length of the path associated with this entry
x start node (i.e. source) of the path associated with this entry
p parent node of v on the path associated with this entry
ν number of entries for x at or below Z in listv (not stored explicitly)
flag-d∗ flag to indicate if Z is the current SP entry for source x
pos position of Z in listv in a round r; same as posr , posrv
SP shortest path
on listv are ordered by key value κ, with ties first resolved by the value of d, and
then by the label of the source vertex. We use Z.ν to denote the number of keys
for source x stored on listv at or below Z. The position of an element Z in listv
is given by pos(Z), which gives the number of elements at or below Z on listv . If
the vertex v and the round r are relevant to the discussion we will use the notation
posrv (Z), but we will remove either the subscript or the superscript (or both) if they
are clear from the context. We also have a flag Z.flag-d∗ which is set if Z has the
smallest (d, κ) value among all entries for source x (so d is the shortest weighted
distance from s to v among all keys for x on listv ). A summary of our notation is
in Table 6.2.
Initially, when round r = 0, listv is empty unless v is in the source set S.
114
Each source vertex x ∈ S places an element (0, 0, 0, x) on its listx to indicate a path
of weight 0 and hop length 0 from x to x, and Z.flag-d∗ is set to true. In Step 1 of
the Initialization round 0, node v initializes the distance from every source to ∞.
In Step 2 every source vertex initializes the distance from itself to 0 and adds the
corresponding entry in its list. There are no Sends in round 0.
115
Insert(Z): Procedure for adding Z to listv
1: insert Z in listv in sorted order of (κ, d, x)
0 0 0
2: if ∃ an entry Z for x in listv such that Z .flag-d∗ = f alse and pos(Z ) > pos(Z)
then
0 0 0 0
3: find Z with smallest pos(Z ) such that pos(Z ) > pos(Z) and Z .flag-d∗ =
f alse
0
4: remove Z from listv
We now provide proofs for establishing correctness of Alg. 1. The initial Observations
and Lemmas given below establish useful properties of an entry Z in a listv and
of posrv (Z) and its relation to posry (Z − ). We then present the key lemmas. In
116
Lemma 6.2.11, we show that the collection of entries for a given source x in listv
can be mapped into (d, l) pairs with non-negative l values such that d = d∗ for the
shortest path entry, and the d values for all other entries are distinct and larger than
d∗ . (It turns out that we cannot simply use the d values already present in Z’s entries
for this mapping since we could have two different entries for source x on listv , Z1
and Z2 , that have the same d value. ) Once we have Lemma 6.2.11 we are able to
h
bound the number of entries for a given source at listv by γ + 1 in Lemma 6.2.13,
and this establishes Invariant 2 (which is stated in Section 6.2.1). Lemma 6.2.14
establishes Invariant 1. In Lemma 6.2.15 we establish that all shortest path values
reach node v. With these results in hand, the final Lemma 6.2.16 for the round
bound for computing (h, k)-SSP with shortest path distances at most 4 is readily
established, which then gives Theorem 6.1.1.
Observations and Lemmas 6.2.1-6.2.8: In the following Observations and Lem-
mas we point out the key facts about an entry Z in listv in our Algorithm 1. We
use these in our proofs in this section.
Proof. An entry is removed from listv only in Step 4 of Insert, and this occurs at
most once in round r0 (through a call from either Step 11 or Step 13 of Algorithm 1).
But immediately before that removal an entry Z 0 with a smaller value was inserted
in listv in Step 1 of Insert.
0
Lemma 6.2.2. Let Z be an entry in listv . Then posrv (Z) ≥ posrv (Z) for all rounds
r0 > r, for which Z exists in v’s list.
Proof. If not, then it implies that there exists Z 0 such that Z 0 was below Z in v’s list
00 00
in round r and was replaced by another entry Z that was above Z in a round r
117
00 00 00 00 0
such that r0 ≥ r > r and hence posrv (Z ) > posrv (Z ). But by Observation 6.2.1
this cannot happen and thus resulting in a contradiction.
Observation 6.2.3. Let Z be an entry for source x that was added to listv . If there
exists a non-SP entry for x above Z in listv , then the closest non-SP entry above Z
will be removed from listv .
Proof. This is immediate from Steps 8 and 9 of Algorithm 1 where the current entry
Z ∗ with Z ∗ .flag-d∗ = true is verified to have a shorter distance (or a smaller key if
Z and Z ∗ have the same distance), and by the check in Step 13.
The above Observation should be contrasted with the fact that listv could
contain entries Z with Z.l > h, but only if flag-d∗ (Z) = f alse. In fact it is possible
that listv contains an entry Z 0 6= Z ∗ with Z 0 .d = d∗ and l > h since such an entry
would fail the check in Step 9 but could then be inserted in Step 13 of Algorithm 1.
Lemma 6.2.6. Let Z be an entry for source x that is present on listv in round r.
Let r0 > r, and let c and c0 be the number of entries for source x on listv that have
key value less than Z’s key value in rounds r and r0 respectively. Then c0 ≥ c.
118
Proof. If c0 < c then an entry for x that was present below Z in round r must have
been removed without having another entry for x being inserted below Z. But by
Observation 6.2.1 this is not possible since any time an entry for source x is removed
from listv another entry for source x with smaller key value is inserted in listv .
Lemma 6.2.6 holds for every round greater than r, even if Z is removed from
listv . The following stronger lemma holds for rounds greater than r when Z remains
on listv .
Lemma 6.2.7. Let Z be a non-SP entry for source x that is present on listv in
round r. Let r0 > r, and let c and c0 be the number of entries for source x on listv
that have key value less than Z’s key value in rounds r and r0 respectively. Then
c0 = c.
0
Proof. If a new entry Z with key < Z.κ for x is added, then by Observation 6.2.3
0
the closest non-SP entry for x with key > Z .κ must be removed from listv and thus
0
c ≤ c. Then using Lemma 6.2.6 we have c0 = c.
Lemma 6.2.8. Let Z − be an entry for source x sent from y to v and suppose the
corresponding entry Z (Step 7 of Algorithm 2) is added to listv in round r. Then
there are at least Z − .ν entries at or below Z in listv for source x.
Proof. Let us assume inductively that this result holds for all entries on listv and
listy with key value at most Z.κ at all previous rounds and at y in round r as well.
(It trivially holds initially.)
Let Z1− be the (Z − .ν − 1)-th entry for source x in listy . Since Z1− has a key
value smaller than Z − it was sent to v in an earlier round r0 . If the corresponding
entry Z1 created for possible addition to listv in Step 7 of Algorithm 1, was inserted
in listv then by inductive assumption there were at least Z1− .ν = Z − .ν − 1 entries
for x at or below Z1 in listv . And by Lemma 6.2.6 this holds for round r as well and
hence the result follows since Z is present above Z1 in listv .
119
And if Z1 was not added to listv in round r0 , then by Observation 6.2.4 there
were already Z − .ν − 1 entries for x with key ≤ Z1 .κ and by Lemma 6.2.6 there are
at least Z − .ν − 1 entries for x with key ≤ Z1 .κ ≤ Z.κ on listv at round r and hence
the result follows.
Establishing posry (Z− ) ≤ posrv (Z): For an entry Z − sent from y to v such that
Z is the corresponding entry created for possible addition to listv in Step 7 of
Algorithm 1, in Lemma 6.2.9 and Corollary 6.2.10 we establish that if Z is added to
listv then posry (Z − ) ≤ posrv (Z), which is an important property of pos.
Lemma 6.2.9. Let Z − be an entry sent from y to v in round r and let Z be the
corresponding entry created for possible addition to listv in Step 7 of Algorithm 1.
For each source xi ∈ S, let there be exactly ci entries for xi at or below Z − in listy .
If Z is added to listv , then for each xi ∈ S, there are at least ci entries for xi at or
below Z in listv .
Proof. If not there exists an xi ∈ S with strictly less than ci entries for xi at or
below Z in listv .
Let Z1− be the ci -th entry for xi in listy (if xi is Z’s source, then Z1− is Z − ). If
Z1− is not Z − , it is below Z − in listy and so was sent in a round r0 < r; if Z1− = Z −
then r0 = r. Let Z1 be the corresponding entry created for possible addition to listv
in Step 7 of Algorithm 1.
If Z1 was added to listv and is also present in listv in round r, then by Lemma
6.2.8 and 6.2.6, there will be at least ci entries for xi at or below Z1 , resulting in
00
a contradiction. And if Z1 was removed from listv in a round r < r, then by
Lemma 6.2.6, the number of entries for xi with key ≤ Z1 .κ should be at least cj .
Now if Z1 was not added to listv in round r0 , then by Observation 6.2.4,
we must already have at least ci entries for xi with key ≤ Z1 .κ in round r0 and by
00
Lemma 6.2.6, this must hold for all rounds r > r as well.
120
Corollary 6.2.10. Let Z − be an entry sent from y to v in round r and let Z be the
corresponding entry created for possible addition to listv in Step 7 of Algorithm 1.
If Z is added to listv , then posry (Z − ) ≤ posrv (Z).
Lemma 6.2.11. Let C be the entries for a source x ∈ S in listv in round r. Then
the entries in C can be mapped to (d, l) pairs such that each l ≥ 0 and each Z ∈ C
is mapped to a distinct d value with Z.κ = d · γ + l. Also d = d∗x if Z is a current
shortest path entry, otherwise d > d∗x .
121
we transform these into j distinct values for listv by adding wx0 (y, v) · γ + 1 to each
of them. For at least one of these d− values in y, call it d−
1 , it must be the case that
d0 = d− 0
1 + wx (y, v) · γ + 1 is not assigned to any of the j − 1 entries for source x below
Z in listv . Let Z1− be the entry in y’s list that is associated with distance d−
1 . We
show that the associated l value for d0 in Z on listv must be greater than 0.
= Z − .κ + wx0 (y, v) · γ + 1
122
Let Z be inserted in position p ≤ j. We assign a d value to Z as in the previous
case, taking care that the d value assigned to Z is different from that for the p − 1
entries below Z. Suppose Z’s d value has been assigned to another entry Z 00 in listv
above Z. Then, we consider Z 0 , the entry that was removed (in Step 5 of Insert)
in order to keep the total number of entries for source x at j. We assign to Z 00 the
value d0 that was assigned to Z 0 . Since Z 00 has a larger key value than Z 0 we will
need to use an l00 at least as large as that used for Z 0 (call it l0 ) in order satisfy the
requirement that Z 00 .κ = d0 · κ + l00 . Since l0 must have been non-negative, l00 will
also be non-negative as required, and all d values assigned to the entries for x will
be distinct.
Lemma 6.2.12. Let Z be the current shortest path distance entry for a source x ∈ S
in v’s list. Then the number of entries for x below Z in listv is at most γ · nk .
Proof. By Lemma 6.2.11, we know that the keys of all the entries for x can be
mapped to (d, l) pairs such that each entry is mapped to a distinct d value and
l > 0.
We have Z.κ = d∗x · γ + lx∗ , where lx∗ is the hop-length of the shortest path
00 00
from x to v. Let Z be an entry for x below Z in v’s list. Then, Z .κ ≤ Z.κ. It
implies
00 00
d · γ + l ≤ d∗x · γ + lx∗
00 00
d · γ ≤ d∗x · γ + (lx∗ − l )
00
00 (lx∗ − l )
d ≤ d∗x +
γ
(h − 1)
≤ d∗x +
γ
h
< d∗x +
γ
123
h
= d∗x + ·γ
γ2
n
= d∗x + · γ
k
00 00
Thus d < d∗x + n
k · γ. Since d ≥ d∗x , there can be at most n
k · γ entries for x
below Z in listv .
n
Lemma 6.2.13. For each source x ∈ S, v’s list has at most k · γ + 1 entries for x.
Proof. On the contrary, let Z be an entry for source x ∈ S with the smallest key
n
such that Z is the (γ · k + 2)-th entry for x in listv . Let y be the sender of Z to v
and let the corresponding entry in y’s list be Z − .
If Z was added as a non-SP entry, then by Lemma 6.2.8 there are at least
γ· n
k + 2 entries for x at or below Z − in listy , resulting in a contradiction as Z is
the entry with the smallest key that have this ν value.
Otherwise if Z was added as a current shortest path entry, then by
n
Lemma 6.2.12, Z can have at most γ · k entries below it in any round and hence
n
there are at most γ · k + 1 at or below Z in listv in all rounds (and if Z is later
marked as non-SP then by Lemma 6.2.7 Z.ν will stay fixed at that value), again
resulting in a contradiction.
Lemma 6.2.14. If an entry Z is added to listv in round r then r < Z.κ + posrv (Z).
Proof. The lemma holds in the first round since all entries have non-negative κ, any
received entry has hop length at least 1, and the lowest position is 1 so for any entry
Z received by v in round 1, Z.κ + pos1v (Z) ≥ 1 + 1 > 1.
Let r be the first round (if any) in which the lemma is violated, and let it
occur when entry Z is added to listv . So r ≥ Z.κ + posrv (Z). Let r1 = Z.κ + posrv (Z)
124
(so r1 < r by assumption).
Since Z was added to listv in round r, Z − was sent to v by a node y in round
r. So r = Z − .κ + posry (Z − ). But Z.κ > Z − .κ and posrv (Z) ≥ posry (Z − ), hence r
must be less than Z.κ + posrv (Z).
its hop length l∗ − 1 is the smallest among all shortest paths from x to y. Hence by
the inductive assumption an entry Z − with Z − .κ = d∗x,y · γ + l∗ − 1 (which is strictly
0
less than Z ∗ .κ) is received by y before round Z − .κ + posy (Z − ) (by Lemma 6.2.14)
0
and is then sent to v in round r0 = Z − .κ + posry (Z − ) in Step 1. Thus v adds the
shortest path entry for x, Z ∗ , to listv by the end of round r0 .
Lemma 6.2.16. Algorithm 1 correctly computes the h-hop shortest path distances
from each source x ∈ S to each node v ∈ V by round (n − 1)γ + h + n · γ + k.
125
Proof. An h-hop shortest path has hop-length at most h and weight at most n − 1,
hence a key corresponding to a shortest path entry will have value at most (n−1)γ+h.
Thus by Lemma 6.2.15, for every source x ∈ S every node v ∈ V should have received
the shortest path distance entry, Z ∗ , for source x by round r = (n−1)γ+h+posrv (Z ∗ ).
Now we need to bound the value of posrv (Z ∗ ). By Lemma 6.2.13, we know
that there are at most γ · nk + 1 entries for each source x ∈ S in a node v’s list. Now
n
as there are k sources, v’s list has at most (γ · k + 1) · k ≤ γ · n + k entries, thus
posrv (Z ∗ ) ≤ γ · n + k and hence r ≤ (n − 1)γ + h + γ · n + k.
q
hk
Since γ = n , Lemma 6.2.16 establishes the bounds given in Theorem 6.1.1.
126
√
Initially every zero edge-weight is increased to a positive value α = 1/ h and then h-
√
hop SSSP is computed using a BFS variant in Õ(n/α) = Õ(n h) rounds. This gives
√
an approximation to the h-hop SSSP where the additive error is at most hα = h.
This error is then fixed by running the Bellman-Ford algorithm [15] for h rounds.
√
The total round complexity of this SSSP algorithm is Õ(n h) and the congestion is
√
O( h).
127
√
If shortest path distances are bounded by ∆, Algorithm 2 runs in d∆· h+he
√
rounds with congestion at most h. And if ∆ ≤ n − 1 (as in [50]), then we can
√
compute shortest path distances from x to every node v in O(n h) rounds.
We can similarly simplify the short-range-extension algorithm in [50], where
some nodes already know their distance from source x and the goal is to compute
shortest paths from x by extending these already computed shortest paths to u
by another h hops. To implement this, we only need to modify the initialization
in Algorithm 2 so that each such node u initializes d∗ with this already computed
√
distance. The round complexity is again O(∆ h) and the congestion per source is
√
O( h). This gives us the following result.
Lemma 6.2.17. Let G = (V, E) be a directed or undirected graph, where all edge
weights are non-negative distances (and zero-weight edges are allowed), and where
shortest path distances are bounded by ∆. Then by using Algorithm 2, we can compute
√ √
h-hop SSSP and h-hop extension in O(∆ h) rounds with congestion bounded by h.
In this section we give faster APSP and k-SSP algorithms. The overall Algorithm 3
√
has the same structure as the deterministic O(n3/2 · log n) round weighted APSP
128
algorithm in Chapter 5 but we use a variant of Algorithm 1 in place of Bellman-Ford,
and we also present new methods within two of the steps.
In our improved Algorithm 3, Steps 3-5 are unchanged from the algorithm in
Chapter 5 (Algorithm 1). However we give an alternate method for Step 1, which
computes h-hop CSSSP, since the method in Chapter 5 (Algorithm 1) takes Θ(n · h)
rounds, which is too large for our purposes. Our new method is very simple and uses
√
Algorithm 1 and runs in O( 4hk) rounds. The following lemma is straightforward
and can be established by replacing Bellman-Ford algorithm with Algorithm 1 in
Lemma 6.4.3.
√
Lemma 6.3.1. h-hop CSSSPs can be computed in O( 4hk) rounds using Algo-
rithm 1.
For Step 2 we use the overall blocker set algorithm from Chapter 5 (Algo-
rithm 2), which runs in O(n · h + (n2 log n)/h) rounds and computes a blocker set
of size q = O((n log n)/h) for the h-hop trees constructed in Step 1 of Algorithm 3.
√
But this gives only an Õ(n3/2 ) bound for Step 2 (by setting h = Õ( n)), so it
will not help us to improve the bound on the number of rounds for APSP beyond
Algorithm 1. Instead we modify and improve a key step where that earlier blocker
set algorithm has a Θ(n · h) round preprocessing step. We give the details of our
method for Step 2 in Section 6.4.1.
129
2 √
Lemma 6.3.2. Algorithm 3 computes k-SSP in O( n log n
h + 4hk) rounds.
O(n · q) rounds (Lemma 5.2.6). Step 5 has no communication. Hence the overall
√
bound for Algorithm 3 is O(n · q + 4hk) rounds. Since q = O( n log n
h ) this gives
n4/3 ·log2/3 n
Proofs of Theorem 6.1.3 and 6.1.2: Using h = (2k·4)1/3
in Lemma 6.3.2 we ob-
tain the bounds in Theorem 6.1.3.
If edge weights are bounded by W , the weight of any h-hop path is at most
2
hW . Hence by Lemma 6.3.2, the k-SSP algorithm (Algorithm 3) runs in O( n logh
n
+
√
h W k) rounds. Setting h = n log1/2 n/(W 1/4 k 1/4 ) we obtain the bounds stated in
Theorem 6.1.2.
We first present our new notion of computing Consistent h-hop trees, which forms an
important component of our blocker set algorithm. We then describe our algorithm
for computing blocker set.
Recall that an h-hop shortest path from a source s to a vertex v in G is a
path of minimum weight from s to v among all paths with at most h hops. If we
consider the graph consisting of an h-hop shortest path from a source s to every
vertex in G reachable from s within h hops, it need not form a tree since the prefix
of an h-hop shortest path may not itself be an h-hop shortest path. The parent
pointers for the h-hop shortest paths computed by Bellman-Ford algorithm [15]
(or our pipelined (h, k)-SSP algorithm in Chapter 5) suffer from a similar problem:
the tree constructed by the parent pointers could have height greater than h (see
130
b b
b b
1 8
1 c a c a
d d
d c a c
1 (ii) Edges on 2- (iii) 2-hop SSSP for d
(i) Example graph hop shortest paths source node b con- (iv) 2-hop CSSSP
G. from source node structed by Bellman- for source node b.
b. Ford.
Figure 6.1: This figure gives an example graph G where the union of the edges on
the 2-hop shortest paths from source node b differs from the 2-hop SSSP constructed
by Bellman-Ford (or using our (h, k)-SSP pipelined algorithm in Chapter 5), and
both are different from the 2-hop CSSSP generated for source node b.
Fig 6.1).
Within the algorithm for computing blocker set in our APSP algorithm (de-
scribed in Section 5.3), there are algorithms for updating the ‘scores’ of the ancestor
and descendant nodes of a newly chosen blocker node in the collection of trees that
contain h-hop shortest paths. The efficient methods used in these algorithms are
based on having a consistent set of paths across all trees in the collection. In order
to create a consistent collection of paths across all sources, we introduce the following
definition of an h-hop Consistent SSSP (CSSSP) collection.
131
shortest paths: In particular, if every shortest path from source s to a vertex x has
more than h hops, then the h-hop tree for source s in the CSSSP collection is not
required to have x in it (see Fig. 6.1).
Our method to construct an h-hop CSSSP collection is very simple: We
execute Bellman-Ford algorithm to construct 2h-hop SSSPs instead of h-hop SSSPs
(Note that if there are two different shortest paths between a pair of vertices u and
v, then the one with shorter hop-length is preferred and in case of a tie, the one
with smaller last edge ID is preferred). Our CSSSP collection will retain the initial
h hops of each of these 2h-hop SSSPs. In other words, each vertex v willl set the
parent pointer p(v) to NIL for a source s if the hop-length of the corresponding path
is greater than h. In [8] we show that this simple construction results in an h-hop
CSSSP collection. Thus we are able to construct h-hop CSSSPs by incurring just a
constant factor overhead in the number of rounds over the bound for constructing
h-hop SSSPs.
Lemma 6.4.2. Consider running Bellman-Ford algorithm (or our pipelined (h, k)-
SSP algorithm in Chapter 5) using the hop-length bound 2h. Let C be the collection
of h-hop trees formed by retaining the initial h hops in each of these 2h-hop SSSPs.
Then the collection C forms an h-hop CSSSP collection.
Proof. If not, then there exist vertices u, v and trees Tx , Ty such that the paths from
x and π y be the corresponding paths in
u to v in Tx and Ty are different. Let πu,v u,v
these trees.
x ) 6= wt(π y ) (2) when paths
There are three possible cases: (1) when wt(πu,v u,v
x and π y
πu,v x
u,v have same weight but different hop-lengths (3) when both πu,v and
y
πu,v have same weight and hop-length.
x ) 6= wt(π y ): w.l.o.g. assume that wt(π x ) < wt(π y ). Now if
(1) wt(πu,v u,v u,v u,v
y x , we get a path of smaller weight from y to v of hop-
we replace πu,v in Ty with πu,v
length at most 2h. But then node v should have picked this lighter path during the
132
execution of Bellman-Ford with y as the source node, resulting in a contradiction.
x and π y have same weight but different hop-lengths. W.l.o.g.
(2) paths πu,v u,v
x has smaller hop-length than π y . Then κ(π x ) < κ(π y )
assume that path πu,v u,v u,v u,v
y x ) < κ(π y ◦ π y ). And again v would have picked the path
and hence κ(πy,u ◦ πu,v y,u u,v
y x as the shortest path from y during the execution of Bellman-Ford with
πy,u ◦ πu,v
y as the source, since paths with smaller hop-length are preferred even if they have
same weighted distance.
x and π y have same weight and hop-length. W.l.o.g. assume that
(3) both πu,v u,v
these two paths have the smallest hop-length for which the paths differ. Let (a, v)
x and let (b, v) be the last edge on the path π y .
be the last edge on the path πu,v u,v
W.l.o.g. assume that ID(a) < ID(b) (a cannot equal b since the resulting smaller
x and π y would be different, which is not possible). Then again
hops subpaths πu,a u,a
during the execution of Bellman-Ford with y as the source, v would have chosen the
y x as the shortest path from y instead of π y , since paths with smaller
path πy,u ◦ πu,v y,v
parent ID are preferred even if they have same weight and hop-length.
Lemma 6.4.3. h-hop CSSSPs can be computed in O(nh) rounds using the Bellman-
Ford algorithm.
We now show two properties of an h-hop CSSSP collection that we will use
in our blocker set algorithm in the next section. In the following, we call a tree T
rooted at a vertex c an out-tree if all the edges incident to c are outgoing edges from
c and we call T an in-tree if all the edges incident to c are incoming edges.
Lemma 6.4.4. Let C be an h-hop CSSSP collection. Let c be a vertex in G and let
T be the union of the edges in the collection of subtrees rooted at c in the trees in C.
Then T forms an out-tree rooted at c.
Proof. If not, there exist nodes u and v and trees Tx and Ty such that the path from
c to u in Tx and path from c to v in Ty first diverge from each other after starting
133
from c and then coincide again at some vertex z. But since C is an h-hop CSSSP
collection, by Lemma 6.4.2 the path from c to z in the collection C is unique.
Lemma 6.4.5. Let C be an h-hop CSSSP collection. Let c be a vertex in G and let
T be the union of the edges on the tree-path from the root of each tree in C to c (for
the trees that contain c). Then T forms an in-tree rooted at c.
Proof. If not, then there exist nodes x and y such that the path from x to c in Tx
and path from y to c in Ty first coincide at some vertex z and then diverge from each
other. But since C is an h-hop CSSSP collection, by Lemma 6.4.2 the path from z
to c in the collection C is unique.
134
number of descendant leaves in each tree, since the sum of these counts is precisely
the number of root-to-leaf paths in which v lies. Once all vertices have their overall
score, the new blocker node c can be identified as one with the maximum score.
It now remains for each node v to update its scores to reflect the fact that paths
through c no longer exist in any of the trees. This update computation is divided
into two steps in Algorithm 2 (Chapter 5). In both steps, the main challenge is for a
given node to determine, in each tree Tx , whether it is an ancestor of c, a descendant
of c, or unrelated to c.
135
Algorithm 4 Pipelined Algorithm for updating scores at v in trees Tx in which v
is a descendant of newly chosen blocker node c
Input: Q: blocker set, c: newly chosen blocker node, S: set of sources
(only for c)
1: Local Step at c: create listc to store the ID of each source x ∈ S such that
scorex (c) 6= 0; for each x ∈ S do set scorex (c) ← 0; set score(c) ← 0
2: Send: Round i: let hxi be the i-th entry in listc ; send hxi to c’s children in
Tx .
(round r > 0 : for vertices v ∈ V − Q − {c})
3: send[lines 3-4]: if v received a message hxi in round r − 1 then
4: if v 6= x then send hxi to v’s children in Tx
5: receive[lines 5-6]: if v receives a message hxi then
6: score(v) ← score(v) − scorex (v); scorex (v) ← 0
round, r + 1, and also sets its score for source x to 0. Similar to the algorithm for
updating ancestors of c [10], it is readily seen that every descendant of c in every
tree Tx receives a message for x by round k + h − 1.
Lemma 6.4.6. Algorithm 4 correctly updates the scores of all nodes v in every tree
Tx in which v is a descendant of c in k + h − 1 rounds.
Here we deal with the problem of finding (1+)-approximate solution to the weighted
APSP problem. If edge-weights are strictly positive, the following result is known.
Theorem 6.5.1 ( [70, 63]). There is a deterministic algorithm that computes (1+)-
approximate APSP on graphs with positive polynomially bounded integer edge weights
in O((n/2 ) · log n) rounds.
The above result does not hold when zero weight edges are present. Here
136
we match the deterministic O((n/2 ) · log n)-round bound for this problem with an
algorithm that also handles zero edge-weights.
We first compute reachability between all pairs of vertices connected by zero-
weight paths. This is readily computed in O(n) rounds, e.g., using [65, 47] while
only considering only the zero weight edges (and ignoring the other edges).
We then consider shortest path distances between pairs of vertices that have
no zero-weight path connecting them. The weight of any such path is at least 1. To
approximate these paths we increase the zero edge-weights to 1 and transform every
non-zero edge weight w(e) to n2 · w(e). Let this modified graph be G0 = (V, E, w0 ) .
Thus the weight of an l-hop path p in G0 , w0 (p), satisfies w0 (p) ≤ w(p) · n2 + l. Since
the modified graph G0 has polynomially bounded positive edge weights, we can use
the result in Theorem 6.5.1 to compute (1 + /3)-approximate APSP on this graph
in Õ(9n/2 ) rounds.
Fix a pair of vertices u, v. Let p be a shortest path from u to v in G, and let
its hop-length be l. Then w0 (p) ≤ n2 · w(p) + l. Let p0 be a (1 + /3)-approximate
shortest path from u to v, and let its hop-length be l. Then w0 (p0 ) ≤ (1+/3)·w0 (p) ≤
(1 + /3) · (n2 · w(p) + l). Dividing w0 (p0 ) by n2 gives us w0 (p0 )/n2 < w(p)(1 + /3) +
(l/n2 )(1 + /3) < w(p) + w(p)/3 + 2/n ≤ w(p)(1 + /3) + 2/3 ≤ w(p)(1 + ) (as
long as > 3/n and since w(p) ≥ 1), and this establishes Theorem 6.1.5.
We adapt the randomized framework of Huang et al. [50] to obtain a simple ran-
domized algorithm for weighted APSP with arbitrary edge weights. Our randomized
algorithm runs in Õ(n4/3 ) rounds w.h.p. in n. We describe our randomized algorithm
below.
As described in Section 6.2.6, Huang et al.[50] use two algorithms short-range
137
and short-range-extension for integer-weighted APSP for which they have random-
√
ized algorithms that run in Õ(n h) rounds w.h.p. in n. (We presented simplified
versions of these two algorithms in Section 6.2.6.) Since we consider arbitrary edge
weights here, we will instead use h rounds of the Bellman-Ford algorithm [15] for
both steps, which will take O(kh) rounds for k source nodes.
We keep the remaining steps in [50] unchanged: These steps involve having
every ‘center’ c broadcast its estimated shortest distances, δ(c0 , c), from every other
center c0 , and each source node x ∈ S sending its correct shortest distance, δ(x, c), to
√
each center c. (The set of centers is a random subset of vertices in G of size Õ( n).)
√
These steps are shown in [50] to take Õ(n + nkq) rounds in total w.h.p. in n,
√
where q = Θ( n log n
h ). This gives an overall round complexity Õ(kh + n + nkq) for
our algorithm. Setting h = n2/3 /k 1/3 and q = n1/3 k 1/3 log n, we obtain the desired
bound of Õ(n + n2/3 k 2/3 ) in Theorem 6.1.6.
6.6 Conclusion
138
technique [34]. Our current algorithm assumes that all sources see the same weight
on each edge, while in the scaling algorithm each source sees a different edge weight
on a given edge. While this can be handled with n different SSSP computations in
conjunction with the randomized scheduling result of Ghaffari [39], it will be very
interesting to see if a deterministic pipelined strategy could achieve the same result.
In Chapter 7 we present a Õ(n4/3 ) round deterministic APSP algorithm that
improves on the results presented in this Chapter and Chapter 5. The main compo-
nent of this algorithm is a faster algorithm for computing blocker set deterministi-
cally and a new approach to propagate distance values from source nodes to blocker
nodes.
139
Chapter 7
7.1 Introduction
In this Chapter we present an Õ(n4/3 ) rounds deterministic algorithm for the weighted
APSP problem. Table 4.1 (Chapter 4) compares our result with the earlier results
for this problem. All of these results as well as our new result can handle zero
weight edges, and these algorithms are qualitatively different from algorithms for
unweighted APSP.
Our new deterministic algorithm directly improves on the APSP algorithm
in Chapter 5 (Algorithm 1, Chapter 5). The algorithm in Chapter 5 (Algorithm 1)
computes Step 1 in O(n · h) rounds by running the distributed Bellman-Ford algo-
rithm for h hops from each source. Our algorithm leaves Step 1 unchanged, but it
improves on both Step 2 and Step 3.
For Step 2, in Chapter 5 we described a deterministic algorithm that greedily
chooses vertices to add to Q at the cost of O(n) rounds per vertex added, for the
cleanup cost for removing paths that are covered by this newly chosen vertex; this
140
is after an initial start-up cost of O(n · h). This gives an overall cost of O(nh + nq)
for Step 2, where q = |Q| = O((n/h) · log n). Our new contribution is to constuct
Q in a sequence of polylog(n) steps, where each step adds several vertices to Q.
Our method incurs a cleanup cost of O(|S| · h) rounds per step after an initial
start-up cost of O(|S| · h) rounds for an arbitrary source set S, thereby removing
the dependence on q from this bound. (S = V gives the standard setting used in
previous APSP algorithms.) We achieve this by framing the computation of a small
blocker set as an approximate set cover problem on a related hypergraph. We then
adapt the efficient NC algorithm in Berger et al. [16] for computing an approximate
minimum set cover in a hypergraph to an Õ(|S| · h)-round Congest algorithm. As
in [16] this involves two main parts. We first give a randomized Õ(|S| · h)-round
algorithm that computes a blocker set of expected size Õ(n/h) using only pairwise
independent random variables. We then derandomize this algorithm, again with an
Õ(|S| · h)-round algorithm.
For Step 3, in Chapter 5 we gave a deterministic O(n · q)-round algorithm,
√ √
and [50] gave a randomized Õ(n · q + n · h)-round algorithm. We replace the
√
n · h randomized algorithm used in [50] with a simple n · h round algorithm (similar
√
to Step 1). The randomized O(n · q) method in [50] computes the reversed q-sink
shortest paths problem that appears to use randomization in a crucial manner, by
invoking the randomized scheduling result of Ghaffari [39], which allows multiple
algorithms to run concurrently in O(∆ + κ · log n) rounds, where ∆ bounds the
dilation of any of the concurrent algorithms and κ bounds the congestion on any
edge when considering all algorithms. It is known that this result in [39] cannot
be derandomized in a completely general setting. For Step 3, our contribution is
√
to give a deterministic Õ(n · q)-round algorithm for the reversed q-sink shortest
paths problem. Our algorithm uses a simple round-robin pipelined approach. To
obtain the desired round bound we rephrase the algorithm to work in frames which
141
allows us to establish suitable progress in the pipelining to show that it terminates in
√
Õ(n· q) rounds. We note that the standard known results on efficiently broadcasting
multiple values, and on sending or receiving messages using the routing schedule in
an undirected APSP algorithm [48, 64] do not apply to this setting.
Finally we obtain the Õ(n4/3 ) bound on the number of rounds by balancing
√
the Õ(nh) bound for Steps 1 and 2 with the Õ(n · q) bound for the reversed q-sink
shortest path problem, as stated in the following theorem.
Algorithm 5 gives our overall APSP algorithm. In Step 1 we use the (simple) O(n·h)-
round algorithm in [8] to compute h-hop Consistent SSSP (CSSSP) for the vertex
set V (Definition 6.4.1, Chapter 5). The advantage of using h-hop CSSSPs instead of
other types of h-hop shortest paths is that the CSSSPs create a consistent collection
of paths across all trees in the collection, i.e. a path from u to v is same in all trees
T in the CSSSP collection C (in which such a path exists). We exploit this useful
142
property of CSSSPs throughout this chapter.
Step 2 computes a blocker set Q (Definition 5.2.1, Chapter 5). Our deter-
ministic blocker set algorithm for Step 2 is completely different from the blocker set
algorithms in Chapters 5 and 6 with significant improvement in the round complex-
ity. We describe this algorithm in Section 7.3. Our blocker set algorithm is based on
the NC approximate Set Cover algorithm of Berger et al. [16] and runs in Õ(|S| · h)
rounds, where S is the set of sources. Previous deterministic blocker set algorithms
in Chapters 5 and 6 have an additional Õ(n · |Q|) term in the round complexity.
In Step 3 we compute, for each c ∈ Q, the h-hop in-SSSP rooted at c, which
is the set of in-coming h-hop shortest paths ending at node c. We can compute
these h-hop in-SSSPs in O(h) rounds per source using Bellman-Ford algorithm [15].
In Step 4 every blocker node c ∈ Q broadcasts its ID and the corresponding h-hop
shortest path distance values δh (c, c0 ) for every c0 ∈ Q. Step 5 is a local computation
step where every node x computes its shortest path distances δ(x, c) to every c ∈ Q
using the shortest path distance values it computed and received in Steps 3 and 4
respectively.
In Step 6 every node x wants to send each shortest path distance value δ(x, c)
143
it computed in Step 5 to blocker node c ∈ Q. This is the reversed q-sink shortest
path problem, where q = |Q|, and is the other crucial step in our APSP algorithm.
This step requires sending Õ(n5/3 ) different distance values across Õ(n2/3 ) different
sources (using |Q| = Õ(n2/3 )). A trivial solution is to broadcast all these messages in
the network, resulting in a round complexity of Õ(n5/3 ) rounds. However this is the
only method known so far to implement this step deterministically. In Section 7.4
we give a pipelined algorithm for implementing this step more efficiently in Õ(n4/3 )
rounds. After the execution of Step 6 every blocker node c ∈ Q knows its shortest
path distance from every node x ∈ V .
Finally, in Step 7 for every source x ∈ V , we run Bellman-Ford algorithm for
h hops with distance values δ(x, c) used as the initialization values at every blocker
node c ∈ Q. These constructed paths are also known as extended h-hop shortest
paths [50]. After this step, each t ∈ V knows the shortest path distance value δ(x, t)
from every source x ∈ V , which gives the desired APSP output. We describe Step 7
in Section 7.5.2.
Proof of Theorem 7.1.1. Fix a pair of nodes x and t. If the shortest path from x to
t has less than h hops, then δ(x, t) = δh (x, t) and the correctness is straightforward
from Lemma 6.4.2.
Otherwise, we can divide the shortest path from x to t into subpaths x to
c1 , c1 to c2 , . . ., cl to t where ci ∈ Q for 1 ≤ i ≤ l and each of these subpaths have
hop-length at most h. Since x knows δh (x, c1 ) from Step 3 and δh (ci , ci+1 ) distance
values from Step 4, it can correctly compute δ(x, cl ) distance value in Step 5. And
from Lemmas 7.4.1 and 7.4.4, cl knows the distance value δ(x, cl ) after Step 6. Since
the shortest path from cl to t has hop-length at most h, from Lemma 7.5.7 t will
compute δ(x, t) in Step 7.
Step 1 runs in O(nh) = O(n4/3 ) rounds (Lemma 6.4.3, Chapter 5). In Sec-
tion 7.3, we will give an Õ(nh) = Õ(n4/3 ) rounds algorithm to compute a blocker set
144
of size q = Õ( nh ) = Õ(n2/3 ) (Step 2). Step 3 takes O(|Q| · h) = Õ(n) rounds using
2
Bellman-Ford algorithm (Lemma 6.4.3, Chapter 5). Since |Q|2 = Õ( nh2 ) = Õ(n4/3 ),
Step 4 takes Õ(n4/3 ) rounds using Lemma 5.2.5 (Chapter 5). Step 5 is local computa-
tion and has no communication. From Lemmas 7.4.1 and 7.4.5, Step 6 takes Õ(n4/3 )
rounds and Step 7 can be computed in O(nh) = O(n4/3 ) rounds using Lemma 7.5.7.
Hence the overall algorithm runs in Õ(n4/3 ) rounds.
In this section we describe our algorithm to compute a small blocker set. We frame
this problem as that of finding a small set cover in an associated hypergraph. We then
adapt the efficient NC algorithm for finding a provably good approximation to this
NP-hard problem given in Berger et al. [16] to obtain our deterministic distributed
algorithm.
As in [16] our algorithm has two parts. We first present a randomized algo-
rithm to find a blocker set of size Õ(n/h) in Õ(|S| · h) rounds using only pairwise
independence. This is described in Section 7.3.1. Then in Section 7.3.2 we describe
how to use the exhaustive search technique of Luby [68] along with the ideas from [16]
to derandomize this algorithm, again in Õ(|S| · h) rounds. In our overall APSP al-
gorithm S = V but we will also use this algorithm in Section 7.4 with a different set
for S.
145
Table 7.1: Notations
Global Parameters:
C h-hop CSSSP collection
S set of sources in C
h number of hops in a path
n number of nodes
, δ positive constants ≤ 1/12
Q blocker set (being constructed)
score(v) number of root-to-leaf paths in C that contain v (local var. at v)
Vi set of nodes v with score(v) ≥ (1 + )i−1
Pi set of paths in C with at least one node in Vi
Pij set of paths in Pi with at least (1 + )j−1 nodes in Vi
scoreij (v) number of paths in Pij that contain v (local var. at v)
CSSSP collection C in a graph G = (V, E) to the minimum set cover problem in the
hypergraph H = (V, F ) where V remains the vertex set of G and each edge in F
consists of the vertices in a root-to-leaf path in a tree in C. This hypergraph has
n vertices and at most n · |S| edges, where S is the number of sources (i.e., trees)
in C. Each edge in F has exactly h vertices. We now use this mapping to rephrase
the algorithm in [16] in our setting, and we derive an Õ(|S| · h)-round randomized
algorithm to compute a blocker set of expected size within O(log n) of the optimal
size, using only pairwise independent random variables. Since we know there exists
a blocker set of size O((n/h) · log n) the size of the blocker set constructed by this
randomized algorithm is Õ(n/h).
Our randomized blocker set method is in Algorithm 6. Table 7.1 presents the
notation we use for this section. In Step 1 for each node v we compute score(v), the
number of h-hop shortest paths in CSSSP collection C that contain node v. This can
be done in O(|S| · h) rounds for all nodes v ∈ V using Algorithm 3 in Chapter 5. Our
algorithm proceeds in stages from i = log1+ n2 down to 2 (Steps 2-17), where is a
small positive constant ≤ 1/12, such that at the start of stage i, all nodes in V have
score value at most (1 + )i and in stage i we focus on Vi , the set of nodes v with
146
Algorithm 6 Randomized Blocker Set Algorithm
Input: S: set of source nodes; h: number of hops; C: collection of h-hop CSSSP for
set S; , δ: positive constants ≤ 1/12
1: Compute score(v) for all nodes v ∈ V using an algorithm from [10].
2: for stage i = log1+ n2 down to 1 do . All nodes have score less than (1 + )i
3: Compute Vi and broadcast it using the algorithm described in Section 7.5.1.1.
147
score value greater than (1 + )i−1 . (This ensures that the nodes that are added to
the blocker set have their score values near the maximum score value). Let Pi be the
set of paths in C that contain a vertex in Vi and let Piv be the set of paths in Pi with
v as the leaf node. These sets are readily computed in O(n) and O(|S| · h)-rounds
respectively (see Section 7.5.1.1).
Similar to [16], in order to ensure that the average number of paths covered by
the newly chosen blocker nodes is near the maximum score value, we further divide
our algorithm for stage i into a sequence of log1+ h = log1+ n1/3 phases, where in
each phase j we focus on the paths in Pi with at least (1 + )j−1 nodes in Vi . We call
this set of paths Pij . We maintain that at the start of phase j, every path in Pi has
at most (1 + )j nodes in Vi . We now describe our algorithm for phase j (Steps 5-17).
The algorithm for phase j consists of a series of selection steps (Steps6-17) (similar
to [16]) which are performed until there are no more paths in Pij .
Now we describe how we select nodes to add to blocker set Q. Let δ be some
fixed positive constant less than or equal to 1/12. In Step 9 we check if there exists a
node v which covers at least δ 3 /(1 + ) fraction of paths in Pij and if so, we add this
node to the blocker set in Step 10. In case of multiple such nodes, we pick the one
with the maximum scoreij value and break ties using node IDs. Otherwise in Step 12,
we randomly pick every node with probability δ/(1 + )j , pairwise independently,
and form a set A. In Step 15 we check if A is a good set, otherwise we try again and
form a new set A in Step 12. As in [16] we define the notion of a good set as given
below and we will later show that A is a good set with probability at least 1/8.
Before the next selection step, we remove the paths covered by these newly
chosen node from the collection C along with recomputing the score values and sets
Vi and Pi (Steps 16-17).
148
7.3.1.1 Analysis of the Randomized Algorithm
Similar to [16] we get the following Lemmas which give us a bound on the number
of selection steps and a bound on the size of Q. Table 7.2 presents the notation we
use in our analysis in this section.
Lemma 7.3.2. The set Q constructed in Algorithm 6 is a blocker set for the CSSSP
collection C.
Proof. To show that Q is a blocker set, we need to show that the computed blocker
set Q indeed covers all paths in the CSSSP collection C. The while loop in Steps 6-17
runs as long as there is a path in Pi with at least (1 + )j−1 nodes in Vi and since
this loop terminated for i = 1 and j = 1, it implies that there is no path in C which
is not covered by some node in Q.
(1+)j
Lemma 7.3.3. If the check in Step 9 fails, then |Vi | > δ3
.
δ3
Proof. Since no node in Vi covers a (1+) fraction of paths from Pij , hence the total
δ 3
scoreij values (defined in Step 8) for all nodes in Vi has value at most |Vi |· (1+) ·|Pij |.
And since every path in Pij has atleast (1 + )j−1 nodes in Vi ,
δ3
|Vi | · · |Pij | > |Pij | · (1 + )j−1
(1 + )
(1+)j
This establishes that |Vi | > δ3
.
Lemma 7.3.4. The set A constructed in Step 12 of Algorithm 6 has size at most
|Vi | |Vi |
(δ + 2δ 2 ) · (1+)j
and atleast (δ − 2δ 2 ) · (1+)j
with probability at least 3/4.
149
Table 7.2: List of Notations Used in the Analysis of the Randomized Algorithm
150
X δ
E[ Xv ] = |Vi | · (7.1)
(1 + )j
v∈Vi
X δ
V ar[ Xv ] = |Vi | · V ar[Xv ] ≤ |Vi | · E[Xv2 ] = |Vi | · (7.2)
(1 + )j
v∈Vi
p
||A| − E[|A|]| ≤ 2 V ar[|A|]
s
δ
≤2 |Vi | ·
(1 + )j
δ2 1 δ3
≤ 2 · |Vi | · (by Lemma 7.3.3 < )
(1 + )j |Vi | (1 + )j
δ δ2
|A| ≤ |Vi | · + 2 · |V i | ·
(1 + )j (1 + )j
|Vi |
Using the above analysis we can also show that |A| ≥ (δ − 2δ 2 ) · (1+)j
with
probability at least 3/4.
151
that even though the lower bound achieved using this term is very weak, improving
it further will not improve the overall bound on the size of A by more than a polylog
factor (Lemma 7.3.4).
Now we show that value of Y is ≥ |A| · (1 + )i · (1 − 3δ − ) with probability
atleast 1/2.
P P
We first split Y into Y1 and Y2 where Y1 = p∈Pi v∈Vi ∩p Xv and Y2 =
P P
p∈Pi v,v 0 ∈Vi ∩p Xv · Xv 0 .
X X
Y1 = Xv
p∈Pi v∈Vi ∩p
X X
= Xv
v∈Vi {p∈Pi :v∈p}
X
≥ (1 + )i−1 · Xv (since every node in Vi lies in ≥ (1 + )i−1 paths in Pi )
v∈Vi
= (1 + )i−1 · |A|
X X
E[Y2 ] = E[Xv · Xv0 ]
p∈Pi v,v 0 ∈Vi ∩p
X X
= E[Xv ] · E[Xv0 ] (follows since Xv and Xv0 are pairwise
p∈Pi v,v 0 ∈Vi ∩p
independent)
X nV ,p 2
i δ
=
2 (1 + )j
p∈Pi
152
X nV ,p 2
j δ
≤ (1 + ) · i
(since nVi ,p ≤ (1 + )j )
2 (1 + )j
p∈Pi
P 2
v∈Vi score(v)
j δ
≤ (1 + ) · ·
2 (1 + )j
2
j |Vi | δ
≤ (1 + ) · · max score(v) ·
2 v∈Vi (1 + )j
|Vi |
≤ · (1 + )i−j · δ 2 (7.3)
2
Now using Markov inequality we get the following upper bound on Y2 with
probability atleast 3/4:
|Vi |
Since |A| ≥ (δ − 2δ 2 ) · (1+)j
with probability at least 3/4 by Lemma 7.3.4,
|A|
Y2 ≤ 2δ 2 · (1 + )i · (δ−2δ 2 )
with probability at least 1/2.
Combining the bounds for Y1 and Y2 we get the following lower bound on Y
with probability at least 1/2:
Y = Y1 − Y2
|A|
≥ (1 + )i−1 · |A| − 2δ 2 · (1 + )i ·
(δ − 2δ 2 )
1 2δ
= (1 + )i · |A| · ( − )
1 + 1 − 2δ
3δ
= (1 + )i · |A| · (1 − − )
1 + 3/2 − 3δ
≥ (1 + )i · |A| · (1 − − 3δ)
153
δ/2 fraction of paths in Pij with probability at least 5/8.
Proof. Similar to the proof of Lemma 7.3.5 we can lower bound the number of paths
P P P
covered by set A in Pij by the term p∈Pij [ v∈Vi ∩p Xv − v,v0 ∈Vi ∩p Xv · Xv0 ]. Let
this term be Y 0 , with first term Y3 and the second term Y4 . As noted in the proof of
Lemma 7.3.5, this term gives a very weak lower bound (however the sum here is over
the paths in the set Pij instead of Pi ) and it is sufficient to get our desired bound
on the size of A.
Note that even though the lower bound achieved using this term is very weak,
improving it further will not improve the overall bound on the size of A by more
than a polylog factor (Lemma 7.3.4).
Now we need to show that Y 0 ≥ δ
2 · |Pij | with probability at least 5/8.
We first give a lower bound on Y3 . To get the lower bound, we first compute
a lower bound on E[Y3 ] and an upper bound on V ar[Y3 ] and then use Chebyshev’s
inequality. (Let nv,Pij represent the number of paths in Pij that contain node v.
δ3 δ3
Since no node covers at least (1+) fraction of paths in Pij , nv,Pij < (1+) )
X X
E[Y3 ] = E[ Xv ]
p∈Pij v∈Vi ∩p
X
≥ E[ (1 + )j−1 · Xv ] (since every path in Pij has atleast (1 + )j−1
p∈Pij
nodes from Vi )
δ
= (1 + )j−1 · |Pij | ·
(1 + )j
δ
= |Pij | ·
(1 + )
154
X X
V ar[Y3 ] = V ar[ Xv ]
p∈Pij v∈Vi ∩p
X X
= V ar[ Xv ]
v∈Vi {p∈Pij :v∈p}
X
= V ar[ nv,Pij Xv ]
v∈Vi
X
= n2v,Pij · V ar[Xv ] (linearity of variance follows since Xv ’s are
v∈Vi
pairwise independent)
δ δ3 X δ3
≤ · · |Pij | · nv,Pij (since nv,Pij < · |Pij |)
(1 + )j (1 + ) (1 + )
v∈Vi
δ4
≤ · |Pij | · |Pij | · (1 + )j (since every path in Pij has at most
(1 + )j+1
(1 + )j nodes from Vi )
≤ δ 4 · |Pij |2 (7.4)
√ p
|Y3 − E[Y3 ]| ≤ 2 2 V ar[Y3 ]
√
Y3 ≥ E[Y3 ] − 2 2δ 2 · |Pij |
δ √
≥ |Pij | · − 2 2δ 2 · |Pij |
(1 + )
155
X X
E[Y4 ] = E[Xv · Xv0 ]
p∈Pij v,v 0 ∈Vi ∩p
X X
= E[Xv ] · E[Xv0 ] (follows since Xv and Xv0 are pairwise
p∈Pij v,v 0 ∈Vi ∩p
independent)
2
(1 + )2j
δ
≤ |Pij | · · (since there are at most (1 + )j nodes from
2 (1 + )j
Vi in any path in Pij )
δ2
= |Pij | · (7.5)
2
Now using Markov inequality we get the following upper bound on Y4 with
probability atleast 3/4:
Y4 ≤ 4E[Y4 ] ≤ 2δ 2 · |Pij |
Combining the bounds for Y3 and Y4 we get the following lower bound on Y 0
with probability at least 5/8:
Y 0 = Y3 − Y4
δ √
≥ |Pij | · − 2 2δ 2 · |Pij | − 2δ 2 · |Pij |
(1 + )
≥ |Pij | · δ · (1 − − 5δ)
δ
≥ |Pij | · (since , δ ≤ 1/12)
2
Lemma 7.3.7. The set A constructed in Step 12 is a good set with probability at
least 1/8.
156
Proof. This is immediate from Lemma 7.3.5 and 7.3.6.
Lemma 7.3.8. The while loop in Steps 6-17 runs for at most O(log3 n/(δ 3 · 2 ))
iterations in total.
Proof. The while loop runs until Pij is non-empty, i.e. there exists a path in Pi with
atleast (1 + )j−1 nodes in Vi . In each iteration, the algorithm either covers at least
δ3
(1+) fraction of paths in Pij (if node c is added to blocker set Q in Step 10) or at
δ
least 2 fraction of paths from Pij (if set A is added to Q in Step 15). Since there are
δ3
at most n2 paths and each iteration of the while loop covers at least (1+) fraction
2
of Pij , there are at most O log n = O (1+)δ3log n = O log n
iterations.
δ3
1
log
1− δ3
(1+)
Since both the inner and outer for loop runs for O(log1+ n) = O( log n ) iterations,
this establishes the lemma.
Lemma 7.3.9. Each iteration of the inner for loop (Steps 5-17) in Algorithm 6 takes
Õ |S|·h
δ3
rounds in expectation.
Proof. We first show that each iteration of the while loop in Steps 6-17 takes O(|S|·h)
rounds in expectation. Step 7 takes O(|S| · h) rounds by Lemmas 7.5.3 and 7.5.3 and
so does Step 8 [10] and by Lemma 5.2.5. The check in Step 9 involves no communi-
cation and so does Step 10, since every node knows the scoreij values for every other
node and also the value of |Pij |, i.e. the number of paths that belong to Pij . Steps 12
and 15 are also local steps and does not involve any communication. Steps 13 and
14 involves broadcasting at most O(n) messages and hence takes O(n) rounds using
Lemma 5.2.5. Since by Lemma 7.3.7 the set A constructed in Step 12 is good with
probability at least 1/8, Steps 12-15 are executed O(1) times in expectation. Step 17
takes O(|S| · h) rounds [10] and using Lemma 7.5.2. Since the while loop runs for at
most O logδ3
n
iterations (by Lemma 7.3.8), this establishes the lemma.
157
Lemma 7.3.10. The blocker set Q constructed by Algorithm 6 has size O( n log n
h ).
Proof. As shown in [58, 10] the size of the blocker set computed by an optimal greedy
algorithm is Θ( n ln p
h ), where p is the number of paths that need to be covered. We
will now argue that the blocker set constructed by Algorithm 6 is at most a factor of
1
(1−3δ−) larger than the greedy solution, thus showing that the constructed blocker
set Q has size at most O( n ln p 1 n log n 2
h · (1−3δ−) ) = Õ( h ) since p ≤ n and 0 < δ, ≤
1
12 .
The blocker set Q constructed by Algorithm 6 has 2 types of nodes: (1) node
c added in Step 10, (2) set of nodes A added in Step 12. Since the while loop in
3
Steps 6-17 runs for at most O log n
δ 3 ·2
iterations (by Lemma 7.3.8), hence there are
3 3
at most O log n
δ 3 ·2
nodes of type 1. Since log n
δ 3 ·2
= o( nh ), hence we only need to bound
the number of nodes added in Steps 12-15.
Since A is a good set, by Lemma 7.3.7 the number of paths covered by A is
at least |A| · (1 + )i · (1 − 3δ − ), where (1 + )i is the maximum possible score value
across all nodes in V (in the current iteration). Since maximum possible score value
is (1 + )i , any greedy solution must add atleast |A| · (1 − 3δ − ) nodes in the blocker
1
set to cover these paths. Hence the choice of A is at most a factor of (1−3δ−) larger
than the greedy solution. This establishes the lemma.
Lemma 7.3.11. Algorithm 6 computes the blocker set Q in Õ(|S| · h/(2 δ 3 )) rounds,
in expectation.
Proof. Step 1 runs in O(|S| · h) rounds [10]. The for loop in Steps 2-17 runs for
log1+ n2 = O log n iterations. Each iteration takes Õ |S|·h
δ 3 rounds in expecta-
tion: Step 4 is readily seen to run in O(|S| · h) rounds (Lemma 7.5.2). The inner
for loop in Steps 5-17 runs for log1+ h = O log n iterations, with each iteration
taking Õ |S|·h
δ 3 rounds in expectation using Lemma 7.3.9.
158
7.3.2 Deterministic Blocker Set Algorithm
The only place where randomization is used in Algorithm 6 is in Steps 12-15, where
a good set A (see Definition 7.3.1) is chosen. Fortunately, the Xv ’s are pairwise-
independent random variables, where Xv = 1 if v ∈ A and 0 otherwise. We use a
linear-sized sample space [11, 60, 20] for generating pairwise independent random
variables and then find a good sample point(i.e., a good set A) in this O(n)-sized
sample space in O(|S| · h + n) rounds.
Note that a trivial solution is to run the inner loop (Steps 12-15) in the
randomized blocker set algorithm for each sample point in the O(n) sample space
until a good set is identified. However in the worst case, we may need to run this
loop O(n) times instead of just a constant number of times in expectation, and that
would worsen the round complexity by a factor of n.
Algorithm 7, our derandomized algorithm, works as follows. Recall that Piv
and Pijv denote the set of paths in Pi and Pij , respectively, that have v as the leaf
node. We start with create an incoming BFS tree rooted at l (Step 1, Alg. 7).
We assume that the X values are enumerated in order and every node knows this
(µ)
enumeration. Let X (µ) refers to the µ-th vector in this enumeration and let σPi ,v and
(µ)
σPij ,v refers to the number of paths covered by X (µ) in sets Piv and Pijv respectively.
(µ) (µ)
Similarly let νPi and νPij refers to the total number of paths covered by X (µ) in
sets Pi and Pij respectively. In Step 2 (Alg. 7), the leader l receive sums of the
νPi ,u and νPij ,u values for all sample points from the nodes u using the algorithm in
Section 7.3.3. The leader then is able to compute the number of paths covered in
both Pi and Pij for each µ and then picks one that satisfies the good set criterion
(Step 3, Alg. 7). It then broadcasts the corresponding X vector to every node in the
network (Step 4, Alg. 7). Algorithm 7 describes the pseudocode of this algorithm.
Lemma 7.3.12. The leader node l can identify a good sample point X ∈ {0, 1}|Vi | ,
and thus a good set A in O(|S| · h + n) rounds.
159
Algorithm 7 Deterministic Algorithm for picking good set A
Input: h: number of hops; S: set of sources; C: h-hop CSSSP collection; X (µ) : µ-th
vector in sample space; Piv : set of paths in Pi with v as the leaf node; Pijv : set of
paths in Pij with v as the leaf node
1: Compute BFS in-tree T rooted at leader l.
(µ) (µ)
2: Compute σP ,u and σP ,u terms locally at each v ∈ V , for each sample point µ,
i ij
and then using the pipelined algorithm in Section 7.3.3, send these values to the
leader l.
(µ) (µ)
3: Local Step at l: For each 1 ≤ µ ≤ n, compute νP and νP . Let µ0 be such
i ij
0
that X (µ ) corresponds to good set A (in case of ties, pick the highest one).
0
4: Node l broadcast X (µ ) values. (This corresponds to good set A)
Proof. Step 1 computes the incoming BFS tree rooted at leader node l in O(n)
rounds. Step 2 takes O(n) rounds by Lemmas 7.3.14 and 7.3.15 . Step 3 is a local
step and involves no communication. Step 4 involves an all-to-all broadcast of at
most n messages and thus takes O(n) rounds using Lemma 5.2.5.
Let Algorithm 20 be the blocker set algorithm obtained after replacing Steps 12-
15 in Algorithm 6 with the deterministic algorithm for generating a good set A
(Algorithm 7). Lemma 7.3.12 together with Lemma 7.3.11, gives us the following
Corollary.
In this Section we describe a simple pipelined algorithm to compute νPi and νPij
terms at leader node l. Both algorithms are similar to an algorithm in [10] (for
(µ)
computing ‘initial scores’). Recall that σPi ,v refers to the number of paths in Piv
(µ)
covered by the sample point X (µ) and σPij ,v refers to the total number of paths in
(µ) (µ)
Pijv covered by the sample point X (µ) . Let νPi ,v refers to the sum total of the σPi ,w
(µ)
values of all descendant nodes w of v and similarly let νPij ,v refers to the sum total
160
(µ)
of the σPij ,w values of all descendant nodes w of v. Also recall from Section 7.3.2
(µ) (µ)
that νPi and νPij refers to the total number of paths covered by X (µ) in sets Pi and
Pij respectively. Table 7.3 presents the notations that we use in this Section.
161
Table 7.3: List of Notations Used in the Analysis of the Deterministic Algorithm
162
(µ)
Lemma 7.3.14. Compute-νPi (Algorithm 8) correctly computes the νPi values at
leader node l for all µ in O(n) rounds.
Proof. In Step 1, every node v correctly initialize their contribution to the overall
νPi ,v term for each µ locally. Since the height of tree T is at most n − 1, it is readily
(µ)
seen that a node v that is at depth h(v) in T will receive the countPi values from
(µ)
its children in round n − h(v) + µ − 2 (Steps 5-9) and thus will have the correct νPi
value to send in round n − h(v) + µ − 1 in Step 3. Since µ = O(n), Steps 3-9 runs in
O(n) rounds. Step 10 is a local step and thus does not involve any communication.
This establishes the lemma.
(µ)
Lemma 7.3.15. Compute-νPij (Algorithm 9) correctly computes the νPij values at
leader node l for all µ in O(n) rounds.
Proof. In Step 1, every node v correctly initialize their contribution to the overall
νPij term for each µ locally. Since the height of tree T is at most n − 1, it is readily
(µ)
seen that a node v that is at depth h(v) in T will receive the νPij values from its
(µ)
children in round n − h(v) + µ − 2 (Steps 5-9) and thus will have the correct νPij
163
Algorithm 9 Compute-νPij : Compute sum of νPij values at leader node l
Input: h: number of hops; S: set of sources; C: h-hop CSSSP collection; X (µ) : µ-th
vector in sample space; T : BFS in-tree rooted at leader l
1: Local Step at v ∈ V : Let P be the set of paths in Pij with v as the leaf node.
(µ) P (µ)
For each 1 ≤ µ ≤ n, set νPij ,v = p∈P ∨z∈p Xz
2: In round r > 0:
(µ)
3: send: if r = n − h(v) + µ − 1 then send hνPij ,v i to parent(v) in T
4: receive [lines 5-9]:
5: if r = n − h(v) + µ − 2 then
6: let I be the set of incoming messages to v
7: for each M ∈ I do
(µ)
8: let the sender be w and let M = hνPij ,w i and
(µ) (µ) (µ)
9: if w is a child of v in T then νPij ,v ← νPij ,v + νPij ,w
(µ)
10: Local Step at leader l: Compute the total sum νPij for each sample point µ,
(µ)
by summing up the received νPij ,w values from all its children w.
In Step 6 of Algorithm 5, the goal is to send the distance values δ(x, c) (which
are already computed at node x) from source node x to the corresponding blocker
node c. Since there are n sources and |Q| = Õ(n2/3 ) blocker nodes, this step can
be implemented in Õ(n5/3 ) rounds using all-to-all broadcast (Lemma 5.2.5). One
could conjecture that the techniques in [48, 64] could be used to send these Õ(n5/3 )
messages from the source nodes to the blocker nodes by constructing trees rooted
at each c. However, it is not clear how these methods can distribute the Õ(n5/3 )
different source-destination messages in o(n5/3 ) rounds.
We now describe a method to implement this step more efficiently in Õ(n4/3 )
164
rounds deterministically. A randomized Õ(n4/3 )-round algorithm for this problem is
given in Huang et al. [50]. Our algorithm uses the concept of bottleneck nodes from
that result but is otherwise quite different.
Our algorithm is divided into two cases: (i) when hops(x, c) > n2/3 and, (ii)
when hops(x, c) ≤ n2/3 (hops(x, c) denotes the number of edges on the shortest path
from x to c).
Case (i) hops(x, c) > n2/3 : Algorithm 10 describes our algorithm for this case.
We first construct an n2/3 -hop in-CSSSP collection (i.e., CSSSP in-trees) using the
blocker set Q as the source set (Step 1, Alg. 10). In Step 2 (Alg. 10) we construct a
blocker set Q0 of size Õ(n1/3 ) for this CSSSP collection using deterministic Algorithm
20 in Sections 6.4.1 and 7.3.2. Then for each c0 ∈ Q0 we construct the incoming and
outgoing shortest path tree rooted at c0 (Step 3, Alg. 10). In Step 4 (Alg. 10), every
source x ∈ V broadcasts the distance value δ(x, c0 ) for each c0 ∈ Q0 . The lemma
below shows that each c ∈ Q can determine the δ(x, c) values for all x for which
hops(x, c) > n2/3 , and the algorithm runs in Õ(n4/3 ) rounds.
1: Compute n2/3 -hop in-CSSSP for source set Q using the algorithm in [8].
2: Compute a blocker set Q0 of size Õ(n/n2/3 ) = Õ(n1/3 ) for the n2/3 -hop CSSSP
computed in Step 1 using the blocker set algorithm described in Section 6.4.1.
3: For each c0 ∈ Q0 in sequence: Compute in-SSSP and out-SSSP rooted at c0
using Bellman-Ford algorithm.
4: For each x ∈ V in sequence: Broadcast ID(x) and the shortest path distance
values δ(x, c0 ) for each c0 ∈ Q0 .
5: Local Step at node c ∈ Q: For each x ∈ V compute the shortest path distance
value δ(x, c) using the δ(x, c0 ) distance values received in Step 4 and the δ(c0 , c)
distance values computed in Step 3.
165
Lemma 7.4.1. Let V 0 be the set of nodes x such that there is a shortest path from x
to a blocker node c ∈ Q with hop-length greater than n2/3 . Using Algorithm 10 each
blocker node c can correctly compute δ(x, c) for all such x ∈ V 0 in Õ(n4/3 ) rounds.
Proof. Since hops(x, c) > n2/3 , there exists a blocker node c0 ∈ Q0 (constructed
in Step 2) such that the shortest path from x to c passes through c0 . Thus c can
compute the distance value δ(x, c) by adding δ(x, c0 ) (received in Step 4) and δ(c0 , c)
(computed in Step 3) values in Step 5.
Step 1 takes O(n2/3 · |Q|) = Õ(n4/3 ) rounds using Bellman-Ford algorithm.
Step 2 requires Õ(n2/3 · n2/3 ) = Õ(n4/3 ) rounds by Corollary 7.3.13. Since |Q0 | =
Õ(n/n2/3 ) = Õ(n1/3 ), Step 3 takes Õ(n·n1/3 ) = Õ(n4/3 ) rounds using Bellman-Ford
algorithm and so does Step 4 using Lemma 5.2.5. Step 5 is a local step and has no
communication.
Case (ii) hops(x, c) ≤ n2/3 : This case deals with sending the distance values from
source nodes x to the blocker nodes c when the shortest path between x and c has
hop-length at most n2/3 . Recall that using an all-to-all broadcast or the techniques
in [48, 64] for sending these Õ(n5/3 ) messages appears to require at least Õ(n5/3 )
rounds.
Let C Q be the n2/3 -hop in-CSSSP collection for source set Q. A set B ⊂ V is
a set of bottleneck nodes if removing the nodes in B, along with their descendants in
the trees in the collection C Q , reduces the congestion to at most Õ(n4/3 ), i.e. every
node would need to send at most Õ(n4/3 ) messages if all nodes x transmitted their
δ(x, c) values along the pruned CSSSP trees in the collection C Q . This notion is
defined in Huang et al. [50], where they present a randomized algorithm using the
randomized scheduling algorithm in Ghaffari [39] to identify such a set of bottleneck
nodes. Here we deterministically identify a set of bottleneck nodes B where |B| =
Õ(n1/3 ) (Step 1, Alg. 11) using a pipelined strategy (Alg. 17 in Appendix 7.5.3.1).
Clearly, after we remove these bottleneck nodes, any remaining node needs to send
166
at most Õ(n4/3 ) messages.
After we identify the set of bottleneck nodes B we run Bellman-Ford algo-
rithm [15] for each b ∈ B to compute both the incoming and outgoing shortest path
tree rooted at b (Step 2, Alg. 11). We then broadcast the δ(x, b) distance values
from every source x ∈ V to the corresponding b ∈ B (Step 3, Alg. 11). Thus if x lies
in the subtree rooted at b for a blocker node c, then c can compute δ(x, c) value by
adding δ(x, b) and δ(b, c) distance values (Step 4, Alg. 11).
It remains to send the distance value δ(x, c) to blocker node c if x is not part
of a subtree of any bottleneck node b in c’s shortest path tree. Since the maximum
congestion at any node is at most Õ(n4/3 ) after removing bottleneck nodes in B, we
are able to perform this computation deterministically. In Steps 8-9 (Alg. 11), we use
a simple round-robin strategy to propagate these distance values from each source x
to all blocker nodes c in the network. We show in Section 7.4.1, using the notion of
frames, that this simple strategy achieves the desired Õ(n4/3 )-round bound.
167
Algorithm 11 Compute δ(x, c) at c: when hops(x, c) ≤ n2/3
Input: Q: blocker set; |Q| ≤ n2/3 log n; C Q : n2/3 -hop in-CSSSP collection for set Q
Lemma 7.4.2. If the shortest path from x ∈ V to a blocker node c ∈ Q has hop-
length at most n2/3 and there exists a bottleneck node b ∈ B on this path, then after
executing Steps 1-4 of Algorithm 11 blocker node c knows the distance value δ(x, c)
for all such x ∈ V .
Proof. This is immediate from Step 4 (Alg. 11) where c will compute δ(x, c) by
adding the distance values δ(x, b) (received in Step 3, Alg. 11) and δ(b, c) value
(computed at c in Step 2, Alg. 11).
Lemma 7.4.3. If a source node x lies in a blocker node c’s tree in the CSSSP
168
collection C Q after the execution of Step 5 of Algorithm 11, then c would have received
δ(x, c) value by (n4/3 log n + n4/3 ) · ((1/3) · log n/ log log n − 1) rounds of Step 9 of
Algorithm 11.
Lemma 7.4.3 is established below in Section 7.4.1. Lemmas 7.4.2 and 7.4.3
establish the following lemma.
Lemma 7.4.4. If the shortest path from x ∈ V to a blocker node c ∈ Q has hop-
length at most n2/3 , then after running Algorithm 11 blocker node c knows the dis-
tance value δ(x, c) for all such x ∈ V .
Proof. Step 1 takes Õ(n4/3 ) rounds by Lemma 7.5.10. Since |B| = Õ(n1/3 ), Step 2
takes Õ(n · n1/3 ) = Õ(n4/3 ) rounds using Bellman-Ford algorithm and so does Step 3
using Lemma 5.2.5. Step 4 is a local step and involves no communication. Step 5
takes Õ(n2/3 · |Q|) = Õ(n4/3 ) rounds using Lemma 7.5.6. Step 9 runs for Õ(n4/3 )
rounds, thus establishing the lemma.
In this section we will establish that the simple round-robin approach used in Steps 8-
9 of Algorithm 11 is sufficient to propagate distance values δ(x, c) from source nodes
x ∈ V to blocker nodes c ∈ Q in Õ(n4/3 ) rounds, when the congestion at any node
is at most Õ(n4/3 ). While this looks plausible, the issue to resolve is whether a
node could be left idling when there are more messages it needs to pass on from its
descendants to its parents in some of the trees. This could happen because each
node forwards at most one message per round and these descendants might have
forwarded messages for other blocker nodes. The round robin scheme appears to
only guarantee that a message for a chosen blocker node will be sent from a node to
its parent at least once every |Q| rounds.
169
We now present and analyze a more structured version of Steps 9-10 to es-
tablish the bound. In this Algorithm 6 we divide Step 9 (Alg. 11) into (1/3) ·
(log n/ log log n)−1 different stages, with each stage running for at most n4/3 log3/2 n+
n4/3 rounds (we assume |Q| ≤ n2/3 log n). Our key observation (in Lemma 7.4.8) is
that at the start of Stage i, every node v only needs to send the distance values for
at most n2/3 / logi−1/2 n different blocker nodes (note that i is not a constant), thus
more messages can be sent by v to each blocker node in later stages.
Let Qv,i be the set of blocker nodes for which node v has messages to send at
start of stage i. We introduce the notion of a frame, where each frame has a single
round available for each blocker node in Qv,i . Stage i is divided into n2/3 logi+1 n +
n2/3 frames (we will show that each frame consists of dn2/3 / logi−1/2 ne rounds). In
each frame, node v sends out an unsent message for each c ∈ Qv,i to its parent in c’s
tree (Step 4, Alg. 12).
1: for i ≥ 0 : do
2: Let Qv,i be the set of nodes in Q for which v contains at least one unsent
message during Stage-i.
3: for frame j = 1 to dn2/3 logi+1 n + n2/3 e do
4: For each c ∈ Qv,i in sequence: v forwards an unsent message for c to
its parent in c’s tree.
Lemma 7.4.6. For all blocker nodes c ∈ Qv,i , node v would have sent α messages to
its parent in c’s tree by α + n2/3 − hc (v) frames of Stage i, where hc (v) = hops(v, c),
provided at least α messages are routed through v in Step 4 of Algorithm 12.
Proof. Fix a blocker node c. Let i0 be the smallest i for which the above statement
170
does not hold and let v be a node with maximum hc (v) value for which this statement
is violated in Stage i0 . Node v is not a leaf node since α is 0 or 1 for a leaf and a
leaf would have sent its distance value to its parent in the first frame of Stage-0.
So v must be an internal node. Since the statement does not hold for v for
the first time for α, it implies that v has already sent α − 1 messages (including
its own distance value δ(v, c)) by (α − 1) + n2/3 − hc (v) frames and now does not
have any message to send to its parent in c’s tree in the next frame. However since
the statement holds for all of v’s children, v should have received at least α − 1
messages from its children by (α − 1) + n2/3 − (hc (v) + 1)-th frame, resulting in a
contradiction.
Corollary 7.4.7. After the completion of Stage i, every node v would have sent all
or at least n2/3 logi+1 n different distance values for all blocker nodes c ∈ Qv,i .
Lemma 7.4.8. The set Qv,i has size at most dn2/3 / logi−1/2 ne.
Proof. By Corollary 7.4.7 after the completion of Stage i − 1, every node v would
have sent all or at least n2/3 logi n different distance values for all blocker nodes in
Qv,i−1 . Thus the set Qv,i will consist of only those nodes from Q for which v needs
to send at least n2/3 logi n different distance values. Since congestion at any node
v is at most n |Q| = n4/3 log1/2 n (using Lemma 7.5.8), the size of Qv,i is at most
p
n4/3 log1/2 n/n2/3 logi n = n2/3 / logi−1/2 n. This establishes the lemma.
Proof of Lemma 7.4.3. Since |Qv,i | ≤ n2/3 / logi−1/2 n (by Lemma 7.4.8), Stage i runs
for n2/3 / logi−1/2 n · (n2/3 logi+1 n + n2/3 ) ≤ n4/3 log3/2 n + n4/3 rounds. Lemma 7.4.3
is immediately established from Corollary 7.4.7 and the fact that there are (1/3) ·
log n/ log log n − 1 stages.
171
7.5 Helper Algorithms
Here we describe our algorithm for computing Steps 3 and 4 of Algorithm 6, which
computes the set Vi and identifies which paths belong to Pi respectively. Since every
node with score value greater than or equal to (1 + )i−1 belongs to Vi , computing
Vi is quite trivial. And to determine if a path p belong to Pi , we only need to check
if one of the nodes in p is in Vi .
Our algorithm for computing Vi works as follows: Every node v checks if its
score value is greater than or equal to (1 + )i−1 and if so, it broadcast its ID to every
other node. The set Vi is then constructed by including the IDs of all such nodes.
Since there are at most n messages involved in the broadcast step, this algorithm
takes O(n) rounds. This leads to the following lemma.
Lemma 7.5.1. Given the score(v) values for every v ∈ V , the set Vi can be con-
structed in O(n) rounds.
Proof. Fix a path p from source x to leaf node v. After h rounds, v will know if any
node in p belongs to Vi (using the f lag value it received in Steps 5-8).
172
Algorithm 13 Compute-Pi : Algorithm for computing paths in Pi for source x at
node v
Input: Vi ; h: number of hops; Tx : tree for source x
1: (Round 0): if v ∈ Vi then set f lag ← true else f lag ← f alse
2: Round h ≥ r > 0:
3: Send: if r = hx (v) + 1 then send hf lagi to all children
4: receive [lines 5-8]:
5: if r = hx (v) then
6: let M be the incoming message to v
7: let the sender be w and let M = hf lagw i and
8: if w is a parent of v in Tx then f lag ← f lag ∨ f lagw
9: Local Step at v: if v is a leaf node and f lag = true then the path from x to
v is in Pi .
The algorithm takes h rounds per source x and thus Pi can be computed in
O(|S| · h) rounds in total (since we need to run the algorithm for every source x).
Here we describe our algorithm for computing Step 7(a) of Algorithm 6, which
identifies the paths in Pi that also belong to Pij . Since every path in Pij has at least
(1 + )j−1 nodes from Vi , for each path p we need to determine the number of nodes
in p that belong to Vi . We do this by counting the number of nodes that are in Vi ,
starting from root to leaf node.
Our algorithm works as follows: Fix a source node x ∈ V . In Round 0 x
initializes β value to 1 if it belongs to Vi , otherwise set it to 0 (Step 1). It then sends
this β value to its children in next round (Step 3). In round r ≥ 1, a node v that is
r hops away from x receives the β value from its parent (Steps 5-8) and v updates
the β value in Step 8 (increment it by 1 if v ∈ Vi ) and send it to its children in x’s
tree in round r + 1 (Step 3).
Lemma 7.5.3. Using Compute-Pij (Algorithm 14), Pij can be computed in O(h)
rounds per source node.
173
Algorithm 14 Compute-Pij : Algorithm for computing paths in Pij for source x
at node v
Input: Vi ; h: number of hops; Tx : tree for source x
1: (Round 0): if v ∈ Vi set β ← 1 else β ← 0
2: Round h ≥ r > 0:
3: Send: if r = hx (v) + 1 then send hβi to all children
4: receive [lines 5-8]:
5: if r = hx (v) then
6: let M be the incoming message to v
7: let the sender be w and let M = hβw i and
8: if w is a parent of v in Tx then β ← β + βw
9: Local Step at v: if v is a leaf node and β ≥ (1 + )j−1 then the path from x
to v is in Pij .
Proof. Fix a path p from source x to leaf node v. After h rounds, v will know the
number of nodes that belong to Vi (using the β values it received in Steps 5-8).
The algorithm takes h rounds per source x and thus Pij can be computed in
O(|S| · h) rounds in total (since we need to run the algorithm for every source x).
Algorithm 15 describes our algorithm for computing Step 7(b) of Algorithm 6, which
computes the value of |Pij |. Let Pijv represents the set of paths p in Pij with v as
the leaf node. Every node v knows the set Pijv after running the algorithm described
in the previous section. Our algorithm works as follows: Every node v first compute
|Pijv | (Step 1) and then broadcast this value in Step 2. Every node v then compute
|Pij | by summing up the values received in Step 2 (Step 3).
Algorithm 15 Compute-|Pij |
Input: Pijv : paths in Pij with v as the leaf node
1: Local Step at v ∈ V : set αPij v ← |P v |
ij
2: For each v ∈ V : Broadcast ID(v) and the value αPij
v.
P
3: Local Step at v ∈ V : |Pij | ← v 0 ∈V αP v 0
ij
174
Lemma 7.5.4. Compute-|Pij | (Algorithm 15) computes |Pij | in O(n) rounds.
Proof. Steps 1 and 3 are local steps and involves no communication. Step 2 involves
a broadcast of n messages and takes O(n) rounds using Lemma 5.2.5.
175
Proof. Since the height of Tx is at most h, any node v ∈ V which lies in the subtree
rooted at a z ∈ Z will receive the message from z by h rounds. This establishes the
lemma.
Lemma 7.5.7. The h-hop shortest path extensions can be computed in O(nh) rounds
for every source x ∈ V using Bellman-Ford algorithm.
Here we describe our deterministic algorithm for computing Step 1 of Algorithm 11,
which identifies a set B of bottleneck nodes such that removing this set of nodes
p
reduces the congestion in the network from O(n · |Q|) to O(n · |Q|). However when
p
randomization is allowed, there is a O(n · |Q|) randomized algorithm of Huang et
176
al [50] that computes this set w.h.p. in n. Our deterministic algorithm is however
very different from the randomized algorithm given in [50] and it uses ideas from our
blocker set algorithm in [10].
We now give an overview of the randomized algorithm of [50] that computes
this set of bottleneck nodes. For a source x and its incoming shortest path tree Tx ,
every node in Tx calculates the number of outgoing messages for source x. This is
done by waiting for messages from all children nodes, followed by sending a message
to its parent in Tx . This takes O(n) rounds and can be run across multiple nodes
in Q as congestion is at most O(|Q|). Thus using the randomized algorithm of
Ghaffari [39], this algorithm can be run across all nodes in Q concurrently in Õ(n +
|Q|) = Õ(n) rounds. After computing these values, a node b with maximum count
is selected to the set B and is then removed from the network. The algorithm
p
repeats this for O( |Q|) times, thus eliminating all nodes that needed to send at
p p
least n |Q| messages (since removal of every such node eliminates O(n |Q|) nodes
across all trees and there are at most n · |Q| nodes).
Our deterministic algorithm for computing bottleneck nodes (Algorithm 17)
works as follows: In Step 1, the algorithm computes the countv,c values (number of
messages v needs to send to its parent in c’s tree) using Algorithm 18 described in
Section 7.5.3.2. Every node v calculates the total number of messages it needs to
send by summing up the values computed in Step 1 (Step 2) and then broadcast this
value in Step 4. The node with maximum value is added to the bottleneck node set
B (Step 5) and the values of its ancestors and descendants are updated using the
algorithms in [8]. In Lemma 7.5.10 we establish that the whole algorithm runs in
p
O(n |Q| + h · |Q|) rounds deterministically.
Proof. This is immediate since the while loop in Steps 3-6 terminates only when
177
Algorithm 17 Compute-Bottleneck: Compute Bottleneck Nodes Set B
Input: Q: blocker set; C Q : CSSSP collection for blocker set Q
Output: B: set of bottleneck nodes
1: For each c ∈ Q in sequence: Compute countv,c values at every node v ∈ V
using Algorithm 18 (Section 7.5.3.2). P
2: Local Step at v ∈ V : Compute total_countv ← c∈Q countv,c
p
3: while there is a node v with total_countv > n |Q| do
4: For each v ∈ V : Broadcast ID(v) and total_countv value .
5: Add node b to B such that b has maximum total_countv value (break ties
using IDs).
6: Update total_countv values for the descendants and ancestors of b across all
trees in the collection C Q using Algorithm 6 in Chapter 5 and Algorithm 4 in
Chapter 6.
p
there is no node v with total_countv > n |Q|.
Proof. Step 1 takes O(h · |Q|) rounds using Lemma 7.5.11. Step 2 is a local compu-
tation step and involves no communication. Step 4 involves a broadcast of at most
n messages and hence takes O(n) rounds using Lemma 5.2.5. Step 5 again do not
involve any communication. Both Algorithm 6 and Algorithm 4 takes O(n) rounds
p
(Lemmas 5.3.6 and 6.4.6). Since B has size at most |Q| (by Lemma 7.5.9), the
p
while loop runs for at most |Q| iterations, thus establishing the lemma.
178
7.5.3.2 Computing countv,c Values
Here we describe our algorithm for computing Step 1 of Algorithm 17, which com-
putes countv,c values in a given h-hop CSSSP collection C for source set S. Our
algorithm (Algorithm 18) is quite simple and works as follows: Fix a source c ∈ S
and let Tc be the tree corresponding to source c in C. The goal is to compute the
number of messages each node v ∈ Tc needs to send to its parent. In Step 1 every
node v ∈ Tc initializes its countv,c value to 1. Every node v that is hc (v) hops away
from c receives the count values from all its children by round h − hc (v) (Steps 5-9)
and it then send it to its parent in round h − hc (v) + 1 (Step 3) after updating it
(Step 9).
Proof. Every leaf node v can initialize their countv,c values to 1 in Step 1. For every
other internal node v, v correctly computes countv,c value after receiving the count
values from all its children by round h − hc (v) (Steps 5-9) and then send the correct
countv,c value to its parent in round h − hc (v) + 1 in Step 3.
Since hc (v) ≥ 0, this algorithm requires at most h + 1 rounds.
179
7.6 Conclusion
180
Bibliography
[3] A. Abboud, V. Vassilevska Williams, and H. Yu. Matching triangles and basing
hardness on an extremely popular conjecture. In Proc. STOC, pages 41–50.
ACM, 2015.
[4] A. Abboud and V. V. Williams. Popular conjectures imply strong lower bounds
for dynamic problems. In Proc. FOCS, pages 434–443. IEEE, 2014.
[6] U. Agarwal and V. Ramachandran. Finding k simple shortest paths and cycles.
In Proc. ISAAC, pages 8:1–8:12, 2016.
181
[8] U. Agarwal and V. Ramachandran. Distributed weighted all pairs shortest paths
through pipelining. In Proc. IPDPS. IEEE, 2019.
[9] U. Agarwal and V. Ramachandran. Faster deterministic all pairs shortest paths
in congest model. Manuscript, 2019.
[11] N. Alon, L. Babai, and A. Itai. A fast and simple randomized parallel algorithm
for the maximal independent set problem. Journal of algorithms, 7(4):567–583,
1986.
[12] N. Alon, Z. Galil, O. Margalit, and M. Naor. Witnesses for boolean matrix
multiplication and for shortest paths. In Proc. FOCS, pages 417–426. IEEE,
1992.
[14] A. Backurs and P. Indyk. Edit distance cannot be computed in strongly sub-
quadratic time (unless SETH is false). In Proc. STOC, pages 51–58. ACM,
2015.
[16] B. Berger, J. Rompel, and P. W. Shor. Efficient NC algorithms for set cover
with applications to learning and geometry. J. Comp. Sys. Sci., 49(3):454–477,
1994.
[17] A. Bernstein and D. Karger. A nearly optimal oracle for avoiding failed vertices
and edges. In Proc. STOC, pages 101–110, 2009.
182
[18] A. Bernstein and D. Nanongkai. Distributed exact weighted all-pairs shortest
paths in near-linear time. In Proc. STOC. ACM, 2019.
[21] U. Brandes. A faster algorithm for betweenness centrality. Jour. Math. Soc.,
25(2):163–177, 2001.
[25] C. Demetrescu and G. F. Italiano. A new approach to dynamic all pairs shortest
paths. J. ACM, 51:968–992, 2004.
[28] M. Elkin. Distributed exact shortest paths in sublinear time. In Proc. STOC,
pages 757–770. ACM, 2017.
183
[29] M. Elkin and O. Neiman. Hopsets with constant hopbound, and applications
to approximate shortest paths. In Proc. FOCS, pages 128–137. IEEE, 2016.
[30] D. Eppstein. Finding the k shortest paths. SIAM Jour. Comput., 28:652–673,
1998.
[32] L. R. Ford Jr. Network flow theory. Technical report, RAND CORP SANTA
MONICA CA, 1956.
[34] H. N. Gabow. Scaling algorithms for network problems. J. Comp. Sys. Sci.,
31(2):148–168, 1985.
184
[40] M. Ghaffari and J. Li. Improved distributed algorithms for exact shortest paths.
In Proc. STOC, pages 431–444. ACM, 2018.
[41] Z. Gotthilf and M. Lewenstein. Improved algorithms for the k simple shortest
paths and the replacement paths problems. Inf. Proc. Lett., 109(7):352–355,
2009.
[42] T. Hagerup. Improved shortest paths on the word RAM. In Proc. ICALP, pages
61–72. Springer, 2000.
[46] J. Hershberger, S. Suri, and A. Bhosle. On the difficulty of some shortest path
problems. ACM Trans. Alg. (TALG), 3(1):5, 2007.
185
[49] S. Holzer and R. Wattenhofer. Optimal distributed all pairs shortest paths and
applications. In PODC ’12, pages 355–364, 2012.
[50] C.-C. Huang, D. Na Nongkai, and T. Saranurak. Distributed exact weighted all-
pairs shortest paths in Õ(n5/4 ) rounds. In Proc. FOCS, pages 168–179. IEEE,
2017.
[52] A. Itai and M. Rodeh. Finding a minimum circuit in an graph. SIAM Jour.
Comput., 7(4):413–423, 1978.
[53] D. B. Johnson. Finding all the elementary circuits of a directed graph. SIAM
Jour. Comput., 4(1):77–84, 1975.
[54] D. B. Johnson. Efficient algorithms for shortest paths in sparse networks. JACM,
24(1):1–13, 1977.
[55] D. R. Karger, D. Koller, and S. J. Phillips. Finding the hidden path: Time
bounds for all-pairs shortest paths. SIAM J. Comput., 22(6):1199–1217, 1993.
[57] N. Katoh, T. Ibaraki, and H. Mine. An efficient algorithm for k shortest simple
paths. Networks, 12(4):411–427, 1982.
[58] V. King. Fully dynamic algorithms for maintaining all-pairs shortest paths and
transitive closure in digraphs. In Proc. IEEE FOCS, pages 81–89. IEEE, 1999.
186
[60] H. O. Lancaster. Pairwise statistical independence. Annals of Mathematical
Statistics, 36(4):1313–1317, 1965.
[61] E. L. Lawler. A procedure for computing the k best solutions to discrete opti-
mization problems and its application to the shortest path problem. Manage-
ment Science, 18(7):401–405, 1972.
[63] C. Lenzen and B. Patt-Shamir. Fast partial distance estimation and applica-
tions. In Proc. PODC, pages 153–162. ACM, 2015.
[65] C. Lenzen and D. Peleg. Efficient distributed source detection with limited
bandwidth. In Proc. PODC, pages 375–382. ACM, 2013.
[66] A. Lincoln, V. V. Williams, and R. Williams. Tight hardness for shortest cycles
and paths in sparse graphs. In Proc. SODA, pages 1236–1252. SIAM, 2018.
[67] A. Lingas and E.-M. Lundell. Efficient approximation algorithms for shortest
cycles in undirected graphs. Inf. Proc. Lett., 109(10):493–498, 2009.
187
[71] J. B. Orlin and A. Sedeno-Noda. An O(nm) time algorithm for finding the min
length directed cycle in a graph. In Proc. SODA. SIAM, 2017.
[74] D. Peleg, L. Roditty, and E. Tal. Distributed algorithms for network diameter
and girth. In Proc. ICALP, pages 660–672. Springer, 2012.
[75] D. Peleg and V. Rubinovich. A near-tight lower bound on the time complexity
of distributed mst construction. In Proc. FOCS, pages 253–261. IEEE, 1999.
[78] L. Roditty and V. Vassilevska Williams. Fast approximation algorithms for the
diameter and radius of sparse graphs. In Proc. STOC, pages 515–524. ACM,
2013.
[79] L. Roditty and V. V. Williams. Minimum weight cycles and triangles: Equiva-
lences and algorithms. In Proc. FOCS, pages 180–189. IEEE, 2011.
[80] L. Roditty and U. Zwick. Replacement paths and k simple shortest paths in
unweighted directed graphs. ACM Trans. Alg. (TALG), 8(4):33, 2012.
[81] P. Sankowski and K. Węgrzycki. Improved distance queries and cycle counting
by Frobenius Normal Form. In Proc. STACS, pages 56:1–56:14, 2017.
188
[82] R. Seidel. On the all-pairs-shortest-path problem in unweighted undirected
graphs. Jour. Comput. Sys. Sci., 51(3):400–403, 1995.
[83] A. Shoshan and U. Zwick. All pairs shortest paths in undirected graphs with
integer weights. In Proc. FOCS, pages 605–614. IEEE, 1999.
[85] M. Thorup. Undirected single source shortest paths in linear time. In Proc.
FOCS, pages 12–21. IEEE, 1997.
[89] H. Weinblatt. A new search algorithm to find the elementary circuits of a graph.
JACM, 19:43–56, 1972.
[90] R. Williams. A new algorithm for optimal 2-constraint satisfaction and its
implications. Theoretical Computer Science, 348(2):357–365, 2005.
189
[93] R. Yuster. A shortest cycle for each vertex of a graph. Inf. Proc. Lett.,
111(21):1057–1061, 2011.
[94] U. Zwick. All pairs shortest paths using bridging sets and rectangular matrix
multiplication. JACM, 49(3):289–317, 2002.
190
Vita
1 A
LTEX 2ε is an extension of LATEX. LATEX is a collection of macros for TEX. TEX is a trademark of
the American Mathematical Society. The macros used in formatting this dissertation were written
by Dinesh Das, Department of Computer Sciences, The University of Texas at Austin, and extended
by Bert Kay, James A. Bednar, and Ayman El-Khashab.
191