Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
9 views101 pages

Lecture5 IO BLG336E 2022

The document discusses greedy algorithms and their applications in interval scheduling, partitioning, and shortest paths in graphs. It explains the principles of greedy algorithms, optimality proofs, and provides examples of problems such as scheduling to minimize lateness. Key concepts include the greedy template, analysis strategies, and specific algorithms like Dijkstra's and those for minimum spanning trees.

Uploaded by

pearsonicin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views101 pages

Lecture5 IO BLG336E 2022

The document discusses greedy algorithms and their applications in interval scheduling, partitioning, and shortest paths in graphs. It explains the principles of greedy algorithms, optimality proofs, and provides examples of problems such as scheduling to minimize lateness. Key concepts include the greedy template, analysis strategies, and specific algorithms like Dijkstra's and those for minimum spanning trees.

Uploaded by

pearsonicin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

BLG 336E

Analysis of Algorithms II
Lecture 5:
Greedy Algorithms, Interval Scheduling, Interval
Partitioning, Shortest Paths in a Graph (Dijkstra)

1
Graphs-Last week
• There are a lot of graphs.

• We want to answer questions about them.


• Efficient routing?
• Community detection/clustering?
• Computing Bacon numbers
• Signing up for classes without violating pre-req constraints
• How to distribute fish in tanks so that none of them will fight.
Recap-Last week
• Depth-first search
• Useful for topological sorting
• Also in-order traversals of BSTs
• Breadth-first search
• Useful for finding shortest paths
• Also for testing bipartiteness
• Both DFS, BFS:
• Useful for exploring graphs, finding connected
components, etc
Greedy Algorithms

An algorithm is greedy if it builds a solution in small steps, choosing a


decision at each step myopically [=locally, not considering what may
happen ahead] to optimize some underlying criterion.

It is easy to design a greedy algorithm for a problem. There may be


many different ways to choose the next step locally.

What is challenging is to produce an algorithm that produces


either an optimal solution,
or a solution close to the optimum.

4
Proving that the Greedy Solution is Optimal

Approaches to prove that the greedy solution is as good or better as


any other solution:

1) prove that it stays ahead of any other algorithm


e.g. Interval Scheduling

2) exchange argument (more general): consider any possible solution


to the problem and gradually transform into the solution found by
the greedy solution without hurting its quality.
e.g. Scheduling to Minimize Lateness

5
Greedy Analysis Strategies

Greedy algorithm stays ahead. Show that after each step of the
greedy algorithm, its solution is at least as good as any other
algorithm's.

Exchange argument. Gradually transform any solution to the one found


by the greedy algorithm without hurting its quality.

Structural. Discover a simple "structural" bound asserting that every


possible solution must have a certain value. Then show that your
algorithm always achieves this bound.

6
Example Problems

Interval Scheduling

Interval Partitioning

Scheduling to Minimize Lateness

Shortest Paths in a Graph (Dijkstra)

The Minimum Spanning Tree Problem


Prim’s Algorithm, Kruskal’s Algorithm

Huffman Codes and Compression

7
Interval Scheduling

Interval scheduling.
Job j starts at sj and finishes at fj.
Two jobs compatible if they don't overlap.
Goal: find maximum subset of mutually compatible jobs.

g
h
Time
0 1 2 3 4 5 6 7 8 9 10 11
8
Interval Scheduling Example 1

Optimal Solution ? Maximum number of jobs?

9
Interval Scheduling Example 2

10
Interval Scheduling Example 3

11
Interval Scheduling Example 4

12
Interval Scheduling: Greedy Algorithms

Greedy template. Consider jobs in some order. Take each job provided
it's compatible with the ones already taken.

[Earliest start time] Consider jobs in ascending order of start time


sj.

[Earliest finish time] Consider jobs in ascending order of finish


time fj.

[Shortest interval] Consider jobs in ascending order of interval


length fj - sj.

[Fewest conflicts] For each job, count the number of conflicting


jobs cj. Schedule in ascending order of conflicts cj.

13
Interval Scheduling: Greedy Algorithms

Greedy template. Consider jobs in some order. Take each job provided
it's compatible with the ones already taken.

breaks earliest start time

breaks shortest interval

breaks fewest conflicts

14
Interval Scheduling: Greedy Algorithm

Greedy algorithm. Consider jobs in increasing order of finish time.


Take each job provided it's compatible with the ones already taken.

Sort jobs by finish times so that f1  f2  ...  fn.


jobs selected

A  
for j = 1 to n {
if (job j compatible with A)
A  A  {j}
}
return A

Implementation. O(n log n), due to the sorting operation b


a

Remember job j* that was added last to A. c

Job j is compatible with A if sj  fj*.


e

h
Time
0 1 2 3 4 5 6 7 8 9 10 11

15
Interval Scheduling: Analysis

Theorem. Greedy algorithm is optimal.

Pf. (by contradiction)


Assume greedy is not optimal, and let's see what happens.
Let i1, i2, ... ik denote set of jobs selected by greedy.
Let j1, j2, ... jm denote set of jobs in the optimal solution with
i1 = j1, i2 = j2, ..., ir = jr for the largest possible value of r.

job ir+1 finishes before jr+1

Greedy: i1 i1 ir ir+1

OPT: j1 j2 jr jr+1 ...

why not replace job jr+1


with job ir+1?

16
Interval Scheduling: Analysis

Theorem. Greedy algorithm is optimal.

Pf. (by contradiction)


Assume greedy is not optimal, and let's see what happens.
Let i1, i2, ... ik denote set of jobs selected by greedy.
Let j1, j2, ... jm denote set of jobs in the optimal solution with
i1 = j1, i2 = j2, ..., ir = jr for the largest possible value of r.

job ir+1 finishes before jr+1

Greedy: i1 i1 ir ir+1

OPT: j1 j2 jr ir+1 ...

solution still feasible and optimal,


but contradicts maximality of r.

17
Interval Partitioning
Interval Partitioning

Interval partitioning.
Aim: Schedule all the requests by using as few resources as possible.
Example: Classroom Scheduling
Lecture j starts at sj and finishes at fj.
Goal: find minimum number of classrooms to schedule all lectures so
that no two occur at the same time in the same room.

Ex: This schedule uses 4 classrooms to schedule 10 lectures.

e j

c d g

b h

a f i

9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30


Time

19
Interval Partitioning

Ex: This schedule uses only 3.

e j

c d g

b h

a f i

9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30


Time

c d f j

b g i

a e h

9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30


Time

20
Interval Partitioning: Lower Bound on Optimal Solution

Def. The depth of a set of intervals is the maximum number that pass
over any single point on the time-line.

Key observation. Number of classrooms needed  depth.

Ex: Depth of schedule below = 3  schedule below is optimal.

a, b, c all contain 9:30

Q. Does there always exist a schedule equal to depth of intervals?


R. May not be.

c d f j

b g i

a e h

9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30


Time

21
Depth of previous schedule

e j

c d g

b h

a f i

9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30


Time

• Depth = 3 → Schedule is not optimal

22
Interval Partitioning: Greedy Algorithm

Greedy algorithm. Consider lectures in increasing order of start time:


assign lecture to any compatible classroom.

Sort intervals by starting time so that s1  s2  ...  sn.


d  0 number of allocated classrooms

for j = 1 to n {
if (lecture j is compatible with some classroom k)
schedule lecture j in classroom k
else
allocate a new classroom d + 1
schedule lecture j in classroom d + 1
d  d + 1
}

Implementation. O(n log n).


For each classroom k, maintain the finish time of the last job added.
Keep the classrooms in a priority queue. c d f j

b g i

a e h

9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30 23


Time
Interval Partitioning: Greedy Analysis

Observation. Greedy algorithm never schedules two incompatible


lectures in the same classroom.

Theorem. Greedy algorithm is optimal.


Pf.
Let d = number of classrooms that the greedy algorithm allocates.
Classroom d is opened because we needed to schedule a job, say j,
that is incompatible with all d-1 other classrooms.
Since we sorted by start time, all these incompatibilities are caused
by lectures that start no later than sj.
Thus, we have d lectures overlapping at time sj + .
Key observation  all schedules use  d classrooms. ▪

24
Scheduling to Minimize Lateness

We have a single resource and a set of n requests to


use the resource for an interval of time. Each
request has a deadline, d, and requires a contiguous
time interval of length, t, but willing to be scheduled
at any time before the deadline.
Aim: Minimizing the lateness
Scheduling to Minimizing Lateness

Minimizing lateness problem.

Single resource processes one job at a time.


Job j requires tj units of processing time and is due at time dj.
If j starts at time sj, it finishes at time fj = sj + tj.
Lateness: j = max { 0, fj - dj }.
Goal: schedule all jobs to minimize maximum lateness L = max j.

1 2 3 4 5 6

Ex: tj 3 2 1 4 3 2
dj 6 8 9 9 14 15

lateness = 2 lateness = 0 max lateness = 6

d3 = 9 d2 = 8 d6 = 15 d1 = 6 d5 = 14 d4 = 9
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

26
Minimizing Lateness: Greedy Algorithms

Greedy template. Consider jobs in some order.

[Shortest processing time first] Consider jobs in ascending order


of processing time tj.

[Earliest deadline first] Consider jobs in ascending order of


deadline dj.

[Smallest slack] Consider jobs in ascending order of slack dj - tj.

27
Minimizing Lateness: Greedy Algorithms

Greedy template. Consider jobs in some order.

[Shortest processing time first] Consider jobs in ascending order


of processing time tj.
1 2
tj 1 10 counterexample
dj 100 10

[Smallest slack] Consider jobs in ascending order of slack dj - tj.

1 2
tj 1 10
counterexample
dj 2 10

28
Minimizing Lateness: Greedy Algorithm
1 2 3 4 5 6

Greedy algorithm. Earliest deadline first. tj 3 2 1 4 3 2


dj 6 8 9 9 14 15

Sort n jobs by deadline so that d1  d2  …  dn

t  0
for j = 1 to n
Assign job j to interval [t, t + tj]
sj  t, fj  t + tj
t  t + tj
output intervals [sj, fj]

max lateness = 1

d1 = 6 d2 = 8 d3 = 9 d4 = 9 d5 = 14 d6 = 15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

29
Minimizing Lateness: No Idle Time

Observation. There exists an optimal schedule with no idle time (no


“gaps” between the scheduled jobs).
d=4 d=6 d = 12
0 1 2 3 4 5 6 7 8 9 10 11

d=4 d=6 d = 12
0 1 2 3 4 5 6 7 8 9 10 11

Observation. The greedy schedule has no idle time.

This is good since the aggregate execution time can not be smaller. We
must check if it satisfies “minimum lateness.”

30
Minimizing Lateness: Inversions

Def. An inversion in schedule S is a pair of jobs i and j such that:


di < dj but j scheduled before i. inversion

before swap j i

Observation. Greedy schedule has no inversions.

Observation. If a schedule (with no idle time) has an inversion, it has


one with a pair of inverted jobs scheduled consecutively.

31
Minimizing Lateness: Inversions

Def. An inversion in schedule S is a pair of jobs i and j such that:


di < dj but j scheduled before i. inversion
fi

before swap j i

after swap i j
f'j

Claim. Swapping two adjacent, inverted jobs reduces the number of


inversions by one and does not increase the max lateness.

Pf. Let  be the lateness before the swap, and let  ' be it afterwards.
'k = k for all k  i, j
'i  i
¢j = fj¢ - d j (definitio n)
If job j is late: = f -d ( j finishes at time f )
i j i
£ fi - di (i < j )
£ i (definitio n)

32
Minimizing Lateness-Example

Def. An inversion in schedule S is a pair of jobs i and j such that:


di < dj but j scheduled before i.

Pf. Let  be the lateness before the swap, and let  ' be it afterwards.
'k = k for all k  i, j
¢j = fj¢ - d j (definitio n)
'i  i
= fi - d j ( j finishes at time fi )
If job j is late:
£ fi - di (i < j )
£ i (definitio n)

1 2 3 4 5 6
tj 3 2 1 4 3 2
dj 6 8 9 9 14 15
max lateness = 1

d1 = 6 d2 = 8 d3 = 9 d4 = 9 d5 = 14 d6 = 15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

33
Minimizing Lateness: Analysis of Greedy Algorithm

All schedules with no inversions and no idle time has the same maximum
lateness.

Theorem. Greedy schedule S is optimal.


Pf. Define S* to be an optimal schedule that has the fewest number of
inversions, and let's see what happens.
Can assume S* has no idle time.
If S* has no inversions, then S = S*.
If S* has an inversion, let i-j be an adjacent inversion.
– swapping i and j does not increase the maximum lateness and
strictly decreases the number of inversions
– this contradicts definition of S* ▪

34
Shortest Paths in a Graph

shortest path from Princeton CS department to Einstein's house


Graphs-Recap

Undirected graph. G = (V, E)


V = nodes.
E = edges between pairs of nodes.
Captures pairwise relationship between objects.
Graph size parameters: n = |V|, m = |E|.

V = { 1, 2, 3, 4, 5, 6, 7, 8 }
E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 }
n=8
m = 11

36
Single-Source Shortest Path

37
Shortest Path Problem

Shortest path network.


Directed graph G = (V, E).
Source s, destination t.
Length e = length of edge e.

Shortest path problem: find shortest directed path from s to t.

cost of path = sum of edge costs in path

2 23 3
9
s
18 Cost of path s-2-3-5-t
14 6
2 = 9 + 23 + 2 + 16
6
30 4 19 = 48.
11
15 5
5
6
20 16

7 t
44
38
Example

A. 0,1,2,3
B. 0,1,4,7
C. 0,1,4,6
D. 0,1,3,6

39
Shortest Path
• What if the graphs are weighted?
• All nonnegative weights: Dijkstra!
• If there are negative weights: Bellman-Ford! (if time
permits)
Caltrain

Hospital

Stadium

Gates
YOU
Packard ARE
HERE
CS161

Union

Dish
Just the graph
How do I get from Gates to the Union?
Caltrain

10 17
Hospital
15
Gates 10
1 Stadium
Packard CS161
1
4
25 Run BFS …
22 Union I should go to the dish and
then back to the union!

20 That doesn’t make sense if I label


Dish
the edges by walking time.
Just the graph
How do I get from Gates to the Union?
Caltrain

10 17
Hospital
15
weighted Gates 10
graph 1 Stadium

w(u,v) = weight Packard CS161


of edge between 1
u and v. 4
25 If I pay attention to
For now, edge 22 Union the weights…
weights are non-
negative. I should go to Packard, then
CS161, then the union.
20
Dish
Shortest path problem
• What is the shortest path between u and v in a
weighted graph?
• the cost of a path is the sum of the weights along that path
• The shortest path is the one with the minimum cost.
This path from s to t
3 20 has cost 25.
2
s
2 1 1 t
1 This path is shorter,
it has cost 5.
• The distance d(u,v) between two vertices u and v is the cost of
the the shortest path between u and v.
• For this lecture all graphs are directed, but to save on notation
=
I’m just going to draw undirected edges.
Shortest paths Caltrain

10 17
Hospital
15
This is the shortest Gates 10
path from Gates to
the Union. Stadium
1
Packard CS161
It has cost 6. 1
4
25
22 Union

Q: What’s the shortest


20 path from Packard to
Dish the Union?
Warm-up
• A sub-path of a shortest path is also a shortest path.

• Say this is a shortest path from s to t.


• Claim: this is a shortest path from s to x.
• Suppose not, this one is shorter.
• But then that gives an even shorter path from s to t!

x t
s
Single-source shortest-path problem
• I want to know the shortest path from one vertex
(Gates) to all other vertices.

Destination Cost To get there


Packard 1 Packard
CS161 2 Packard-CS161
Hospital 10 Hospital
Caltrain 17 Caltrain
Union 6 Packard-CS161-Union
Stadium 10 Stadium
Dish 23 Packard-Dish

(Not necessarily stored as a table – how this information


is represented will depend on the application)
Example
• “what is the
shortest path from
Palo Alto to
[anywhere else]”
using BART, Caltrain,
lightrail, MUNI, bus,
Amtrak, bike,
walking, uber/lyft.
• Edge weights have
something to do
with time, money,
hassle. (They also
change depending on my
mood and traffic…).
Example
• Network routing
• I send information
over the internet,
from my computer
to to all over the
world.
• Each path has a cost
which depends on
link length, traffic,
other costs, etc..
• How should we
send packets?
Caltrain

Hospital

Stadium

Gates

Packard
Back to this example
CS161

Union

Dish
Gates
Dijkstra’s algorithm
1
CS161
• What are the shortest paths
from Gates to everywhere 1
else?
Packard
4

22
Union
25
20

Dish
Dijkstra
intuition

YOINK!

Gates
CS161
Packard Dish Union
Dijkstra
intuition
A vertex is done when it’s not
on the ground anymore.

YOINK!

Gates

CS161
Packard Dish Union
Dijkstra
intuition

YOINK!

Gates

1
Packard

CS161
Dish Union
Dijkstra
intuition
YOINK!

Gates

1
Packard

1
CS161

Dish Union
YOINK!
Dijkstra
intuition
Gates

1
Packard

1
CS161

Union

Dish
YOINK!
Dijkstra
intuition Gates

1
Packard

1
CS161

4 22
Union

Dish
YOINK!
Dijkstra
intuition Gates

1
Packard
This also creates a
tree structure! 1
CS161

The shortest paths 4 22


are the lengths
Union
along this tree.

Dish
How do we actually implement this?

• Without string and gravity?


Dijkstra by example Gates 0
How far is a node from Gates? ∞
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
Initialize d[v] = ∞ ∞ 4
for all non-starting vertices
v, and d[Gates] = 0
22 ∞
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
20

Dish

Dijkstra by example Gates 0
How far is a node from Gates? ∞
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
∞ 4
Current node u
22 ∞
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
Dish

Dijkstra by example Gates 0
How far is a node from Gates? ∞
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 ∞
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
25
Dijkstra by example Gates 0
How far is a node from Gates? ∞
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 ∞
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 25
Dijkstra by example Gates 0
How far is a node from Gates? ∞
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 ∞
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 25
Dijkstra by example Gates 0
How far is a node from Gates? 2
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 ∞
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 23
Dijkstra by example Gates 0
How far is a node from Gates? 2
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 ∞
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 23
Dijkstra by example Gates 0
How far is a node from Gates? 2
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 ∞
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 23
Dijkstra by example Gates 0
How far is a node from Gates? 2
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 6
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 23
Dijkstra by example Gates 0
How far is a node from Gates? 2
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 6
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 23
Dijkstra by example Gates 0
How far is a node from Gates? 2
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 6
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 23
Dijkstra by example Gates 0
How far is a node from Gates? 2
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 6
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 23
Dijkstra by example Gates 0
How far is a node from Gates? 2
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 6
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 23
Dijkstra by example Gates 0
How far is a node from Gates? 2
1
CS161
I’m not sure yet

I’m sure 1
x = d[v] is my best over-estimate
x for dist(Gates,v).
Packard
1 4
Current node u
22 6
• Pick the not-sure node u with the Union
smallest estimate d[u]. 25
• Update all u’s neighbors v:
20
• d[v] = min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure. Dish
• Repeat 23
Dijkstra’s algorithm
Dijkstra(G,s):
• Set all vertices to not-sure
• d[v] = ∞ for all v in V
• d[s] = 0
• While there are not-sure nodes:
• Pick the not-sure node u with the smallest estimate d[u].
• For v in u.neighbors:
• d[v] ← min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure.
• Now d(s, v) = d[v]
As usual
• Does it work?
• Yes.

• Is it fast?
• Depends on how you implement it.
Why does this work?
• Theorem:
• Run Dijkstra on G =(V,E), starting from s.
• At the end of the algorithm, the estimate d[v] is the actual
distance d(s,v).
Let’s rename “Gates” to
“s”, our starting vertex.
• Proof outline:
• Claim 1: For all v, d[v] ≥ d(s,v).
• Claim 2: When a vertex v is marked sure, d[v] = d(s,v).

• Claims 1 and 2 imply the theorem.


• By the time we are sure about v, d[v] = d(s,v).
• d[v] never increases, so after v is sure, d[v] stops changing.
• All vertices are eventually sure. (Stopping condition in algorithm)
• So all vertices end up with d[v] = d(s,v).
Next let’s prove the claims!
Claim 1
d[v] ≥ d(s,v) for all v.
Gates 0
Informally:
• Every time we update d[v], we have a path in mind: 2
CS161
1
d[v] ← min( d[v] , d[u] + edgeWeight(u,v) )
1
Whatever path we Packard
The shortest path to u, and 25
had in mind before
then the edge from u to v. 1 4
• d[v] = length of the path we have in mind ∞
6
≥ length of shortest path 22 Union
= d(s,v)
20
Formally: Dish
23
• We should prove this by induction.
Claim 2
When a vertex u is marked sure, d[u] = d(s,u)
• For s (the start vertex):
• The first vertex marked sure has d[s] = d(s,s) = 0.
• For all the other vertices:
• Suppose that we are about to add u to the sure list.
• That is, we picked u in the first line here:
• Pick the not-sure node u with the smallest estimate d[u].
• Update all u’s neighbors v:
• d[v] ← min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure.
• Repeat

• Want to show that d[u] = d(s,u).


YOINK!
Intuition
When a vertex u is marked sure, d[u] = d(s,u)
Gates s
• The first path that lifts u off the
1
ground is the shortest one.
Packard

1
But let’s actually prove it. CS161

u Union

Dish
Temporary definition:
v is “good” means that d[v] = d(s,v)
Claim 2
• Want to show that u is good.

Consider a true shortest


path from s to u:

s
The vertices in between
u
are beige because they
may or may not be sure.
True shortest path.
Temporary definition:
v is “good” means that d[v] = d(s,v)
Claim 2 means good means not good
“by way of contradiction”

• Want to show that u is good. BWOC, suppose u isn’t good.


• Say z is the last good vertex before u.
• z’ is the vertex after z.

z’
s It may be that z = s.
z It may be that z’ = u.
u
The vertices in between z != u, since u is not good.
are beige because they
may or may not be sure.
True shortest path.
Temporary definition:
v is “good” means that d[v] = d(s,v)
Claim 2 means good means not good

• Want to show that u is good. BWOC, suppose u isn’t good.


𝑑 𝑧 = 𝑑 𝑠, 𝑧 ≤ 𝑑 𝑠, 𝑢 ≤ 𝑑[𝑢]
z is good This is the shortest Claim 1
path from s to u.

• If 𝑑 𝑧 = 𝑑 𝑢 , then u is good.
We chose u so that d[u] was
• If 𝑑 𝑧 < 𝑑 𝑢 , then z is sure. smallest of the unsure vertices.
So therefore
True shortest path.
r z is sure.
z’
s It may be that z’ = u.
It may be that z = s.
z u
Dijkstra shortest path.
Temporary definition:
v is “good” means that d[v] = d(s,v)
Claim 2 means good means not good

• Want to show that u is good. BWOC, suppose u isn’t good.


• If z is sure then we’ve already updated z’:
• 𝑑 𝑧′ ← min{ 𝑑 𝑧 ′ , 𝑑 𝑧 + 𝑤 𝑧, 𝑧 ′ }, so

𝒅 𝒛′ ≤ 𝒅 𝒛 + 𝒘 𝒛, 𝒛′ = 𝒅 𝒔, 𝒛′ ≤ 𝒅[𝒛′ ]
So everything is equal!
d(s,z’) = d[z’]
And z’ is good.
r z’ CONTRADICTION!!
s It may be that z = s.
z u
Dijkstra shortest path.
Temporary definition:
v is “good” means that d[v] = d(s,v)
Claim 2 means good means not good

• Want to show that u is good. BWOC, suppose u isn’t good.


𝑑 𝑧 = 𝑑 𝑠, 𝑧 ≤ 𝑑 𝑠, 𝑢 ≤ 𝑑[𝑢]
Def. of z This is the shortest Claim 1
path from s to x

• If 𝑑 𝑧 = 𝑑 𝑢 , then u is good. So u is
• If 𝑑 𝑧 < 𝑑 𝑢 , then z is sure. good!
r z’
aka d[u] = d(s,u)

s It may be that z = s.
z u
Dijkstra shortest path.
Claim 2
When a vertex is marked sure, d[u] = d(s,u)
• For s (the starting vertex):
• The first vertex marked sure has d[s] = d(s,s) = 0.
• For all other vertices:
• Suppose that we are about to add u to the sure list.
• That is, we picked u in the first line here:
• Pick the not-sure node u with the smallest estimate d[u].
• Update all u’s neighbors v:
• d[v] ← min( d[v] , d[u] + edgeWeight(u,v))
• Mark u as sure.
• Repeat

Then u is good! aka d[u] = d(s,u)


Why does this work?
• Theorem:
• Run Dijkstra on G =(V,E) starting from s.
• At the end of the algorithm, the estimate d[v] is the
actual distance d(s,v).

• Proof outline:
• Claim 1: For all v, d[v] ≥ d(s,v).
• Claim 2: When a vertex is marked sure, d[v] = d(s,v).

• Claims 1 and 2 imply the theorem.


YOINK!

What did we just learn? Gates

1
• Dijkstra’s algorithm finds Packard
shortest paths in weighted
graphs with non-negative edge 1
weights. CS161

4 22
• Along the way, it constructs a
nice tree. Union

Dish
As usual
• Does it work?
• Yes.

• Is it fast?
• Depends on how you implement it.
Running time?
Dijkstra(G,s):
• Set all vertices to not-sure
• d[v] = ∞ for all v in V
• d[s] = 0
• While there are not-sure nodes:
• Pick the not-sure node u with the smallest estimate d[u].
• For v in u.neighbors:
• d[v] ← min( d[v] , d[u] + edgeWeight(u,v) )
• Mark u as sure.
• Now dist(s, v) = d[v]

• n iterations (one per vertex)


• How long does one iteration take?
Depends on how we implement it…
We need a data structure that:
Just the inner loop:
• Stores unsure vertices v • Pick the not-sure node u with the
smallest estimate d[u].
• Keeps track of d[v]
• Update all u’s neighbors v:
• Can find u with minimum d[u] • d[v] ← min( d[v] , d[u] +
• findMin() edgeWeight(u,v))
• Mark u as sure.
• Can remove that u
• removeMin(u)
• Can update (decrease) d[v]
• updateKey(v,d)
Total running time is big-oh of:
෍ 𝑇 findMin + ෍ 𝑇 updateKey + 𝑇(removeMin)
𝑢∈𝑉
𝑣∈𝑢.𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟𝑠

= n( T(findMin) + T(removeMin) ) + m T(updateKey)


If we use an array
• T(findMin) = O(n)
• T(removeMin) = O(n)
• T(updateKey) = O(1)

• Running time of Dijkstra


=O(n( T(findMin) + T(removeMin) ) + m T(updateKey))
=O(n^2) + O(m)
=O(n^2)
If we use a red-black tree
• T(findMin) = O(log(n))
• T(removeMin) = O(log(n))
• T(updateKey) = O(log(n))

• Running time of Dijkstra


=O(n( T(findMin) + T(removeMin) ) + m T(updateKey))
=O(nlog(n)) + O(mlog(n))
=O((n + m)log(n))

Better than an array if the graph is sparse!


aka if m is much smaller than n2
Heaps support these operations
0
• T(findMin)
• T(removeMin) 2 3

• T(updateKey) 6 5 4 10

• A heap is a tree-based data structure that has the


property that every node has a smaller key than its
children.
• Not covered in this class – see AoA1!!! (Or CLRS).
• But! We will use them.
Many heap implementations
Nice chart on Wikipedia:
Say we use a Fibonacci Heap
• T(findMin) = O(1) (amortized time*)
• T(removeMin) = O(log(n)) (amortized time*)
• T(updateKey) = O(1) (amortized time*)
• See CS166 for more! (or CLRS)

• Running time of Dijkstra


=O(n( T(findMin) + T(removeMin) ) + m T(updateKey))
=O(nlog(n) + m) (amortized time)

*This means that any sequence of d removeMin calls takes time at most O(dlog(n)).
But a few of the d may take longer than O(log(n)) and some may take less time..
In practice

Dijkstra using a list to


keep track of vertices
has quadratic runtime.

Dijkstra using a heap


looks a bit more linear
(actually nlog(n))

BFS is really fast by


comparison! But it
doesn’t work on
weighted graphs.
Dijkstra is used in practice
• eg, OSPF (Open Shortest Path First), a routing
protocol for IP networks, uses Dijkstra.

But there are


some things it’s
not so good at.
Dijkstra Drawbacks
• Needs non-negative edge weights.
• If the weights change, we need to re-run the
whole thing.
• in OSPF, a vertex broadcasts any changes to the
network, and then every vertex re-runs Dijkstra’s
algorithm from scratch.
Recap: shortest paths
• BFS:
• (+) O(n+m)
• (-) only unweighted graphs
• Dijkstra’s algorithm:
• (+) weighted graphs
• (+) O(nlog(n) + m) if you implement it right.
• (-) no negative edge weights
• (-) very “centralized” (need to keep track of all the vertices to know
which to update).
NEXT LECTURE

• The Minimum Spanning Tree


• Prim’s Algorithm
• Kruskal’s Algorithm

108

You might also like