P NP-merged
P NP-merged
Undecidable Languages
What is an efficient algorithm?
For Comparison
● Longest increasing ● Shortest path
subsequence: problem:
● Naive: O(n · 2n) ● Naive: O(n2 · n!)
● Fast: O(n2) ● Fast: O(n + m),
where n is the
number of nodes
and m the number
of edges. (Take
CS161 for details!)
Polynomials and Exponentials
●
An algorithm runs in polynomial time if
its runtime is some polynomial in n.
● That is, time O(nk) for some constant k.
● Polynomial functions “scale well.”
● Small changes to the size of the input do not
typically induce enormous changes to the
overall runtime.
● Exponential functions scale terribly.
● Small changes to the size of the input induce
huge changes in the overall runtime.
The Cobham-Edmonds Thesis
Like
Likethe
theChurch-Turing
Church-Turingthesis,
thesis,this
thisis
is
not
notaatheorem!
theorem!
It's
It'san
anassumption
assumptionabout
aboutthe
thenature
natureof of
efficient
efficientcomputation,
computation,and
andititis
is
somewhat
somewhatcontroversial.
controversial.
The Cobham-Edmonds Thesis
● Efficient runtimes: ● Inefficient runtimes:
● 4n + 13 ● 2n
● n3 – 2n2 + 4n ● n!
● n log log n ● nn
● “Efficient” runtimes: ● “Inefficient” runtimes:
● n1,000,000,000,000 ● n0.0001 log n
● 10500 ● 1.000000001n
The Complexity Class P
● The complexity class P (for polynomial
time) contains all problems that can be
solved in polynomial time.
● Formally:
P = { L | There is a polynomial-time
decider for L }
● Assuming the Cobham-Edmonds thesis, a
language is in P if it can be decided
efficiently.
Examples of Problems in P
● All regular languages are in P.
● All have linear-time TMs.
● All CFLs are in P.
● Requires a more nuanced argument (the
CYK algorithm or Earley's algorithm.)
● Many other problems are in P.
● More on that in a second.
Regular
Languages
CFLs P R
Undecidable Languages
Problems in P
● Graph connectivity:
Given a graph G and nodes s and t,
is there a path from s to t?
● Primality testing:
Given a number p, is p prime? (Best known TM
for this takes time O(n37).)
● Maximum matching:
Given a set of tasks and workers who can perform
those tasks, if each worker performs exactly one
task, can at least n tasks be performed?
Problems in P
● Remoteness testing:
Given a graph G, are all of the nodes in G
within distance at most k of one another?
● Linear programming:
Given a linear set of constraints
and linear objective function, is the
optimal solution at least n?
● Edit distance:
Given two strings, can the strings be
transformed into one another in at most n
single-character edits?
What can't you do in polynomial time?
start How
How many
many simple
simple
paths
paths are
are there
there
from
from the
the start
start
node
node to
to the
the end
end
end node?
node?
, , ,
How
How many
many
subsets
subsets of
of this
this
set
set are
are there?
there?
An Interesting Observation
● There are (at least) exponentially many
objects of each of the preceding types.
● However, each of those objects is not very
large.
● Each simple path has length no longer than the
number of nodes in the graph.
● Each subset of a set has no more elements than
the original set.
● This brings us to our next topic...
NP
NP
What if you could magically
guess which element of the
search space was the one
you wanted?
A Sample Problem
4 3 11 9 7 13 5 6 1 12 2 8 0 10
M
M==“On
“Oninputinput⟨S, ⟨S,k⟩,
k⟩,where
whereSSisisaasequence
sequence
ofofnumbers
numbersand andkkisisaanatural
naturalnumber:
number:
· ·Nondeterministically
Nondeterministicallyguess guessaasubsequence
subsequence
of
ofS. S.
· ·IfIfititisisan
anascending
ascendingsubsequence
subsequenceof oflength
length
at
atleast
leastk, k,accept.
accept.
· ·Otherwise,
Otherwise,reject.”reject.”
Another Problem
B
A C
M
M==“On
“Oninputinput⟨G,
⟨G,u,
u,v,v,k⟩,
k⟩,where
whereGGisisaagraph,
graph,
uuand andvvare
arenodes
nodesin inG,
G,and
andkk∈∈ℕ:ℕ:
F D
· ·Nondeterministically
Nondeterministicallyguess
of
guessaapermutation
permutation
ofatatmost
mostkknodes
nodesfrom
fromG.G.
· ·IfIfthe
thepermutation
permutationisisaapath
pathfrom
fromuuto
tov,v,
accept.
Eaccept.
· ·Otherwise,
Otherwise,reject.
reject.
How do we measure NTM efficiency?
Analyzing NTMs
● When discussing deterministic TMs, the notion of
time complexity is (reasonably)
straightforward.
● Recall: One way of thinking about
nondeterminism is as a tree.
● In a deterministic computation,
the tree is a straight line.
● The time complexity is the
height of that straight line.
Analyzing NTMs
● When discussing deterministic TMs, the notion of time
complexity is (reasonably) straightforward.
● Recall: One way of thinking about nondeterminism is
as a tree.
● The time complexity is the height of the
tree (the length of the longest possible
choice we could make).
● Intuition: If you ran all possible
branches in parallel, how long would
it take before all branches completed?
The Size of the Tree
From NTMs to TMs
● Theorem: For any NTM with time
complexity f(n), there is a TM with time
complexity 2O(f(n)).
● It is unknown whether it is possible
to do any better than this in the
general case.
● NTMs are capable of exploring multiple
options in parallel; this “seems”
inherently faster than deterministic
computation.
The Complexity Class NP
● The complexity class NP
(nondeterministic polynomial time)
contains all problems that can be solved in
polynomial time by an NTM.
● Formally:
NP = { L | There is an NTM that
decides L in non-deterministic
polynomial time. }
● What types of problems are in NP?
A Problem in NP
● Does an n2 × n2 Sudoku grid have a solution?
● M = “On input ⟨S⟩, an encoding of a Sudoku puzzle:
– Nondeterministically guess how to fill in all the squares.
– Deterministically check whether the guess is correct.
– If so, accept; if not, reject.” 2 5 7 9 6 4 1 8 3
For an arbitrary n 2
2× n 2grid:
2 4 9 1 8 7 3 6 5 2
For an arbitrary n × n grid:
Total number of cells in the grid: n4 3 8 6 1 2 5 9 4 7
Total number of cells in the grid: n4
6 4 5 7 3 2 8 1 9
Total time to fill in the grid: O(n4))
Total time to fill in the grid: O(n 4
Total 7 1 9 5 4 8 3 2 6
Totalnumber
numberof ofrows,
rows, columns,
columns, and
and
boxes
boxesto check:O(n
tocheck: O(n2))
2
8 3 2 6 1 9 5 7 4
Total
Totaltime
timerequired
requiredto
to check
checkeach
each 1 6 3 2 5 7 4 9 8
row/column/box:O(n
row/column/box: O(n2))
2
5 7 8 4 9 6 2 3 1
Total runtime:O(n
Totalruntime: O(n))
4
4
9 2 4 3 8 1 7 6 5
A Problem in NP
● A k-coloring of an undirected graph G is a way of
assigning one if k colors to each node in G such that
no two nodes joined by an edge have the same color.
● Applications in compilers, cell phone towers, etc.
● Question: Given a graph G and a number k, is graph
G k-colorable?
● M = “On input ⟨G, k⟩:
● Nondeterministically guess a k-coloring of the nodes of G.
● Deterministically check whether it is legal.
● If so, accept; if not, reject.”
Other Problems in NP
● Subset sum:
Given a set S of natural numbers and a target
number n, is there a subset of S that sums to n?
● Longest path:
Given a graph G, a pair of nodes u and v, and a
number k, is there a simple path from u to v of
length at least k?
● Job scheduling:
Given a set of jobs J, a number of workers k, and
a time limit t, can the k workers, working in
parallel complete all jobs in J within time t?
Problems and Languages
● Abstract question: does a Sudoku grid
have a solution?
● Formalized as a language:
SUDOKU = { ⟨S⟩ | S is a solvable
Sudoku grid. }
● In other words:
S is solvable iff ⟨S⟩ ∈ SUDOKU
Problems and Languages
● Abstract question: can a graph be
colored with k colors?
● Formalized as a language:
COLOR = { ⟨G, k⟩ | G is an undirected
graph, k ∈ ℕ, and
G is k-colorable. }
● In other words:
G is k-colorable iff ⟨G, k⟩ ∈ COLOR
An Intuition for NP
● Intuitively, a language L is in NP if for
every w ∈ L, there is an efficient way to
prove to someone that w ∈ L.
● Analogous to the verifier intuition for RE,
except that we need to be able to
efficiently prove strings are in the
language.
A Problem in NP
7 6 1
3 5 2
3 1 5 9 7
6 5 3 8 9
1 2
8 2 1 5 4
1 3 2 7 8
5 7 4
4 8 7
A Problem in NP
2 5 7 9 6 4 1 8 3
4 9 1 8 7 3 6 5 2
3 8 6 1 2 5 9 4 7
6 4 5 7 3 2 8 1 9
7 1 9 5 4 8 3 2 6
8 3 2 6 1 9 5 7 4
1 6 3 2 5 7 4 9 8
5 7 8 4 9 6 2 3 1
9 2 4 3 8 1 7 6 5
A Problem in NP
4 3 11 4
9 9 2
7 13 5 6 1 12 7
2 8 0 10
2 4
1 6
5 3
Is there a simple path that goes
through every node exactly once?
Formal-language framework
Alphabet = finite set of symbols
3 5 2
● M = “On input
3 1 5 9 7
⟨S, A⟩, an encoding
of a Sudoku puzzle 6 5 3 8 9
and an alleged 1 2
solution to it: 8 2 1 5 4
– Deterministically 1 3 2 7 8
check whether A is 5 7 4
a solution to S. 4 8 7
– If so, accept; if not,
reject.”
A Problem in NP
● M = “On input ⟨⟨G, k⟩, C⟩, where C is an alleged
coloring:
● Deterministically check whether C is a legal k-
coloring of G.
● If so, accept; if not, reject.”
The Verifier Definition of NP
● Theorem: If there is a polynomial-time
verifier V for L, then L ∈ NP.
● Proof idea: Build an NTM that
nondeterministically guesses a certificate,
then deterministically runs the verifier to
check it. Then, argue that the NTM runs in
nondeterministic polynomial time. ■
The Verifier Definition of NP
● Theorem: If L ∈ NP, there is a
polynomial-time verifier for it.
● Proof sketch: Use the general
construction that turns an NTM into a
verifier, and argue that the overall
construction runs in polynomial time. ■
The
Most Important Question
in
Theoretical Computer Science
What is the connection between P and NP?
P = { L | There is a polynomial-time
decider for L }
NP = { L | There is a nondeterministic
polynomial-time decider for L }
P ⊆ NP
Does P = NP?
P ≟ NP
● The P ≟ NP question is the most important question in
theoretical computer science.
● With the verifier definition of NP, one way of phrasing
this question is
AA matching,
matching, but
but
not
not aa maximum
maximum
matching.
matching.
Maximum Matching
● Given an undirected graph G, a matching in G is a
set of edges such that no two edges share an
endpoint.
● A maximum matching is a matching with the
largest number of edges.
AA maximum
maximum
matching.
matching.
Maximum Matching
● Given an undirected graph G, a matching in G is a
set of edges such that no two edges share an
endpoint.
● A maximum matching is a matching with the
largest number of edges.
Maximum
Maximum matchings
matchings
are
are not
not necessarily
necessarily
unique.
unique.
Maximum Matching
● Jack Edmonds' paper “Paths, Trees, and
Flowers” gives a polynomial-time
algorithm for finding maximum
matchings.
● (This is the same Edmonds as in “Cobham-
Edmonds Thesis.)
● Using this fact, what other problems can
we solve?
Domino Tiling
A Domino Tiling Reduction
● Let MATCHING be the language defined as
follows:
MATCHING = { ⟨G, k⟩ | G is an undirected graph
with a matching of size at least k }
● Theorem (Edmonds): MATCHING ∈ P.
● Let DOMINO be this language:
DOMINO = { ⟨D, k⟩ | D is a grid and k
nonoverlapping dominoes can be placed on D. }
● We'll use the fact that MATCHING ∈ P to
prove that DOMINO ∈ P.
Solving Domino Tiling
Solving Domino Tiling
Solving Domino Tiling
The Setup
● To determine whether you can place at
least k dominoes on a crossword grid, do
the following:
● Convert the grid into a graph: each empty
cell is a node, and any two adjacent empty
cells have an edge between them.
● Ask whether that graph has a matching of
size k or greater.
● Return whatever answer you get.
In Pseudocode
PATH P
Polynomial-time verification
Algorithm A verifies language L if
L = {x {0, 1}* : y {0, 1}* s.t. A(x, y) = 1}
SUBSET-SUM NP
P vs. NP
Not much is known, unfortunately
P NP
Write as L1 L2
Reductions show easiness/hardness
To show L1 is easy, reduce it to something we know
is easy (e.g., matrix mult., network flow, etc.)
L1 easy
Use algorithm for easy language to decide L1
/ / / / / / b \ \ \ \ \ \
-3 -2 -1 0 1 2 3
unbounded tape
SAT NPC
If S’ S that sums to t
Includes either vi or vi’ for each i = 1,…, n; if vi S’, set xi = 1
Each clause cj has at least one vi or vi’ set to 1 since slacks
add up to only 3; by above clause is satisfied
Implications of P = NP
Ability to verify a solution ability to produce one!