Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
18 views111 pages

P NP-merged

The document discusses the concepts of decidability and complexity in computer science, focusing on the classes P and NP. It explains efficient algorithms, the Cobham-Edmonds thesis, and provides examples of problems within these complexity classes. The document also highlights the nature of nondeterminism and its implications for problem-solving in NP.

Uploaded by

deysarnabhahope
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views111 pages

P NP-merged

The document discusses the concepts of decidability and complexity in computer science, focusing on the classes P and NP. It explains efficient algorithms, the Cobham-Edmonds thesis, and provides examples of problems within these complexity classes. The document also highlights the nature of nondeterminism and its implications for problem-solving in NP.

Uploaded by

deysarnabhahope
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 111

P and NP

Arnab Kumar Ghoshal

Course: M.Sc. Semester - I

Paper: CSMC- 102 ( Data Structures & Algorithms)


The Limits of Decidability

In computability theory, we ask the
question
What problems can be solved by a
computer?
● In complexity theory, we ask the question
What problems can be solved efficiently
by a computer?
● In the remainder of this course, we will
explore this question in more detail.
Regular
Languages
CFLs
R
Efficiently
Decidable
Languages

Undecidable Languages
What is an efficient algorithm?
For Comparison
● Longest increasing ● Shortest path
subsequence: problem:
● Naive: O(n · 2n) ● Naive: O(n2 · n!)
● Fast: O(n2) ● Fast: O(n + m),
where n is the
number of nodes
and m the number
of edges. (Take
CS161 for details!)
Polynomials and Exponentials

An algorithm runs in polynomial time if
its runtime is some polynomial in n.
● That is, time O(nk) for some constant k.
● Polynomial functions “scale well.”
● Small changes to the size of the input do not
typically induce enormous changes to the
overall runtime.
● Exponential functions scale terribly.
● Small changes to the size of the input induce
huge changes in the overall runtime.
The Cobham-Edmonds Thesis

A language L can be decided efficiently if


there is a TM that decides it in polynomial time.

Equivalently, L can be decided efficiently iff


it can be decided in time O(nk) for some k ∈ ℕ.

Like
Likethe
theChurch-Turing
Church-Turingthesis,
thesis,this
thisis
is
not
notaatheorem!
theorem!
It's
It'san
anassumption
assumptionabout
aboutthe
thenature
natureof of
efficient
efficientcomputation,
computation,and
andititis
is
somewhat
somewhatcontroversial.
controversial.
The Cobham-Edmonds Thesis
● Efficient runtimes: ● Inefficient runtimes:
● 4n + 13 ● 2n
● n3 – 2n2 + 4n ● n!
● n log log n ● nn
● “Efficient” runtimes: ● “Inefficient” runtimes:
● n1,000,000,000,000 ● n0.0001 log n
● 10500 ● 1.000000001n
The Complexity Class P
● The complexity class P (for polynomial
time) contains all problems that can be
solved in polynomial time.
● Formally:
P = { L | There is a polynomial-time
decider for L }
● Assuming the Cobham-Edmonds thesis, a
language is in P if it can be decided
efficiently.
Examples of Problems in P
● All regular languages are in P.
● All have linear-time TMs.
● All CFLs are in P.
● Requires a more nuanced argument (the
CYK algorithm or Earley's algorithm.)
● Many other problems are in P.
● More on that in a second.
Regular
Languages
CFLs P R

Undecidable Languages
Problems in P
● Graph connectivity:
Given a graph G and nodes s and t,
is there a path from s to t?
● Primality testing:
Given a number p, is p prime? (Best known TM
for this takes time O(n37).)
● Maximum matching:
Given a set of tasks and workers who can perform
those tasks, if each worker performs exactly one
task, can at least n tasks be performed?
Problems in P
● Remoteness testing:
Given a graph G, are all of the nodes in G
within distance at most k of one another?
● Linear programming:
Given a linear set of constraints
and linear objective function, is the
optimal solution at least n?
● Edit distance:
Given two strings, can the strings be
transformed into one another in at most n
single-character edits?
What can't you do in polynomial time?
start How
How many
many simple
simple
paths
paths are
are there
there
from
from the
the start
start
node
node to
to the
the end
end
end node?
node?
, , ,
How
How many
many
subsets
subsets of
of this
this
set
set are
are there?
there?
An Interesting Observation
● There are (at least) exponentially many
objects of each of the preceding types.
● However, each of those objects is not very
large.
● Each simple path has length no longer than the
number of nodes in the graph.
● Each subset of a set has no more elements than
the original set.
● This brings us to our next topic...
NP
NP
What if you could magically
guess which element of the
search space was the one
you wanted?
A Sample Problem

4 3 11 9 7 13 5 6 1 12 2 8 0 10

M
M==“On
“Oninputinput⟨S, ⟨S,k⟩,
k⟩,where
whereSSisisaasequence
sequence
ofofnumbers
numbersand andkkisisaanatural
naturalnumber:
number:
· ·Nondeterministically
Nondeterministicallyguess guessaasubsequence
subsequence
of
ofS. S.
· ·IfIfititisisan
anascending
ascendingsubsequence
subsequenceof oflength
length
at
atleast
leastk, k,accept.
accept.
· ·Otherwise,
Otherwise,reject.”reject.”
Another Problem
B

A C

M
M==“On
“Oninputinput⟨G,
⟨G,u,
u,v,v,k⟩,
k⟩,where
whereGGisisaagraph,
graph,
uuand andvvare
arenodes
nodesin inG,
G,and
andkk∈∈ℕ:ℕ:
F D
· ·Nondeterministically
Nondeterministicallyguess
of
guessaapermutation
permutation
ofatatmost
mostkknodes
nodesfrom
fromG.G.
· ·IfIfthe
thepermutation
permutationisisaapath
pathfrom
fromuuto
tov,v,
accept.
Eaccept.
· ·Otherwise,
Otherwise,reject.
reject.
How do we measure NTM efficiency?
Analyzing NTMs
● When discussing deterministic TMs, the notion of
time complexity is (reasonably)
straightforward.
● Recall: One way of thinking about
nondeterminism is as a tree.
● In a deterministic computation,
the tree is a straight line.
● The time complexity is the
height of that straight line.
Analyzing NTMs
● When discussing deterministic TMs, the notion of time
complexity is (reasonably) straightforward.
● Recall: One way of thinking about nondeterminism is
as a tree.
● The time complexity is the height of the
tree (the length of the longest possible
choice we could make).
● Intuition: If you ran all possible
branches in parallel, how long would
it take before all branches completed?
The Size of the Tree
From NTMs to TMs
● Theorem: For any NTM with time
complexity f(n), there is a TM with time
complexity 2O(f(n)).
● It is unknown whether it is possible
to do any better than this in the
general case.
● NTMs are capable of exploring multiple
options in parallel; this “seems”
inherently faster than deterministic
computation.
The Complexity Class NP
● The complexity class NP
(nondeterministic polynomial time)
contains all problems that can be solved in
polynomial time by an NTM.
● Formally:
NP = { L | There is an NTM that
decides L in non-deterministic
polynomial time. }
● What types of problems are in NP?
A Problem in NP
● Does an n2 × n2 Sudoku grid have a solution?
● M = “On input ⟨S⟩, an encoding of a Sudoku puzzle:
– Nondeterministically guess how to fill in all the squares.
– Deterministically check whether the guess is correct.
– If so, accept; if not, reject.” 2 5 7 9 6 4 1 8 3

For an arbitrary n 2
2× n 2grid:
2 4 9 1 8 7 3 6 5 2
For an arbitrary n × n grid:
Total number of cells in the grid: n4 3 8 6 1 2 5 9 4 7
Total number of cells in the grid: n4
6 4 5 7 3 2 8 1 9
Total time to fill in the grid: O(n4))
Total time to fill in the grid: O(n 4

Total 7 1 9 5 4 8 3 2 6
Totalnumber
numberof ofrows,
rows, columns,
columns, and
and
boxes
boxesto check:O(n
tocheck: O(n2))
2
8 3 2 6 1 9 5 7 4
Total
Totaltime
timerequired
requiredto
to check
checkeach
each 1 6 3 2 5 7 4 9 8
row/column/box:O(n
row/column/box: O(n2))
2
5 7 8 4 9 6 2 3 1
Total runtime:O(n
Totalruntime: O(n))
4
4
9 2 4 3 8 1 7 6 5
A Problem in NP
● A k-coloring of an undirected graph G is a way of
assigning one if k colors to each node in G such that
no two nodes joined by an edge have the same color.
● Applications in compilers, cell phone towers, etc.
● Question: Given a graph G and a number k, is graph
G k-colorable?
● M = “On input ⟨G, k⟩:
● Nondeterministically guess a k-coloring of the nodes of G.
● Deterministically check whether it is legal.
● If so, accept; if not, reject.”
Other Problems in NP
● Subset sum:
Given a set S of natural numbers and a target
number n, is there a subset of S that sums to n?
● Longest path:
Given a graph G, a pair of nodes u and v, and a
number k, is there a simple path from u to v of
length at least k?
● Job scheduling:
Given a set of jobs J, a number of workers k, and
a time limit t, can the k workers, working in
parallel complete all jobs in J within time t?
Problems and Languages
● Abstract question: does a Sudoku grid
have a solution?
● Formalized as a language:
SUDOKU = { ⟨S⟩ | S is a solvable
Sudoku grid. }
● In other words:
S is solvable iff ⟨S⟩ ∈ SUDOKU
Problems and Languages
● Abstract question: can a graph be
colored with k colors?
● Formalized as a language:
COLOR = { ⟨G, k⟩ | G is an undirected
graph, k ∈ ℕ, and
G is k-colorable. }
● In other words:
G is k-colorable iff ⟨G, k⟩ ∈ COLOR
An Intuition for NP
● Intuitively, a language L is in NP if for
every w ∈ L, there is an efficient way to
prove to someone that w ∈ L.
● Analogous to the verifier intuition for RE,
except that we need to be able to
efficiently prove strings are in the
language.
A Problem in NP

7 6 1

3 5 2

3 1 5 9 7

6 5 3 8 9

1 2

8 2 1 5 4

1 3 2 7 8

5 7 4

4 8 7
A Problem in NP

2 5 7 9 6 4 1 8 3

4 9 1 8 7 3 6 5 2

3 8 6 1 2 5 9 4 7

6 4 5 7 3 2 8 1 9

7 1 9 5 4 8 3 2 6

8 3 2 6 1 9 5 7 4

1 6 3 2 5 7 4 9 8

5 7 8 4 9 6 2 3 1

9 2 4 3 8 1 7 6 5
A Problem in NP

4 3 11 4
9 9 2
7 13 5 6 1 12 7
2 8 0 10

Is there an ascending subsequence of


length at least 7?
A Problem in NP

2 4

1 6

5 3
Is there a simple path that goes
through every node exactly once?
Formal-language framework
Alphabet  = finite set of symbols

Language L over  is any subset of strings in *

We’ll focus on  = {0, 1}


L = {10, 11, 101, 111, 1011, …} is language of primes
Verifiers
● Recall that a verifier for L is a
deterministic TM V such that
● V halts on all inputs.
● w∈L iff ∃c ∈ Σ*. V accepts ⟨w, c⟩.

● Theorem: L ∈ RE iff there is a verifier


for L.
Polynomial-Time Verifiers
● A polynomial-time verifier for L is a
deterministic TM V such that
● V halts on all inputs.
● w∈L iff ∃c ∈ Σ*. V accepts ⟨w, c⟩.
● V's runtime is a polynomial in |w|.
● Theorem: L ∈ NP iff there is a
polynomial-time verifier for L.
A Problem in NP
● Does a Sudoku grid
have a solution? 7 6 1

3 5 2
● M = “On input
3 1 5 9 7
⟨S, A⟩, an encoding
of a Sudoku puzzle 6 5 3 8 9

and an alleged 1 2

solution to it: 8 2 1 5 4

– Deterministically 1 3 2 7 8
check whether A is 5 7 4
a solution to S. 4 8 7
– If so, accept; if not,
reject.”
A Problem in NP
● M = “On input ⟨⟨G, k⟩, C⟩, where C is an alleged
coloring:
● Deterministically check whether C is a legal k-
coloring of G.
● If so, accept; if not, reject.”
The Verifier Definition of NP
● Theorem: If there is a polynomial-time
verifier V for L, then L ∈ NP.
● Proof idea: Build an NTM that
nondeterministically guesses a certificate,
then deterministically runs the verifier to
check it. Then, argue that the NTM runs in
nondeterministic polynomial time. ■
The Verifier Definition of NP
● Theorem: If L ∈ NP, there is a
polynomial-time verifier for it.
● Proof sketch: Use the general
construction that turns an NTM into a
verifier, and argue that the overall
construction runs in polynomial time. ■
The
Most Important Question
in
Theoretical Computer Science
What is the connection between P and NP?
P = { L | There is a polynomial-time
decider for L }

NP = { L | There is a nondeterministic
polynomial-time decider for L }

P ⊆ NP
Does P = NP?
P ≟ NP
● The P ≟ NP question is the most important question in
theoretical computer science.
● With the verifier definition of NP, one way of phrasing
this question is

If a solution to a problem can be checked efficiently,


can that problem be solved efficiently?

● An answer either way will give fundamental insights


into the nature of computation.
Why This Matters
● The following problems are known to be efficiently
verifiable, but have no known efficient solutions:
● Determining whether an electrical grid can be built to link up
some number of houses for some price (Steiner tree problem).
● Determining whether a simple DNA strand exists that multiple
gene sequences could be a part of (shortest common
supersequence).
● Determining the best way to assign hardware resources in a
compiler (optimal register allocation).
● Determining the best way to distribute tasks to multiple
workers to minimize completion time (job scheduling).
● And many more.
● If P = NP, all of these problems have efficient solutions.
● If P ≠ NP, none of these problems have efficient solutions.
Why This Matters
● If P = NP:
● A huge number of seemingly difficult problems
could be solved efficiently.
● Our capacity to solve many problems will scale
well with the size of the problems we want to
solve.
● If P ≠ NP:
● Enormous computational power would be required
to solve many seemingly easy tasks.
● Our capacity to solve problems will fail to keep up
with our curiosity.
What We Know
● Resolving P ≟ NP has proven extremely difficult.
● In the past 43 years:
● Not a single correct proof either way has been
found.
● Many types of proofs have been shown to be
insufficiently powerful to determine whether
P ≟ NP.
● A majority of computer scientists believe P ≠ NP,
but this isn't a large majority.
● Interesting read: Interviews with leading thinkers
about P ≟ NP:
● http://web.ing.puc.cl/~jabaier/iic2212/poll-1.pdf
The Million-Dollar Question

The Clay Mathematics Institute has offered


a $1,000,000 prize to anyone who proves
or disproves P = NP.
Problem Sets
● Problem Set Seven was due at the start of class
today.
● Want to use a late day? Turn it in tomorrow by
12:50PM.
● Want to use two late days? Turn it in on Friday by
12:50PM.
● Problem Set Eight goes out now, is due one
week from today at 12:50PM.
● Explore the limits of RE languages, the P vs. NP
question, and the Big Picture.
● No late days may be used on this assignment.
It's university policy; sorry about that!
What do we know about P ≟ NP?
Reducibility
Maximum Matching
● Given an undirected graph G, a matching in G is a
set of edges such that no two edges share an
endpoint.
● A maximum matching is a matching with the
largest number of edges.

AA matching,
matching, but
but
not
not aa maximum
maximum
matching.
matching.
Maximum Matching
● Given an undirected graph G, a matching in G is a
set of edges such that no two edges share an
endpoint.
● A maximum matching is a matching with the
largest number of edges.

AA maximum
maximum
matching.
matching.
Maximum Matching
● Given an undirected graph G, a matching in G is a
set of edges such that no two edges share an
endpoint.
● A maximum matching is a matching with the
largest number of edges.

Maximum
Maximum matchings
matchings
are
are not
not necessarily
necessarily
unique.
unique.
Maximum Matching
● Jack Edmonds' paper “Paths, Trees, and
Flowers” gives a polynomial-time
algorithm for finding maximum
matchings.
● (This is the same Edmonds as in “Cobham-
Edmonds Thesis.)
● Using this fact, what other problems can
we solve?
Domino Tiling
A Domino Tiling Reduction
● Let MATCHING be the language defined as
follows:
MATCHING = { ⟨G, k⟩ | G is an undirected graph
with a matching of size at least k }
● Theorem (Edmonds): MATCHING ∈ P.
● Let DOMINO be this language:
DOMINO = { ⟨D, k⟩ | D is a grid and k
nonoverlapping dominoes can be placed on D. }
● We'll use the fact that MATCHING ∈ P to
prove that DOMINO ∈ P.
Solving Domino Tiling
Solving Domino Tiling
Solving Domino Tiling
The Setup
● To determine whether you can place at
least k dominoes on a crossword grid, do
the following:
● Convert the grid into a graph: each empty
cell is a node, and any two adjacent empty
cells have an edge between them.
● Ask whether that graph has a matching of
size k or greater.
● Return whatever answer you get.
In Pseudocode

boolean canPlaceDominos(Grid G, int k) {


return hasMatching(gridToGraph(G), k);
}
Why This Works
● This overall construction gives a polynomial-
time algorithm for the domino tiling problem.
● It takes on polynomial time to convert the grid
into a graph (we'll hand-wave these details
away.)
● Once we have that new graph, it takes only
polynomial time to check if there's a
sufficiently large matching.
● Overall, this only requires polynomial time.
Tractability
Polynomial time (p-time) = O(nk), where n is the
input size and k is a constant

Problems solvable in p-time are considered


tractable

NP-complete problems have no known p-time


solution, considered intractable
Tractability
Difference between tractability and intractability
can be slight

Can find shortest path in graph in O(m + nlgn) time,


but finding longest simple path is NP-complete

Can find satisfiable assignment for 2-CNF formula in


O(n) time, but for 3-CNF is NP-complete:
(x1  x2)  (x1  x3)  (x2  x3)
Decision problems
A decision problem has a yes/no answer
Different, but related to optimization problem,
where trying to maximize/minimize a value
Any decision problem Q can be viewed as
language: L = {x  {0,1}* : Q(x) = 1}
Q decides L: every string in L accepted by Q,
every string not in L rejected
Example of a decision problem
PATH = {G, u, v, k : G = (V, E) is an undirected
graph, u,v ∈ V, k ≥ 0 is an integer, and  a path
from u to v in G with  k edges}
Encoding of input G, u, v, k is important! We
express running times as function of input size

Corresponding optimization problem is


SHORTEST-PATH
Complexity class P
P = {L {0, 1}* :  an algorithm A that
decides L in p-time}

PATH  P
Polynomial-time verification
Algorithm A verifies language L if
L = {x  {0, 1}* :  y  {0, 1}* s.t. A(x, y) = 1}

Can verify PATH given input G, u, v, k and path


from u to v
PATH  P, so verifying and deciding take p-time

For some languages, however, verifying much easier


than deciding
SUBSET-SUM: Given finite set S of integers, is there a
subset whose sum is exactly t?
Complexity class NP
Let A be a p-time algorithm and k a constant:

NP = {L  {0, 1}* :  a certificate y, |y| = O(|x|k),


and an algorithm A s.t. A(x, y) = 1}

SUBSET-SUM  NP
P vs. NP
Not much is known, unfortunately

Can think of NP as the ability to appreciate a


solution, P as the ability to produce one

P  NP

Don’t even know if NP closed under


complement, i.e. NP = co-NP?
Does L  NP imply Ḹ  NP?
P vs. NP
Comparing hardness
NP-complete problems are the “hardest” in NP:
if any NP-complete problem is p-time solvable,
then all problems in NP are p-time solvable

How to formally compare easiness/hardness of


problems?
Reductions
Reduce language L1 to L2 via function f:
1. Convert input x of L1 to instance f(x) of L2
2. Apply decision algorithm for L2 to f(x)

Running time = time to compute f + time to


apply decision algorithm for L2

Write as L1  L2
Reductions show easiness/hardness
To show L1 is easy, reduce it to something we know
is easy (e.g., matrix mult., network flow, etc.)
L1  easy
Use algorithm for easy language to decide L1

To show L1 is hard, reduce something we know is


hard to it (e.g., NP-complete problem):
hard  L1
If L1 was easy, hard would be easy too
Polynomial-time reducibility
L1 is p-time reducible to L2, or L1 p L2, if  a p-
time computable function f : {0, 1}*  {0, 1}*
s.t. for all x  {0, 1}*, x  L1 iff f(x)  L2

Lemma. If L1 p L2 and L2  P, then L1  P


Complexity class NPC
A language L  {0, 1}* is NP-complete if:
1. L  NP, and
2. L’ p L for every L’  NP, i.e. L is NP-hard

Lemma. If L is language s.t. L’ p L where L’ 


NPC, then L is NP-hard. If L  NP, then L  NPC.

Theorem. If any NPC problem is p-time solvable,


then P = NP.
P, NP, and NPC
NPC reductions
Lemma. If L is language s.t. L’ p L where L’  NPC,
then L is NP-hard. If L  NP, then L  NPC.

This gives us a recipe for proving any L  NPC:


1. Prove L  NP
2. Select L’  NPC
3. Describe algorithm to compute f mapping every input
x of L’ to input f(x) of L
4. Prove f satisfies x  L’ iff f(x)  L, for all x  {0, 1}*
5. Prove computing f takes p-time
Bootstrapping
Need one language in NPC to get started

SAT = { :  is a satisfiable boolean formula}


Can the variables of  be assigned values in {0, 1} s.t.
 evaluates to 1?
Cook-Levin theorem
Theorem. SAT  NPC.

Proof. SAT  NP since certificate is satisfying


assignment of variables. To show SAT is NP-hard,
must show every L  NP is p-time reducible to it.

Idea: Use p-time verifier A(x,y) of L to construct


input  of SAT s.t. verifier says “yes” iff  satisfiable
Verifier: Turing Machine
Finite Control read/write head

blank certificate input blank

 / / / / / / b \ \ \ \ \ \ 
   -3 -2 -1 0 1 2 3 
unbounded tape

Church-Turing thesis: Everything computable is


computable by a Turing machine
In one step, can write a symbol, move head one
position, change state
What to do is based on state and symbol read
Fixed # of states: start state, “yes” state, (“no”
state); fixed # of tape symbols, including blank
Explicit worst-case p-time bound p(n)
Proof plan
Given L  NP we have Turing machine that
implements verifier A(x,y)

Input x, |x| = n, is “yes” instance iff for some


certificate y, machine reaches “yes” state within
p(n) steps from start state
Loops in “yes” state if gets there earlier

Construct  = f(x) that is satisfiable iff this happens


x is fixed and used to construct f(x), but y is unspecified
Variables in 
States: 1,…, w // 1 = start, w = “yes”

Symbols: 1,…, z // 1 = blank, rest input


// symbols like ‘0’ and ‘1’

Tape cells: -p(n),..., 0,…, p(n)

Time: 0, 1,…, p(n)


Variables:
hit: true if head on tape cell i at time t,
p(n)  i  p(n), 0  t  p(n)

sjt: true if state j at time t,


1  j  w, 0  t  p(n)

cikt: true if tape cell i holds symbol k at time t,


p(n)  i  p(n), 1  k  z, 0  t  p(n)
What does  need to say?
At most one state, head position, and symbol
per cell at each time:
hit  hi’t, i  i’, all t

sjt  sj’t, j  j’, all t

cikt  cik’t, k  k’, all i, all t


Correct initial state, head position, and tape
contents:
h00  s10  c010  c1k10  c2k20  …  cnkn0  c(n+1)10 
…  cp(n)10
Input is k1,…, kn, followed by blanks to right

Correct final state:


swp(n)
Correct transitions: e.g., if machine in state j
reads k, it then writes k’, moves head right,
and changes to state j’:
sjt  hit  cikt  sj’(t+1)  h(i+1)(t+1)  cik’(t+1), all i, t

Unread tape cells are unaffected:


hit  ci’kt  ci’k(t+1), i  i’, all k, t
Wrapping up
Any proof that gives “yes” execution gives
satisfying assignment, and vice versa
Also  contains O(p(n)2) variables, O(p(n)2) clauses

 SAT  NPC

Now that we are bootstrapped, much easier to


prove other L  NPC
Recall recipe for NPC proofs
1. Prove L  NP
2. Select L’  NPC
3. Describe algorithm to compute f mapping
every input x of L’ to input f(x) of L
4. Prove f satisfies x  L’ iff f(x)  L, for all x 
{0, 1}*
5. Prove computing f takes p-time
3-CNF-SAT  NPC
3-CNF-SAT = { :  is a satisfiable 3-CNF
boolean formula}

 is 3-CNF if it is AND of clauses, each of which is


OR of three literals (variable or negation)
(x1  x1  x2)  (x3  x2  x4)  (x1  x3  x4)

Proof. Show SAT p 3-CNF-SAT


Given input of SAT, construct binary parse tree,
introduce variable yi for each internal node

E.g.,  = ((x1  x2)  ((x1  x3)  x4))  x2


Rewrite as AND of root and clauses describing
operation of each node:

Each clause has at most three literals


Write truth table for each clause, e.g. for
’1 = (y1  (y2  x2)):

Write DNF (OR of ANDs) for ’1:


’1 = (y1  y2  x2)  (y1  y2  x2)  …
Use DeMorgan’s laws to convert to CNF:
’’1 = (y1  y2  x2)  (y1  y2  x2)  …
If any clause has < three literals, augment with
dummy variables p, q
(l1  l2)  (l1  l2  p)  (l1  l2  p)

Resulting 3-CNF formula is satisfiable iff original


SAT formula is satisfiable
CLIQUE  NPC
CLIQUE = {G, k : graph G = (V, E) has clique of
size k}
Naïve algorithm runs in (k2  |V|Ck))

Proof. Show 3-CNF-SAT p CLIQUE


Given formula  = c1  c2  …  ck, construct input
of CLIQUE:
For each cr = (l1r  l2r  l3r), place v1r, v2r, v3r in V
Add edge between vir and vjs if r  s and corresponding
literals are consistent

If  is satisfiable, at least one literal in each cr is 1 


set of k vertices that are completely connected

If G has clique of size k, contains exactly one vertex


per clause   satisfied by assigning 1 to
corresponding literals
VERTEX-COVER  NPC
VERTEX-COVER = {G, k : graph G = (V, E) has
vertex cover of size k}

Vertex cover is V’  V s.t. if (u, v)  E, then u 


V’ or v  V’ or both

Proof. Show CLIQUE p VERTEX-COVER


Given input G, k of CLIQUE, construct input of
VERTEX-COVER:
Ḡ, |V|  k, where Ḡ = (V, Ē)

If G has clique V’, |V’| = k, then V  V’ is vertex


cover of Ḡ:
(u, v)  Ē  either u or v not in V’, since (u, v)  E
 at least one of u or v in V – V’, so covered

If Ḡ has vertex cover V’  V, |V’| = |V|  k, then V –


V’ is clique of G of size k
(u, v)  Ē  u  V’ or v  V’ or both
if u  V’ and v  V’, then (u, v)  E
SUBSET-SUM  NPC
SUBSET-SUM = {S, t : S  N, t  N and  a
subset S’ ⊆ S s.t. t = sS’ s}

Integers encoded in binary! If t encoded in


unary, can solve SUBSET-SUM in p-time, i.e.
weakly NPC (vs. strongly NPC)

Proof. Show 3-CNF-SAT p SUBSET-SUM


Given formula , assume w.l.o.g. each variable
appears in at least one clause, and variable and
negation don’t appear in same clause

Construct input of SUBSET-SUM:


2 numbers per variable xi, 1  i  n, indicates if
variable or negation is in a clause
2 numbers per clause cj, 1  j  k, slack variables
Each digit labeled by variable/clause, total n + k digits
t is 1 for each variable digit, 4 for each clause digit
 = C1  C2  C3  C4, C1 = (x1  x2  x3), C2 =
(x1  x2  x3), C3 = (x1  x2  x3), and C4 =
(x1  x2  x3)

Max digit sum is 6, interpret numbers in base  7


Reduction takes p-time: set S has 2n + 2k values of n + k
digits each; each digit takes O(n + k) time to compute

If  has satisfying assignment


Sum of variable digits is 1, matching t
Each clause digit at least 1 since at least 1 literal satisfied
Fill rest with slack variables sj, sj’

If  S’  S that sums to t
Includes either vi or vi’ for each i = 1,…, n; if vi  S’, set xi = 1
Each clause cj has at least one vi or vi’ set to 1 since slacks
add up to only 3; by above clause is satisfied
Implications of P = NP
Ability to verify a solution  ability to produce one!

Can automate search of solutions, i.e. creativity!

Can use a p-time algorithm for SAT to find formal


proof of any theorem that has a concise proof,
because formal proofs can be verified in p-time

 P = NP could very well imply solutions to all the


other CMI million-dollar problems!
“If P = NP, then the world would be a profoundly different
place than we usually assume it to be. There would be no
special value in "creative leaps," no fundamental gap
between solving a problem and recognizing the solution once
it's found. Everyone who could appreciate a symphony would
be Mozart; everyone who could follow a step-by-step
argument would be Gauss...”

— Scott Aaronson, MIT

You might also like