Lecture 03
Lecture 03
Lecture 03
We couldn’t use the requirement that t(n) T (n) or t(n)/T (n) → 0 because of the overhead
associated with more symbols and tapes. Thus, we needed to require that
t(n) log n
→0
T (n)
In addition, is important to note that t(n) and T (n) must both be time-constructible. The theorem
is simply that if these conditions hold, we achieve the desired goal.
t(n) log n
Theorem 1 (Time-Hierarchy Theorem). If t(n) and T (n) are fully time-constructible and T (n) →
0, then DTIME[t(n)] DTIME[T (n)].
The proof is based on a delayed diagonalization argument. Thus, we try to kill all Turing machines
that run in t(n) time, running T (n) as a subroutine as before. Consider infinitely many inputs so
that M1 runs on input x1 then M2 runs on x2 then M1 runs again but this time on input x3 . This
continues as:
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 ...
M1 M2 M1 M2 M3 M1 M2 M3 M4 M1 ...
and it should be clear that one can construct a machine in T (n) which kills off every machine that
runs in t(n) time. The log n factor comes from simulating an arbitrary number of tapes using a
fixed number of tapes (2 tapes).
The Space-Hierarchy Theorem is similar to the time version except that there is no log n fudge
factor. Also, recall that if the space required is less than log n, the function is uninteresting so we
only really care about functions greater than this value.
Recall that
1
The proof of this theorem is similar to time version.
While both of these theorems are important, there is something unsatisfying about them. They
show the existence of a hierarchy, but there are no specific examples of these functions that can’t
be computed in time t(n) or space s(n). It is similar to proof that transcendental numbers exist.
Cantor’s proof showed that such numbers exist, but his proof did not give any concrete examples
of such numbers. Hermite’s proof that e is transcendental somehow is more satisfying. At some
level, these hierarchy theorems “don’t sing music to your heart.” In this class, we will now attempt
to study actual problems instead of concentrating on existence proofs.
2 Complexity Classes
In order to think about more problems and their relationship to complexity theory, we need to
examine the major classes of this area. Recall the following definitions of the two most prominent
complexity classes:
[
P= DTIME[ni + i]
i>0
[
NP = NTIME[ni + i]
i>0
where NTIME is non-deterministic time. It is harder to formally define non-deterministic time so
consider the following description. In running a program, the machine can take a variety of different
paths to reach an ending state. For NTIME, we only require that there is some path that accepts
in the time bounded. There may be other paths that take much longer times, but as long as there
is one that fits the bound, the language is in NTIME. Another way to think about this concept is
to imagine the program trying every path simultaneously, and stopping all paths when the clock
runs out (it has ticked away f (n) instructions). If we reach an accept state, the language is in
NTIME[f (n)].
It is clear that P ⊆ NP, and one of the greatest outstanding problems in computer science is whether
P = NP. It is widely believed that P 6= NP, but no one has been able to show this yet.
2
2.2 s-t Connectivity
Consider the process starting at the vertex v = s, guesses a new edge v 0 , and checks to see if the
edge (v, v 0 ) exists. If it does we forget the edge v 0 and repeat the process for v = v 0 . If not, we
choose a new v 0 and continue as before. Notice that if v 0 = t and (v, v 0 ) ∈ E, we have found the
desired path. The major problem with this algorithm is that if no path exists and there is a cycle,
it will never end. However, we know that any path will be at most as long as the number of vertices
n. Thus, if we run a clock, we can stop this process after n steps.
Algorithm 1: Space-Efficient Algorithm for s-t connectivity
Input: The starting vertex s, a time bound n, and the starting number of steps (0)
Output: TRUE or FALSE based on whether the algorithm finds a path
FINDPATH(v,n,i)
(1) if i > n
(2) return FALSE
(3) Given a vertex v, guess a new vertex v 0
(4) Check the input tape to see if (v, v 0 ) ∈ E
(5) if (v, v 0 ) ∈ E
(6) if v 0 = t
(7) return TRUE
(8) else
(9) FINDPATH(v 0 ,n, i + 1)
(10) else
(11) FINDPATH(v, n, i + 1)
In essence, we have a clock that ticks up to n, and for i < n, we guess the next vertex v 0 , then
check to see if (v, v 0 ) ∈ E. If it is not, we abort; if it is and v 0 = t we accept; otherwise, if it is and
v 0 6= t, increment i and repeat.
First, notice that if there is a path from s to t, there will be some execution path (choice of v’s)
that verifies that s connects to t. Hence, the algorithm is correct. Also notice that the only values
we need to keep track of on the work tape are the current vertex, the guessed vertex, and the clock.
Each of these can be written in c · log n space because we can express or encode each vertex in a
binary sequence that represents a number up to n and the clock only needs to count up n (a binary
counter works fine).
Now that we have shown that s-t connectivity is in NL, we use the idea above to show that any
computation in NL is really just s-t connectivity. Define a graph G = (V, E) so that each vertex in G
represents a configuration or snapshot of the machine. In order to do this, we need to represent the
input tape, input tape head position, current state, finite control, work tape, and work tape head
position in some encoding. It takes n + log n + c1 + c2 + log n + log log n bits, respectively (c1 , c2
are constants) to do this, but if we think about a specific problem with a given fixed (read-only)
input x, we need only remember the input tape head position, current state, work tape, and work
tape head position, for log n + c1 + log n + log log n (logarithmically many) bits. Then given N (x),
the problem with input x, we can construct a graph GN (x) where each vertex is an encoding of
the configuration and a directed edge (u → v) exists iff the configuration represented by v follows
from the one represented by u. Then, the problem of deciding whether x ∈ L for L ∈ NL can be
3
reduced to the question of whether there exists a path from some starting configuration/vertex s
to an accepting configuration/vertex t on the graph GNL (x) .
Finally, we have the class PSPACE which contains all languages which can be determined by a
deterministic TM in polynomial space. We note that the space of all languages that can be de-
termined by a non-deterministic TM in polynomial space is exactly the same as PSPACE (thus
PSPACE = NPSPACE). Specifically, NSPACE[f ] ⊆ DSPACE[f 2 ]. We know that L 6= PSPACE by the
Space-Hierarchy Theorem, but that’s about all we know of the separation between the containment
of complexity classes:
L ⊆ NL ⊆ P ⊆ NP ⊆ PSPACE
3 NP-Completeness
The theory of NP-Completeness owes a lot to the reduction-oriented aspect of recursion theory,
where problems are shown to be undecidable by the Halting Problem to them. It makes things
easier if we can consider a whole bunch of problems that are related in such that a way that we
can convert one problem to another and apply similar techniques or results.
For our example of checking whether or not a language is finite, first notice that the input for the
Halting Problem is a machine M and a string w. We wish to know whether M halts given input
w. So construct a machine Mw that will “translate” the problem of halting into the problem of
determining whether a language is finite or not. To do this, this machine will ignore its input,
instead running as if it were input w. In effect, the machine Mw will simulate M on input w.
Notice, however, that L(Mw ) is only finite if it is the empty set since we either accept on all inputs
(since all inputs are treated as if they are w) or reject on all inputs. Thus, Mw halts iff L(Mw )
4
is infinite. So if we can determine whether a language is finite, we can solve the halting problem.
However, since we know that Halting Problem is undecidable, determining whether a language is
finite is also undecidable.
In some sense, all you need to know about undecidable problems is the Halting Problem. In a similar
vein, we will see in the next section that there are some problems, the NP-Complete problems, which
completely characterize NP and so are believed to be intractable (though still decidable, of course).
So, if we reduce an NP-Complete problem to some other problem, we will show that this other
problem is also intractable in the same way.
1. x ∈ L1 ⇐⇒ f (x) ∈ L2 , and
This means that if we wish to reduce Satisfiability (SAT) to Vertex-Cover (VC), we cannot
simply wait until we know whether a given formula is satisfiable and then pick a corresponding
graph for vertex cover. Such a procedure would not be computable in polynomial time, violating
the second condition of the definition. Our goal for reduction is to take a given formula and produce
a vertex cover such that when we find out whether the formula is satisfiable, we will know whether
the vertex cover exists. In other words, we simply want to encode the formula as a graph. If we
can find a function f that accomplishes this, we can say that SAT ≤Pm VC via f . This is an
expressibility issue, not a computability issue.
The power of NP-Completeness is that if we have an algorithm for say VC, we automatically have
an algorithm for SAT. SAT is connected to thousands of NP-Complete problems, and if we can do
one in polynomial time, we can do all of them in polynomial time. There is evidence (since there
are thousands of NP-Complete problems and no poly-time algorithm has been found for any of
them) that there is no such algorithm. It can be shown that SAT is NP-Complete by using a NTM
(non-deterministic Turing Machine), and this is outlined in the text. In effect, we use boolean logic
to express what computation is.
One important reduction is SAT ≤Pm 3-SAT, which puts the variable length clauses of SAT into a
more constrained structure. 3-SAT is the language that consists of all satisfiable boolean expressions
of the form
3
n _
^
ϕ= x
bij
i=1 j=1
5
is essentially the identity. On the other hand, 2-SAT, which is defined in a similar manner but j
ranges only to 2, is not NP-complete; it is decidable in polynomial time as it can be formulated as a
graph of logical implications and one can find if the implications are consistent or not in polynomial
time.
In order to transform SAT into 3-SAT, we use a construction termed the “butterfly construction”.
The idea is that for any clause that is longer than three components, we use two components
and represent the rest of the clause by a stand-in variable whose inverse is added to the list of
components. Consider the following formula:
ϕ = (x1 ∨ x2 ∨ x3 ∨ x4 ∨ x5 ∨ x6 )
ϕ = (x1 ∨ x2 ∨ y1 ) ∧ (y 1 ∨ x3 ∨ x4 ∨ x5 ∨ x6 )
= (x1 ∨ x2 ∨ y1 ) ∧ (y 1 ∨ x3 ∨ y2 ) ∧ (y 2 ∨ x4 ∨ x5 ∨ x6 )
= (x1 ∨ x2 ∨ y1 ) ∧ (y 1 ∨ x3 ∨ y2 ) ∧ (y 2 ∨ x4 ∨ y3 ) ∧ (y 3 ∨ x5 ∨ x6 )
Notice that this ensures there must be some xi that is true because if we simply let each yj be
true, the last clause will never be satisfied. Also, if we set one of the xi to be true, the effect of this
propagates to satisfy the whole clause. In other words, once we have one of the literals is true, we
can set the yj to take advantage of this. It looks something like this where ↑ represents true and ↓
represents false for the whole literal (including the not symbol):
y1 y1 x3 y2 y2 y3 y3
↑ ↓ ↑ ↓ ↑ ↓ ↑
With the SAT ≤Pm 3-SAT reduction, we can show many other problems like VC are NP-Complete
by reducing 3-SAT to them. In the next lecture, we will examine more such reductions.