Auto Mat

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

CSCI 3130: Formal Languages and Automata Theory The Chinese University of Hong Kong, Fall 2011

Homework 6 Solutions

Problem 1
For each of the following problems, say whether it is decidable or not. Justify your answer by describing an appropriate Turing Machine, or by reducing from ALLCF G which was shown undecidable in class. Assume that the alphabet of CFG G contains the symbol a. (a) L1 = { G : CFG G generates at least one string that starts with a}. (b) L2 = { G : CFG G generates all strings that start with a}.

Solution
(a) Decidable. To decide L1 , we want to determine if there is a derivation that begins with the start variable and produce a string that starts with a. It will be convenient to convert the CFG in Chomsky Normal form. Since we only care whether the rst symbol is an a, we may disregard all terminal productions of the form A x where x is anything other than a. For the same reason, we can replace every nonterminal production A BC by A B; this does not aect the rst symbol of derivations. After we do these simplications, we end up with a very simple CFG. For example, if G is the CFG S AA | BA A BB | a B AB | b the following CFG G preserves the property that at least one string starts with a: SA|B AB|a BA Now we can represent this CFG by a directed graph whose nodes are the terminals, as well as the nonterminal a, and there is a directed edge A x for every such production. If there is a path from S to a in this graph, then G has a string that starts with a, and so does G; if not, this cannot be the case. Here is a Turing Machine that decides L1 . In this Turing Machine description the conversion of G into G and the construction of the graph happen in one step.

On input G : Convert G to Chomsky Normal Form. Construct the following graph H: The nodes of H are the variables of G plus the terminal a. For every rule of the form A BC in G, put an edge from A to B. For every rule of the form A a in G, put an edge from A to a. If H has a path from S to a, accept. Otherwise reject. (b) Undecidable. To show L2 is undecidable, we reduce from ALLCF G . In other words, let us assume that L2 is decidable by a TM M . We then show how to construct a TM that decides ALLCF G , contradicting the fact that the latter is undecidable. To achieve this plan, we need to nd a way to convert a G into G so that G generates all strings that start with a if and only if G generates all strings. We do so by rst copying all the rules of G; then we introduce a new start variable S and add the rule S aS, where S is the start variable of G. Notice that G then generates all strings in G preceded by the symbol a. So if G generates all strings, G will generate all strings that start with a. Conversely, if G fails to generate some string w, then G will fail to generate aw. Let us now write down this reduction formally. Assume that M is a TM that decides L2 . Then the following TM will decide ALLCF G : On input G : Construct the following CFG G : Copy all the rules of G. Add a new start variable S and add the production S aS Run M on input G and return its answer. By the above reasoning, G generates all strings that start with a if and only if G generates all strings. So if M decides L2 , this TM will decide ALLCF G which is impossible, since ALLCF G is undecidable. Therefore L2 must be undecidable as well.

Problem 2
For each of the following problems, show that it is NP-complete (namely, (1) it is in NP and (2) some NP-complete language reduces to it.) When showing NP-completeness, you can start from any language that was shown NP-complete in class or tutorial. (a) L1 = { : is a boolean formula that has at least two satisfying assignments}. (b) L2 = { G, k : G is a graph that contains a clique of size k or an independent set of size k}.

Solution
(a) To show L1 is in NP, we give a verier for L1 : On input , a, b , where is a boolean formula and a, b are candidate assignments, reject if a = b. If a is a satisfying assingment for and b is also a satisfying assignnment for , accept, otherwise reject. The running time of this verier is polynomial, as it merely needs to tell if two strings are equal and if each of them is a satisfying assignment for . To show L1 is NP-hard, we give a reduction from SAT. We need a way to turn a boolean formula into a new boolean formula so that if is satisable, then has two satisfying assignments and if not, then has less than two satisfying assignments. For example, the following is like that: = ( y) ( y) ] where y is a new variable that does not appear in . If is satisable and a is its satsifying assignment, then has at least two satisfying assignments: one is a, y = f alse, the other one a, y = true. If is not satisable, then for any assignment will evaluate to false, and so will (regardless of what is assigned to y). The Turing Machine that on input , outputs ( y) ( y) implements this reduction. Since it runs in polynomial time, L1 is NP-hard. (b) To show L2 is in NP, we give a verier for L2 : On input G, k, S , where S is a set of vertices of G, if S has size less than k reject. Otherwise, for every pair of vertices in S check if there is an edge. If all the edges are there (so G has a k-clique), accept. If none of the edges are there (so G has an independent set), accept. Otherwise reject. This verier needs to perform O(k 2 ) checks on pairs of vertices, and so it runs in polynomial time. We now argue that L2 is NP-hard. There are two natural problems we can try to reduce from: clique or independent set. Lets try CLIQU E. So given G, k , we want to come up with G , k so that (1) if G has a k-clique then G has a k -clique or a k -independent set, and (2) if G has no k-clique, then G has neither. Condition (1) is easy to satisfy; we can just take G = G and k = k. But then condition (2) may be false: G could have no k-clique but it could well have a k-independent set, in which case so will G . So we have to do something that will kill all independent sets in G of size k , but make sure that cliques of size k in G become cliques of size k in G . Here is one way to do it. Suppose G has n vertices. Then G will have 2n + 1 vertices. The rst n of these will be a copy of G, ant the other n + 1 will be connected to all the other vertices (the rst n as well as among themselves). Now set k = k + n + 1. Let us describe G a bit more formally. If the vertices of G are {v1 , . . . , vn }, then the vertices of G will be {v1 , . . . , vn , w1 , . . . , wn+1 }. For every edge {vi , vj } in G the same edge will be present in G , but G will have the additional edges from every wi to every vj and for every wi to every wj . If G has a clique S of size k, then G will have a clique of size k = k + n + 1, namely the clique S {w1 , . . . , wn+1 }.

If G has no clique of size k, then G cannot have a clique of size k + n + 1: Any such clique must contain at least k of the vertices {v1 , . . . , vn }, and the restriction to those vertices would be a clique of size k in G. Moreover, G cannot have an independent set of size k = k + n + 1 because any set of k vertices must contain one of the vertices wi , and this vertex is connected to everything so it cannot be part of an independent set. The construction of G and k from G and k requires building a graph with O(n) vertices O(n2 ) edges, so it can be done in polynomial time. Therefore CLIQU E reduces to L2 in polynomial time and so L2 is NP-hard.

Problem 3
Throughout the semester, we looked at various models of computation and we came up with the following hierarchy: regular context-free P NP decidable recognizable

We also gave examples showing that the containments are strict (e.g., a context-free language that is not regular), except for the containment P NP, which is not known to be strict. There is one gap in this picture between NP languages and decidable languages. In this problem you will ll this gap. (a) Show that 3SAT is decidable, and the decider has running time 2O(n) . (Unlike a verier for 3SAT which is given a 3CNF together with a potential satisfying assignment for , a decider for 3SAT is only given a 3CNF but not an assignment for it.) (b) Argue that for every language L in NP there is a constant c such that L is decidable in time c 2O(n ) . (Use the Cook-Levin Theorem.) (c) Let D be the following Turing Machine: D: On input M, w , where M is a Turing Machine, |w| Run M on input M, w for at most 22 steps. If M accepts M, w within this many steps, reject; Otherwise (if M rejects or hasnt halted), accept. Show that the language decided by D cannot be decided in time 2O(n and therefore it is not in NP.
c c)

for any constant c,

Hint: Assume that D can be decided in time 2O(n ) . What happens when D is given input D, w , where w is a suciently long string?

Solution
(a) To decide 3SAT, we do the following. Given a 3CNF formula with k variables, we list all possible 2k assignments to . If any one of these assignments satises we accept, otherwise we reject. The running time of this algorithm is O(n2k ) = 2O(n) , where n is the length of . (b) Let L be any language in NP. By the Cook-Levin theorem, there is a reduction from L to 3SAT that runs in time O(nc ) for some constant c. Consider the following TM for L: On input x, rst reduce x to an instance of 3SAT , then run the decider from part (a) on and return its answer. Since the running time of the reduction is O(nc ), has size at most c c O(nc ), so the running time of the decider for L is O(nc ) + 2O(n ) = 2O(n ) . (c) Suppose that the language of D can be decided in time t(n) = 2O(n ) and let M be a decider for the language of D with this running time. Consider what happens when D is given input M, w , where w is any suciently long string. Specically, assume the length n of w satises n n the condition t(n) 22 . Since D simulates M on input M, w for 22 steps, M will have halted on input w by the end of the simulation. So if M accepts M, w then D will reject M, w and vice-versa. So the language of D and the language of M must dier on input M, w , and therefore M cannot be a decider for the language of D. Since D runs in time at most O(22 ) it is a decider, but the language it decides cannot be c decided in time 2O(n ) for any constant c. By part (b), all NP languages can be decided in c) time 2O(n . Therefore the language of D cannot be in NP.
n c

Problem 4
A heuristic is an algorithm that often works well in practice, but it may not always produce the correct answer. In this problem, we will consider a heuristic for 3SAT. Let be a CNF formula and x a literal in . Suppose we set x to true. The reduced form of is the formula obtained by discarding all clauses of that contain x and removing x from all the other clauses. For example, if = (x1 x2 ) (x1 x3 ) (x2 x3 ), then setting x1 = true gives the reduced form x3 (x2 x3 ), while setting x1 = true gives the reduced form x2 (x2 x3 ). Consider the following heuristic H for 3SAT: On input , where is a 3CNF formula with n variables: For i := 1 to n, repeat the following: If xi appears in more often than xi , set xi = true. Otherwise, set xi = true. Replace with its reduced form. If contains an empty clause, reject. accept. (a) Show that H runs in polynomial time. (b) Show that if H accepts , then 3SAT .

(c) Show that it is possible that H rejects , even though 3SAT .

Solution
(a) Let m be the length of . In each loop of the iteration, H has to count the number of appearances of xi and xi in , which takes O(m) steps. Then it has to substitute a value for xi and reduce the formula, which also takes O(m) steps. (While reducing, it gures out if there is an empty clause or not.) Since there are n iterations, the total number of steps is O(nm), which is polynomial in the input size. (b) If H accepts , then in the process of assigning x1 , x2 , . . . , xn , no empty clause is ever created. Since xi satises all the clauses discarded in the ith iteration, all clauses of are satised by this assignment. (c) Let be the formula (x1 x1 x1 ) (x1 x1 x2 ) (x1 x1 x3 ). This formula is satisable (set all variables to true), but H will choose to assign x1 to false, create an empty clause after reducing, and reject.

You might also like