Academia.eduAcademia.edu

Distributed quantum computing with classical communication

2021

Distributed quantum computing with classical communications allows to relieve some of the limitations on the number of qubits and mitigate the noise in quantum computers. We give an algorithm that transforms a quantum circuit on a single processor to equivalent circuits on distributed processors. We address the quantum advantage of distributed circuits for the Grover search, Simon's and the Deutsch-Jozsa problems. In the case of Grover the quantum advantage of distributed computing remains the same, i.e. O(√(N)). In the case of Simon it remains exponential, but the complexity deteriorates from O(n) to O(n^2), where n = log_2(N). The distributed Deutsch-Jozsa deteriorates to being probabilistic but retains a quantum advantage over classical random sampling: A single quantum query gives the same error as O(n) random sampling. In section 5 we describe an experiment with the IBMQ5 machines that illustrates the advantages of distributed Grover search.

Distributed quantum computing with classical communication J. Avron, Ofer Casper and Ilan Rozen arXiv:2104.07817v2 [quant-ph] 20 Apr 2021 Department of Physics, Technion, 320000 Haifa, Israel April 21, 2021 Abstract Distributed quantum computing with classical communications allows to relieve some of the limitations on the number of qubits and mitigate the noise in quantum computers. We give an algorithm that transforms a quantum circuit on a single processor to equivalent circuits on distributed processors. We address the quantum advantage of distributed circuits for the Grover search, Simon’s and the Deutsch-Jozsa problems. In the√case of Grover the quantum advantage of distributed computing remains the same, i.e. O( N ). In the case of Simon it remains exponential, but the complexity deteriorates from O(n) to O(n2 ), where n = log2 (N ). The distributed Deutsch-Jozsa deteriorates to being probabilistic but retains a quantum advantage over classical random sampling: A single quantum query gives the same error as O(n) random sampling. In section 5 we describe an experiment with the IBMQ5 machines that illustrates the advantages of distributed Grover search. 1 Introduction The IBM quantum experience and Amazon Braket offer the opportunity to implement quantum algorithms on distributed quantum computers. The IBM quantum experience, which is mostly free, proved to be a popular platform for research [12], games [16] and education [15]. Twenty eight quantum computers have been deployed by IBM. None can communicate quantumly. This is also the case for Amazon Braket. The question then begs itself [10], how one can use distributed quantum computing with classical communication to better exploit these resources. Distributed quantum computing with quantum communications [4, 14, 7] is a technology that does not yet exist and is unavailable for the current NISQ [18] computers. Distributed quantum computing with classical communication, is currently available, and offers at least two advantages: • It allows to redress the limited depth of of NISQ machines. Distributing the task among different processors allows to decrease the depth of the quantum circuits so that each machine is assigned a number of gates it can safely handle1 . • It offers a way to study algorithms that need more qubits than are currently available on any single machine. 1 Since the errors in noisy devices grows exponentially with the depth, even a modest gain in the depth can be important. 1 The two main disadvantages of such distributed hybrid quantum computing are: • Inefficient usage of qubits: Two distributed n-qubits quantum processors are needed to emulate a single processor with n + 1 connected qubits. • Degradation of the quantum advantage of certain quantum algorithms. Both result from the restricted (quantum) connectivity of distributed computing. Several basic quantum algorithms involve the use of oracles. This raises the question of how should one compare an oracle with its distributed cousin. The distributed oracles we shall consider compute complementary pieces of the function computed by the single oracle. We address the quantum advantage of distributed computing for few basic algorithms2: Grover [11]; Simon’s [21, 20] and Deutsch-Jozsa [9]. Distributed Grover still has quadratic speedup, and distributed Simon’s exponential speedup, but with higher complexity, see section 4.2. The most dramatic deterioration of the the quantum advantage occurs for Deutsch-Jozsa , see section 4.3. An experiment with Grover search on the IBM 5-qubit quantum computers demonstrates how distributed computing can effectively overcome limitations due to the limited depth of NISQ machines. 2 2.1 Distributed computations with classical communication Distributed processors The general (pure) state of Alice’s fully connected n-qubits processor is the N × N (rank one) matrix |ψihψ| (1) where N = 2n and |ψi is a general superposition of the n qubits. Bob has two processors with m qubits each. The (pure) quantum state of Bob’s distributed processors, is given by the M 2 × M 2 matrix |φe ihφe | ⊗ |φo ihφo | (2) where M = 2m and φe/o is an arbitrary superposition of the m qubits of each processor3. We are interested in quantum calculations that need n connected qubits, but Alice’s computer is either unavailable or is too noisy and we have to make do with the distributed processors of Bob where each processor has m < n. When m = n − 1 Bob can perform the same computation as Alice by assigning the computation of the even arguments to one processor and the odd arguments to a second processor: feven (y1 , . . . , ym ) = f (y1 , . . . , ym , 0), fodd (y1 , . . . , ym ) = f (y1 , . . . , ym , 1) (3) More generally, the arguments can be partitioned by feven (y1 , . . . , ym ) = f (y1 , . . . , yj−1 , 0, yj+1 , . . . , ym+1 ), fodd (y1 , . . . , ym ) = f (y1 , . . . , yj−1 , 1, yj+1 , . . . , ym+1 ) 2 In (4) [10] the more practical but less elementary example of distributed 3SAT is shown to have a quantum advantage. density matrices of Alice and Bob do not, in general, have the same size. In the case n = m + 1 they depend on the same number of parameters (and so describe manifolds of equal dimensions). 3 The 2 Bob’s distributed processors then perform the same task as Alice’s single processor. Alice’s processor has the advantage of higher connectivity, while Bob’s distributed processor has the advantage of more qubits4 and smaller depth. Alice can make arbitrary superposition of N states. Bob, on the other hand, can only make superpositions of the M = N/2 states of each processor. States in different processors can not be superposed. This gives Alice a quantum advantage. This then leads to the question how much of the quantum advantage survives in distributed computing. We do not know how to answer this question in general. Instead, we shall answer it in the special cases of Grover, Simon’s and Deutsch-Jozsa algorithms. 2.2 Quantum circuits and DNF Any Boolean function can be represented in a DNF (Disjunctive normal form) [1]. (For an elementary introduction see appendix A). For example, the DNF of the function that assigns 1 to 101 and 010 and 0 otherwise is f (x0 , x1 , x2 ) = x0 · x̄1 · x2 + x̄0 · x1 · x̄2 (5) where xj ∈ {0, 1} are logical variables (equivalently, binaries) and x̄j is the (logical) NOT, (equivalently, for binary x̄j = xj ⊕ 1). The + is the (logical) OR normally denoted ∨. In the general case with n logical variables xj , j ∈ 1, . . . , n, the DNF could be a sum of a large number of terms, each term is a product over all logical variables; Each variable appears once either as xj or as x̄j . The quantum circuit that computes Uf |xi = (−1)f (x) |xi (6) can be read out from the DNF. For example, for DNF in Eq. 5 the circuit is given in Fig. 1. In • Uf : X • X X • X • • X • X Figure 1: The circuit corresponding to the DNF in Eq. 5. The columns of control gates is C ⊗n Z gate, a notation that manifests the symmetry of the gate. the general cased, a pair of X gates decorate all the x̄ and the n-fold product is represented by C n Z. Remark 2.1. The DNF is, in general, not the most compact representation of the function f and similarly, the corresponding quantum circuit need not be the optimal circuit. Optimization of quantum circuits is considered in [2]. The even and odd parts of f are easily constructed from the DNF. If we use x0 as the bit that determine even/odd then fe/o for the example in Eq. 5 is fe = x1 · x̄2 , fo = x̄1 · x2 4 Bob’s (7) measurements give more bits of information than Alice’s measurements. We do not make use of this advantage here. 3 In the general case, all the x̄0 terms make the even part (with x̄0 deleted) all the x0 terms make the odd part (with x0 deleted). The DNF then can be used to construct the n − 1 qubits circuits for the even and odd parts: UfoB : X • X • UfeB : , • X • X Figure 2: The circuits corresponding to the DNF of Eq. 7 In conclusion, given Alice’s n qubits circuit corresponding to the DNF, there is a simple procedure that constructs the n − 1 qubits circuits of Bob for the even and odd parts of the function. An algorithm for generating the even and odd circuits in the general case is: Algorithm 2.1: Splitting a Circuit Result: two (n-1)-qubit circuits UfeB , UfoB 1 Divide (UfA ) 2 parity ← 0; 3 UfoB ← 1; 4 UfeB ← 1; 5 if UfA is not of DNF then 6 abort ; foreach gate G in UfA in the order they are executed from top to bottom do if G is an X gate acting on the parity qubit a then parity ← N OT (parity) ; else if G is a C ⊗(n−1) Z then if parity == 1 then append C ⊗(n−2) Z to UfeB ; else append C ⊗(n−2) Z to UfoB ; 7 8 9 10 11 12 13 14 else if G is a single qubit gate acting not on the parity qubit then append G to its respective qubit both in UfoB and UfeB ; else abort; 15 16 17 18 return UfoB , UfeB ; 19 a The parity qubit is defined as the qubit that distinguishes between the even and odd subspaces The algorithm returns the requisite UfoB and UfeB : UfA |0i ⊗ |ϕi ≡ |0i ⊗ UfeB |ϕi UfA |1i ⊗ |ϕi ≡ |1i ⊗ UfoB |ϕi (8) In appendix B we shall describe the converse procedure whereby starting from Bob’s distributed quantum circuits, Alice can generate her single quantum circuit. 4 3 Distributed circuits are shallower Distributed circuits have shallower depths because only part of a function is computed in each node. The shallow depth can be important for noisy devices that handle limited depth.  Consider  the circuit in Fig. 1 and its distributed cousins Fig. 2. To compute depth UfA and depth UfB we use the following rules [3]: • Any single qubit gate is allowed • The only 2-qubit gate allowed is CNOT • Gates that can be executed in parallel are grouped into columns • Consecutive single qubits gates are counted as a single qubit gate We first observe that C 2 Z is related to Toffoli by: • • • • • ≡ H H The Toffoli gate reduction to CN OT is given by [19]: • • • ≡ • • • T† H • T† T • T† T T T H Since H · H = 1 the Hadamard gates drop from the depth and depth(C 2 Z) = 11 < depth(C 2 X) = 12 (9) It follows from the circuit in Fig. 1 that depth(UfA ) = 3 + 2 · depth(C 2 Z) = 25 (10) To compute the depth of the corresponding distributed circuits in Fig. 2 we first recall • • ≡ • H (11) H For the even circuit one combines the X gates with the Hadamard gates. Hence both circuits have: e/o depth(UfB ) = 3 The example shows that distributed circuits can lead to substantial saving in the depth of the quantum circuit. Remark 3.1. A naive guess is that the gain in depth should be about a factor 2 since the distributed circuits compute half the entries. This is, of course, not the case in the example given above where the gain is about a factor 8. An interesting open problem is to establish general results about the gain in circuit depth for n qubits circuits and useful functions f . 5 4 4.1 The quantum advantage of distributed algorithms Grover Consider a Boolean function of n bits f . Grover search for the pre-image of 1 requires p O( N/M ) (12) queries of f where M is the total number of pre-images of 1. The distributed oracles consist of two Boolean functions of n − 1 bits each. Denote by Me,o the pre-images of 1 for the two functions. Evidently M = Me + Mo (13) The number of queries of the two distributed oracles are p p O( N/2Me ), O( N/2Mo) Since r 1 + x r √ 1 ≥ 2 2, 1−x (14) 0≤x≤1 (15) the comparison between a local Grover search and a distributed search is given by  p p 1 p N/2Me + N/2Mo ≥ N/M 2 (16) The 1/2 on the left reflects the fact that the (two) distributed oracles are the fair equivalent of the local oracle. We conclude that although distributed oracles are weaker than a single oracle, as expressed by the inequality, the quadratic quantum advantage of Grover remains. This is in agreement with [23] where it is shown that one can not do better than Grover (in the case of distributed computing). Similar arguments could be applied to quantum counting [6], since we could apply the counting algorithm on each sub-space and the final result would be the sum of all the outputs. 4.2 Period finding and the Zn2 Fourier transform |0i H ⊗n (−1)f H ⊗n ✌✌✌ Figure 3: The Deutsch-Jozsa and the period finding circuit Period finding is hard because of the complexity of the Fourier transform. Here we consider period finding of a Boolean phase oracle. The task is to find the periods under bit-wise addition of f : {0, 1}n 7→ {0, 1} f (x ⊕ s) = f (x) ∀x ∈ {0, 1}n, (17) The trivial solution is s = 0. Period finding is the business of Fourier transforms. For the case at hand, this is the Zn2 Fourier transform. This problem is related to the Simon’s problem [21]. 6 The circuit in Fig. 3 reduces period finding to solving a set of linear equations. Chasing the state |0i through the circuit in Fig. 3, using the identity 1 X H ⊗n |xi = √ (−1)x·y |yi N y (18) one finds 1 X |xi H ⊗n |0i = √ N x 1 X (−)f −−−→ √ (−1)f (x) |xi N x  1 X (−1)f (x) |xi + |x ⊕ si = √ 2 N x !  1 X X H ⊗n f (x)+x·y 1 + (−1)s·y |yi −−−→ (−1) 2N y x {z } | (19) =g(y) g(y) is the Fourier transform of (−1)f with respect to Zn2 . It is therefore localized on arguments linearly related to the period. This is seen from X X 2g(y) = (−1)f (x)+x·y + (−1)f (x⊕s)+(x⊕s)·y = g(y) (1 + (−1)s·y ) (20) x x It follows that: g(y) = wy δ(s · y) (21) where wy ∈ Z is the “weight” of the delta function. Inserting this to Eq. 19 gives for the state exiting the circuit 1 X (22) wy |yj i N y ·s=0 j j The outgoing state is then a linear combination of solutions of s · y = 0 mod 2. The period s are determined as the solutions of the linear system yj · s = 0 mod 2, wyj 6= 0 (23) Each yj represent a measurement of Eq. 22, and so a different run of the circuit. The unique solution of the system in Eq. 23 is the trivial one s = 0 in the case there are n vectors yj which are independent. In the case that n − 1 vectors are independent yj , there is a unique non-trivial period s, etc. In the case that the sum in Eq. 22 has a single term w0 for y = 0, Eq. 23 trivializes and any s is a solution: f is a constant. Remark 4.1. The weights {wy } are (at most) N integers that satisfy Pythagoras X wy2 = N 2 (24) y·s=0 This imposes a (Diophantine) constraint on the allowed {wy } which is independent of the function f. 7 Example 4.2. With N = 4, the solutions {wy } of Eq. 24 are5 {|4|}, {|2|, |2|, |2|, |2|} (25) Not all of these solutions are realized as Fourier transforms of (the phases of ) Boolean functions. In fact, there are 16 Boolean functions of 2 bits: • 2 constant functions f (x1 , x2 ) = 0 and f (x1 , x2 ) = 1 where s is arbitrary. • 6 balanced functions where 1 has two pre-images. These have a single non-trivial period. • 4 functions where 1 has a single pre-image and 4 where 1 has three pre-images. These have no periodicity and s = 0. The Fourier transforms of the phase functions (−1)f are • ±4δ(y) for the constant functions. • ±4δ(y − j), j ∈ 1, 2, 3 for the balanced functions  • ±2 1 − 2δ(y − j) , j ∈ 0, 1, 2, 3 for the remaining. Consider now the corresponding distributed algorithm where the non-trivial s is unique. Suppose first that s is even. The distributed oracle reduces to the problem for n − 1 qubits for the even (odd) oracles. This allows to determine s after O(n) queries. In the case that s is odd, x and x⊕s have different parity. The n−2 queries of the distributed algorithm will give the incorrect trivial result s = 0. One then needs to try again with a different notion of even-odd, per Eq. 4. If the new notion of even-odd gives s even the next n − 2 queries will determine s after a total of 3(n − 2) queries. If s is odd, we need to repeat the process. The complexity of an algorithm is determined by the worst case, which corresponds to all s being odd. This gives O(n2 ) (26) Remark 4.3. Similar arguments apply for the standard Simon algorithm for f : {0, 1}n 7→ {0, 1}n (which is represented by a quantum circuit acting on 2n-qubits). Consider two quantum circuits each corresponding to the odd and even sub-spaces, defining functions fe , fo : {0, 1}(n−1) 7→ {0, 1}n (each represented by a quantum circuit acting on 2n − 1-qubits). Following similar steps as in the paragraph above, we find that complexity in this case is also O(n2 ). In summary: The complexity of the period finding of a phase oracle is: • Classically, the cost of the Zn2 Fourier transform i.e. O(N ) • O(n) for the n qubits quantum circuit • O(n2 ) for the distributed quantum circuit. 5 In the sense that for example for the first set wy = |4| meaning either wy = +4 or wy = −4 8 4.3 Deutsch-Jozsa The Deutsch-Jozsa algorithm uses the circuit in Fig. 3 to tell if the function f is constant or balanced. There are two constant Boolean functions: fc1 (x) ≡ 0 and fc2 (x) ≡ 1. But there are many balanced functions. In fact, the space of balanced functions has (super) exponentially many elements:   N , N = 2n (27) N/2 Classically, to determine if f is balanced or constant with no error, one needs N/2 + 1 queries of f . If one is satisfied with distinguishing with high probability, then the probability that k random entries of a balanced functions have the same image under f is like selecting k stones from an urn with N stones, half white and half black, all of the same color. The probability for k stones of the same color is given by6     N N ... 1 − (28) P rob(k identical stones) = 1 − 2N − 1 2N − k + 1 When k ≪ N one has P rob(k identical stones) ≈ If one tolerates ǫ error probability, one needs to sample f  k 1 2 k = O(log 1/ǫ) (29) (30) times. The Deutsch-Jozsa circuit of Fig. 3 outputs 1 X P rob(y = 0) = (−1)f (x) N x 2 = ( 1 0 f ∈ const f ∈ balanced (31) and determines f , with no error, with a single query (assuming that the quantum gates are error free and the f is indeed either balanced or constant). Now consider the corresponding distributed Deutsch-Jozsa . This involves the partitioning of f into its even and odd parts. The even and odd parts of a constant function are still constant functions. But, the even and odd parts of a balanced function need not be (two) balanced functions. The distribution of 1 in the even function is the same as randomly drawing N/2 stones from two urns with N/2 stones each, all 1 or all 0. The probability distribution for finding k 1’s in the even sequence is   .  N/2 N/2 N P rob(k) = (32) k N/2 − k N/2 The average and variance of P rob(k) are: average = N , 4 variance = N2 16(N − 1) (33) Using this in Eq. 31 gives for the probability that a single use of the Deutsch-Jozsa will make the error of identifying a balanced function as if it were constant P rob(y = 0|balanced, single quantum query) = O(1/N ) 6 The (34) formula follows from repeated application of the fact that in an urn with W white stones and B black stone, the probability for picking a black stone is B/(B + W ). 9 Comparing with Eq. 30 we see that one needs O(n) classical queries to get the same margin of error as a single quantum query. In summary: The complexity of the Deutsch-Jozsa problem is • 1 + N/2 for a deterministic classical computation • A single query for a deterministic (error-free) circuit with n qubits • A single query of a distributed circuits with error O(1/N ) • O(n) queries for a probabilistic classical computation with comparable error O(1/N ) The first two entries imply an O(N ) quantum advantage and the last two entries a O(n) advantage of distributed quantum computing. 5 Distributed computation: An experiment Distributed algorithms can substantially reduce the depth of quantum circuits. This allows a computation on two noisy machines that can not be carried out on a single noisy machine. The depth the machine can handle is determined by at least two factors: • The error rate of the individual gates: The final error then scales exponentially with the depth. • The coherence time of the qubits relative to the computation time. The latter scales linearly the depth. In this section we describe an experiment carried out on the IBMQ5 running the Grover algorithm on n = 4 qubits. We compared the Grover search for n = 4 qubits with the distributed Grover search of the even and odd parts computed on two quantum circuits with n = 3 qubits each. The quantum circuit with 4 qubits fails to identify the marked element, see Fig. 5. In contrast, the distributed Grover circuit with 3 qubits each, succeeds to identify the marked element, see Fig. 7, albeit with a probability that is well below what one would expect for noiseless machines. 5.1 Splitting the Grover oracle We denote Grover(C,U) the Grover quantum search algorithm on computer C, with the Oracle U. Alice and Bob each get the same n-qubit oracle. Alice uses the oracle as is. Bob, uses the following algorithm to split the oracle to the two computers B1 and B2 : Algorithm 5.1: Distributed Grover Algorithm Result: The binary index of a desired element from the whole set. 1 BobGrover (Uf ) 2 Uodd , Ueven ← Divide(Uf ) ; 3 res1 ← Grover(B1 , Ueven ) ; 4 res2 ← Grover(B2 , Uodd) ; 5 if Ueven (res1 ) == 1 then 6 return res1 + ’0’ ; 7 return res2 + ’1’ ; 10 5.2 Distributed Grover search Consider Grover search with N = 16 and M = 1, i.e. the oracle has a single target. The optimal number of Grover iterations in this case is 3 and the success probability is 0.96. ✌✌✌ |q0 i H |q1 i H |q2 i H ✌✌✌ |q3 i H ✌✌✌ ✌✌✌ Grover Grover Grover Figure 4: Grover search with N = 16 and M = 1. A Grover iteration is made of two reflections: The oracle and the “diffuser”. The optimal number of iterations is 3 and the success probability in a noiseless machine 0.96. 5.9 · 10−2 5.3 · 10−2 5.1 · 10−2 5.6 · 10−2 6.9 · 10−2 6.1 · 10−2 6.1 · 10−2 6.2 · 10−2 6.1 · 10−2 6.1 · 10−2 6.4 · 10−2 6.3 · 10−2 7.3 · 10−2 6.8 · 10−2 6.8 · 10−2 Fraction 0.1 7 · 10−2 In the following experiments we made use of the open-source Qiskit library to create the various Grover search circuits and run them on real (i.e not simulated) quantum machines. Performing a measurement of the above circuit produced the following histogram: 3 · 10−2 |0 00 0i |0 00 1i |0 01 0i |0 01 1i |0 10 0i |0 10 1i |0 11 0i |0 11 1i |1 00 0i |1 00 1i |1 01 0i |1 01 1i |1 10 0i |1 10 1i |1 11 0i |1 11 1i 0 measured states Figure 5: The histogram shows the results from IBM’s ibmq santiago (a five qubit quantum computer available on the IBM Quantum Experience) execution of the Grover algorithm with a phase oracle that shifts the phase of the state |1111i. The length of the transpiled circuit and the noise of the machine cause Grover to fail. The histogram fails to identify |1111i. Next, we consider the corresponding distributed Grover search with two oracles, one for the ’odd’ subspace and the other for the ’even’ subspace. The target |1111i is now encoded in the odd oracle and the even oracle is “empty”.The resulting Grover circuits are: 11 ✌✌✌ |q0 i H |q1 i H |q2 i H ✌✌✌ |q0 i H ✌✌✌ |q1 i H |q2 i H Odd Grover ✌✌✌ Odd Grover Even Grover Even Grover ✌✌✌ ✌✌✌ Figure 6: The top circuit runs Grover search on three qubits, with the four qubits oracle evaluated for odd arguments. Similarly, the the bottom part evaluates for even arguments. The optimal number of iterations in the odd circuit is 2 and the success probability is 0.95. Grover fails in the even circuit, reflecting the absence of an even pre-image of 1. 0.1 measured states in the odd sub-space 0.12 0.13 0.13 0.14 measured states in the even sub-space Figure 7: The left histogram presents the percentage of each Z basis state for the Grover algorithm with the Uodd oracle. It identifies |111i as it should, albeit with a lower probability than what is expected for an ideal machine. The right histogram is the same for the Grover algorithm with the Ueven oracle. It does not single out any state, as it should. Both executions were done on IBM’s ibmq santiago. The left histogram clearly singles out the correct answer |111i (corresponding to |1111i) albeit with a lower probability than what one would expect in an ideal machine. The Grover search in the odd sub-space did better than the search involving 4 qubits. 12 |1 11 i |1 10 i |1 01 i |1 00 i |0 11 i |0 10 i |0 01 i |0 00 i |1 11 i |1 10 i |1 01 i |1 00 i |0 11 i |0 10 i 0 |0 01 i 0 9.9 · 10−2 0.13 0.11 0.12 0.12 9.4 · 10−2 0.11 0.11 9.4 · 10−2 0.1 0.3 0.2 8.3 · 10−2 0.2 |0 00 i Fraction 0.3 0.15 0.27 Measuring the two circuits (8096 times each) resulted in the following histograms: Following the discussion in section 3, we turned to IBM’s web dashboard to view additional details regarding the jobs that were sent to IBM’s ibmq santiago computer to produce the aforementioned histograms, and in particular we were interested in the scheme of the actual quantum circuit that was run. These circuits are also refered to as ”transpiled”, and they consist of quantum gates that represent basic physical operations that a quantum machine can perform on qubits. We discovered that the amount of computational steps in the transpiled version of the circuit in Figure 4. is approximately 370. The amount of computational steps in the ’odd’ circuit are around 90, and in the ’even’ circuit the number is roughly 40. It follows that the number of gates involved in single Grover iteration is approximately: • 120 for the undistributed circuit with 4 qubits • 45 and 20 for the distributed circuit with 3 qubits each This gives an advantage of about factor 3 to distributed computing, due to the significant shortening of the length of a distributed computation, with impact on the accuracy of the results. 6 Conclusion We studied distributed quantum computation with classical communication, a setting motivated by the currents status of NISQ computing [18]. Distributed computing offers the advantages of smaller depth which gives an exponential advantage in noisy machines. Distributed computing with classical communication on ideal devices is, in general, inferior to quantum computing in a single node. In order to make a comparison, we proposed “fair criteria” for comparing localized and distributed processors and oracles. We examined the quantum√advantage of three basic distributed algorithms: The complexity of distributed Grover is still O( N ). In the case of Simon the complexity is O(n2 ) versus O(n), both offering exponential advantage over the classical algorithm. In the case of Deutsch-Jozsa distributed computing only works probabilistically and offers a modest advantage over sampling. Acknowledgment We thank Eyal Bairey and Oded Kenneth for careful reading and helpful comments on the manuscript and E.B. for pointing out [10]. Appendix A The standard form of quantum circuits The quantum circuit for any Boolean function f : {0, 1}n → {0, 1}m in terms of C ⊗k X and X gates only. An algorithm for doing so is given below. |x0 i |x0 i ··· |xn i |0i ··· Uf ··· |xn i |f0 (x)i ··· |0i |fm (x)i 13 For the sake of simplicity and concreteness we illustrate the algorithm for an example where n = 2 and m = 1. A.1 From the truth table to DNF [1] The Disjunctive Normal Form [8] of a function can be calculated directly from the truth table of the function. Suppose we are given the truth table of the Boolean function f : {0, 1}2 → {0, 1}: x0 0 0 1 1 x1 0 1 0 1 f (x) 1 0 1 0 The DNF form of f is given by fDN F (x) = x̄0 · x̄1 + x0 · x̄1 (35) where x̄ denotes N OT (x). The DNF form is constructed as follows: We only need to consider the rows where f (x) = 1. In the example the first and third rows get a 1, so we only need to consider them. For every such row we build out of the arguments x0 , x1 a conjunctive (a Boolean statement made of only NOT and AND operations) so that an argument that takes the value 1 is written as is, while argument with value 0 is negated. For the example: • The first row is described by x̄0 · x̄1 since both arguments are 0. • The third row is described by x0 · x̄1 . Generalization: In the case of a different n do the same with more variables. For a different m repeat the process for each fi (x). A.1.1 Remark The DNF extracted from the truth table is not, in general, the simplest DNF formula of the function. For example f of Eq. 35 can be written more simply as: fDN F = x̄1 The optimal DNF formula can be calculated using Karnaugh maps [13]. A.2 From DNF to quantum circuit [5] The quantum circuit for any f : {0, 1}n → {0, 1}m can be built with X gates and C ⊗n X gates. To see this observe first that the Toffoli gate [22] gives the conjunction of its arguments: The |ai |bi |0i • • |ai |bi |a · bi Figure 8: Toffoli gate perform the product of its argument, i.e. their conjuction. quantum circuit for calculating x̄0 · x̄1 + x0 · x̄1 (disjunctions of conjuctions) is therefore 14 |x0 i X • X |x1 i X • X • • X |x0 i |x1 i X |0i |x̄0 · x̄1 + x0 · x̄1 i |x̄0 · x̄1 i The circuit can be simplified using XX ≡ 1. Generalization: For the case of n conjuctions, a similar constructions works with C ⊗n X replacing Toffoli: |x0 i • |x0 i .. . |xn i .. . • .. . |xn i |c + x0 · x1 · · · xn i |ci For Boolean functions whose output is a string of m bit one simply adds additional target qubits. A.2.1 Phase oracles Since it shows more clearly the main idea, this appendix demonstrates constructing bit flipping oracles. Yet most of the algorithms in section 2 use phase flipping oracles, it is noteworthy to add to this appendix that in-order to construct a phase converting quantum circuit for a Boolean function f : {0, 1}n 7→ {0, 1} one could use the same method while converting the C ⊗n X gates to C ⊗(n−1) Z gates, for example the circuit we constructed in this section would be: |x0 i X • X |x1 i X • X • X • |x0 i X (−1)x̄0 ·x̄1 +x0 ·x̄1 |x1 i For a function f : {0, 1}n 7→ {0, 1}m one would append these circuits one after the other for each output bit. Appendix B Synthesizing Alice’s circuit from Bob’s Given the quantum circuits for Uf and Ug Uf |xi = (−1)f (x) |xi , Ug |xi = (−1)g(x) |xi f, g : {0, 1}n 7→ {0, 1} (36) on n qubits each, a quantum circuit for U acting on n + 1 qubits that evaluate a function that coincides with f for even arguments and with g for odd arguments is given in Fig. 9 15 U ≡ • X • X · · · Ug ··· Uf ··· Figure 9: A circuit that synthesises Uf and Ug . The circuits for Uf /g are made from single X gates and C n Z. To construct CUf /g all we need is the construction of CX and CC n Z. Since CX = CN OT is a primitive gate, all one needs is to construct CC n Z which reduces [17] to Toffoli gates and CZ. References [1] David Hilbert; Wilhelm Ackermann. Principles of Mathematical Logic. American Mathematical Soc, 1999. [2] J.-H. Bae, Paul M. Alsing, Doyeol Ahn, and Warner A. Miller. Quantum circuit optimization using quantum karnaugh map. Scientific Reports, 10(1):15651, Sep 2020. [3] Adriano Barenco, Charles H Bennett, Richard Cleve, David P DiVincenzo, Norman Margolus, Peter Shor, Tycho Sleator, John A Smolin, and Harald Weinfurter. Elementary gates for quantum computation. Physical review A, 52(5):3457, 1995. [4] Robert Beals, Stephen Brierley, Oliver Gray, Aram W Harrow, Samuel Kutin, Noah Linden, Dan Shepherd, and Mark Stather. Efficient distributed quantum computing. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 469(2153):20120686, 2013. [5] Yu. I. Bogdanov, N.A. Bogdanova, D.V. Fastovets, and V.F. Lukichev. Representation of boolean functions in terms of quantum computation. Proc. SPIE 11022, International Conference on Micro- and Nano-Electronics 2018, 110222R, arXiv:1906.06374, 2019. [6] Gilles Brassard, Peter Høyer, and Alain Tapp. Quantum counting. In International Colloquium on Automata, Languages, and Programming, pages 820–831. Springer, 1998. [7] JI Cirac, AK Ekert, SF Huelga, and Chiara Macchiavello. Distributed quantum computation over noisy channels. Physical Review A, 59(6):4249, 1999. [8] B.A. Davey and H.A. Priestley. Introduction to Lattices and Order, page 153. Cambridge University Press., 1990. [9] David Deutsch and Richard Jozsa. Rapid solution of problems by quantum computation. Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences, 439(1907):553–558, 1992. [10] Vedran Dunjko, Yimin Ge, and J. Ignacio Cirac. Computational speedups using small quantum devices. Phys. Rev. Lett., 121:250501, Dec 2018. [11] Lov K. Grover. A fast quantum mechanical algorithm for database search. Proceedings, 28th Annual ACM Symposium on the Theory of Computing (STOC), May 1996, pages 212-219. arXiv:quant-ph/9605043, 1996. 16 [12] Robin Harper and Steven T. Flammia. Fault-tolerant logical gates in the ibm quantum experience. Phys. Rev. Lett., 122:080504, Feb 2019. [13] M. Karnaugh. The map method for synthesis of combinational logic circuits. Transactions of the American Institute of Electrical Engineers, Part I: Communication and Electronics, 72(5):593–599, 1953. [14] Yuan Liang Lim, Almut Beige, and Leong Chuan Kwek. Repeat-until-success linear optics distributed quantum computing. Physical review letters, 95(3):030505, 2005. [15] Robert Loredo. Learn Quantum Computing with Python and IBM Quantum Experience: A hands-on introduction to quantum computing and writing your own quantum programs with Python. Packet, 2020. [16] Soumik Mahanti, Santanu Das, Bikash K Behera, and Prasanta K Panigrahi. Quantum robots can fly; play games: an ibm quantum experience. Quantum Information Processing, 18(7):1–10, 2019. [17] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press, 2010. [18] John Preskill. Quantum computing in the nisq era and beyond. Quantum, 2:79, Aug 2018. [19] Vivek V. Shende and Igor L. Markov. On the cnot-cost of toffoli gates. Quantum Info. Comput., 9(5):461, May 2009. [20] Peter W Shor. Introduction to quantum algorithms. In Proceedings of Symposia in Applied Mathematics, volume 58, pages 143–160, 2002. [21] Daniel R Simon. On the power of quantum computation. SIAM journal on computing, 26(5):1474–1483, 1997. [22] Tommaso Toffoli. Reversible computing. In Jaco de Bakker and Jan van Leeuwen, editors, Automata, Languages and Programming, pages 632–644, Berlin, Heidelberg, 1980. Springer Berlin Heidelberg. [23] Christof Zalka. Grover’s quantum searching algorithm is optimal. Phys.Rev. A60 (1999) 2746-2751 arXiv:quant-ph/9711070, 1999. 17