Distributed quantum computing with classical
communication
J. Avron, Ofer Casper and Ilan Rozen
arXiv:2104.07817v2 [quant-ph] 20 Apr 2021
Department of Physics, Technion, 320000 Haifa, Israel
April 21, 2021
Abstract
Distributed quantum computing with classical communications allows to relieve some
of the limitations on the number of qubits and mitigate the noise in quantum computers.
We give an algorithm that transforms a quantum circuit on a single processor to equivalent
circuits on distributed processors. We address the quantum advantage of distributed circuits
for the Grover search, Simon’s and the Deutsch-Jozsa problems. In the√case of Grover the
quantum advantage of distributed computing remains the same, i.e. O( N ). In the case of
Simon it remains exponential, but the complexity deteriorates from O(n) to O(n2 ), where
n = log2 (N ). The distributed Deutsch-Jozsa deteriorates to being probabilistic but retains
a quantum advantage over classical random sampling: A single quantum query gives the
same error as O(n) random sampling. In section 5 we describe an experiment with the
IBMQ5 machines that illustrates the advantages of distributed Grover search.
1
Introduction
The IBM quantum experience and Amazon Braket offer the opportunity to implement quantum
algorithms on distributed quantum computers. The IBM quantum experience, which is mostly
free, proved to be a popular platform for research [12], games [16] and education [15]. Twenty
eight quantum computers have been deployed by IBM. None can communicate quantumly. This
is also the case for Amazon Braket. The question then begs itself [10], how one can use distributed
quantum computing with classical communication to better exploit these resources.
Distributed quantum computing with quantum communications [4, 14, 7] is a technology
that does not yet exist and is unavailable for the current NISQ [18] computers. Distributed
quantum computing with classical communication, is currently available, and offers at least two
advantages:
• It allows to redress the limited depth of of NISQ machines. Distributing the task among
different processors allows to decrease the depth of the quantum circuits so that each
machine is assigned a number of gates it can safely handle1 .
• It offers a way to study algorithms that need more qubits than are currently available on
any single machine.
1 Since
the errors in noisy devices grows exponentially with the depth, even a modest gain in the depth can be
important.
1
The two main disadvantages of such distributed hybrid quantum computing are:
• Inefficient usage of qubits: Two distributed n-qubits quantum processors are needed to
emulate a single processor with n + 1 connected qubits.
• Degradation of the quantum advantage of certain quantum algorithms.
Both result from the restricted (quantum) connectivity of distributed computing.
Several basic quantum algorithms involve the use of oracles. This raises the question of
how should one compare an oracle with its distributed cousin. The distributed oracles we shall
consider compute complementary pieces of the function computed by the single oracle.
We address the quantum advantage of distributed computing for few basic algorithms2:
Grover [11]; Simon’s [21, 20] and Deutsch-Jozsa [9]. Distributed Grover still has quadratic
speedup, and distributed Simon’s exponential speedup, but with higher complexity, see section
4.2. The most dramatic deterioration of the the quantum advantage occurs for Deutsch-Jozsa ,
see section 4.3.
An experiment with Grover search on the IBM 5-qubit quantum computers demonstrates
how distributed computing can effectively overcome limitations due to the limited depth of
NISQ machines.
2
2.1
Distributed computations with classical communication
Distributed processors
The general (pure) state of Alice’s fully connected n-qubits processor is the N × N (rank one)
matrix
|ψihψ|
(1)
where N = 2n and |ψi is a general superposition of the n qubits. Bob has two processors with m
qubits each. The (pure) quantum state of Bob’s distributed processors, is given by the M 2 × M 2
matrix
|φe ihφe | ⊗ |φo ihφo |
(2)
where M = 2m and φe/o is an arbitrary superposition of the m qubits of each processor3.
We are interested in quantum calculations that need n connected qubits, but Alice’s computer
is either unavailable or is too noisy and we have to make do with the distributed processors of
Bob where each processor has m < n.
When m = n − 1 Bob can perform the same computation as Alice by assigning the computation of the even arguments to one processor and the odd arguments to a second processor:
feven (y1 , . . . , ym ) = f (y1 , . . . , ym , 0),
fodd (y1 , . . . , ym ) = f (y1 , . . . , ym , 1)
(3)
More generally, the arguments can be partitioned by
feven (y1 , . . . , ym ) = f (y1 , . . . , yj−1 , 0, yj+1 , . . . , ym+1 ),
fodd (y1 , . . . , ym ) = f (y1 , . . . , yj−1 , 1, yj+1 , . . . , ym+1 )
2 In
(4)
[10] the more practical but less elementary example of distributed 3SAT is shown to have a quantum advantage.
density matrices of Alice and Bob do not, in general, have the same size. In the case n = m + 1 they depend
on the same number of parameters (and so describe manifolds of equal dimensions).
3 The
2
Bob’s distributed processors then perform the same task as Alice’s single processor. Alice’s
processor has the advantage of higher connectivity, while Bob’s distributed processor has the
advantage of more qubits4 and smaller depth.
Alice can make arbitrary superposition of N states. Bob, on the other hand, can only make
superpositions of the M = N/2 states of each processor. States in different processors can not be
superposed. This gives Alice a quantum advantage. This then leads to the question how much
of the quantum advantage survives in distributed computing. We do not know how to answer
this question in general. Instead, we shall answer it in the special cases of Grover, Simon’s and
Deutsch-Jozsa algorithms.
2.2
Quantum circuits and DNF
Any Boolean function can be represented in a DNF (Disjunctive normal form) [1]. (For an
elementary introduction see appendix A). For example, the DNF of the function that assigns 1
to 101 and 010 and 0 otherwise is
f (x0 , x1 , x2 ) = x0 · x̄1 · x2 + x̄0 · x1 · x̄2
(5)
where xj ∈ {0, 1} are logical variables (equivalently, binaries) and x̄j is the (logical) NOT,
(equivalently, for binary x̄j = xj ⊕ 1). The + is the (logical) OR normally denoted ∨.
In the general case with n logical variables xj , j ∈ 1, . . . , n, the DNF could be a sum of a
large number of terms, each term is a product over all logical variables; Each variable appears
once either as xj or as x̄j .
The quantum circuit that computes
Uf |xi = (−1)f (x) |xi
(6)
can be read out from the DNF. For example, for DNF in Eq. 5 the circuit is given in Fig. 1. In
•
Uf :
X
•
X
X
•
X
•
•
X
•
X
Figure 1: The circuit corresponding to the DNF in Eq. 5. The columns of control gates is C ⊗n Z
gate, a notation that manifests the symmetry of the gate.
the general cased, a pair of X gates decorate all the x̄ and the n-fold product is represented by
C n Z.
Remark 2.1. The DNF is, in general, not the most compact representation of the function f
and similarly, the corresponding quantum circuit need not be the optimal circuit. Optimization
of quantum circuits is considered in [2].
The even and odd parts of f are easily constructed from the DNF. If we use x0 as the bit
that determine even/odd then fe/o for the example in Eq. 5 is
fe = x1 · x̄2 ,
fo = x̄1 · x2
4 Bob’s
(7)
measurements give more bits of information than Alice’s measurements. We do not make use of this
advantage here.
3
In the general case, all the x̄0 terms make the even part (with x̄0 deleted) all the x0 terms make
the odd part (with x0 deleted).
The DNF then can be used to construct the n − 1 qubits circuits for the even and odd parts:
UfoB :
X
•
X
•
UfeB :
,
•
X
•
X
Figure 2: The circuits corresponding to the DNF of Eq. 7
In conclusion, given Alice’s n qubits circuit corresponding to the DNF, there is a simple
procedure that constructs the n − 1 qubits circuits of Bob for the even and odd parts of the
function.
An algorithm for generating the even and odd circuits in the general case is:
Algorithm 2.1: Splitting a Circuit
Result: two (n-1)-qubit circuits UfeB , UfoB
1 Divide (UfA )
2
parity ← 0;
3
UfoB ← 1;
4
UfeB ← 1;
5
if UfA is not of DNF then
6
abort ;
foreach gate G in UfA in the order they are executed from top to bottom do
if G is an X gate acting on the parity qubit a then
parity ← N OT (parity) ;
else if G is a C ⊗(n−1) Z then
if parity == 1 then
append C ⊗(n−2) Z to UfeB ;
else
append C ⊗(n−2) Z to UfoB ;
7
8
9
10
11
12
13
14
else if G is a single qubit gate acting not on the parity qubit then
append G to its respective qubit both in UfoB and UfeB ;
else
abort;
15
16
17
18
return UfoB , UfeB ;
19
a The
parity qubit is defined as the qubit that distinguishes between the even and odd subspaces
The algorithm returns the requisite UfoB and UfeB :
UfA |0i ⊗ |ϕi ≡ |0i ⊗ UfeB |ϕi
UfA |1i ⊗ |ϕi ≡ |1i ⊗ UfoB |ϕi
(8)
In appendix B we shall describe the converse procedure whereby starting from Bob’s distributed quantum circuits, Alice can generate her single quantum circuit.
4
3
Distributed circuits are shallower
Distributed circuits have shallower depths because only part of a function is computed in each
node. The shallow depth can be important for noisy devices that handle limited depth.
Consider
the circuit in Fig. 1 and its distributed cousins Fig. 2. To compute depth UfA and
depth UfB we use the following rules [3]:
• Any single qubit gate is allowed
• The only 2-qubit gate allowed is CNOT
• Gates that can be executed in parallel are grouped into columns
• Consecutive single qubits gates are counted as a single qubit gate
We first observe that C 2 Z is related to Toffoli by:
•
•
•
•
•
≡
H
H
The Toffoli gate reduction to CN OT is given by [19]:
•
•
•
≡
•
•
•
T†
H
•
T†
T
•
T†
T
T
T
H
Since H · H = 1 the Hadamard gates drop from the depth and
depth(C 2 Z) = 11 < depth(C 2 X) = 12
(9)
It follows from the circuit in Fig. 1 that
depth(UfA ) = 3 + 2 · depth(C 2 Z) = 25
(10)
To compute the depth of the corresponding distributed circuits in Fig. 2 we first recall
•
•
≡
•
H
(11)
H
For the even circuit one combines the X gates with the Hadamard gates. Hence both circuits
have:
e/o
depth(UfB ) = 3
The example shows that distributed circuits can lead to substantial saving in the depth of the
quantum circuit.
Remark 3.1. A naive guess is that the gain in depth should be about a factor 2 since the
distributed circuits compute half the entries. This is, of course, not the case in the example given
above where the gain is about a factor 8. An interesting open problem is to establish general
results about the gain in circuit depth for n qubits circuits and useful functions f .
5
4
4.1
The quantum advantage of distributed algorithms
Grover
Consider a Boolean function of n bits f . Grover search for the pre-image of 1 requires
p
O( N/M )
(12)
queries of f where M is the total number of pre-images of 1.
The distributed oracles consist of two Boolean functions of n − 1 bits each. Denote by Me,o
the pre-images of 1 for the two functions. Evidently
M = Me + Mo
(13)
The number of queries of the two distributed oracles are
p
p
O( N/2Me ), O( N/2Mo)
Since
r
1
+
x
r
√
1
≥ 2 2,
1−x
(14)
0≤x≤1
(15)
the comparison between a local Grover search and a distributed search is given by
p
p
1 p
N/2Me + N/2Mo ≥ N/M
2
(16)
The 1/2 on the left reflects the fact that the (two) distributed oracles are the fair equivalent of
the local oracle.
We conclude that although distributed oracles are weaker than a single oracle, as expressed
by the inequality, the quadratic quantum advantage of Grover remains. This is in agreement
with [23] where it is shown that one can not do better than Grover (in the case of distributed
computing).
Similar arguments could be applied to quantum counting [6], since we could apply the counting
algorithm on each sub-space and the final result would be the sum of all the outputs.
4.2
Period finding and the Zn2 Fourier transform
|0i
H ⊗n
(−1)f
H ⊗n
✌✌✌
Figure 3: The Deutsch-Jozsa and the period finding circuit
Period finding is hard because of the complexity of the Fourier transform. Here we consider
period finding of a Boolean phase oracle. The task is to find the periods under bit-wise addition
of f : {0, 1}n 7→ {0, 1}
f (x ⊕ s) = f (x) ∀x ∈ {0, 1}n,
(17)
The trivial solution is s = 0. Period finding is the business of Fourier transforms. For the case
at hand, this is the Zn2 Fourier transform. This problem is related to the Simon’s problem [21].
6
The circuit in Fig. 3 reduces period finding to solving a set of linear equations. Chasing the
state |0i through the circuit in Fig. 3, using the identity
1 X
H ⊗n |xi = √
(−1)x·y |yi
N y
(18)
one finds
1 X
|xi
H ⊗n |0i = √
N x
1 X
(−)f
−−−→ √
(−1)f (x) |xi
N x
1 X
(−1)f (x) |xi + |x ⊕ si
= √
2 N x
!
1 X X
H ⊗n
f (x)+x·y
1 + (−1)s·y |yi
−−−→
(−1)
2N y
x
{z
}
|
(19)
=g(y)
g(y) is the Fourier transform of (−1)f with respect to Zn2 . It is therefore localized on arguments
linearly related to the period. This is seen from
X
X
2g(y) =
(−1)f (x)+x·y +
(−1)f (x⊕s)+(x⊕s)·y = g(y) (1 + (−1)s·y )
(20)
x
x
It follows that:
g(y) = wy δ(s · y)
(21)
where wy ∈ Z is the “weight” of the delta function. Inserting this to Eq. 19 gives for the state
exiting the circuit
1 X
(22)
wy |yj i
N y ·s=0 j
j
The outgoing state is then a linear combination of solutions of s · y = 0 mod 2. The period s
are determined as the solutions of the linear system
yj · s = 0
mod 2,
wyj 6= 0
(23)
Each yj represent a measurement of Eq. 22, and so a different run of the circuit.
The unique solution of the system in Eq. 23 is the trivial one s = 0 in the case there are n
vectors yj which are independent. In the case that n − 1 vectors are independent yj , there is a
unique non-trivial period s, etc. In the case that the sum in Eq. 22 has a single term w0 for
y = 0, Eq. 23 trivializes and any s is a solution: f is a constant.
Remark 4.1. The weights {wy } are (at most) N integers that satisfy Pythagoras
X
wy2 = N 2
(24)
y·s=0
This imposes a (Diophantine) constraint on the allowed {wy } which is independent of the function
f.
7
Example 4.2. With N = 4, the solutions {wy } of Eq. 24 are5
{|4|},
{|2|, |2|, |2|, |2|}
(25)
Not all of these solutions are realized as Fourier transforms of (the phases of ) Boolean functions.
In fact, there are 16 Boolean functions of 2 bits:
• 2 constant functions f (x1 , x2 ) = 0 and f (x1 , x2 ) = 1 where s is arbitrary.
• 6 balanced functions where 1 has two pre-images. These have a single non-trivial period.
• 4 functions where 1 has a single pre-image and 4 where 1 has three pre-images. These have
no periodicity and s = 0.
The Fourier transforms of the phase functions (−1)f are
• ±4δ(y) for the constant functions.
• ±4δ(y − j), j ∈ 1, 2, 3 for the balanced functions
• ±2 1 − 2δ(y − j) , j ∈ 0, 1, 2, 3 for the remaining.
Consider now the corresponding distributed algorithm where the non-trivial s is unique.
Suppose first that s is even. The distributed oracle reduces to the problem for n − 1 qubits for
the even (odd) oracles. This allows to determine s after O(n) queries.
In the case that s is odd, x and x⊕s have different parity. The n−2 queries of the distributed
algorithm will give the incorrect trivial result s = 0. One then needs to try again with a different
notion of even-odd, per Eq. 4. If the new notion of even-odd gives s even the next n − 2 queries
will determine s after a total of 3(n − 2) queries. If s is odd, we need to repeat the process. The
complexity of an algorithm is determined by the worst case, which corresponds to all s being
odd. This gives
O(n2 )
(26)
Remark 4.3. Similar arguments apply for the standard Simon algorithm for f : {0, 1}n 7→
{0, 1}n (which is represented by a quantum circuit acting on 2n-qubits). Consider two quantum circuits each corresponding to the odd and even sub-spaces, defining functions fe , fo :
{0, 1}(n−1) 7→ {0, 1}n (each represented by a quantum circuit acting on 2n − 1-qubits). Following
similar steps as in the paragraph above, we find that complexity in this case is also O(n2 ).
In summary: The complexity of the period finding of a phase oracle is:
• Classically, the cost of the Zn2 Fourier transform i.e. O(N )
• O(n) for the n qubits quantum circuit
• O(n2 ) for the distributed quantum circuit.
5 In
the sense that for example for the first set wy = |4| meaning either wy = +4 or wy = −4
8
4.3
Deutsch-Jozsa
The Deutsch-Jozsa algorithm uses the circuit in Fig. 3 to tell if the function f is constant or
balanced. There are two constant Boolean functions: fc1 (x) ≡ 0 and fc2 (x) ≡ 1. But there
are many balanced functions. In fact, the space of balanced functions has (super) exponentially
many elements:
N
, N = 2n
(27)
N/2
Classically, to determine if f is balanced or constant with no error, one needs N/2 + 1 queries
of f . If one is satisfied with distinguishing with high probability, then the probability that k
random entries of a balanced functions have the same image under f is like selecting k stones
from an urn with N stones, half white and half black, all of the same color. The probability for
k stones of the same color is given by6
N
N
... 1 −
(28)
P rob(k identical stones) = 1 −
2N − 1
2N − k + 1
When k ≪ N one has
P rob(k identical stones) ≈
If one tolerates ǫ error probability, one needs to sample f
k
1
2
k = O(log 1/ǫ)
(29)
(30)
times. The Deutsch-Jozsa circuit of Fig. 3 outputs
1 X
P rob(y = 0) =
(−1)f (x)
N x
2
=
(
1
0
f ∈ const
f ∈ balanced
(31)
and determines f , with no error, with a single query (assuming that the quantum gates are error
free and the f is indeed either balanced or constant).
Now consider the corresponding distributed Deutsch-Jozsa . This involves the partitioning of
f into its even and odd parts. The even and odd parts of a constant function are still constant
functions. But, the even and odd parts of a balanced function need not be (two) balanced
functions. The distribution of 1 in the even function is the same as randomly drawing N/2
stones from two urns with N/2 stones each, all 1 or all 0. The probability distribution for finding
k 1’s in the even sequence is
.
N/2
N/2
N
P rob(k) =
(32)
k
N/2 − k
N/2
The average and variance of P rob(k) are:
average =
N
,
4
variance =
N2
16(N − 1)
(33)
Using this in Eq. 31 gives for the probability that a single use of the Deutsch-Jozsa will make
the error of identifying a balanced function as if it were constant
P rob(y = 0|balanced, single quantum query) = O(1/N )
6 The
(34)
formula follows from repeated application of the fact that in an urn with W white stones and B black stone,
the probability for picking a black stone is B/(B + W ).
9
Comparing with Eq. 30 we see that one needs O(n) classical queries to get the same margin of
error as a single quantum query.
In summary: The complexity of the Deutsch-Jozsa problem is
• 1 + N/2 for a deterministic classical computation
• A single query for a deterministic (error-free) circuit with n qubits
• A single query of a distributed circuits with error O(1/N )
• O(n) queries for a probabilistic classical computation with comparable error O(1/N )
The first two entries imply an O(N ) quantum advantage and the last two entries a O(n)
advantage of distributed quantum computing.
5
Distributed computation: An experiment
Distributed algorithms can substantially reduce the depth of quantum circuits. This allows a
computation on two noisy machines that can not be carried out on a single noisy machine. The
depth the machine can handle is determined by at least two factors:
• The error rate of the individual gates: The final error then scales exponentially with the
depth.
• The coherence time of the qubits relative to the computation time. The latter scales linearly
the depth.
In this section we describe an experiment carried out on the IBMQ5 running the Grover
algorithm on n = 4 qubits. We compared the Grover search for n = 4 qubits with the distributed
Grover search of the even and odd parts computed on two quantum circuits with n = 3 qubits
each.
The quantum circuit with 4 qubits fails to identify the marked element, see Fig. 5. In contrast,
the distributed Grover circuit with 3 qubits each, succeeds to identify the marked element, see
Fig. 7, albeit with a probability that is well below what one would expect for noiseless machines.
5.1
Splitting the Grover oracle
We denote Grover(C,U) the Grover quantum search algorithm on computer C, with the Oracle
U. Alice and Bob each get the same n-qubit oracle. Alice uses the oracle as is. Bob, uses the
following algorithm to split the oracle to the two computers B1 and B2 :
Algorithm 5.1: Distributed Grover Algorithm
Result: The binary index of a desired element from the whole set.
1 BobGrover (Uf )
2
Uodd , Ueven ← Divide(Uf ) ;
3
res1 ← Grover(B1 , Ueven ) ;
4
res2 ← Grover(B2 , Uodd) ;
5
if Ueven (res1 ) == 1 then
6
return res1 + ’0’ ;
7
return res2 + ’1’ ;
10
5.2
Distributed Grover search
Consider Grover search with N = 16 and M = 1, i.e. the oracle has a single target. The optimal
number of Grover iterations in this case is 3 and the success probability is 0.96.
✌✌✌
|q0 i
H
|q1 i
H
|q2 i
H
✌✌✌
|q3 i
H
✌✌✌
✌✌✌
Grover
Grover
Grover
Figure 4: Grover search with N = 16 and M = 1. A Grover iteration is made of two reflections:
The oracle and the “diffuser”. The optimal number of iterations is 3 and the success probability
in a noiseless machine 0.96.
5.9 · 10−2
5.3 · 10−2
5.1 · 10−2
5.6 · 10−2
6.9 · 10−2
6.1 · 10−2
6.1 · 10−2
6.2 · 10−2
6.1 · 10−2
6.1 · 10−2
6.4 · 10−2
6.3 · 10−2
7.3 · 10−2
6.8 · 10−2
6.8 · 10−2
Fraction
0.1
7 · 10−2
In the following experiments we made use of the open-source Qiskit library to create the
various Grover search circuits and run them on real (i.e not simulated) quantum machines.
Performing a measurement of the above circuit produced the following histogram:
3 · 10−2
|0
00
0i
|0
00
1i
|0
01
0i
|0
01
1i
|0
10
0i
|0
10
1i
|0
11
0i
|0
11
1i
|1
00
0i
|1
00
1i
|1
01
0i
|1
01
1i
|1
10
0i
|1
10
1i
|1
11
0i
|1
11
1i
0
measured states
Figure 5: The histogram shows the results from IBM’s ibmq santiago (a five qubit quantum
computer available on the IBM Quantum Experience) execution of the Grover algorithm with a
phase oracle that shifts the phase of the state |1111i. The length of the transpiled circuit and
the noise of the machine cause Grover to fail.
The histogram fails to identify |1111i.
Next, we consider the corresponding distributed Grover search with two oracles, one for the
’odd’ subspace and the other for the ’even’ subspace. The target |1111i is now encoded in the
odd oracle and the even oracle is “empty”.The resulting Grover circuits are:
11
✌✌✌
|q0 i
H
|q1 i
H
|q2 i
H
✌✌✌
|q0 i
H
✌✌✌
|q1 i
H
|q2 i
H
Odd Grover
✌✌✌
Odd Grover
Even Grover
Even Grover
✌✌✌
✌✌✌
Figure 6: The top circuit runs Grover search on three qubits, with the four qubits oracle evaluated
for odd arguments. Similarly, the the bottom part evaluates for even arguments. The optimal
number of iterations in the odd circuit is 2 and the success probability is 0.95. Grover fails in
the even circuit, reflecting the absence of an even pre-image of 1.
0.1
measured states in the odd sub-space
0.12
0.13
0.13
0.14
measured states in the even sub-space
Figure 7: The left histogram presents the percentage of each Z basis state for the Grover algorithm
with the Uodd oracle. It identifies |111i as it should, albeit with a lower probability than what is
expected for an ideal machine. The right histogram is the same for the Grover algorithm with
the Ueven oracle. It does not single out any state, as it should. Both executions were done on
IBM’s ibmq santiago.
The left histogram clearly singles out the correct answer |111i (corresponding to |1111i) albeit
with a lower probability than what one would expect in an ideal machine. The Grover search in
the odd sub-space did better than the search involving 4 qubits.
12
|1
11
i
|1
10
i
|1
01
i
|1
00
i
|0
11
i
|0
10
i
|0
01
i
|0
00
i
|1
11
i
|1
10
i
|1
01
i
|1
00
i
|0
11
i
|0
10
i
0
|0
01
i
0
9.9 · 10−2
0.13
0.11
0.12
0.12
9.4 · 10−2
0.11
0.11
9.4 · 10−2
0.1
0.3
0.2
8.3 · 10−2
0.2
|0
00
i
Fraction
0.3
0.15
0.27
Measuring the two circuits (8096 times each) resulted in the following histograms:
Following the discussion in section 3, we turned to IBM’s web dashboard to view additional
details regarding the jobs that were sent to IBM’s ibmq santiago computer to produce the aforementioned histograms, and in particular we were interested in the scheme of the actual quantum
circuit that was run. These circuits are also refered to as ”transpiled”, and they consist of
quantum gates that represent basic physical operations that a quantum machine can perform on
qubits. We discovered that the amount of computational steps in the transpiled version of the
circuit in Figure 4. is approximately 370. The amount of computational steps in the ’odd’ circuit
are around 90, and in the ’even’ circuit the number is roughly 40. It follows that the number of
gates involved in single Grover iteration is approximately:
• 120 for the undistributed circuit with 4 qubits
• 45 and 20 for the distributed circuit with 3 qubits each
This gives an advantage of about factor 3 to distributed computing, due to the significant shortening of the length of a distributed computation, with impact on the accuracy of the results.
6
Conclusion
We studied distributed quantum computation with classical communication, a setting motivated
by the currents status of NISQ computing [18]. Distributed computing offers the advantages of
smaller depth which gives an exponential advantage in noisy machines. Distributed computing
with classical communication on ideal devices is, in general, inferior to quantum computing
in a single node. In order to make a comparison, we proposed “fair criteria” for comparing
localized and distributed processors and oracles. We examined the quantum√advantage of three
basic distributed algorithms: The complexity of distributed Grover is still O( N ). In the case of
Simon the complexity is O(n2 ) versus O(n), both offering exponential advantage over the classical
algorithm. In the case of Deutsch-Jozsa distributed computing only works probabilistically and
offers a modest advantage over sampling.
Acknowledgment
We thank Eyal Bairey and Oded Kenneth for careful reading and helpful comments on the
manuscript and E.B. for pointing out [10].
Appendix A
The standard form of quantum circuits
The quantum circuit for any Boolean function f : {0, 1}n → {0, 1}m in terms of C ⊗k X and X
gates only. An algorithm for doing so is given below.
|x0 i
|x0 i
···
|xn i
|0i
···
Uf
···
|xn i
|f0 (x)i
···
|0i
|fm (x)i
13
For the sake of simplicity and concreteness we illustrate the algorithm for an example where
n = 2 and m = 1.
A.1
From the truth table to DNF [1]
The Disjunctive Normal Form [8] of a function can be calculated directly from the truth table of
the function. Suppose we are given the truth table of the Boolean function f : {0, 1}2 → {0, 1}:
x0
0
0
1
1
x1
0
1
0
1
f (x)
1
0
1
0
The DNF form of f is given by
fDN F (x) = x̄0 · x̄1 + x0 · x̄1
(35)
where x̄ denotes N OT (x).
The DNF form is constructed as follows: We only need to consider the rows where f (x) = 1.
In the example the first and third rows get a 1, so we only need to consider them. For every
such row we build out of the arguments x0 , x1 a conjunctive (a Boolean statement made of only
NOT and AND operations) so that an argument that takes the value 1 is written as is, while
argument with value 0 is negated. For the example:
• The first row is described by x̄0 · x̄1 since both arguments are 0.
• The third row is described by x0 · x̄1 .
Generalization: In the case of a different n do the same with more variables. For a different m
repeat the process for each fi (x).
A.1.1
Remark
The DNF extracted from the truth table is not, in general, the simplest DNF formula of the
function. For example f of Eq. 35 can be written more simply as:
fDN F = x̄1
The optimal DNF formula can be calculated using Karnaugh maps [13].
A.2
From DNF to quantum circuit [5]
The quantum circuit for any f : {0, 1}n → {0, 1}m can be built with X gates and C ⊗n X gates.
To see this observe first that the Toffoli gate [22] gives the conjunction of its arguments: The
|ai
|bi
|0i
•
•
|ai
|bi
|a · bi
Figure 8: Toffoli gate perform the product of its argument, i.e. their conjuction.
quantum circuit for calculating x̄0 · x̄1 + x0 · x̄1 (disjunctions of conjuctions) is therefore
14
|x0 i
X
•
X
|x1 i
X
•
X
•
•
X
|x0 i
|x1 i
X
|0i
|x̄0 · x̄1 + x0 · x̄1 i
|x̄0 · x̄1 i
The circuit can be simplified using XX ≡ 1.
Generalization: For the case of n conjuctions, a similar constructions works with C ⊗n X replacing
Toffoli:
|x0 i
•
|x0 i
..
.
|xn i
..
.
•
..
.
|xn i
|c + x0 · x1 · · · xn i
|ci
For Boolean functions whose output is a string of m bit one simply adds additional target qubits.
A.2.1
Phase oracles
Since it shows more clearly the main idea, this appendix demonstrates constructing bit flipping
oracles. Yet most of the algorithms in section 2 use phase flipping oracles, it is noteworthy to
add to this appendix that in-order to construct a phase converting quantum circuit for a Boolean
function f : {0, 1}n 7→ {0, 1} one could use the same method while converting the C ⊗n X gates
to C ⊗(n−1) Z gates, for example the circuit we constructed in this section would be:
|x0 i
X
•
X
|x1 i
X
•
X
•
X
•
|x0 i
X
(−1)x̄0 ·x̄1 +x0 ·x̄1 |x1 i
For a function f : {0, 1}n 7→ {0, 1}m one would append these circuits one after the other for
each output bit.
Appendix B
Synthesizing Alice’s circuit from Bob’s
Given the quantum circuits for Uf and Ug
Uf |xi = (−1)f (x) |xi ,
Ug |xi = (−1)g(x) |xi
f, g : {0, 1}n 7→ {0, 1}
(36)
on n qubits each, a quantum circuit for U acting on n + 1 qubits that evaluate a function that
coincides with f for even arguments and with g for odd arguments is given in Fig. 9
15
U
≡
•
X
•
X
· · · Ug
···
Uf
···
Figure 9: A circuit that synthesises Uf and Ug .
The circuits for Uf /g are made from single X gates and C n Z. To construct CUf /g all we
need is the construction of CX and CC n Z. Since CX = CN OT is a primitive gate, all one
needs is to construct CC n Z which reduces [17] to Toffoli gates and CZ.
References
[1] David Hilbert; Wilhelm Ackermann. Principles of Mathematical Logic. American Mathematical Soc, 1999.
[2] J.-H. Bae, Paul M. Alsing, Doyeol Ahn, and Warner A. Miller. Quantum circuit optimization
using quantum karnaugh map. Scientific Reports, 10(1):15651, Sep 2020.
[3] Adriano Barenco, Charles H Bennett, Richard Cleve, David P DiVincenzo, Norman Margolus, Peter Shor, Tycho Sleator, John A Smolin, and Harald Weinfurter. Elementary gates
for quantum computation. Physical review A, 52(5):3457, 1995.
[4] Robert Beals, Stephen Brierley, Oliver Gray, Aram W Harrow, Samuel Kutin, Noah Linden,
Dan Shepherd, and Mark Stather. Efficient distributed quantum computing. Proceedings of
the Royal Society A: Mathematical, Physical and Engineering Sciences, 469(2153):20120686,
2013.
[5] Yu. I. Bogdanov, N.A. Bogdanova, D.V. Fastovets, and V.F. Lukichev. Representation
of boolean functions in terms of quantum computation. Proc. SPIE 11022, International
Conference on Micro- and Nano-Electronics 2018, 110222R, arXiv:1906.06374, 2019.
[6] Gilles Brassard, Peter Høyer, and Alain Tapp. Quantum counting. In International Colloquium on Automata, Languages, and Programming, pages 820–831. Springer, 1998.
[7] JI Cirac, AK Ekert, SF Huelga, and Chiara Macchiavello. Distributed quantum computation
over noisy channels. Physical Review A, 59(6):4249, 1999.
[8] B.A. Davey and H.A. Priestley. Introduction to Lattices and Order, page 153. Cambridge
University Press., 1990.
[9] David Deutsch and Richard Jozsa. Rapid solution of problems by quantum computation.
Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences,
439(1907):553–558, 1992.
[10] Vedran Dunjko, Yimin Ge, and J. Ignacio Cirac. Computational speedups using small
quantum devices. Phys. Rev. Lett., 121:250501, Dec 2018.
[11] Lov K. Grover. A fast quantum mechanical algorithm for database search. Proceedings, 28th
Annual ACM Symposium on the Theory of Computing (STOC), May 1996, pages 212-219.
arXiv:quant-ph/9605043, 1996.
16
[12] Robin Harper and Steven T. Flammia. Fault-tolerant logical gates in the ibm quantum
experience. Phys. Rev. Lett., 122:080504, Feb 2019.
[13] M. Karnaugh. The map method for synthesis of combinational logic circuits. Transactions
of the American Institute of Electrical Engineers, Part I: Communication and Electronics,
72(5):593–599, 1953.
[14] Yuan Liang Lim, Almut Beige, and Leong Chuan Kwek. Repeat-until-success linear optics
distributed quantum computing. Physical review letters, 95(3):030505, 2005.
[15] Robert Loredo. Learn Quantum Computing with Python and IBM Quantum Experience: A
hands-on introduction to quantum computing and writing your own quantum programs with
Python. Packet, 2020.
[16] Soumik Mahanti, Santanu Das, Bikash K Behera, and Prasanta K Panigrahi. Quantum
robots can fly; play games: an ibm quantum experience. Quantum Information Processing,
18(7):1–10, 2019.
[17] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information:
10th Anniversary Edition. Cambridge University Press, 2010.
[18] John Preskill. Quantum computing in the nisq era and beyond. Quantum, 2:79, Aug 2018.
[19] Vivek V. Shende and Igor L. Markov. On the cnot-cost of toffoli gates. Quantum Info.
Comput., 9(5):461, May 2009.
[20] Peter W Shor. Introduction to quantum algorithms. In Proceedings of Symposia in Applied
Mathematics, volume 58, pages 143–160, 2002.
[21] Daniel R Simon. On the power of quantum computation. SIAM journal on computing,
26(5):1474–1483, 1997.
[22] Tommaso Toffoli. Reversible computing. In Jaco de Bakker and Jan van Leeuwen, editors,
Automata, Languages and Programming, pages 632–644, Berlin, Heidelberg, 1980. Springer
Berlin Heidelberg.
[23] Christof Zalka. Grover’s quantum searching algorithm is optimal. Phys.Rev. A60 (1999)
2746-2751 arXiv:quant-ph/9711070, 1999.
17