Thesis Bjarnebouwer Hgqsat0x
Thesis Bjarnebouwer Hgqsat0x
Thesis Bjarnebouwer Hgqsat0x
Master Thesis 60 EC
Supervisors:
Prof. dr. Michael Walter
Author: Dr. Jonas Helsen
Bjarne Bouwer
Second assessor:
Dr. Philippe Corboz
Abstract
Noisy intermediate-scale quantum computers are limited in their performance due to the low
number of qubits and high error rate. A quantum circuit is limited in its circuit depth so that
errors do not build up to unmanageable levels. However, a theoretical proposal, DMERA, based
on the MERA tensor network by Kim and Swingle provides us with a method to construct a
quantum circuit that is inherently noise resilient and does not require many qubits [1]. The
ansatz can reliably prepare a desired quantum state by iteratively applying a quantum channel
with one stable fixed point. In a recent paper, Sewell and Jordan have numerically studied
DMERA by performing experiments on a quantum computer [2].
In this thesis, we will investigate DMERA in a formal and experimental manner. We repro-
duce and verify the results from Sewell and Jordan via experiments on a quantum computer.
Furthermore, we introduce a new quantum algorithm that can extract the scaling dimensions
of a critical model from measurement data using the eigenvalues of DMERA.
1
Acknowledgments
First, I would like to thank Michael Walter and Jonas Helsen for the great guidance and
discussions, I definitely learned a lot from them. Even though this thesis was done during a
weird time period where we had to meet in hybrid or completely digital fashion, they still made
me feel at home. Any time I had a question they were happy to answer me.
I want to thank the QuSoft quantum information group for letting me be part of the weekly
meetings and being so welcoming. In particular, I would like to thank Freek Witteveen for the
help in the beginning of the project and sharing his knowledge on tensor networks.
Thanks to Amazon we were able to perform experiments on a real quantum computer.
Through their Amazon Web Services we were given access to a wide selection of quantum
computers.
For their continued support during my studies, I want to thank my parents, sister, and
brother, for taking care of me and always showing interest in what I was working on. Lastly, I
am grateful that I got to finish my master’s degree with my dear friends who were a constant
factor in the time I was working on my thesis and who were always there in the library with
me.
2
Contents
1 Introduction 5
2 Quantum computing 9
2.1 Quantum bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Quantum circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Quantum channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Entanglement entropy and area law . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Tensor networks 13
3.1 MERA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.2 Bond dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.3 Causal cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.4 Entanglement entropy scaling . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.5 Scaling superoperators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.6 Evaluation of observables and correlators . . . . . . . . . . . . . . . . . 22
3.2 DMERA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.1 Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.2 Causal cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.3 Noise resilience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Implementing DMERA 30
4.1 Practicalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.1 Native gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.2 Noise models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.3 Classical simulation of DMERA . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.4 Algorithmic cooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Energies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2.1 Classical method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.2 Quantum method and experiment . . . . . . . . . . . . . . . . . . . . . 38
3
Contents
5.2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.2.6 Discussion and outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
A Appendices 52
A.1 Tensor contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
A.2 Telescopic decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4
Chapter 1
Introduction
In recent years, there has been a rapid development in the world of quantum computing. Many
companies and organizations, such as Amazon [3], IBM [4], and Google [5], have shown interest
in building a quantum computer that can outperform even the fastest and largest classical
computer. The reason for this interest is the long list of potential applications, some of which
can revolutionize the digital world. Among these future applications are usages in medicine
development [6], financial modeling [7], and optimization problems [8]. Another notable entry
is the ability to break internet encryption. When Peter Shor proposed an algorithm that allows
one to relatively easily solve the math problems that guard everyone’s money, personal details
and more, people began to take interest in quantum computing and realize that this technology
can be world changing, for the better or worse.
Unfortunately, the current generation of quantum computers are not advanced enough to
realize many of these potential applications. The machines do not have enough computing
power (qubits) yet and any result obtained is not reliable due to the many mistakes that occur
during computation. The tasks that are run on quantum computers are limited in their size,
as errors would otherwise keep building up. John Preskill coined the term Noisy Intermediate-
Scale Quantum (NISQ) era for this generation of quantum computers [9]. He argues that even
with 100 qubits we should not expect too many great results yet. It would take orders of
magnitude more qubits to be able to efficiently run many of the quantum algorithms.
5
Figure 1.1: An example of DMERA.
systems are both examples of a critical system. MPS does not exhibit the same properties that
these critical systems have, for example scale invariance or entanglement entropy scaling. There
is however one type of tensor network that has the same properties as many critical systems
and thus should be able to perform well in simulating criticality.
The multiscale entanglement renormalization ansatz (MERA) is built specifically to preserve
entanglement and is designed with critical systems in mind [13, 14]. It also has the properties
that follow the critical systems’ properties, such as correct entanglement entropy scaling and
decay of correlation functions [15].
Kim and Swingle combined both MERA and quantum computing to obtain DMERA. By
making adjustments to tailor MERA for quantum computers, they developed a method that
can be used efficiently and accurately on NISQ era quantum computers.
One big hurdle to deal with is the fact that quantum computers make many mistakes.
Luckily, DMERA is intrinsically noise resilient due to an inner layer structure. At each layer,
new pure qubits are introduced and old used qubits which contain noise are removed. As a
result, the accumulating noise during the computation stays limited and the approximation
of the ground state becomes more accurate. We will discuss the bound on the noise and a
corresponding mathematical proof in Section 3.2.3.
Furthermore, DMERA can be computed quickly and efficiently on a quantum computer.
With a small enough gate time, a quantum computation of DMERA would only take a few
seconds [1]. If we were to simulate DMERA on a classical computer instead, we would need
to store the information in memory and use a lot of time to compute the gates in DMERA. In
comparison, for a DMERA circuit depth D a quantum computer spends O(D log N ) time to
compute a circuit with O(Dd ) qubits where N is the system size and d the number of spatial
dimensions whereas a classical computer would have to spend time that scales exponentially in
O(Dd ).
Although a classical computer cannot efficiently simulate DMERA for many qubits, it is
nonetheless convenient to use a classical simulation for verification of the results from a real
quantum computer. A quantum computer will always be prone to errors and noise, while a
classical computer can implement DMERA without any faults. A classical simulation can be
used to predict what should happen and also what will happen on a quantum computer. We
can simulate a noisy environment by choosing a noise model and applying it to the classical
simulation to replicate the errors that typically occur on a quantum computer.
With a classically simulated and real quantum computer in hand, we can test the perfor-
6
mance of DMERA by preparing quantum states and measuring the energy of the states. The
prepared states should approximate the ground state better and better the more DMERA lay-
ers we use, which we can test this by measuring their energy. We should find a convergence to
the true ground state energy as a function of the number of applied layers. In Section 4.2 we
discuss the method and results of this experiment.
The same experiment has been performed by Sewell and Jordan on a different quantum
computer [2]. In their results they observed a convergence to a value close to the true ground
state energy and they were able to verify the theoretical proposal of Kim and Swingle. In this
thesis, we have used the methods of Sewell and Jordan to verify and build upon their results.
Contributions
In their paper they briefly discuss the concept of scaling dimensions which are a property
of a critical model, specifically of one that is described by a conformal field theory. Scaling
dimensions can be of physical interest, so finding a way to obtain these quantities from the
measurement data would be useful. The scaling dimensions of a critical model can be corre-
sponded with the scaling dimensions of DMERA, so instead we will try to extract the scaling
dimensions of DMERA using a new quantum algorithm.
The DMERA scaling dimensions can in turn be computed from its eigenvalues. As briefly
mentioned before, DMERA is constructed from several layers and each layer is an application
of some function Φ such that the DMERA can be described as repeated iterations of Φ. The
function Φ and its eigenvalues dictate the convergence to the ground state and this can easily
be observed in the convergence of the expectation value of an observable O (Figure 1.2). For n
iterations of Φ, the expectation value of the observable ⟨O⟩ shows an exponential decay to some
final value. The decay contains information on the eigenvalues and thus the scaling dimensions.
Figure 1.2: Decay of the expectation value of an observable ⟨O⟩ as a function of the number of
DMERA layers n.
We propose a new quantum algorithm to extract the eigenvalues from the decay. This
algorithm uses a sub-algorithm called ESPRIT [16] which is the key part of the new algorithm.
The ESPRIT algorithm can estimate the eigenvalues that could be present in a decay curve,
such as the one displayed in Figure 1.2. The new algorithm combines ESPRIT and a statistical
test to determine how many and which eigenvalues are most likely present in a decay curve.
7
From Figure 1.2 we could find a few eigenvalues of DMERA, but to find more eigenvalues
and map out the eigenvalue structure we would need more decay curves to draw information
from. By picking many observables and starting states, we can cover the whole basis space
of DMERA and theoretically extract every eigenvalue of DMERA. Using the new algorithm,
we are able to extract several eigenvalues from the data and correspond them to the scaling
dimensions of DMERA.
To summarize, the current generation of quantum computers make many mistakes in their
computations, so Kim and Swingle proposed an ansatz for a quantum circuit which is robust
against noise and can prepare states that are useful for research of critical physical models.
Using the methods described by Sewell and Jordan, we have computed the energies of the
prepared states and verified both their experimental results and the theoretical proposal of
Kim and Swingle. Finally, we propose a new quantum algorithm that can extract scaling
dimensions from measurement data by finding eigenvalues of DMERA.
In this thesis we research the quantum Ising chain model with transverse field as our critical
model. This is a relatively simple model and easily verifiable as this model has been examined
on both MERA by Evenbly and White [17] and DMERA by Sewell and Jordan.
Organization
In Chapter 2 we introduce basis concepts from quantum computing and information theory
which appear frequently in the thesis. In Chapter 3 we discuss the MERA and DMERA
tensor networks by first introducing MERA and its properties and then making the step to
its variant DMERA which has similar properties as MERA. In Chapter 4 we examine several
concepts that are used for experiments on a quantum computer and its simulation thereof before
performing an experiment to measure the energies of prepared states by the DMERA circuit.
Finally, in Chapter 5, after examining the eigenvalue structure of DMERA, we introduce a new
quantum algorithm that can extract the eigenvalues of the DMERA channel using data from
measurements and we test its performance on simulated data.
8
Chapter 2
Quantum computing
Quantum computers work inherently different from classical computers, however many parallels
can be drawn between both types of computers. Both use a form of bits, the building blocks
of the logical computer, and logic gates to control the bits. The gates can be combined into a
circuit which outputs a desired result. In the case of classical computing, this result is always
determined by the input (save for any incidental errors). However, for quantum computers,
the result is dependent on the probabilities of the possible states as prepared by the circuit.
The prepared state is in a superposition of all possible outcomes so one trial of the circuit can
return a different output from another trial. The concept of superposition is unique to quantum
computers and when used properly it can significantly speed up computations compared to
classical computers. Superposition is used in many quantum algorithms, such as in Shor’s [18]
and Grover’s [19] algorithm, for a great decrease in computation time over classical algorithms.
This section serves as a brief introduction to the basics of quantum computing. For further
and more thorough discussion on quantum computing and information, see Nielsen and Chuang
[20] or Watrous [21].
In this chapter, we will introduce the quantum bit and its properties in Section 2.1, and
quantum gates to manipulate the quantum bits in Section 2.2. We then introduce quantum
channels which generalizes the idea of sending or manipulating quantum information in Section
2.3. Lastly, we discuss entanglement entropy which can be used to quantify entanglement
between quantum (sub)systems in Section 2.4.
9
2.2. Quantum circuits
In computational basis, we often leave out the tensor product sign ⊗ to get |0⟩ ⊗ |1⟩ → |01⟩.
This concept can be trivially extended to an n-qubit system.
Density matrices
Commonly the density matrix, or operator, ρ ∈ L(H) is used to describe a quantum state
instead of |ψ⟩. In general, X X
ρ= pi ρi = pi |ψi ⟩⟨ψi | , (2.3)
i i
where pi ≥ 0 are the probabilities for ρi . In addition, the trace of ρ must be one, i.e. Tr[ρ] = 1.
Formally, a density matrix ρ is a positive semidefinite operator as it is Hermitian and its
eigenvalues are non-negative. We denote this as ρ ∈ P SD(H).
Both vector and density matrix formulations are equivalent to each other, however in some
situations one may be more convenient to use than the other. The vector formulation |ψ⟩ is
useful when the quantum system is in a pure state, such as an eigenstate of the system. The
density matrix formulation can represent both pure states and mixed states. Mixed states are
used to describe entanglement between two subsystems, which is otherwise not possible to do
with the vector notation.
U [ρ] = U ρU † .
Furthermore, a unitary U has the property that U † U = I and it follows then that the inner
product is conserved after a unitary transformation: ⟨U ψ, U ϕ⟩ = ⟨ψ| U † U |ϕ⟩ = ⟨ψ|ϕ⟩ = ⟨ψ, ϕ⟩.
Additionally, by the cyclic property of trace we have that Tr[U [ρ]] = Tr[ρ] = 1 which implies
that unitaries map quantum states to quantum states.
Commonly used quantum gates are
0 1 0 −i 1 0
• Pauli spin operators X = ,Y = ,Z=
1 0 i 0 0 −1
1 1
• Hadamard gate H = √12
1 −1
1 0
• Phase gate P (ϕ) =
0 eiϕ
1 0 0 0
0 1 0 0
• Controlled NOT gate CNOT = 0 0 0 1
0 0 1 0
10
2.3. Quantum channels
Quantum circuits
Quantum circuits are the bread of quantum computing and quantum states are the butter.
Circuits act on n qubits and are a (tensor) product of quantum gates. For example, to prepare
a so-called Bell state |Φ+ ⟩ we can construct a small quantum circuit which uses only two gates.
In equation form this reads
1
CNOT(H ⊗ I) |00⟩ = CNOT √ (|00⟩ + |10⟩)
2
1
= √ (|00⟩ + |11⟩),
2
where H is the Hadamard gate which brings a |0⟩-state in an equal superposition of |0⟩ and
|1⟩ and CNOT performs a spin flip on the second qubit if the first qubit is in the |1⟩-state.
However, we can also represent this operation with a diagram. In Figure 2.1 we visualize the
qubits as wires running from left to right passing through the gates. Measuring the prepared
state in computational basis will result in the |00⟩-state in 50% of the results and the |11⟩-state
in the other 50%. A single measurement will yield only one of the results, therefore it is crucial
to perform many trials and measurements of the circuit to get a realistic view of the actual
prepared state.
√
Figure 2.1: A quantum circuit that prepares the Bell state |ψ⟩ = (|00⟩ + |11⟩)/ 2 from the
starting state |00⟩. The H-gate denotes the Hadamard gate, which brings the first qubit in an
equal superposition of |0⟩ and |1⟩. The X-gate together with the dot denotes the CNOT gate,
where the first qubit is the control bit. The X-gate is performed only if the first qubit is in the
|1⟩-state.
11
2.4. Entanglement entropy and area law
quantum channel, however, does need not be unitary. Unitarity implies reversibility and there
exist some channels that cannot be reversed. Consider a quantum channel Φ that measures a
state and prepares the state ρ0 = |0⟩⟨0|. Then for any ρ,
Φ[ρ] = ρ0 = |0⟩⟨0| ,
which is a valid quantum channel but not unitary.
Quantum channels can act consecutively on a state ρ. If first channel Φ is applied to ρ
and then the channel Ψ, we write this as Ψ[Φ[ρ]] or Ψ ◦ Φ[ρ]. In the case that Φ = Ψ we can
alternatively write this as Φ2 [ρ].
where ρAB is the joint state of two qubits A and B [21]. An example of a separable state would
be ρ = 21 (|00⟩⟨00| + |11⟩⟨11|), which is analogous to two coins that always land on the same
side together.
Entangled states are states that cannot be written as a separable state. This implies that
there is some correlation between the subsystems that is not classical. Commonly used entan-
gled states are the Bell states,
Consider the Bell state from Eq. 2.5. The entanglement entropy of register A is found by
computing S(ρA ) = − Tr[ρA log ρA ], with ρA = TrB [ρBell ], to get S(ρA ) = log 2 which turns
out to be the largest value for a two-dimensional system that one can get. For this reason the
Bell state is sometimes called the maximally entangled state.
Area law
If we were to pick a random state ρ from a Hilbert space H, there will generally be some
entanglement between the subsystems of ρ which we can quantify with the von Neumann
entropy. Suppose we pick a random state ρrdm ∈ H for L sites with dimensionality d, then
S(ρrdm ) ≈ Ld , i.e. the entropy scales as the volume of the system. Most of the pi in ρrdm are
non-vanishing so the entanglement entropy becomes large.
Now, a ground state ρgs of a non-critical Hamiltonian H will generally have only a few
relevant pi so instead the entropy scales as the boundary of the system, S(ρgs ) ≈ Ld−1 . This
is known as the area law of entanglement entropy [22]. Additionally, ground states of critical
Hamiltonians acquire a logarithmic correction, so S(ρgs,crit ) ≈ Ld−1 log L.
12
Chapter 3
Tensor networks
Quantum systems can have exponentially large associated Hilbert spaces, which can be difficult
to deal with for a classical computer. Let us consider a joint system of 3 spins. The associated
Hilbert space of this system is H012 = H0 ⊗H1 ⊗H2 which has a dimensionality of 2·2·2 = 23 = 8.
The dimension of the Hilbert space for a n-spin system would be 2n , exponential in n. An
exponentially growing Hilbert space dimension is concerning if one wants to classically simulate
a quantum system. Classical computers that are used for quantum physical calculations will
have to store that information in memory which makes it difficult to handle systems with
numerous particles as the memory requirement quickly grows.
Fortunately, tensor networks, a type of computational method for classical computers, are
designed to tackle many-body quantum systems differently that avoids the exponential growth
problem. By only considering a select part of the full Hilbert space, tensor networks do not
need to consider an exponentially growing space. In addition, tensor networks offer up accuracy
in trade for speed and efficiency by limiting the size of its tensors. This trade is allowed to
work since tensor networks are generally used to approximate ground states. Ground states
have special properties compared to a general state, such as the entanglement entropy scaling
which we discussed in Section 2.4, which is why tensor networks can approximate ground states
in a computationally favorable way.
One of the most commonly used tensor networks is the so-called matrix product state (MPS)
[23, 24]. In short, MPS works by decomposing a large tensor, e.g. a state which describes many
spin particles, into a chain of smaller tensors whose size depends on the number of particles. By
restricting the tensors’ size up to a certain value χ we decrease the amount of information in our
computations, which makes it easier for a classical computer to deal with the calculations. One
might think that we then lose too much information to accurately approximate the quantum
system, but by keeping the most relevant states during the size restriction, MPS is still able to
make an accurate approximation if the value χ is chosen appropriately. For more discussion on
MPS, see the review of Schöllwock [24].
In this chapter, we will introduce and discuss the multiscale entanglement renormalization
ansatz (MERA), a different type of tensor network. Later, we will introduce the deep MERA
(DMERA), a variant on MERA which is suited to be used on a quantum computer.
3.1 MERA
The matrix product state is not efficient dealing with critical systems, as their properties (e.g.
entanglement entropy scaling, lack of scale invariance, decay of correlation function) do not
13
3.1. MERA
align with the properties of critical systems. MPS can still approximate the ground state of a
critical system, but cannot capture all the behaviors of such a system.
The multiscale entanglement renormalization ansatz (MERA) is a tensor network that is
based on the concepts of renormalization group and entanglement, and aims to handle (one-
dimensional) critical Hamiltonians in a more efficient manner than MPS [14, 25, 26, 27]. MERA
has several properties which coincide with the properties that critical systems also have, which
we will discuss in detail in the following section.
In this section, we will first formally define a translation and scale invariant one-dimensional
binary MERA, and discuss several aspects such as bond dimension and causal cone. Then we
will look at scale invariance in more detail and define the ascending/descending superoperators
in Section 3.1.5. Lastly, we discuss the evaluation of expectation values of observables in Section
3.1.6.
U = UT −1 UT −2 · · · U0
with Ui : HLi 7→ HLi+1 , which are referred to as (MERA) layers. Then the state transforms
with each layer,
|ψ0 ⟩ → |ψ1 ⟩ → ... → |ψT ⟩ ,
where |ψi+1 ⟩ = Ui |ψi ⟩. Each Ui is described by two tensors u and w. The u-tensors, also called
disentanglers, are described by
As MERA is a tensor network, we can represent the network using the diagrammatic tensor
notation, see Appendix A.1. A graphical representation is shown in Figure A.1. The lattice
points in Li are represented as wires and the tensors as (xin , xout )-tensors, where xin is the
number of wires going into the tensor and xout the number of wires leaving the tensor. We know
that each isometry w is the same by translational invariance, which implies that all ingoing
wires must have the same dimension d and vector space Cd . By the definitions in Eq. 3.1 and
3.2 we can see that the w-tensors are of the type (1,2) and the u-tensors are of the type (2,2).
The tensors are laid out in a brick wall structure where the u-tensors entangle neighboring
wires.
Here we have explicitly defined a specific MERA scheme, a binary MERA, by choosing
the isometries to map from Cd to (Cd )⊗2 . One can also define a ternary MERA by letting
the co-domain of the isometry be (Cd )⊗3 [28]. Different schemes also exist such as a modified
binary MERA and branching MERA [26, 27], or even generalizations to higher dimensions [14].
These schemes differ in contraction cost scaling and efficiency of accuracy. In this thesis we
will assume that every MERA is binary for simplicity.
14
3.1. MERA
(a) Two MERA layers. The isometries are rep- (b) Consecutive MERA layers which show the
resented by the blue tensors and disentanglers by fine-graining effect of MERA transformations.
the green tensors. The dimension of the tensors Each layer introduces new sites and entangles
is determined by the bond dimension d. them with the existing sites. The inner struc-
ture of the two layers is explicitly shown in (a).
Figure 3.1: The effects of MERA transformations on a small and large-scale level.
Criticality properties
In general, we want MERA to have similar properties to the systems we study, namely
critical systems. Two of those properties are translation and scale invariance. Earlier, we
asserted that all isometries and disentanglers are the same in Ui . This choice ensures that
the structure is invariant under shifting positions of the tensors. One can also let all tensors
be different and freely adjust each tensor to optimize the network. For scale invariance all Ui
consist of the same tensors {w, u}. Each layer Ui is then independent of the scale. By requiring
these invariance properties, for a given local dimension d and number of MERA layers N ,
we lower the number of parameters to describe the tensors from O(d4 N ) to O(d4 log2 N ) by
translational invariance, and even further to O(d4 ) by scale invariance. We will return to scale
invariance of MERA in Section 3.1.5.
At this point, we should look at the similarities between quantum circuits and the given
definition of MERA. Quantum circuits implement a transformation from an initial n-qubit state
⊗n
|ψ0 ⟩, by convention |0⟩ , to a final state |ψf ⟩ by applying various quantum gates to the qubits.
Similarly, MERA transforms an initial state |ψ0 ⟩ to a final state |ψT ⟩ by repeatedly applying
quantum circuits. If we think of MERA as a particular quantum circuit, then we can interpret
the lattice points as qubits and the {w, u} unitaries as two-qubit quantum gates. Precisely this
interpretation is what makes MERA a particularly good and natural candidate for a tensor
network that can be implemented on a quantum computer.
15
3.1. MERA
increase d, then the number of parameters increases, and therefore the accuracy of MERA.
Note that grouping sites together is purely for computational purposes and, in most cases,
has no physical meaning. However, it does not mean that we lose physical meaning. Any
observable that we may be interested in can be transformed to this grouping ansatz. For
example, a d-dimensional observable h can be appropriately embedded by s − 1 identities such
that it becomes a ds -dimensional observable, that is
h 7→ h′ = I ⊗ ... ⊗ I ⊗ h ⊗ I ⊗ ... ⊗ I,
where I are d × d identities.
The grouping ansatz can be generalized by transitional layers, which are placed before
the scale/translational invariant MERA. The transitional layers are allowed to have arbitrary
isometries and disentanglers, with arbitrary dimensions, that generally differ from the ones
used in the MERA. As such they can be freely adjusted in a method called tensor optimization
where by an iterative process each individual tensor is optimized in such a way that the following
MERA better approximates the ground state energy. We can consider a transitional layer that
contains isometries (or identities) that simply group two sites together, that is w : Cd ⊗ Cd 7→
C2d , which can be extended to allow for an arbitrary number of sites to be grouped together.
Besides this specific application of transitional layers, we will not make further use of them.
For more discussion on transitional layers and tensor optimization, see [26].
16
3.1. MERA
Figure 3.3: Relative error in ground state energy of a select few critical spin models, for both
MERA and MPS. The energy error scales polynomially with the bond dimension for both tensor
networks. Figure from [26].
at the bottom of the network and compute the expectation value of that observable (which we
will do later in Section 3.1.6), only certain tensors can effect the state that ends up at the
observable. These tensors are part of the causal cone, as shown in Figure 3.4. The notion of
a cone of causality is not unique to MERA; it is similar to the light cone that can be found
in the theory of relativity. Events (tensors) that happen outside the cone cannot influence the
present (observable).
Figure 3.4: The causal cone of a three-site observable (obs) in MERA. The cone has a fixed
width for this operator.
The MERA contains tensors that are not causally connected to an observable O (i.e. not
in the causal cone), so there will be situations where u directly contracts with u† or w with
w† , see Figure 3.6. As defined earlier, we know that u† u = u u† = I and w† w = I, so these
contractions contribute nothing to the computation and we can safely ignore them. We can say
that the tensors that do not annihilate with each other are elements of the causal cone.
17
3.1. MERA
Fixed widths
A practically useful property of the causal cone of a MERA is that the width, in terms of the
number of sites, will converge to a fixed value which depends on the support of the observable.
Let us consider an three-site observable O (Figure 3.6). After the contraction of all the tensors
that do not effect O, there are only three sites left that did not contract to identity, so the
causal cone of a three-site observable also has a width of three sites. This is one of the few
fixed values but we can write down a general formula to find the other fixed values. We will
investigate this by seeing what the effect of one Ui† transformation is on n sites, or rather, how
the width of the causal cone of n sites changes. We will separate the cases of odd and even
number of sites.
For an odd number of sites, n sites will connect to ⌈ n2 ⌉ disentanglers (u† ) which in turn
connect to ⌈ n2 ⌉ + 1 isometries (w† ). The width of a causal cone with n sites as its base is ⌈ n2 ⌉ + 1
sites after going up one layer, lnm
n→ + 1. (3.3)
2
As an example for n = 1, the single site must connect to a single disentangler which connects
to 2 isometries, so the causal cone now has a width of 2 sites. Similarly, n = 3 has a cone
width of 3 sites, n = 5 a cone width of 4 sites and n = 7 a cone width of 5 sites. An interesting
observation we can already make, is that a causal cone of 3 sites has a fixed width.
The counting works differently for an even number of sites. Instead, n sites can connect to
either n2 or n2 + 1 disentanglers depending on which n sites are being considered. Let us call
an observable aligned if it follows the former case and disaligned for the latter case, see Figure
3.5. An aligned observable will then connect to n2 + 1 isometries and the latter case to n2 + 2
isometries. The causal cone of n sites has a width of either n2 + 1 or n2 + 2 after going up one
layer,
n
n→ +1 (aligned) (3.4)
2
n
n→ +2 (disaligned). (3.5)
2
This implies that the cone of n = 2 has a width of 2 or 3 sites, n = 4 a cone width of 3 or 4
sites and n = 6 a cone width of 4 or 5 sites. Again, we observe that 2 and 4 sites can have a
causal cone with a fixed width, depending on the exact positions of those sites in relation to
the MERA.
Figure 3.5: A four-site operator that is (a) aligned and connected to two disentanglers; or (b)
disaligned and connected to three disentanglers.
We have seen that 2 and 4 sites can have a causal cone with a fixed width in some situations,
18
3.1. MERA
whereas 3 sites are guaranteed to have a constant width for its causal cone. Given that there
are ni sites on each lattice Li , we would have had to compute O(n2i ) tensors for each MERA
layer. Now, in the worst case, we only have to consider up to 7 tensors per layer that contribute
to the expectation value of an observable. This significantly reduces the amount of tensors that
take part in computations, and consequently reduces the computation time.
Figure 3.6: Contraction to identity of tensors that are not causally connected to the three-site
operator O.
Superoperators
To make the concept of a causal cone more precise, we turn to ascending superoperators. Up
until this point we have imagined the MERA to transform an initial state to a final state, |ψf ⟩ =
U |ψ0 ⟩. If we computed the expectation value of an observable O, this would be ⟨ψf | O |ψf ⟩.
However, from a mathematical perspective it is equivalent to transform the observable O by
absorbing U and U † to obtain the transformed observable O′ , that is
However as we know, not all tensors in U are involved with O. Precisely which tensors are
involved is dependent on the support of O, more so which exact sites support O. As proof by
example, a two-site operator that is directly connected to two disentanglers will then connect
to three isometries, whereas a two-site operator that is connected to just one disentangler will
connect to two isometries. In the former case, O′ now has a larger support than O; ideally we
want the transformed operators to retain the same support.
For now, let us consider three-site operators, although this argument can be extended to any
[r−1, r, r+1]
odd number of sites. A local operator Oi+1 on lattice Li+1 that is supported on adjacent
sites [r − 1, r, r + 1] can be transformed to the lattice Li by the ascending superoperator S
such that
[r ′ −1, r ′ , r ′ +1] [r−1, r, r+1]
Oi = S[Oi+1 ], (3.7)
where [r′ − 1, r′ , r′ + 1] are adjacent sites on lattice Li . It can be ambiguous how exactly O is
transformed with respect to left and right side of disentanglers, so instead we will average over
both possibilities such that S = 12 (SL + SR ). Eq. 3.7 can also be expressed in diagrammatic
notation, see Figure 3.7. We can take this figure to be the definition of how S acts on three-site
operators.
19
3.1. MERA
Figure 3.7: The ascending superoperator S acting on a three-site operator. Both possibilities
SL and SR need to be averaged over.
20
3.1. MERA
PT −1
where NTtr = z=0 ntr z .
From the causal cone discussion in Section 3.1.3, we know that the fixed width of the causal
cone of 3-site observables is 3 sites. NTtr is proportional to the number of scales that are not at
the fixed width yet. Starting with L sites, it takes roughly log2 L iterations of D to reduce the
cone width to 3 sites. To be more precise, we take NTtr = 2 log2 (L − 2). Finally, we obtain the
entropy of ρT ,
S(ρT ) ≤ log2 (d)(3 + 2 log2 (L − 2)). (3.13)
The entropy scaling in Eq. 3.13 coincides with the entropy scaling for critical ground states.
This implies that MERA is suited for approximating these ground states and capturing the
critical behaviors.
for eigenvalues λa and right eigenoperators ϕa [15]. ϕ̂a are the left eigenoperators of S and by
duality the right eigenoperators of D, the dual of S. We identify ϕa as operators/observables
that sit inside the MERA and ϕ̂a as ”starting states” of MERA. The associated eigenvalues λa
are the same for both ϕa and ϕ̂a . In general, ϕ̂a need not be proper quantum states. In fact,
only the first left eigenoperator is a quantum state.
S scales its eigenoperators by a factor λa with each application. Sometimes we will talk
about the scaling dimension ∆a = − log2 λa . We can decompose any operator O in terms of
ϕa , X X h i
O= ⟨ϕa , O⟩ϕa = Tr ϕ̂a O ϕa , (3.15)
a a
where ⟨· , ·⟩ is the Hilbert-Schmidt inner product. This then allows us to write the action of S
on an arbitrary operator O as
X h i
S[O] = λa Tr ϕ̂a O ϕa . (3.16)
a
Due to the way we have defined MERA, it follows that S is unital. Let us again look
at Figure 3.7, but now consider operator O to be the three-site identity operator I ⊗ I ⊗ I.
Consequently, the tensors can contract with each other to identity and we are left with three
straight wires, so S[I] = I. This implies that there is at least one λa such that λI = 1. For the
other eigenvalues λa we will assume that they are strictly less than 1.
There must then also be a fixed point of D that has the same eigenvalue λI = 1 due to
the dual nature of S and D, shown in Eq. 3.9. There exists a density matrix ρ∗ such that
D[ρ∗ ] = ρ∗ , so repeated applications of D converges an arbitrary density matrix ρ to the fixed
point ρ∗ ,
lim D ... ◦ D} [ρ] = ρ∗ .
| ◦ {z (3.17)
T →∞
T
The fixed point ρ∗ depends on the MERA and its tensors w, u. By choosing the right construc-
tion for the tensors we can have ρ∗ be the ground state of a (critical) Hamiltonian.
21
3.1. MERA
Observables
We will keep the assumption thath the iobservable O is a three-site operator.
h i From Section
3.1.5 we know that S[O] = a λa Tr ϕ̂a O ϕa , so then S k [O] = a λka Tr ϕ̂a O ϕa . The expec-
P P
tation value of an observable O after k layers is ⟨O⟩ρk = Tr[Oρk ] = Tr S k [O]ρ0 . Combining
both expressions for Sk [O] and ⟨O⟩ρk we get
" #
X h i
k
⟨O⟩ρk = Tr λa Tr ϕ̂a O ϕa ρ0
a
X h i
= λka Tr ϕ̂a O Tr[ϕa ρ0 ]. (3.18)
a
22
3.2. DMERA
where now ∆α = log2 (λα ). There might be an additional factor (λα λβ )−c for c = 0, 1, 2, ... due
to the exact placement of the operators and overestimation of the number of iterations of S.
The most important feature of this expression is the polynomial decay of the correlator with
respect to the scaling dimensions. This polynomial behavior of the correlations is also present
for systems at the critical point and this regime is precisely what we are interested in.
To summarize, due to the way MERA is structured, we can generalize the tensor network to
a repeated application of superoperators, which stems from the fact that the causal cone of an
operator can have a constant width. These superoperators come with a set of eigenoperators
and eigenvalues which describe the behavior of a general operator or state. Only one of these
eigenoperators has eigenvalue λ = 1, which corresponds to the fixed point of MERA. Addition-
ally, we discovered that MERA has similar properties to systems at the critical point, namely
for the entanglement entropy scaling and decay of the correlator. Based on the properties,
MERA looks like a better candidate for studying critical systems than MPS, which does not
have the same properties.
3.2 DMERA
Now that we have established MERA we can move to DMERA. As hinted in the formal in-
troduction of MERA (Section 3.1.1), we can interpret the network in a quantum manner by
replacing the tensors with quantum gates and the legs with qubits. In this section, we formalize
this concept and discuss several unique features of DMERA.
DMERA has been proposed by Kim and Swingle in 2017 as a method to study strongly
interacting quantum many-body states at criticality on a noisy quantum computer [1]. They
suggest that DMERA has several strong qualities – of which we will discuss a few in this section
– namely fast contraction on a QC, approximation of a wide variety of physical ground states,
resilience to noise and good approximation of the ground state energy in presence of noise.
In this section we will formally introduce DMERA and its differences with MERA, then we
will discuss how the fixed width of the causal cone changes with larger depth D and finally we
prove that DMERA with noisy gates has an upper bound on the cumulative error.
iE
where Ui is a depth D quantum circuit (layer) and 02 contains 2i qubits in |0⟩-state which
iE
are placed alternating with the qubits from |ψi ⟩ [1]. Alternatively, we can merge Ui with 02 ,
23
3.2. DMERA
Figure 3.8: One DMERA layer with depth D = 4. The wires are qubits and the tensors are
two-qubit quantum gates. At each isometry (blue) a new qubit in the |0⟩-state is inserted.
All layers Ui are constructed from the same set of two-qubit unitaries {w, u1 , u2 , ..., uD−1 },
where the w-unitaries are applied first to ψi (interspersed with |0⟩), then the u1 -unitaries and
so on, in a brick wall structure, see Figure 3.8.
So far, we have been more or less repeating the definition of MERA from Section 3.1.1.
Many properties and tools that apply to DMERA can be copied directly from MERA, such as
translation and scale invariance or evaluation of observables. However, there are some subtle
differences. For example, we consider DMERA to contain unitaries instead of tensors because
we exclusively use two-qubit unitary gates. Another difference is that we explicitly introduce a
qubit in the |0⟩-state at every isometry, instead of letting the isometry be general.
Depth
However, there is one major difference: DMERA is allowed to have more than two unitary
layers in each Ui . This is necessary as the DMERA works with qubits, which have a fixed local
dimension d = 2. We cannot group the qubits together to increase the bond dimension as was
the case for MERA, since DMERA needs to be a valid quantum circuit to work on a quantum
computer. Nonetheless, the extra unitary layers in Ui allows us to increase the bond dimension
and accuracy of DMERA.
We will now show that any DMERA can be formulated as a MERA, and vice versa. Consider
a DMERA with n qubits and a depth D = 2. This simply corresponds to a MERA with d = 2.
Now consider a DMERA with depth D = 3. To relate this to a MERA, let us group the
unitaries {w, u1 , u2 } together such that we get new effective unitaries {w∗ , u∗ }, see Figure 3.9
[17]. These unitaries still follow the MERA definition from Eq. 3.1 and 3.2,
so they are a valid set of tensors. This now allows us to write down a MERA with d = 4.
Similarly, we can do this for any D to obtain a MERA with d = 2D−1 by continuing this
”triangular” grouping. By going the opposite way a MERA with d = 2D−1 can be formulated
as a DMERA with depth D.
24
3.2. DMERA
Figure 3.9: Structure with 3 layers transformed to an ordinary MERA with increased local
dimension
25
3.2. DMERA
where D i iE
Φi [O] = 02 U † O U 02 , (3.30)
Mathematical argument
We assume that the DMERA channel Φ has only one eigenvalue λ such that |λ| = 1.
Furthermore, we assume that the noisy channel Φ̃ is unital but can have more eigenvalues with
modulus 1.
In this section we will make use of the induced operator norm which is given by
for any quantum channel Φ. For the rest of this section we will only use the induced operator
norm, so for simplicity we will write ∥Φ∥∞→∞ as ∥Φ∥∞ .
To formally analyze the noise resilience, we will first need to introduce the noisy version of
Φ, namely Φ̃. The noisy counterpart is constructed with noisy gates which have some error
given by
∥Γ − Γ̃∥∞ ≤ ϵ, (3.33)
where Γ is a two-qubit gate and ϵ is the error rate.
We would like to express the error between Φ and Φ̃. We do so by replacing a noiseless gate
with a noisy gate and observe the effect on the error. Intuitively, for an N-gate quantum circuit
we might assume that we need to apply Eq. 3.33 for each gate we replace. As we will see, this
intuitive idea is correct.
26
3.2. DMERA
First, let us examine a simple circuit with 2 consecutive gates, namely ∥Γ2 Γ1 − Γ̃2 Γ̃1 ∥, and
express the error rate as
where in the first line we effectively added 0, from the second to the third line we used the
triangle inequality and finally we used Eq. 3.33. Similar arguments can be used to show that
∥Γ2 ⊗ Γ1 − Γ̃2 ⊗ Γ̃1 ∥ ≤ 2ϵ. For a circuit with N consecutive gates we can write the error rate as
as the DMERA channel Φ contains D layers and O(D) gates within each layer, so in total Φ
holds O(D2 ) gates.
As we are particularly interested in the decaying property of Φ, we are free to eliminate
the stationary part of operator O, i.e. by removing the identity operator which corresponds to
the largest eigenvalue of 1. Let us define the traceless operator Ō = O − Tr(O)
d I. Using this
operator, we can write
Φ̃n [Ō] − Φn [Ō] = Φ̃n [O] − Φn [O]. (3.37)
Let us introduce Φ(1) which is defined as
X
Φ(1) = λk Pk , (3.38)
k:|λk |=1
where Pk is the projection for λk (Eq. 6.13 from [30]). Given that Φ has only one eigenvalue
with modulus 1, then there is only one Pk = cI and Φ(1) projects any operator onto the identity
operator, i.e. Φ(1) [O] = cI. Using this projector and the traceless operator Ō we can write
which has a spectral radius of µ = maxλ∈spec(Φ) {|λ| < 1}. In other words, µ is the second-largest
eigenvalue of Φ.
Now that we have Eq. 3.39 we can give an upper bound on ||Ψn ||∞ . Using Theorem 8.23
from [30] we know that for any norm
where C is a constant independent of n and µ, and dµ is the size of Jordan block corresponding
to µ. In general, we do not know the size of the Jordan block of an arbitrary eigenvalue of our
27
3.2. DMERA
channel, and especially not the Jordan block corresponding to µ. However, for any µ < ν < 1
there is an n0 such that ν n ≥ µn nd−1 for n ≥ n0 . Indeed,
ν n ≥ µn nd−1
⇐⇒ n log ν ≥ n log µ + (d − 1) log n
ν
⇐⇒ n log ≥ (d − 1) log n
µ
n d−1
⇐⇒ ≥ . (3.41)
log n log µν
For the proof and discussion of the telescopic decomposition, see Appendix A.2.
We can find the upper bound of the error after n layers as follows,
n
X
n n
k−1 n−k
Φ̃ − Φ
=
Φ̃ ◦ (Φ̃ − Φ) ◦ Φ
∞
k=0 ∞
n
X
≤ ||Φ̃k−1 ||∞ · ||Φ̃ − Φ||∞ · ||Φn−k − Φn−k
(1) ||∞
k=0
Xn
≤ ||Φ̃ − Φ||∞ · ν n−k
k=0
n
X
≤ O(ϵD2 ) ν n−k
k=0
2
O(ϵD )
≤ . (3.44)
1−ν
From the second to the third line, we used that Φ̃k−1 is norm-nonincreasing. Next, we used Eq.
3.36 and by using Eq. 3.39 we can apply the upper bound from Eq. 3.42. Finally, we notice
that the sum is upper bounded by the geometric series of ν.
The result from Eq. 3.44 tells us that the upper bound of error on observables is inde-
pendent of n, the number of DMERA layers. Additionally, we expect the error to decrease
with additional DMERA layers, since we discard qubits that have errors in them and introduce
new qubits at each layer. Therefore, we should expect that increasing the number of layers is
beneficial to estimating the ground state energy, based on the argument of both the scaling
dimensions analysis and the previously found upper bound. Another interesting consequence
of this result is that if there exists a noiseless DMERA that can approximate the ground state
energy then there exists a noisy DMERA that approximates the ground state energy up to an
2
error of O(ϵD
1−ν .
)
28
3.2. DMERA
Experimental argument
Now that we have proven an upper bound on the noise, let us consider a different argument
for the noise resilience of DMERA. With each DMERA layer, or channel iteration, we introduce
pure qubits and throw away qubits that potentially contain errors. As a result, the errors
cannot grow beyond a certain size, which is exactly the result derived in Eq. 3.44. Instead of
considering many (noisy) layers, let us see what happens with only one noisy layer followed by
several noiseless layers. In Figure 3.10 we can see the effect of the error from a single noisy
channel iteration on the ground state energy after several iterations of the noiseless channel.
The difference between the measured ground state energy Ẽ and the exact ground state energy
E0 converges exponentially for all noise levels. From this we can see that noise tends to spread
out and get eliminated in further layers. The two-qubit gate infidelity noise model, described
later in Section 4.1.2, was used with various levels of noise.
10
0 1 2 3 4 5
# noiseless iterations
Figure 3.10: Decay of noise after an initial noisy layer followed by noiseless layers. This
experiment was performed for several levels of noise indicated by the noise parameter σ 2 .
29
Chapter 4
Implementing DMERA
We have laid the groundwork for implementing DMERA on a quantum computer in the previous
section. What is left is to discuss the practical side of implementing DMERA by for example
discussing the specific quantum gates that will be used. To test the implementation, we verify
the performance by computing the energy of the prepared states and comparing the results to
those of Sewell and Jordan [2].
For this thesis, we have chosen the trapped ion QC developed by IonQ which currently
contains 11 qubits [31]. The access to this QC is facilitated by the Amazon Braket service.
This particular QC has the advantage that for all qubits the fidelity is relatively high compared
to its competitors. In addition, the QC has all-to-all connectivity so any two qubits can interact
with each other without the use of SWAP gates.
In a recent study in 2021, Sewell and Jordan successfully managed to implement DMERA
to prepare the ground state of the quantum Ising chain model on the Honeywell HØ trapped
ion QC [2]. The machine has six qubits that are also all connected. The advantage of this
computer over the IonQ machine is that it allows for mid-circuit qubit resets. Any qubit that
has been used earlier in the quantum circuit can be reset and used again in the computation
which is extremely useful for DMERA implementations as old qubits can be reset and inserted
at a new layer. Mid-circuit resets thus allow for an arbitrary number of layers. In this thesis we
will adapt their DMERA implementation and compare our obtained results with their results.
In this chapter, we will discuss the practicalities of running DMERA on both a simulated
and real quantum computer, such as compiling a DMERA circuit in terms of native gates before
performing the circuit on a quantum computer. We then measure the energy of the states that
were prepared by circuits with various number of layers to test the performance of DMERA.
4.1 Practicalities
4.1.1 Native gates
Quantum computers have a preference to work with so-called native gates. These are the basic
operations that a quantum computer uses, depending on its type of design. Any unitary that
is given to a quantum computer will first be compiled to a series of native gates. To avoid any
unnecessary errors, we will want to present a DMERA circuit to the computer that is already
comprised of native gates. Our circuit only consists of the w and u unitaries, so we will want
to decompose them in terms of native gates (for now we do not consider the observable as part
of the circuit). The two native gates we will use for this is the Mølmer-Sørensen XX gate (MS)
30
4.1. Practicalities
and the Z-axis rotation gate (Rz ) as they are used on the IonQ quantum computer [31].
The Rz gate is defined as
θ
Rz (θ) = e−i 2 Z
θ
!
e−i 2 0
= θ (4.1)
0 ei 2
where Z is the Pauli Z spin operator. Rz rotates a single qubit around the z-axis. On a quantum
computer this operation is particularly useful since it does not actually have to be performed.
Instead, it can be performed virtually by updating the angles of the following gates, which is
equivalent to rotating the qubit [32]. For this reason the Rz gate adds practically no runtime
or errors.
The MS gate is a two-qubit gate that can be implemented particularly well on an ion-trap
quantum computer and is shown to allow for efficient construction of quantum circuits [33].
The XX MS gate, also known as the Ising XX coupling gate, can be defined as
θ
XXmn (θ) = e−i 2 Xm Xn
cos θ2 −i sin θ2
0 0
θ
0 cos 2 −i sin θ2 0
= (4.2)
0 −i sin θ2 cos θ2 0
−i sin θ2 0 0 cos θ2
where Xm is the Pauli X spin operator on the m-th qubit. Xm and Xn commute with each
other, so XXmn is invariant under interchange of qubits m and n.
With these two native gates, we can construct a native gate decomposition for the DMERA
gates w and u, as described in [2]:
where Rz,n acts on the n-th qubit. Note that the angle dependence resides only in the rotation
gate which is implemented as a virtual operation.
Sewell and Jordan researched two different native gate decompositions, namely Eq. 4.3
and another with angle dependence in the XX gate [2]. The latter one was found to be more
susceptible to noise than the former one, hence Eq. 4.3 has been chosen for this thesis.
31
4.1. Practicalities
The noisy version of the XX gate is denoted as XXg and the Kraus operators, which will
be given meaning in Section 4.1.3, are given by
r r
1 + e−σ2 /2 1 − e−σ2 /2
K0 = XX(θ) K1 = XX(θ − π). (4.4)
2 2
2
The XXg gate is equivalent to applying an ideal XX gate with a probability of (1 − e−σ /2 )/2
to flip both qubits afterwards. The gate infidelity noise is correlated between the two qubits.
Depolarizing noise
Another source of error we have to deal with is spontaneous errors due to the environment
(e.g. cosmic rays, external electromagnetic fields, internal particle interactions). When a qubit
interacts with the environment in these manners, the qubit may become depolarized with
probability p. We can model spontaneous noise by applying the depolarizing channel Fp [ρ],
given by
p
Fp [ρ] = (1 − p)ρ + (XρX + Y ρY + ZρZ)
3
4p 2p
= 1− ρ + I, (4.5)
3 3
to each qubit before and after applying an XX gate. In this way, the noise is uncorrelated
between the qubits. Note that it is not necessary to apply the depolarizing channel before and
after a phase rotation gate as this is implemented virtually. The parameter p will be used to
indicate the level of noise for depolarizing error.
Kraus operators
In general, we will perform a DMERA layer by doing each unitary layer separately. Any
completely positive trace-preserving map can be written in terms of its Kraus operators Xi
32
4.1. Practicalities
such that
r
Xi M Xi† ,
X
Ω[M ] = (4.6)
i
for all M ∈ L(H) [21]. For a 3-qubit state ρ, we can then compute the isometries layer as
X
w ⊗ w ⊗ w[ρ] = (Xi ⊗ Xj ⊗ Xk )ρ(Xi ⊗ Xj ⊗ Xk )† , (4.7)
i,j,k
where X are the Kraus operators of w. We can generalize this statement to N qubits by
allowing N isometries.
For the simplest case, namely the noiseless case with only 1 Kraus operator for w and u,
we use the matrices
cos θ − π4 sin θ − π4
0
√ 0√ cos(θ) 0 0 sin(θ)
1 1
0 2 √2 − 2√ 2 0
0 1 0 0
w(θ) = 1 1
u(θ) = ,
0 2 2 2 2 0 0 0 1 0
π π
− sin θ − 4 0 0 cos θ − 4 sin(θ) 0 0 cos(θ)
(4.8)
given in [17]. For D = 2 there are only two unitary layers formed by w(θ1 ) and u(θ2 ) inside
π
each DMERA layer where θ1 = 12 and θ2 = − π6 . For D = 4 we instead have four unitary layers
consisting of w(θ1 ), u(θ2 ), u(θ3 ) and u(θ4 ). The angles θ1 , θ2 , θ3 , θ4 are derived in the paper by
Evenbly and White [17].
To add two-qubit gate infidelity from Section 4.1.2 we use the Kraus operators and the native
gate decomposition given by Sewell and Jordan [2]. The gate XX has two Kraus operators K0
and K1 so from Eq. 4.3 we know that each w and u has four Kraus operators, namely two for
each appearance of XX. In total, we have four Kraus operators per w and u gate and Eq. 4.7
still applies.
Figure 4.1: (a) The general contraction scheme for any depth DMERA. Note that not all legs
need have the same dimensions. This scheme can be used for both left and right alignment
by contracting the appropriate leg between U and U † . (b) Example for D = 4 of construction
blocks used in the scheme.
The Kraus operator method is computationally intensive as there are a lot of matrix mul-
tiplications involved. Specifically for the noiseless channel there is an alternative numerical
method that we can use which is more optimized and much faster to compute. Due to the
33
4.1. Practicalities
fact that we are dealing with tensors, we can use e.g. the einsum() function from the numpy
Python package [34]. This specific function is highly optimized to work with tensors and can
efficiently contract them.
Additionally, we can construct a general contraction scheme that works for any depth D.
For a DMERA with any depth D > 1 we can group the tensors in such a way that we can form
blocks that fit the scheme shown in Figure 4.1.
The classical simulation code we created and used for this thesis can be found on GitHub
[35].
3B-Comp
As we have no access to a heat bath, we have to find a different method. What we can do
is swap probabilities of states around such that the targeted qubit becomes colder. A method
we can use for 3 qubits is called 3B-Comp [37]. This method performs the swap |011⟩ ↔ |100⟩
and leaves the other states unchanged. The swap |011⟩ ↔ |100⟩ is the only swap that would
increase the bias ϵ of the first qubit, as any other swap would either reduce or not change the
bias, given that all three qubits have a bias towards |0⟩. We can implement the swap with the
small circuit shown in Figure 4.2 which involves the use of a CNOT, X and CSWAP gate. The
CSWAP gate swaps two target qubits if the control bit is 1, else it does nothing.
The 3B-Comp method works for only three qubits, however we can use it for more than
three qubits by repeatedly applying the swap for different sets of three qubits. Suppose we
have five qubits A, B, C, D and E with equal bias and we want to cooldown qubit A. We can
perform the first swap with qubits A, B and C and the second swap with qubits A, D and E.
Assuming that all qubits have a bias towards |0⟩ qubit A should be cooled down further due to
the extra iteration of 3B-Comp than with a single iteration of 3B-Comp.
34
4.1. Practicalities
Figure 4.2: The 3B-Comp circuit. This effectively swaps the states |011⟩ and |100⟩ and leaves
the other states intact. In this configuration qubit A is cooled down and thus its bias towards
|0⟩ is increased while qubits B and C are heated up.
Now that we have introduced a simple method for three qubits, let us put this to practice.
After we have done two DMERA layers, we have used 9 from the available 11 qubits of the
IonQ QC. 3 of the used qubits are kept in the computation and 3 additional qubits need to be
inserted into the circuit as pure |0⟩. As we only have 2 qubits left over, we need to cool one
previously used qubit back to |0⟩ for an additional layer. By using 3B-Comp we can attempt
to cooldown one single qubit by using the other qubits that are left over after the other layers.
Timing of cooling
The first choice to consider, is whether the cooling should take place after the first or the
second layer. After the first layer means that there are only three qubits available for algorithmic
cooling which limits the number of times we can iterate 3B-Comp. However, it would allow
us to insert the cooled qubit back in already at the second layer. Inserting an impure qubit
at the new layer introduces noise, however this may be balanced out by the noise resilience of
DMERA. Additionally, using only three qubits also uses fewer gates meaning there is less error
due to gate infidelity.
On the other hand, cooling after the second layer means there are more qubits available
and thus we can construct a larger circuit that can cool down one qubit further than what is
possible with only three qubits. In an ideal scenario, a larger circuit would be able to cool more
efficiently, but in a noisy environment it might work reversed; a large circuit could generate
more noise and in the end the qubit would be less cool in comparison.
To investigate this issue of cooling timing, we will test several scenarios. First, we cool one
qubit after the first layer for both the noiseless and noisy case. Then, we do the same but
with the qubit cooled after the second layer. In both noiseless and noisy cases, we only have
to use the impure qubit for a DMERA circuit with three layers. For DMERA with less than
three layers we have enough pure qubits to use, so the difference in cooling timing will only be
apparent for three layers. We compare the results with three layers that all have pure qubits.
The results are shown in Figure 4.3. In general, the energies perfectly overlap for the first
two layers as no impure qubits are used yet. When preparing the circuit with three layers we
introduce an impure qubit halfway through the circuit depending on the scenario and measure
the energy only after the third layer. For the noiseless case, in Figure 4.3a we see that it is
energetically favorable to cool the qubit after two layers. We argued before that this would be
the case as there are more qubits available to cool with. In a noisy environment, the opposite
is true, as can be seen in Figure 4.3b. Running a large cooling circuit is prone to more noise
than a smaller cooling circuit, thus it is better to cool after one layer.
All scenarios in Figure 4.3 show a common theme. The energy of the third layer is never
lower than the energy of the second layer, regardless of the timing of the cooling algorithm.
35
4.2. Energies
Energy with cooled qubit (noiseless) Energy with cooled qubit (noisy)
0.0 0.0
3 pure layers 3 pure layers
After 1st layer After 1st layer
0.2 After 2nd layer After 2nd layer
0.1
0.4 0.2
Energy (a.u.)
Energy (a.u.)
0.6 0.3
0.8
0.4
1.0
0.5
1.2
0.6
0 1 2 3 0 1 2 3
Layers Layers
(a) Cooling with no noise. (b) Cooling with depolarizing noise p = 0.05.
Figure 4.3: The energy after three layers with a qubit that is cooled after the first or second
layer.
We expect an exponential decay in the energy with pure qubits, so for algorithmic cooling to
be of use we need to see the same behavior when using cooled qubits. If the energy does not
lower after cooling, we are better off not doing the cooling at all, as it would interfere with the
decaying behavior of DMERA. As this simple experiment shows, it is not beneficial for us to
cool a qubit to perform an extra layer.
There are several reasons why algorithmic cooling does not give the desired results. Firstly,
we are very limited in the number of qubits available for the cooling process. This means we are
also limited in the number of times we can do 3B-Comp or exchange probabilities in any other
manner. Eventually, the qubits besides the targeted qubit are too hot and we hit a natural limit
of cooling. Additionally, we do not have access to a heat bath. This means that our circuit,
or quantum computer, is a closed system and we cannot exchange heat with the outside. If we
did have a heat bath, there would be more possibilities for cooling algorithms.
It is also possible that we did not find the optimal cooling circuit or algorithm for the
specific situation. For the cooling algorithm, we have limited ourselves to using only 3B-Comp
as this can be used in a general situation, which might also not be the optimal choice. We could
implement a specific circuit that cools the targeted qubit down further, however we assumed
that we do not know anything about the qubits except that they have a bias towards |0⟩. This
limits our approach to an algorithmic method, which we have used to no success.
4.2 Energies
One way to benchmark the implementation of DMERA is to compute the energy of the prepared
state and compare the results with the true ground state energy. When the measured energy
converges to the exact energy and follows the expected behavior, i.e. an exponential decay,
we can be confident that the DMERA works as expected. In this section we will first discuss
two general points on the Hamiltonian and left-right alignment of the circuit, then we will
describe the method to find the ground state energy for both classical simulation and quantum
computers and lastly we showcase the results of both methods.
36
4.2. Energies
Hamiltonian of model
The critical system of choice is the Ising quantum critical chain, which is a conformal field
theory. Conformal field theories and its applications are outside the scope of this thesis, but
the important property we will use is that such a theory is by definition invariant under angle-
preserving transformations. We have already implicitly used this fact in Section 3.1.5.
The associated Hamiltonian for this critical Ising model is given by
X
H=− Xr Xr+1 + Zr . (4.9)
r
However, we will use the transformed Hamiltonian for a fully scale-invariant (D)MERA
X
H= (Xr Zr+1 Xr+2 − Xr Xr+1 ). (4.10)
r
The detailed derivation and discussion of Eq. 4.10 is given by Evenbly and White [17]. Ef-
fectively, we only prepare three qubits (for depth 2) with DMERA to the ground state so the
Hamiltonian reduces to
1
H = XZX − (IXX + XXI), (4.11)
2
where the factor 12 accounts for the average of both XX terms. We will explicitly use this
expression as the three-qubit observable to measure the energy of the prepared state.
37
4.2. Energies
The noise models show convergence after several layers, despite the potential concern that
error builds up in large quantum circuits. This result indicates that DMERA is robust to noise
and supports the findings from Section 3.2.3.
We can also see that a larger depth gives a closer approximation to the true ground state
energy than a small depth. This result is not unexpected, as we have seen before that increas-
ing the bond dimension of an MPS or MERA better approximates the ground state energy.
Increasing the depth is an alternative way of increasing the bond dimension.
0.6
O
0.8
1.0
1.2
0 1 2 3 4 5
n
Figure 4.4: The decay of the ground state energies to the converged values. As more layers are
applied to a starting state, the energy converges to a final value. This is shown for the noiseless
case with D = 2 and D = 4 and with added depolarizing and gate infidelity noise.
Table 4.1: The converged energy values compared with the exact value E0 = − π4 . The values
have been computed after applying 20 DMERA layers to the starting state ρ = |000⟩⟨000|. The
code used to compute and generate this figure can be found in [35].
Method
Each experiment run on the quantum computer consists of a number of shots. One shot
is a single preparation and measurement of a state after going through the quantum circuit.
38
4.2. Energies
We can see this prepared state as a probability distribution for each basis state. Each shot
might give a different value than the previous shot due to this distribution. To get an accurate
estimation of the expectation value of an observable we want to do experiments with many
shots such that the sample size is large enough.
We cannot implement the Hamiltonian from Eq. 4.11 as it is stated. Instead, we need to
separate the Hamiltonian in the three observables XZX, IXX and XXI and measure each
observable individually. The energy can then be calculated in classical post-processing by
adding up the expectation values of each observable.
It is not possible to do the averaging over the left and right centering of the three qubits after
a layer, but instead the quantum circuit needs to be constructed explicitly for either left or right
centering. As there are two choices for each layer, we need to construct 2N different circuits for
a DMERA with N layers. In total, we need to perform 3 × 2N different experiments and bring
the results together in post-processing to find the expectation value of the full Hamiltonian.
The error margin can also be found indirectly from the experiment. First, we obtain the
marginal probabilities of the three qubits that are measured from the data. We then artificially
reproduce the experiment on a classical computer by randomly picking an outcome according
to the obtained marginal probability distribution. All outcomes are stored to form a large set of
”measurements” from which we can compute the standard deviation. This standard deviation
is then the error margin for our results.
Experiment
The experiment was performed with the IonQ 11-qubit trapped ion quantum computer [31]
which can be accessed through the Amazon Braket service [38]. In total, we have run 21 tasks
with 1000 shots each: 3 tasks for 0 layers, 6 tasks for 1 layers and 12 tasks for 2 layers. The
results from the tasks have been summed up to find the energy and averaged according to the
number of possible left/right configurations.
0.4
Energy (a.u.)
0.6
0.8
1.0
1.2
0 1 2
Layers
Figure 4.5: Energy results for two layers with standard deviation σ = 0.03. Data is from IonQ
quantum computer with curves from simulations (noiseless and noisy). The code used to send
the quantum tasks to the QC and analyze the data can be found in [35].
The results of the experiment are shown in Figure 4.5. The standard deviation is comparable
for each data point and has a value of 0.03. Additionally, the energies as found by classical
simulations for the noiseless case and the depolarizing noise case (p = 0.105) are shown. It
is immediately clear that the results from the quantum computer differ significantly from the
results for the noiseless case. The energy did not lower between 1 layer and 2 layers which does
39
4.2. Energies
not line up with our expectation of an exponential decay. However, we may not have performed
enough shots or experiments to get precise measurement values.
The simulation needs a large amount of noise to fit the QC data. We have chosen depo-
larizing noise as the noise model as it is a commonly occurring error and the noise parameter
that fits well to the data is p = 0.105. This value is much larger than the value that Sewell and
Jordan used in their paper and that we have used in the classical simulations [2]. The choice of
noise model is however not as relevant. Regardless of the chosen noise model, one would need
to add a high amount of noise to fit the QC data.
As we are only able to perform experiments that are limited in terms of channel iterations
and qubit usage, we refer to Sewell and Jordan for a more thorough and detailed experiment
[2]. Their results show that DMERA works well on a real quantum computer and converges
after only five layers (Figure 9 in their paper). They found an approximation of the ground
state energy of E = −1.08 which is 15% higher than the true ground state energy.
40
Chapter 5
Throughout this thesis we have seen enough reasons to believe that DMERA is robust against
noise, from a mathematical proof in Section 3.2.3 to a convergence of the ground state energy
in Section 4.2. We can conclude that DMERA is an ansatz that can reliably prepare a ground
state when implemented on a noisy quantum computer. Now that we have computed energies,
we look to find and investigate other physical quantities. An interesting quantity is the so-called
scaling dimensions, which are briefly discussed in both Kim-Swingle and Sewell-Jordan [1, 2].
At the foundation of preparing and converging to a ground state lie the eigenvalues of
the DMERA channel Φ. The eigenvalues govern the decaying behavior to the ground state,
which corresponds to the largest eigenvalue. These eigenvalues are interesting for investigation
into DMERA, but they can additionally be linked with another physical quantity, namely the
scaling dimensions. This quantity is of interest in the study of conformal field theories, which
can describe critical models such as the quantum Ising chain that we have been using. A scaling
dimension describes how the corresponding operator transforms under a scaling transformation,
i.e. x → αx for some scale factor α [39].
Each layer of DMERA in fact implements a scaling transformation, namely with a scaling
factor α = 2 as we double the number of qubits at each layer. This hints at a presence of
scaling dimensions in DMERA which we can then find from the eigenvalues. With a simple
conversion, ∆ = − log2 λ, we can obtain the scaling dimension ∆ from the eigenvalue λ. As the
specific DMERA channel investigates a critical model, we can correspond the DMERA scaling
dimensions to the scaling dimensions of the critical model.
We know that DMERA is noise resilient and that shows in the energy computations as well.
We speculate that the scaling dimensions are in turn also resilient against noise. To investigate
this hypothesis, we will have to compute the eigenvalues of the DMERA channel. In a classical
simulation this is a simple task as we have direct access to the channel which is diagonalizable.
However, on a quantum computer this is not possible. Instead, we can extract the eigenvalues
directly from the measurement data as the eigenvalues are directly involved in the convergence
of the expectation value of an observable. For the ground state energy we already observed
that there is a convergence to a fixed value. This behavior is not exclusive to the Hamiltonian,
but can be seen for all observables. By using a new quantum algorithm, we can extract the
eigenvalues from the measurement data and compute the scaling dimensions for DMERA.
The code used in this section for both the classical and quantum algorithm can be found in
[35].
In this section, we will first investigate the robustness of the scaling dimensions in a classical
simulation in Section 5.1 by diagonalizing the DMERA channel Φ. Then in Section 5.2 we
41
5.1. Classical algorithm
extract the scaling dimensions on a (simulated) quantum computer by first introducing a new
quantum algorithm and then testing the algorithm on simulated QC data to find the scaling
dimensions.
where ⟨· , ·⟩ is the Hilbert-Schmidt inner product, for k, l = 0, 1, . . . , 2n − 1. Note that (kl), and
likewise (ij), constitutes one index and that there is freedom in choosing how the index is formed
from k and l. For example, one can pick the row-ordering to be (00), (01), . . . , (0d), (10), . . . , (dd),
where d = 2n , and similarly for the column-ordering. The particular choice for the ordering
does not affect further results.
Using Eq. 5.1 for depth D = 2 we obtain the matrix representation M of Φ by computing all
its elements for each permutation of i, j, k, l. Concretely, this involves (23 )4 = 4096 calculations
as Φ is a map from 3-qubit states to 3-qubit states. Once the matrix M is determined, we can
diagonalize M and compute its eigenvalues and eigenoperators. Currently, we are interested in
the scaling dimensions ∆a which can be found from the eigenvalues via ∆a = 2−λa .
Results
The 32 smallest scaling dimensions for both the noiseless and noisy case are plotted in
Figure 5.1. Figure 5.1a contains the scaling dimensions found for two different DMERA depths
in the exact case. Figure 5.1b includes the exact and noisy scaling dimensions for two different
noise models. The latter figure also appears in Sewell-Jordan and in this thesis we managed to
reproduce the figure using the same noise models and parameters [2].
In the exact case, both depths agree on scaling dimensions until after the 20th eigenvalue.
The reason for this discrepancy could be due to the larger space of operators for D = 4. The
causal cone for D = 4 is extended to include 7 qubits as opposed to 3 qubits for D = 2. The
dimensions of the space of operators for D = 4 is then 214 × 214 and much larger than the
space for D = 2. Consequently, there are many more eigenvalues in the space for D = 4. The
distribution of eigenvalues may differ between both depths which could explain the deviation.
For the noise models, the noisy scaling dimensions are remarkably close to the exact scaling
dimensions. The decaying behavior is dictated by the smallest scaling dimensions, which in
42
5.1. Classical algorithm
3 3
Scaling dimension
Scaling dimension
2 2
1 Noiseless, D = 2
1
Depolarizing, p = 0.0087
2-q gate infidelity, 2 = 0.112
0 Ising CFT
0
1 5 10 15 20 25 30
1 5 10 15 20 25 30 n-th eigenvalue
n-th eigenvalue
(b) The 32 smallest scaling dimensions with
(a) The 32 smallest noiseless scaling dimen-
and without noise. The two noise models used
sions as obtained from the classical simulation
in this simulation are described in Section 4.1.2
of DMERA. Two different depths for DMERA
with depolarizing noise p = 0.0087 and two-
layers are shown
qubit gate infidelity σ = 0.112.
Figure 5.1: Scaling dimensions for different scenarios. Code used to produce the figures from
[35].
particular are close across the three cases. We know from the energy calculations in Section
4.2 and the bound on the error of the noisy channel in Section 3.2.3 that DMERA is stable in
presence of noise and such it is not surprising that the scaling dimensions are also stable in a
noisy environment. This result gives us the confidence that DMERA can yield reliable results
when performed on a real quantum computer.
43
5.2. Quantum algorithm
The magnitude of ⟨O⟩ρn is determined by the eigenvalues of the DMERA channel Φ. As its
eigenvalues have modulus strictly less than one, except for the largest which has modulus one,
⟨O⟩ρn will be solely determined by the largest eigenvalue for large n. For this reason, the
elements of η typically show a decaying behavior, hence we will call η the observable decay
curve. If O is the Hamiltonian H then η gives the energies for up to k layers, which we
extensively discussed in Section 4.2.
Using Eq. 3.18 we can write η in a more suggestive manner so that we can justify using
ESPRIT, X X
ηn = xa λna = xa 2−∆a , (5.4)
a a
h i
where xa = Tr ϕ̂a O Tr[ϕa ρ0 ] is a factor independent of n. Each element is then a sum of
exponentials, similar to X, and we see that η is an appropriate input for ESPRIT. Note that
we will have complex frequencies as opposed to real frequencies. In general this will not be an
issue as the ESPRIT algorithm works regardless of the nature of its input.
Hankel matrix
The next ingredient we will need for ESPRIT is the Hankel matrix H(X). The defining
property of a Hankel matrix is that each skew diagonal contains the same entries, that is
hi,j = hi−1,j+1 . By specifying the first column and the last row, combined with this property,
the Hankel matrix is completely determined. This approach works not only for square Hankel
matrices, but for also for non-square Hankel matrices. A general H(X) ∈ Cm×n for X =
44
5.2. Quantum algorithm
ESPRIT algorithm
Now we can describe the sequence of the ESPRIT algorithm. The following description is
based on [45].
1. Choose L = ⌊ k+1
2 ⌋
H(X) = [U U⊥ ] Σ [V V⊥ ] ,
where
• U ∈ C(L+1)×q
• U⊥ ∈ C(L+1)×(L+1−q)
• Σ ∈ Cq×q
• V ∈ C(k−L+1)×q
• V⊥ ∈ C(k−L+1)×(k−L+1−q)
4. Find U0 by taking the first L rows of U and U1 by taking the last L rows of U
5. Compute Ψ = U0+ U1 , where U0+ is the Moore-Penrose inverse of U0
Pq
If input X is in the form of X = a=1 xa λa then the eigenvalues of Ψ are precisely the same
λa as in X. However, in the case that we have a noisy input X̃ = X + η, where η represents a
noise vector, we should still expect to recover approximately λa [45].
45
5.2. Quantum algorithm
where SSi is the residual sum-of-squares of model Mi , pi the number of parameters in model
Mi and n the number of data points in D. The F-value is then compared to the value of the
F-distribution for the values of degrees of freedom p2 − p1 and n − p1 and α = 0.05. We will
call this value from the F-distribution the critical value Fcrit . If the calculated F-value is larger
than the critical value, then the model with more parameters is a better fit. If it is instead
smaller than the critical value, then the simpler model is correct.
We can apply the F-test for two models with eigenvalues obtained from the ESPRIT algo-
rithm with restriction q and q + 1. We take a model Mq (k) to be
q
X
Mq (k) = wa λka , (5.8)
a=1
where wa is the weight for each λa and k is the number of DMERA layers. Starting with M1 (k)
and M2 (k), we compute the F-value between these models and compare the value to the Fcrit .
If the model with more parameters is a better fit, then we repeat the same process, now for
M2 (k) and M3 (k). This procedure is continued until we find an F -value which is smaller than
the critical value. Once this is determined, we can be certain that only the q eigenvalues given
by ESPRIT are statistically likely to be eigenvalues of the DMERA channel.
The weights wa in Mq (k) can be determined by fitting the model to η using specialized
coding functions, e.g. curve fit() from the scipy Python package [47]. These functions will
find an optimized set of weights that minimizes the sum of squares. This method is called
(non-linear) least squares. The optimized weights can then be used in the model Mq (k) for the
F-test.
46
5.2. Quantum algorithm
λa are present in X, so the input restriction q is not yet determined. For that reason, we
try different restrictions q and see which model Mq constructed from the q values returned by
ESPRIT has the ”best fit” to X, as described in Section 5.2.2. Algorithm 2 provides guidelines
to find a set of eigenvalues that are likely to be in X.
1. Set q = 1
2. Input X into ESPRIT (Algorithm 1) with restriction q to obtain Λq . Repeat for restriction
q+1
3. Construct models Mq , Mq+1 with the values from Λq and Λq+1
(q) (q+1)
4. Find weights wa and wa by fitting Mq and Mq+1 to X
5. Compute the F -value for Mq and Mq+1
• If F -value ≥ Fcrit ,
then go back to step 2 and increase q by 1
• If F -value < Fcrit ,
then return Λq as computed in step 2
If we use η as an input for Algorithm 2 then we are able to find q likely eigenvalues from
the DMERA channel. The only information required from the (simulated) quantum computer
is measurements for a given observable for up to k layers. The algorithm and computation of
the eigenvalues is done classically and this ensures that the algorithm is as reliable as possible.
Limitations
A single pair of a starting state and an observable cannot not yield all eigenvalues from
the channel. The main reason is that a starting state ρ0 and an observable O cannot have
overlap with all left and right eigenoperators from the channel. Recall from Section 3.1.5 that
O has overlap with the right eigenoperator ϕa of Φ if ⟨ϕa , O⟩ ̸= 0. Similarly, we can say that
ρ0 has overlap with the left eigenoperator ϕ̂a of Φ if ⟨ϕ̂a , ρ0 ⟩ ̸= 0. If either ⟨ϕa , O⟩ or ⟨ϕ̂a , ρ0 ⟩
vanishes for a, then there will be no contribution from λa in η and thus it cannot be found by
the extraction algorithm.
Another problem inherent to DMERA is that there is a bias in η towards larger eigenvalues.
Large eigenvalues decay much slower than small eigenvalues and are the dominant part in the
signal after many layers. Smaller eigenvalues, however, decay quickly and due to the discrete
nature of layers they will be negligible after only a few layers. As we have access only to the
data from the observable decay, it is difficult to distinguish between the contribution of small
eigenvalues and noise. Both the ESPRIT and the extraction algorithm will then likely classify
these smaller eigenvalues as noise and filter these out.
The first problem can be solved by collecting data from many pairs of different starting
states and observables. Different pairs will have overlap with different eigenoperators and as a
result we can retrieve more eigenvalues of the channel. We will discuss the data collection in
Section 5.2.4 and the results in Section 5.2.5. Another benefit of collecting more data is that
47
5.2. Quantum algorithm
we will find more of the same eigenvalues, so the certainty that a found eigenvalue is correct
increases.
and
|−⟩⟨−| = |0⟩⟨0| + |1⟩⟨1| − |+⟩⟨+| ,
so we can eliminate |−i⟩⟨−i| and |−⟩⟨−| from the set. We can alternatively write the remaining
states as
1 1
|0⟩⟨0| = (I + Z) , |+⟩⟨+| = (I + X) ,
2 2
1 1
|1⟩⟨1| = (I − Z) , |i⟩⟨i| = (I + Y ) .
2 2
48
5.2. Quantum algorithm
|0⟩⟨0| = I[|0⟩⟨0|]
|1⟩⟨1| = X[|0⟩⟨0|]
|+⟩⟨+| = H[|0⟩⟨0|]
|i⟩⟨i| = P H[|0⟩⟨0|].
For the observables we will pick the set consisting of Pauli gates {X, Y, Z}. These observables
are the natural choice as quantum computers already use these measurement gates and any
arbitrary observable is converted in terms of the Pauli gates by the quantum computer. For
example, the Hamiltonian that we used in Section 4.2 can be decomposed in terms of Pauli
gates as H = XZX − 12 (IXX + XXI).
Data collection
From the basis states set we can construct 43 = 64 starting states and from the Pauli
gates set 33 = 27 observables, then in total we have 1728 pairs for which we can compute the
observable decay curve η. However, most pairs have no overlap at all and return an empty η,
i.e. ηn = 0 for all n. This occurs in roughly between 50 and 75% of the pairs which reduces
the number of different η to be between 400 and 900.
In general, we will want to collect data points until the curve η has converged. The con-
vergence rate differs from state-observable pair to another, so we need a method that utilizes
relative changes. We say that the curve is converged when the mean of the relative changes of
the last three data points is less than 0.02. As the curve may sometimes decay too slowly for
this to occur, we also include a forced stop at 30 layers. The value of 0.02 is obtained through
trial-and-error by looking at a state-observable pair decay curve at different cut-off values, de-
termining when it looks converged and cross-checking with other state-observable pairs. The
ESPRIT algorithm performs best on decay curves that are just converged.
49
5.2. Quantum algorithm
contain the same eigenvalues λa . Then in general an element from η (1) , and similarly for η (2) ,
will be of the form X
(1)
ηm = x(1)
a λa + ϵ
(1)
(m),
a
(1)
where ϵ (m) is a noise vector. Adding both curves η (1) , η (2) gives
X
(1) (2)
ηm + ηm = (x(1) (2)
a + xa )λa + (ϵ
(1)
(m) + ϵ(2) (m)).
a
(1) (2)
Assuming the curves are similar enough, that is xa and xa have the same sign and similar
magnitude, and that the noise is uncorrelated, then the linear sum will give us a better esti-
mation of λa than the individual curves would. Extending this method to summing more than
two curves will improve the results even further.
For this procedure, we assumed that both η (1) and η (2) share the same eigenvalues. In
general, it is unlikely that this assumption holds for any two curves, and if there are more than
two curves involved then it becomes increasingly unlikely. However, the linear combination
method still works if we assume that the curves share some eigenvalues, especially those in
which we are interested. If a set of curves share the same larger eigenvalues, then that is a
good candidate set to use to find those eigenvalues. In practice, we will combine the curves
from the state-observable pairs which share the same observable. From observations by using
Algorithm 2 on all decay curves, we found that pairs which share an observable have a similar
decay curve and yield similar eigenvalues.
The decay curves η (1) and η (2) need not have the same length, i.e. k1 ̸= k2 , as they may
have converged at different lengths. In this case we cannot directly sum both curves, so to do
this we extend one of the curves until their lengths are equal. For example, if η (1) is shorter
than η (2) , so k1 < k2 , then we add k2 − k1 elements onto η (1) whose values are equal to the
value of the previously last element of η (1) . We call this procedure padding.
5.2.5 Results
The scaling dimensions found using Algorithm 2 are shown in Table 4.1, compared with values
from the exact Ising CFT and matrix diagonalization from classical simulation. The values
found with the algorithm are more likely to correspond with the values from the diagonalization
method, as both methods aim to find the scaling dimensions of the same function, i.e. the
DMERA channel. We can see in the results that this is indeed the case.
Ising CFT Diagonalization (D = 2) Algorithm (error 1σ)
∆0 0 0 0.013 ± 0.009
∆1 0.125 0.140 0.140 ± 0.001
∆2 1 1 1.033 ± 0.011
∆3 1.125 1.136 1.135 ± 0.086
∆4 2 2 1.942 ± 0.047
Table 5.1: Scaling dimensions obtained with Algorithm 2. The error size is one standard
deviation. The code used to generate the data and perform Algorithm 2 can be found in [35].
Although some eigenvalues are abundant in the data, such as λ1 (with corresponding ∆1 =
0.140 ± 0.001), other eigenvalues do not occur often. The eigenvalue λ2 (w. corresp. ∆2 =
1.033 ± 0.011) was only encountered in the data of a low number of state-observable pairs,
while the eigenvalues λ0 (w. corresp. ∆0 = 0.013 ± 0.009) and λ1 were present in the data
50
5.2. Quantum algorithm
of almost all state-observable pairs. Consequently, the ∆2 from the algorithm is 3σ removed
from the diagonalization (and Ising CFT) value while both ∆0 and ∆1 are within 1.5σ of the
diagonalization value. This rare occurrence of some eigenvalues makes it difficult to accurately
determine if the found eigenvalues are real or only noise.
Padding curves allows us to sum curves that do not have the same length, i.e. not the same
convergence rate. However, it can happen that the length of the linear combination is too great.
This makes it difficult for the ESPRIT algorithm to find good approximations of eigenvalues
that are in the data. Normally, we do not want the decay curve η to be unnecessarily long,
hence why we only collect data points until η has converged. However, when we pad an η so
it can be used in the linear combination, we extend the η past its convergence, which only
worsens the performance of ESPRIT. To make sure we get the best performance from ESPRIT,
we remove several last elements from the linear combination before using it as input for the
extraction algorithm.
Outlook
For future outlook, there are several things that can be considered. Firstly, the performance
is directly related to the input data, so with more data (decay curves) the algorithm might be
able to more accurately extract eigenvalues/scaling dimensions. One method that could be
deployed is shadow estimation [49]. By using many random unitaries we would be able to
accurately determine the expectation value of an observable O which in turn results in accurate
decay curves.
Secondly, in this thesis we have used a DMERA circuit that is derived analytically and is
thus both scale and translation invariant [17]. However, this approach might not be optimal for
approaching the ground state energy. By starting from the analytic circuit and using a hybrid
quantum-classical method of optimization, described in Kim-Swingle [1], we can minimize the
energy further.
Another potential next step for DMERA would be to explore different critical models. The
quantum Ising chain is a relatively simple model such that DMERA does not yield a large benefit
over a classical method. An one-dimensional model that is proposed by Kim and Swingle is the
so-called D1-D5 system, which is constructed in string theory [50]. Furthermore, DMERA can
currently only investigate 1D systems, however an extension to higher dimensions would be an
interesting point of research. There are many 2D systems that are non-trivial to solve, such as
the Hubbard model [51].
Finally, a recent study proposed a generalized MERA (gMERA) network that is also suited
for a quantum computer [52]. Although gMERA is a different approach to preparing MERA
on a quantum computer, its results are consistent with results obtained from DMERA. A
comparison between DMERA and gMERA would be useful to determine which method has a
better performance and higher accuracy for a given situation.
51
Appendix A
Appendices
52
A.2. Telescopic decomposition
where Ai come with an associative product, i.e. (A1 A2 )A3 = A1 (A2 A3 ), and similarly for B m .
We will prove Eq. A.3 by induction. First for the base case m = 1,
A1 − B 1 = I(A1 − B1 )I = A1 − B1
53
Bibliography
[1] Isaac H. Kim and Brian Swingle. Robust entanglement renormalization on a noisy quan-
tum computer. 2017. arXiv: 1711.07500.
[2] Troy J. Sewell and Stephen P. Jordan. Preparing Renormalization Group Fixed Points on
NISQ Hardware. 2021. arXiv: 2109.09787.
[3] Amazon. Quantum Computing Service. url: https://aws.amazon.com/braket.
[4] IBM. IBM Quantum Computing. url: https://www.ibm.com/quantum.
[5] Google. Google Quantum AI. url: https://quantumai.google.
[6] Matthias Evers, Anna Heid, and Ivan Ostoji. Pharma’s digital Rx: Quantum computing
in drug research and development. 2021. url: https://www.mckinsey.com/industries/
life-sciences/our-insights/pharmas-digital-rx-quantum-computing-in-drug-
research-and-development.
[7] Dylan Herman et al. A Survey of Quantum Computing for Finance. 2022.
[8] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A Quantum Approximate Opti-
mization Algorithm. 2014.
[9] John Preskill. “Quantum Computing in the NISQ era and beyond”. In: Quantum 2 (Aug.
2018), p. 79.
[10] Xiongjie Yu, David Pekker, and Bryan K. Clark. “Finding Matrix Product State Rep-
resentations of Highly Excited Eigenstates of Many-Body Localized Hamiltonians”. In:
Phys. Rev. Lett. 118 (1 Jan. 2017), p. 017201.
[11] Michele Dolfi et al. “Matrix product state applications for the ALPS project”. In: Com-
puter Physics Communications 185.12 (Dec. 2014), pp. 3430–3440.
[12] Philippe Corboz et al. “Simulation of strongly correlated fermions in two spatial dimen-
sions with fermionic projected entangled-pair states”. In: Physical Review B 81.16 (Apr.
2010).
[13] G. Vidal. “Entanglement Renormalization”. In: Physical Review Letters 99.22 (Nov.
2007).
[14] G. Vidal. “Class of Quantum Many-Body States That Can Be Efficiently Simulated”. In:
Physical Review Letters 101.11 (Sept. 2008). issn: 1079-7114.
[15] Robert N. C. Pfeifer, Glen Evenbly, and Guifré Vidal. “Entanglement renormalization,
scale invariance, and quantum criticality”. In: Physical Review A 79.4 (Apr. 2009). issn:
1094-1622.
54
Bibliography
[16] Richard H. Roy, Arogyaswami Paulraj, and Thomas Kailath. “Estimation of Signal Pa-
rameters via Rotational Invariance Techniques - ESPRIT”. In: MILCOM 1986 - IEEE
Military Communications Conference: Communications-Computers: Teamed for the 90’s
3 (1986), pp. 41.6.1–41.6.5.
[17] Glen Evenbly and Steven R. White. “Entanglement Renormalization and Wavelets”. In:
Physical Review Letters 116.14 (Apr. 2016). issn: 1079-7114.
[18] P.W. Shor. “Algorithms for quantum computation: discrete logarithms and factoring”.
In: Proceedings 35th Annual Symposium on Foundations of Computer Science (1994),
pp. 124–134.
[19] Lov K. Grover. “A Fast Quantum Mechanical Algorithm for Database Search”. In: Pro-
ceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing. STOC
’96. Philadelphia, Pennsylvania, USA: Association for Computing Machinery, 1996, pp. 212–
219. isbn: 0897917855.
[20] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Informa-
tion. 10th. Cambridge University Press, 2000.
[21] John Watrous. The Theory of Quantum Information. Cambridge University Press, 2018.
[22] J. Eisert, M. Cramer, and M. B. Plenio. “Area laws for the entanglement entropy”. In:
Reviews of Modern Physics 82.1 (Feb. 2010), pp. 277–306.
[23] J. Ignacio Cirac et al. “Matrix product states and projected entangled pair states: Con-
cepts, symmetries, theorems”. In: Rev. Mod. Phys. 93 (4 Dec. 2021), p. 045003.
[24] Ulrich Schollwöck. “The density-matrix renormalization group in the age of matrix prod-
uct states”. In: Annals of physics 326.1 (2011), pp. 96–192.
[25] G. Evenbly and G. Vidal. “Algorithms for entanglement renormalization”. In: Physical
Review B 79.14 (Apr. 2009). issn: 1550-235X.
[26] G. Evenbly and G. Vidal. Quantum Criticality with the Multi-scale Entanglement Renor-
malization Ansatz. Springer, Berlin, Heidelberg, 2013.
[27] G. Evenbly and G. Vidal. “Class of Highly Entangled Many-Body States that can be
Efficiently Simulated”. In: Physical Review Letters 112.24 (June 2014). issn: 1079-7114.
[28] Ya-Lin Lo et al. “Quantum impurity in a Luttinger liquid: Universal conductance with
entanglement renormalization”. In: Physical Review B 90 (Dec. 2014). doi: 10.1103/
PhysRevB.90.235124.
[29] G. Evenbly and G. Vidal. “Scaling of entanglement entropy in the (branching) multiscale
entanglement renormalization ansatz”. In: Physical Review B 89.23 (June 2014).
[30] Michael M. Wolf. Quantum Channels and Operators: Guided Tour. 2012. url: https://
www-m5.ma.tum.de/foswiki/pub/M5/Allgemeines/MichaelWolf/QChannelLecture.
pdf.
[31] K. Wright et al. “Benchmarking an 11-qubit quantum computer”. In: Nature Communi-
cations 10.1 (Nov. 2019).
[32] David C. McKay et al. “Efficient Z-gates for quantum computing”. In: Physical Review
A 96.2 (Aug. 2017).
[33] Dmitri Maslov and Yunseong Nam. “Use of global interactions in efficient quantum circuit
constructions”. In: New Journal of Physics 20.3 (Mar. 2018), p. 033018. issn: 1367-2630.
[34] Charles R. Harris et al. “Array programming with NumPy”. In: Nature 585.7825 (Sept.
2020), pp. 357–362.
55
Bibliography
[35] Bjarne Bouwer. Python scripts for project. url: https : / / github . com / amsqi / msc -
bjarne.
[36] Yuval Elias, Tal Mor, and Yossi Weinstein. “Semioptimal practicable algorithmic cooling”.
In: Physical Review A 83.4 (Apr. 2011).
[37] José M. Fernandez et al. “Algorithmic Cooling of Spins: A Practicable Method for Increas-
ing Polarization”. In: International Journal of Quantum Information 02 (2004), pp. 461–
477.
[38] Amazon Web Services. Amazon Braket. 2020. url: https://aws.amazon.com/braket/.
[39] John Cardy. Scaling and Renormalization in Statistical Physics. Cambridge University
Press, 1996, pp. 52–54.
[40] Walter Rudin. Principles of mathematical analysis. McGraw-Hill Book Company, Inc.,
New York-Toronto-London, 1953, p. 210.
[41] Feifei Gao and Alex B Gershman. “A generalized ESPRIT approach to direction-of-arrival
estimation”. In: IEEE signal processing letters 12.3 (2005), pp. 254–257.
[42] Tukaram Baburao Lavate, VK Kokate, and AM Sapkal. “Performance analysis of MUSIC
and ESPRIT DOA estimation algorithms for adaptive array smart antenna in mobile
communication”. In: 2010 Second International Conference on Computer and Network
Technology. IEEE. 2010, pp. 308–311.
[43] Bjorn Ottersten and Thomas Kailath. “Direction-of-arrival estimation for wide-band sig-
nals using the ESPRIT algorithm”. In: IEEE Transactions on Acoustics, Speech, and
Signal Processing 38.2 (1990), pp. 317–327.
[44] Shahram Shahbazpanahi, Shahrokh Valaee, and Mohammad Hasan Bastani. “Distributed
source localization using ESPRIT algorithm”. In: IEEE Transactions on Signal Processing
49.10 (2001), pp. 2169–2178.
[45] Weilin Li, Wenjing Liao, and Albert Fannjiang. Super-resolution limit of the ESPRIT
algorithm. 2019. url: https://arxiv.org/abs/1905.03782.
[46] Andrew Siegel. Practical Business Statistics. 7th ed. Academic Press, 2016, pp. 367–369.
[47] Pauli Virtanen et al. “SciPy 1.0: Fundamental Algorithms for Scientific Computing in
Python”. In: Nature Methods 17 (2020), pp. 261–272.
[48] Scott Aaronson and Daniel Gottesman. “Improved simulation of stabilizer circuits”. In:
Physical Review A 70.5 (Nov. 2004).
[49] Hsin-Yuan Huang, Richard Kueng, and John Preskill. “Predicting many properties of a
quantum system from very few measurements”. In: Nature Physics 16.10 (June 2020),
pp. 1050–1057.
[50] Gautam Mandal. A review of the D1/D5 system and five dimensional black hole from
supergravity and brane viewpoint. 2000.
[51] Daniel P. Arovas et al. “The Hubbard Model”. In: Annual Review of Condensed Matter
Physics 13:239–74 (2022).
[52] Sajant Anand et al. Holographic quantum simulation of entanglement renormalization
circuits. 2022.
[53] Roger Penrose. “Applications of negative dimensional tensors”. In: Combinatorial math-
ematics and its applications 1 (1971), pp. 221–244.
56
Bibliography
[54] Jacob C. Bridgeman and Christopher T. Chubb. “Hand-waving and interpretive dance:
an introductory course on tensor networks”. In: Journal of Physics A: Mathematical and
Theoretical 50 (2016).
57