Thesis Bjarnebouwer Hgqsat0x

Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

University of Amsterdam

Master Thesis 60 EC

Investigating many-body systems on


quantum computers

Supervisors:
Prof. dr. Michael Walter
Author: Dr. Jonas Helsen
Bjarne Bouwer
Second assessor:
Dr. Philippe Corboz
Abstract

Noisy intermediate-scale quantum computers are limited in their performance due to the low
number of qubits and high error rate. A quantum circuit is limited in its circuit depth so that
errors do not build up to unmanageable levels. However, a theoretical proposal, DMERA, based
on the MERA tensor network by Kim and Swingle provides us with a method to construct a
quantum circuit that is inherently noise resilient and does not require many qubits [1]. The
ansatz can reliably prepare a desired quantum state by iteratively applying a quantum channel
with one stable fixed point. In a recent paper, Sewell and Jordan have numerically studied
DMERA by performing experiments on a quantum computer [2].
In this thesis, we will investigate DMERA in a formal and experimental manner. We repro-
duce and verify the results from Sewell and Jordan via experiments on a quantum computer.
Furthermore, we introduce a new quantum algorithm that can extract the scaling dimensions
of a critical model from measurement data using the eigenvalues of DMERA.

1
Acknowledgments

First, I would like to thank Michael Walter and Jonas Helsen for the great guidance and
discussions, I definitely learned a lot from them. Even though this thesis was done during a
weird time period where we had to meet in hybrid or completely digital fashion, they still made
me feel at home. Any time I had a question they were happy to answer me.
I want to thank the QuSoft quantum information group for letting me be part of the weekly
meetings and being so welcoming. In particular, I would like to thank Freek Witteveen for the
help in the beginning of the project and sharing his knowledge on tensor networks.
Thanks to Amazon we were able to perform experiments on a real quantum computer.
Through their Amazon Web Services we were given access to a wide selection of quantum
computers.
For their continued support during my studies, I want to thank my parents, sister, and
brother, for taking care of me and always showing interest in what I was working on. Lastly, I
am grateful that I got to finish my master’s degree with my dear friends who were a constant
factor in the time I was working on my thesis and who were always there in the library with
me.

2
Contents

1 Introduction 5

2 Quantum computing 9
2.1 Quantum bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Quantum circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Quantum channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Entanglement entropy and area law . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Tensor networks 13
3.1 MERA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.2 Bond dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.3 Causal cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.4 Entanglement entropy scaling . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.5 Scaling superoperators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.6 Evaluation of observables and correlators . . . . . . . . . . . . . . . . . 22
3.2 DMERA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.1 Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.2 Causal cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.3 Noise resilience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Implementing DMERA 30
4.1 Practicalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.1 Native gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.2 Noise models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.3 Classical simulation of DMERA . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.4 Algorithmic cooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Energies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2.1 Classical method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.2 Quantum method and experiment . . . . . . . . . . . . . . . . . . . . . 38

5 Extracting scaling dimensions 41


5.1 Classical algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Quantum algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.1 ESPRIT algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.2 F-test for nested models . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2.3 Extraction algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2.4 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3
Contents

5.2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.2.6 Discussion and outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

A Appendices 52
A.1 Tensor contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
A.2 Telescopic decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4
Chapter 1

Introduction

In recent years, there has been a rapid development in the world of quantum computing. Many
companies and organizations, such as Amazon [3], IBM [4], and Google [5], have shown interest
in building a quantum computer that can outperform even the fastest and largest classical
computer. The reason for this interest is the long list of potential applications, some of which
can revolutionize the digital world. Among these future applications are usages in medicine
development [6], financial modeling [7], and optimization problems [8]. Another notable entry
is the ability to break internet encryption. When Peter Shor proposed an algorithm that allows
one to relatively easily solve the math problems that guard everyone’s money, personal details
and more, people began to take interest in quantum computing and realize that this technology
can be world changing, for the better or worse.
Unfortunately, the current generation of quantum computers are not advanced enough to
realize many of these potential applications. The machines do not have enough computing
power (qubits) yet and any result obtained is not reliable due to the many mistakes that occur
during computation. The tasks that are run on quantum computers are limited in their size,
as errors would otherwise keep building up. John Preskill coined the term Noisy Intermediate-
Scale Quantum (NISQ) era for this generation of quantum computers [9]. He argues that even
with 100 qubits we should not expect too many great results yet. It would take orders of
magnitude more qubits to be able to efficiently run many of the quantum algorithms.

Quantum networks on quantum computers


If we have NISQ era quantum computers and they are supposedly poor in quality, what can
we use them for? Kim and Swingle proposed an ansatz called DMERA (Figure 1.1), based on
a known tensor network (MERA), that can in theory function well on a NISQ era computer
[1]. They suggest that the ansatz is robust against noise and does not require many qubits to
function properly. DMERA is used to prepare ground states (lowest energy quantum states) of
a physical model, which can then be used to compute physically interesting things such as the
energy or other quantities.
DMERA is based on tensor networks which are conventionally implemented on classical
computers. Tensor networks are used to simulate quantum mechanical systems, such as a chain
or lattice of strongly coupled spin particles. Ultimately, the goal is to prepare ground states
of such quantum systems and compute values which are of interest. These networks have been
used successfully in many studies and outperform other classical methods [10, 11, 12].
A commonly used tensor network is the matrix product state (MPS) which is very efficient
for computations of one-dimensional systems. However, there is one particular scenario where
it is not efficient: the critical domain. A material in a phase transition or highly entangled

5
Figure 1.1: An example of DMERA.

systems are both examples of a critical system. MPS does not exhibit the same properties that
these critical systems have, for example scale invariance or entanglement entropy scaling. There
is however one type of tensor network that has the same properties as many critical systems
and thus should be able to perform well in simulating criticality.
The multiscale entanglement renormalization ansatz (MERA) is built specifically to preserve
entanglement and is designed with critical systems in mind [13, 14]. It also has the properties
that follow the critical systems’ properties, such as correct entanglement entropy scaling and
decay of correlation functions [15].
Kim and Swingle combined both MERA and quantum computing to obtain DMERA. By
making adjustments to tailor MERA for quantum computers, they developed a method that
can be used efficiently and accurately on NISQ era quantum computers.
One big hurdle to deal with is the fact that quantum computers make many mistakes.
Luckily, DMERA is intrinsically noise resilient due to an inner layer structure. At each layer,
new pure qubits are introduced and old used qubits which contain noise are removed. As a
result, the accumulating noise during the computation stays limited and the approximation
of the ground state becomes more accurate. We will discuss the bound on the noise and a
corresponding mathematical proof in Section 3.2.3.
Furthermore, DMERA can be computed quickly and efficiently on a quantum computer.
With a small enough gate time, a quantum computation of DMERA would only take a few
seconds [1]. If we were to simulate DMERA on a classical computer instead, we would need
to store the information in memory and use a lot of time to compute the gates in DMERA. In
comparison, for a DMERA circuit depth D a quantum computer spends O(D log N ) time to
compute a circuit with O(Dd ) qubits where N is the system size and d the number of spatial
dimensions whereas a classical computer would have to spend time that scales exponentially in
O(Dd ).
Although a classical computer cannot efficiently simulate DMERA for many qubits, it is
nonetheless convenient to use a classical simulation for verification of the results from a real
quantum computer. A quantum computer will always be prone to errors and noise, while a
classical computer can implement DMERA without any faults. A classical simulation can be
used to predict what should happen and also what will happen on a quantum computer. We
can simulate a noisy environment by choosing a noise model and applying it to the classical
simulation to replicate the errors that typically occur on a quantum computer.
With a classically simulated and real quantum computer in hand, we can test the perfor-

6
mance of DMERA by preparing quantum states and measuring the energy of the states. The
prepared states should approximate the ground state better and better the more DMERA lay-
ers we use, which we can test this by measuring their energy. We should find a convergence to
the true ground state energy as a function of the number of applied layers. In Section 4.2 we
discuss the method and results of this experiment.
The same experiment has been performed by Sewell and Jordan on a different quantum
computer [2]. In their results they observed a convergence to a value close to the true ground
state energy and they were able to verify the theoretical proposal of Kim and Swingle. In this
thesis, we have used the methods of Sewell and Jordan to verify and build upon their results.

Contributions
In their paper they briefly discuss the concept of scaling dimensions which are a property
of a critical model, specifically of one that is described by a conformal field theory. Scaling
dimensions can be of physical interest, so finding a way to obtain these quantities from the
measurement data would be useful. The scaling dimensions of a critical model can be corre-
sponded with the scaling dimensions of DMERA, so instead we will try to extract the scaling
dimensions of DMERA using a new quantum algorithm.
The DMERA scaling dimensions can in turn be computed from its eigenvalues. As briefly
mentioned before, DMERA is constructed from several layers and each layer is an application
of some function Φ such that the DMERA can be described as repeated iterations of Φ. The
function Φ and its eigenvalues dictate the convergence to the ground state and this can easily
be observed in the convergence of the expectation value of an observable O (Figure 1.2). For n
iterations of Φ, the expectation value of the observable ⟨O⟩ shows an exponential decay to some
final value. The decay contains information on the eigenvalues and thus the scaling dimensions.

Figure 1.2: Decay of the expectation value of an observable ⟨O⟩ as a function of the number of
DMERA layers n.

We propose a new quantum algorithm to extract the eigenvalues from the decay. This
algorithm uses a sub-algorithm called ESPRIT [16] which is the key part of the new algorithm.
The ESPRIT algorithm can estimate the eigenvalues that could be present in a decay curve,
such as the one displayed in Figure 1.2. The new algorithm combines ESPRIT and a statistical
test to determine how many and which eigenvalues are most likely present in a decay curve.

7
From Figure 1.2 we could find a few eigenvalues of DMERA, but to find more eigenvalues
and map out the eigenvalue structure we would need more decay curves to draw information
from. By picking many observables and starting states, we can cover the whole basis space
of DMERA and theoretically extract every eigenvalue of DMERA. Using the new algorithm,
we are able to extract several eigenvalues from the data and correspond them to the scaling
dimensions of DMERA.
To summarize, the current generation of quantum computers make many mistakes in their
computations, so Kim and Swingle proposed an ansatz for a quantum circuit which is robust
against noise and can prepare states that are useful for research of critical physical models.
Using the methods described by Sewell and Jordan, we have computed the energies of the
prepared states and verified both their experimental results and the theoretical proposal of
Kim and Swingle. Finally, we propose a new quantum algorithm that can extract scaling
dimensions from measurement data by finding eigenvalues of DMERA.
In this thesis we research the quantum Ising chain model with transverse field as our critical
model. This is a relatively simple model and easily verifiable as this model has been examined
on both MERA by Evenbly and White [17] and DMERA by Sewell and Jordan.

Organization
In Chapter 2 we introduce basis concepts from quantum computing and information theory
which appear frequently in the thesis. In Chapter 3 we discuss the MERA and DMERA
tensor networks by first introducing MERA and its properties and then making the step to
its variant DMERA which has similar properties as MERA. In Chapter 4 we examine several
concepts that are used for experiments on a quantum computer and its simulation thereof before
performing an experiment to measure the energies of prepared states by the DMERA circuit.
Finally, in Chapter 5, after examining the eigenvalue structure of DMERA, we introduce a new
quantum algorithm that can extract the eigenvalues of the DMERA channel using data from
measurements and we test its performance on simulated data.

8
Chapter 2

Quantum computing

Quantum computers work inherently different from classical computers, however many parallels
can be drawn between both types of computers. Both use a form of bits, the building blocks
of the logical computer, and logic gates to control the bits. The gates can be combined into a
circuit which outputs a desired result. In the case of classical computing, this result is always
determined by the input (save for any incidental errors). However, for quantum computers,
the result is dependent on the probabilities of the possible states as prepared by the circuit.
The prepared state is in a superposition of all possible outcomes so one trial of the circuit can
return a different output from another trial. The concept of superposition is unique to quantum
computers and when used properly it can significantly speed up computations compared to
classical computers. Superposition is used in many quantum algorithms, such as in Shor’s [18]
and Grover’s [19] algorithm, for a great decrease in computation time over classical algorithms.
This section serves as a brief introduction to the basics of quantum computing. For further
and more thorough discussion on quantum computing and information, see Nielsen and Chuang
[20] or Watrous [21].
In this chapter, we will introduce the quantum bit and its properties in Section 2.1, and
quantum gates to manipulate the quantum bits in Section 2.2. We then introduce quantum
channels which generalizes the idea of sending or manipulating quantum information in Section
2.3. Lastly, we discuss entanglement entropy which can be used to quantify entanglement
between quantum (sub)systems in Section 2.4.

2.1 Quantum bits


A quantum bit (qubit) is the fundamental building block of quantum computing and is the
quantum counterpart of the classical bit. Formally, we describe it as a quantum state or object
that exists in a two-dimensional Hilbert space H = C2 . A qubit can be described by two basis
vectors for which we will choose the computational basis {|0⟩ , |1⟩}, where
   
1 0
|0⟩ = |1⟩ = . (2.1)
0 1
In general, a qubit, represented by the vector |ψ⟩ ∈ H, will be in a superposition of |0⟩ and |1⟩,
that is |ψ⟩ = α |0⟩ + β |1⟩. Due to normalization, we require that |α|2 + |β|2 = 1.
We can combine qubits to represent quantum systems larger than one qubit by taking the
tensor product. To describe a two-qubit system, we join together two single qubits, that is
|ω⟩ = |ψ⟩ ⊗ |ϕ⟩ . (2.2)

9
2.2. Quantum circuits

In computational basis, we often leave out the tensor product sign ⊗ to get |0⟩ ⊗ |1⟩ → |01⟩.
This concept can be trivially extended to an n-qubit system.

Density matrices
Commonly the density matrix, or operator, ρ ∈ L(H) is used to describe a quantum state
instead of |ψ⟩. In general, X X
ρ= pi ρi = pi |ψi ⟩⟨ψi | , (2.3)
i i

where pi ≥ 0 are the probabilities for ρi . In addition, the trace of ρ must be one, i.e. Tr[ρ] = 1.
Formally, a density matrix ρ is a positive semidefinite operator as it is Hermitian and its
eigenvalues are non-negative. We denote this as ρ ∈ P SD(H).
Both vector and density matrix formulations are equivalent to each other, however in some
situations one may be more convenient to use than the other. The vector formulation |ψ⟩ is
useful when the quantum system is in a pure state, such as an eigenstate of the system. The
density matrix formulation can represent both pure states and mixed states. Mixed states are
used to describe entanglement between two subsystems, which is otherwise not possible to do
with the vector notation.

2.2 Quantum circuits


The goal of quantum computing is to manipulate qubits in such a way that a measurement
returns the answer to a problem. In order to manipulate the qubits, a set of instructions should
be constructed where the instructions are unitary transformations. In quantum computing,
unitary transformations are represented as quantum (logic) gates. Quantum circuits can be
built from these quantum gates by placing them in series and parallel for n qubits.
Operating with a quantum gate U ∈ L(H) on a qubit |ψ⟩ is defined as U |ψ⟩. For density
matrices, a gate U acts both on the left and right of ρ,

U [ρ] = U ρU † .

Furthermore, a unitary U has the property that U † U = I and it follows then that the inner
product is conserved after a unitary transformation: ⟨U ψ, U ϕ⟩ = ⟨ψ| U † U |ϕ⟩ = ⟨ψ|ϕ⟩ = ⟨ψ, ϕ⟩.
Additionally, by the cyclic property of trace we have that Tr[U [ρ]] = Tr[ρ] = 1 which implies
that unitaries map quantum states to quantum states.
Commonly used quantum gates are
     
0 1 0 −i 1 0
• Pauli spin operators X = ,Y = ,Z=
1 0 i 0 0 −1
 
1 1
• Hadamard gate H = √12
1 −1
 
1 0
• Phase gate P (ϕ) =
0 eiϕ
 
1 0 0 0
0 1 0 0
• Controlled NOT gate CNOT =  0 0 0 1

0 0 1 0

10
2.3. Quantum channels

Quantum circuits
Quantum circuits are the bread of quantum computing and quantum states are the butter.
Circuits act on n qubits and are a (tensor) product of quantum gates. For example, to prepare
a so-called Bell state |Φ+ ⟩ we can construct a small quantum circuit which uses only two gates.
In equation form this reads
1
CNOT(H ⊗ I) |00⟩ = CNOT √ (|00⟩ + |10⟩)
2
1
= √ (|00⟩ + |11⟩),
2
where H is the Hadamard gate which brings a |0⟩-state in an equal superposition of |0⟩ and
|1⟩ and CNOT performs a spin flip on the second qubit if the first qubit is in the |1⟩-state.
However, we can also represent this operation with a diagram. In Figure 2.1 we visualize the
qubits as wires running from left to right passing through the gates. Measuring the prepared
state in computational basis will result in the |00⟩-state in 50% of the results and the |11⟩-state
in the other 50%. A single measurement will yield only one of the results, therefore it is crucial
to perform many trials and measurements of the circuit to get a realistic view of the actual
prepared state.


Figure 2.1: A quantum circuit that prepares the Bell state |ψ⟩ = (|00⟩ + |11⟩)/ 2 from the
starting state |00⟩. The H-gate denotes the Hadamard gate, which brings the first qubit in an
equal superposition of |0⟩ and |1⟩. The X-gate together with the dot denotes the CNOT gate,
where the first qubit is the control bit. The X-gate is performed only if the first qubit is in the
|1⟩-state.

2.3 Quantum channels


Quantum circuits are a great tool to effectively implement a set of instructions, but we will want
to use a more general notion. For this, we will consider quantum channels [20]. A quantum
channel is a way to transmit quantum and classical information. Usually, this information is
in the form of qubits.
Formally, a quantum channel Φ is a linear map that is completely positive and trace pre-
serving [21]. Complete positivity means that Φ ⊗ IR sends any positive semidefinite operator
to another positive semidefinite operator for all HR . Trace preserving simply implies that the
trace of an operator is preserved after an application of the channel, i.e. Tr[Φ[X]] = Tr[X] for
X ∈ L(H). These two definitions ensure that any quantum state ρ ∈ P SD(HA ) gets sent to
another quantum state σ ∈ P SD(HB ).
A quantum circuit follows the definition of a quantum channel, however the opposite is not
always true. In Section 2.2 it was clear that quantum circuits are a combination of quantum
gates which are by definition unitary, and a quantum circuit is then by extension also unitary. A

11
2.4. Entanglement entropy and area law

quantum channel, however, does need not be unitary. Unitarity implies reversibility and there
exist some channels that cannot be reversed. Consider a quantum channel Φ that measures a
state and prepares the state ρ0 = |0⟩⟨0|. Then for any ρ,
Φ[ρ] = ρ0 = |0⟩⟨0| ,
which is a valid quantum channel but not unitary.
Quantum channels can act consecutively on a state ρ. If first channel Φ is applied to ρ
and then the channel Ψ, we write this as Ψ[Φ[ρ]] or Ψ ◦ Φ[ρ]. In the case that Φ = Ψ we can
alternatively write this as Φ2 [ρ].

2.4 Entanglement entropy and area law


Quantum systems can have either classical or quantum correlations between each other. These
two types of correlations can be distinguished from each other by writing out the joint quantum
state in terms of the individual qubit registers. If a state has quantum correlations then we
speak of entanglement.
Separable states can have classical correlations and are of the form
X
ρAB = pi ρA,i ⊗ ρB,i , (2.4)
i

where ρAB is the joint state of two qubits A and B [21]. An example of a separable state would
be ρ = 21 (|00⟩⟨00| + |11⟩⟨11|), which is analogous to two coins that always land on the same
side together.
Entangled states are states that cannot be written as a separable state. This implies that
there is some correlation between the subsystems that is not classical. Commonly used entan-
gled states are the Bell states,

ρBell = Φ+ Φ+ ; Φ = √1 (|00⟩ + |11⟩) .



+
(2.5)
2
This state cannot be written in the form of Eq. 2.4 so it is an entangled state.
The “amount” of entanglement between subsystems can be measured with the von Neumann
entropy, or entanglement entropy, which is defined as
X
S(ρ) = − Tr[ρ log ρ] = − pi log pi . (2.6)
i

Consider the Bell state from Eq. 2.5. The entanglement entropy of register A is found by
computing S(ρA ) = − Tr[ρA log ρA ], with ρA = TrB [ρBell ], to get S(ρA ) = log 2 which turns
out to be the largest value for a two-dimensional system that one can get. For this reason the
Bell state is sometimes called the maximally entangled state.

Area law
If we were to pick a random state ρ from a Hilbert space H, there will generally be some
entanglement between the subsystems of ρ which we can quantify with the von Neumann
entropy. Suppose we pick a random state ρrdm ∈ H for L sites with dimensionality d, then
S(ρrdm ) ≈ Ld , i.e. the entropy scales as the volume of the system. Most of the pi in ρrdm are
non-vanishing so the entanglement entropy becomes large.
Now, a ground state ρgs of a non-critical Hamiltonian H will generally have only a few
relevant pi so instead the entropy scales as the boundary of the system, S(ρgs ) ≈ Ld−1 . This
is known as the area law of entanglement entropy [22]. Additionally, ground states of critical
Hamiltonians acquire a logarithmic correction, so S(ρgs,crit ) ≈ Ld−1 log L.

12
Chapter 3

Tensor networks

Quantum systems can have exponentially large associated Hilbert spaces, which can be difficult
to deal with for a classical computer. Let us consider a joint system of 3 spins. The associated
Hilbert space of this system is H012 = H0 ⊗H1 ⊗H2 which has a dimensionality of 2·2·2 = 23 = 8.
The dimension of the Hilbert space for a n-spin system would be 2n , exponential in n. An
exponentially growing Hilbert space dimension is concerning if one wants to classically simulate
a quantum system. Classical computers that are used for quantum physical calculations will
have to store that information in memory which makes it difficult to handle systems with
numerous particles as the memory requirement quickly grows.
Fortunately, tensor networks, a type of computational method for classical computers, are
designed to tackle many-body quantum systems differently that avoids the exponential growth
problem. By only considering a select part of the full Hilbert space, tensor networks do not
need to consider an exponentially growing space. In addition, tensor networks offer up accuracy
in trade for speed and efficiency by limiting the size of its tensors. This trade is allowed to
work since tensor networks are generally used to approximate ground states. Ground states
have special properties compared to a general state, such as the entanglement entropy scaling
which we discussed in Section 2.4, which is why tensor networks can approximate ground states
in a computationally favorable way.
One of the most commonly used tensor networks is the so-called matrix product state (MPS)
[23, 24]. In short, MPS works by decomposing a large tensor, e.g. a state which describes many
spin particles, into a chain of smaller tensors whose size depends on the number of particles. By
restricting the tensors’ size up to a certain value χ we decrease the amount of information in our
computations, which makes it easier for a classical computer to deal with the calculations. One
might think that we then lose too much information to accurately approximate the quantum
system, but by keeping the most relevant states during the size restriction, MPS is still able to
make an accurate approximation if the value χ is chosen appropriately. For more discussion on
MPS, see the review of Schöllwock [24].
In this chapter, we will introduce and discuss the multiscale entanglement renormalization
ansatz (MERA), a different type of tensor network. Later, we will introduce the deep MERA
(DMERA), a variant on MERA which is suited to be used on a quantum computer.

3.1 MERA
The matrix product state is not efficient dealing with critical systems, as their properties (e.g.
entanglement entropy scaling, lack of scale invariance, decay of correlation function) do not

13
3.1. MERA

align with the properties of critical systems. MPS can still approximate the ground state of a
critical system, but cannot capture all the behaviors of such a system.
The multiscale entanglement renormalization ansatz (MERA) is a tensor network that is
based on the concepts of renormalization group and entanglement, and aims to handle (one-
dimensional) critical Hamiltonians in a more efficient manner than MPS [14, 25, 26, 27]. MERA
has several properties which coincide with the properties that critical systems also have, which
we will discuss in detail in the following section.
In this section, we will first formally define a translation and scale invariant one-dimensional
binary MERA, and discuss several aspects such as bond dimension and causal cone. Then we
will look at scale invariance in more detail and define the ascending/descending superoperators
in Section 3.1.5. Lastly, we discuss the evaluation of expectation values of observables in Section
3.1.6.

3.1.1 Formal definition


A translation and scale invariant one-dimensional binary MERA (Figure A.1) can be seen as
a transformation from an initial state |ψ0 ⟩ that lives on a lattice L0 to a final state |ψT ⟩ on
a lattice LT . A lattice Li containing ni sites with local dimension d has an attached Hilbert
space HLi = Cd0 ⊗ Cd1 ⊗ ... ⊗ Cdni −1 , for now we assume ni to be an arbitrary integer. The
transformation takes place in T steps; the number of steps depends on how the MERA is
structured. Let U describe the MERA, then

U = UT −1 UT −2 · · · U0

with Ui : HLi 7→ HLi+1 , which are referred to as (MERA) layers. Then the state transforms
with each layer,
|ψ0 ⟩ → |ψ1 ⟩ → ... → |ψT ⟩ ,
where |ψi+1 ⟩ = Ui |ψi ⟩. Each Ui is described by two tensors u and w. The u-tensors, also called
disentanglers, are described by

u : (Cd )⊗2 7→ (Cd )⊗2 , u† u = u u† = Id2 , (3.1)

and the w-tensors, called isometries, are described by

w : Cd 7→ (Cd )⊗2 , w † w = Id . (3.2)

As MERA is a tensor network, we can represent the network using the diagrammatic tensor
notation, see Appendix A.1. A graphical representation is shown in Figure A.1. The lattice
points in Li are represented as wires and the tensors as (xin , xout )-tensors, where xin is the
number of wires going into the tensor and xout the number of wires leaving the tensor. We know
that each isometry w is the same by translational invariance, which implies that all ingoing
wires must have the same dimension d and vector space Cd . By the definitions in Eq. 3.1 and
3.2 we can see that the w-tensors are of the type (1,2) and the u-tensors are of the type (2,2).
The tensors are laid out in a brick wall structure where the u-tensors entangle neighboring
wires.
Here we have explicitly defined a specific MERA scheme, a binary MERA, by choosing
the isometries to map from Cd to (Cd )⊗2 . One can also define a ternary MERA by letting
the co-domain of the isometry be (Cd )⊗3 [28]. Different schemes also exist such as a modified
binary MERA and branching MERA [26, 27], or even generalizations to higher dimensions [14].
These schemes differ in contraction cost scaling and efficiency of accuracy. In this thesis we
will assume that every MERA is binary for simplicity.

14
3.1. MERA

(a) Two MERA layers. The isometries are rep- (b) Consecutive MERA layers which show the
resented by the blue tensors and disentanglers by fine-graining effect of MERA transformations.
the green tensors. The dimension of the tensors Each layer introduces new sites and entangles
is determined by the bond dimension d. them with the existing sites. The inner struc-
ture of the two layers is explicitly shown in (a).

Figure 3.1: The effects of MERA transformations on a small and large-scale level.

Criticality properties
In general, we want MERA to have similar properties to the systems we study, namely
critical systems. Two of those properties are translation and scale invariance. Earlier, we
asserted that all isometries and disentanglers are the same in Ui . This choice ensures that
the structure is invariant under shifting positions of the tensors. One can also let all tensors
be different and freely adjust each tensor to optimize the network. For scale invariance all Ui
consist of the same tensors {w, u}. Each layer Ui is then independent of the scale. By requiring
these invariance properties, for a given local dimension d and number of MERA layers N ,
we lower the number of parameters to describe the tensors from O(d4 N ) to O(d4 log2 N ) by
translational invariance, and even further to O(d4 ) by scale invariance. We will return to scale
invariance of MERA in Section 3.1.5.
At this point, we should look at the similarities between quantum circuits and the given
definition of MERA. Quantum circuits implement a transformation from an initial n-qubit state
⊗n
|ψ0 ⟩, by convention |0⟩ , to a final state |ψf ⟩ by applying various quantum gates to the qubits.
Similarly, MERA transforms an initial state |ψ0 ⟩ to a final state |ψT ⟩ by repeatedly applying
quantum circuits. If we think of MERA as a particular quantum circuit, then we can interpret
the lattice points as qubits and the {w, u} unitaries as two-qubit quantum gates. Precisely this
interpretation is what makes MERA a particularly good and natural candidate for a tensor
network that can be implemented on a quantum computer.

3.1.2 Bond dimension


From the current definition of MERA it is not apparent how to adjust the size of the tensors
which can impact the accuracy and efficiency of the ansatz. As it is currently defined, the size
of the tensors depends only on the dimensionality d of the lattice sites, e.g. d = 2 for spins.
However, we can change d into a parameter which we can adjust by grouping sites together.
The performance of MPS is often indicated with a quantity χMPS which serves to put a limit
on the bond dimension, the dimension between the tensors. The larger the bond dimension
is allowed to be, the more accurate MPS will be. This however comes with a (literal) cost of
computational time. We can use a similar notion to quantify the accuracy of MERA. Earlier
we saw that the number of parameters needed to describe the tensors that make up MERA is
O(d4 ), where d is the dimension of the legs that point in- or outwards of the tensor. If we can

15
3.1. MERA

increase d, then the number of parameters increases, and therefore the accuracy of MERA.

Increasing bond dimension


Given a lattice L0 with n0 sites that have dimension d, let us group s sites together into
bigger sites with dimension ds (Figure 3.2). The MERA that would act on this now has to
contain tensors with larger dimensions, namely d2s × d2s for disentanglers and d2s × ds for
isometries. These bigger tensors contain more elements and thus more number of parameters,
and we theorized that the accuracy of MERA should therefore increase. In Figure 3.3 the
relative error rate of the computed ground state energy and the exact ground state energy
as a function of the bond dimension for both MERA and MPS is shown [25]. As expected,
MPS becomes more accurate for larger values of χMPS . Similarly, MERA is more accurate for
increasing bond dimensions χMERA , what we have been calling the local dimension d thus far.
At this point it seems appropriate to promote the local dimension d to the bond dimension
χMERA as we have shown that it fulfills a similar role to χMPS ; it is a parameter that signifies
the accuracy of the corresponding tensor network.

Figure 3.2: Grouping of sites to increase the dimensionality of a ”site”.

Note that grouping sites together is purely for computational purposes and, in most cases,
has no physical meaning. However, it does not mean that we lose physical meaning. Any
observable that we may be interested in can be transformed to this grouping ansatz. For
example, a d-dimensional observable h can be appropriately embedded by s − 1 identities such
that it becomes a ds -dimensional observable, that is
h 7→ h′ = I ⊗ ... ⊗ I ⊗ h ⊗ I ⊗ ... ⊗ I,
where I are d × d identities.
The grouping ansatz can be generalized by transitional layers, which are placed before
the scale/translational invariant MERA. The transitional layers are allowed to have arbitrary
isometries and disentanglers, with arbitrary dimensions, that generally differ from the ones
used in the MERA. As such they can be freely adjusted in a method called tensor optimization
where by an iterative process each individual tensor is optimized in such a way that the following
MERA better approximates the ground state energy. We can consider a transitional layer that
contains isometries (or identities) that simply group two sites together, that is w : Cd ⊗ Cd 7→
C2d , which can be extended to allow for an arbitrary number of sites to be grouped together.
Besides this specific application of transitional layers, we will not make further use of them.
For more discussion on transitional layers and tensor optimization, see [26].

3.1.3 Causal cone


Up until this point we did not assume anything about the width of the initial lattice L0 because
that does not matter yet until we contract the network. As it turns out, if we place an observable

16
3.1. MERA

Figure 3.3: Relative error in ground state energy of a select few critical spin models, for both
MERA and MPS. The energy error scales polynomially with the bond dimension for both tensor
networks. Figure from [26].

at the bottom of the network and compute the expectation value of that observable (which we
will do later in Section 3.1.6), only certain tensors can effect the state that ends up at the
observable. These tensors are part of the causal cone, as shown in Figure 3.4. The notion of
a cone of causality is not unique to MERA; it is similar to the light cone that can be found
in the theory of relativity. Events (tensors) that happen outside the cone cannot influence the
present (observable).

Figure 3.4: The causal cone of a three-site observable (obs) in MERA. The cone has a fixed
width for this operator.

The MERA contains tensors that are not causally connected to an observable O (i.e. not
in the causal cone), so there will be situations where u directly contracts with u† or w with
w† , see Figure 3.6. As defined earlier, we know that u† u = u u† = I and w† w = I, so these
contractions contribute nothing to the computation and we can safely ignore them. We can say
that the tensors that do not annihilate with each other are elements of the causal cone.

17
3.1. MERA

Fixed widths
A practically useful property of the causal cone of a MERA is that the width, in terms of the
number of sites, will converge to a fixed value which depends on the support of the observable.
Let us consider an three-site observable O (Figure 3.6). After the contraction of all the tensors
that do not effect O, there are only three sites left that did not contract to identity, so the
causal cone of a three-site observable also has a width of three sites. This is one of the few
fixed values but we can write down a general formula to find the other fixed values. We will
investigate this by seeing what the effect of one Ui† transformation is on n sites, or rather, how
the width of the causal cone of n sites changes. We will separate the cases of odd and even
number of sites.
For an odd number of sites, n sites will connect to ⌈ n2 ⌉ disentanglers (u† ) which in turn
connect to ⌈ n2 ⌉ + 1 isometries (w† ). The width of a causal cone with n sites as its base is ⌈ n2 ⌉ + 1
sites after going up one layer, lnm
n→ + 1. (3.3)
2
As an example for n = 1, the single site must connect to a single disentangler which connects
to 2 isometries, so the causal cone now has a width of 2 sites. Similarly, n = 3 has a cone
width of 3 sites, n = 5 a cone width of 4 sites and n = 7 a cone width of 5 sites. An interesting
observation we can already make, is that a causal cone of 3 sites has a fixed width.
The counting works differently for an even number of sites. Instead, n sites can connect to
either n2 or n2 + 1 disentanglers depending on which n sites are being considered. Let us call
an observable aligned if it follows the former case and disaligned for the latter case, see Figure
3.5. An aligned observable will then connect to n2 + 1 isometries and the latter case to n2 + 2
isometries. The causal cone of n sites has a width of either n2 + 1 or n2 + 2 after going up one
layer,
n
n→ +1 (aligned) (3.4)
2
n
n→ +2 (disaligned). (3.5)
2
This implies that the cone of n = 2 has a width of 2 or 3 sites, n = 4 a cone width of 3 or 4
sites and n = 6 a cone width of 4 or 5 sites. Again, we observe that 2 and 4 sites can have a
causal cone with a fixed width, depending on the exact positions of those sites in relation to
the MERA.

Figure 3.5: A four-site operator that is (a) aligned and connected to two disentanglers; or (b)
disaligned and connected to three disentanglers.

We have seen that 2 and 4 sites can have a causal cone with a fixed width in some situations,

18
3.1. MERA

whereas 3 sites are guaranteed to have a constant width for its causal cone. Given that there
are ni sites on each lattice Li , we would have had to compute O(n2i ) tensors for each MERA
layer. Now, in the worst case, we only have to consider up to 7 tensors per layer that contribute
to the expectation value of an observable. This significantly reduces the amount of tensors that
take part in computations, and consequently reduces the computation time.

Figure 3.6: Contraction to identity of tensors that are not causally connected to the three-site
operator O.

Superoperators
To make the concept of a causal cone more precise, we turn to ascending superoperators. Up
until this point we have imagined the MERA to transform an initial state to a final state, |ψf ⟩ =
U |ψ0 ⟩. If we computed the expectation value of an observable O, this would be ⟨ψf | O |ψf ⟩.
However, from a mathematical perspective it is equivalent to transform the observable O by
absorbing U and U † to obtain the transformed observable O′ , that is

⟨ψf | O |ψf ⟩ = ⟨ψ0 | U † O U |ψ0 ⟩ = ⟨ψ0 | O′ |ψ0 ⟩ . (3.6)

However as we know, not all tensors in U are involved with O. Precisely which tensors are
involved is dependent on the support of O, more so which exact sites support O. As proof by
example, a two-site operator that is directly connected to two disentanglers will then connect
to three isometries, whereas a two-site operator that is connected to just one disentangler will
connect to two isometries. In the former case, O′ now has a larger support than O; ideally we
want the transformed operators to retain the same support.
For now, let us consider three-site operators, although this argument can be extended to any
[r−1, r, r+1]
odd number of sites. A local operator Oi+1 on lattice Li+1 that is supported on adjacent
sites [r − 1, r, r + 1] can be transformed to the lattice Li by the ascending superoperator S
such that
[r ′ −1, r ′ , r ′ +1] [r−1, r, r+1]
Oi = S[Oi+1 ], (3.7)
where [r′ − 1, r′ , r′ + 1] are adjacent sites on lattice Li . It can be ambiguous how exactly O is
transformed with respect to left and right side of disentanglers, so instead we will average over
both possibilities such that S = 12 (SL + SR ). Eq. 3.7 can also be expressed in diagrammatic
notation, see Figure 3.7. We can take this figure to be the definition of how S acts on three-site
operators.

19
3.1. MERA

Figure 3.7: The ascending superoperator S acting on a three-site operator. Both possibilities
SL and SR need to be averaged over.

Similarly, we can define a descending superoperator D given a density operator ρ on lattice


Li ,
[r ′ −1, r ′ , r ′ +1] [r−1, r, r+1]
ρi+1 = D[ρi ], (3.8)
1
where again we average such that D = (DL + DR ). The ascending and descending superop-
2
erators are each others dual, D = S ∗ , as we have the relation
Tr(Oi+1 D[ρi ]) = Tr(S[Oi+1 ] ρi ), (3.9)
which is equivalent to Eq. 3.6.
Both superoperators can be related to the Schrödinger and Heisenberg picture. The de-
scending superoperator transforms the wavefunction which is precisely the Schrödinger picture,
whereas evolving the operator corresponds to the Heisenberg picture. Both interpretations lead
to the same result, however in certain situations it may be insightful to choose one over the
other.

3.1.4 Entanglement entropy scaling


Ground states of critical one-dimensional Hamiltonians have a logarithmic scaling in its entropy,
see Section 2.4. This scaling cannot be represented well by MPS, but MERA does succeed at
that. The following derivation is based on [29] for D = 1.
An important property of the entanglement entropy is that it is upper bounded by S(ρ) ≤
log2 d, where d is the dimension of ρ. Another property is the triangle inequality, which leads
to S(ρA ) ≤ S(ρAB ) + S(ρB ). Combining both properties we get
S(ρA ) ≤ S(ρAB ) + log2 d. (3.10)
In the process of preparing the fixed state we have the sequence ρ0 → ρ1 → ... → ρT by
applying D T times. As we go through this sequence, we descend through the causal cone and
trace out sites. The traced sites add entropy recursively, that is
S(ρz ) ≤ S(ρz−1 ) + log2 (d) ntr
z , (3.11)
where 0 < z ≤ T and ntrz is the number of traced sites. At any point the entropy in the sequence
is upper bounded by S(ρz ) ≤ lz log2 d, where lz is the number of sites at scale z. Iterating Eq.
2.6 until we get to ρT gives us
S(ρT ) ≤ log2 (d) lT + NTtr ,

(3.12)

20
3.1. MERA

PT −1
where NTtr = z=0 ntr z .
From the causal cone discussion in Section 3.1.3, we know that the fixed width of the causal
cone of 3-site observables is 3 sites. NTtr is proportional to the number of scales that are not at
the fixed width yet. Starting with L sites, it takes roughly log2 L iterations of D to reduce the
cone width to 3 sites. To be more precise, we take NTtr = 2 log2 (L − 2). Finally, we obtain the
entropy of ρT ,
S(ρT ) ≤ log2 (d)(3 + 2 log2 (L − 2)). (3.13)
The entropy scaling in Eq. 3.13 coincides with the entropy scaling for critical ground states.
This implies that MERA is suited for approximating these ground states and capturing the
critical behaviors.

3.1.5 Scaling superoperators


In Section 3.1.1 we assumed a scale invariant MERA, which implies that all scales have the
same set of tensors as their building blocks. As such, the layers can be interpreted as repeatedly
applying the same function. To be precise, we can refer to the ascending superoperator S that
was introduced in the previous section now as the scaling superoperator. We assume that S is
diagonalizable, then this superoperator has the property that
S[ϕa ] = λa ϕa , ⟨ϕ̂a , ϕb ⟩ = δab , (3.14)

for eigenvalues λa and right eigenoperators ϕa [15]. ϕ̂a are the left eigenoperators of S and by
duality the right eigenoperators of D, the dual of S. We identify ϕa as operators/observables
that sit inside the MERA and ϕ̂a as ”starting states” of MERA. The associated eigenvalues λa
are the same for both ϕa and ϕ̂a . In general, ϕ̂a need not be proper quantum states. In fact,
only the first left eigenoperator is a quantum state.
S scales its eigenoperators by a factor λa with each application. Sometimes we will talk
about the scaling dimension ∆a = − log2 λa . We can decompose any operator O in terms of
ϕa , X X h i
O= ⟨ϕa , O⟩ϕa = Tr ϕ̂a O ϕa , (3.15)
a a

where ⟨· , ·⟩ is the Hilbert-Schmidt inner product. This then allows us to write the action of S
on an arbitrary operator O as
X h i
S[O] = λa Tr ϕ̂a O ϕa . (3.16)
a

Due to the way we have defined MERA, it follows that S is unital. Let us again look
at Figure 3.7, but now consider operator O to be the three-site identity operator I ⊗ I ⊗ I.
Consequently, the tensors can contract with each other to identity and we are left with three
straight wires, so S[I] = I. This implies that there is at least one λa such that λI = 1. For the
other eigenvalues λa we will assume that they are strictly less than 1.
There must then also be a fixed point of D that has the same eigenvalue λI = 1 due to
the dual nature of S and D, shown in Eq. 3.9. There exists a density matrix ρ∗ such that
D[ρ∗ ] = ρ∗ , so repeated applications of D converges an arbitrary density matrix ρ to the fixed
point ρ∗ ,
lim D ... ◦ D} [ρ] = ρ∗ .
| ◦ {z (3.17)
T →∞
T
The fixed point ρ∗ depends on the MERA and its tensors w, u. By choosing the right construc-
tion for the tensors we can have ρ∗ be the ground state of a (critical) Hamiltonian.

21
3.1. MERA

3.1.6 Evaluation of observables and correlators


The goal of MERA is to prepare an approximate ground state of a Hamiltonian and find the
associated ground state energy. So far we have laid out the tools and properties of MERA
necessary to do exactly that. Let us know discuss how to evaluate observables and two-point
correlation functions

Observables
We will keep the assumption thath the iobservable O is a three-site operator.
h i From Section
3.1.5 we know that S[O] = a λa Tr ϕ̂a O ϕa , so then S k [O] = a λka Tr ϕ̂a O ϕa . The expec-
P P
 
tation value of an observable O after k layers is ⟨O⟩ρk = Tr[Oρk ] = Tr S k [O]ρ0 . Combining
both expressions for Sk [O] and ⟨O⟩ρk we get
" #
X h i
k
⟨O⟩ρk = Tr λa Tr ϕ̂a O ϕa ρ0
a
X h i
= λka Tr ϕ̂a O Tr[ϕa ρ0 ]. (3.18)
a

In the limit of k goes to infinity, Eq. 3.18 simplifies to

⟨O⟩ρ∗ = Tr[ρ∗ O], (3.19)

where ρ∗ is the fixed point state of the descending superoperator D.


There are three ways to prepare Eq. 3.19. The first one is to prepare the state by repeatedly
applying the descending superoperator on the top layer until the bottom layer has been reached,
where the operator O sits, to obtain the state ρ∗ . The second approach is to apply the ascending
superoperator to the observable O many times until the top layer has been reached. Lastly, in
a general manner, one can use both ascending and descending superoperators to reach a layer
τ , where 0 ≤ τ ≤ T . All three methods are computationally equivalent as they involve O(log n)
uses of the ascending/descending superoperator and ultimately lead to a cost of O(d9 log n)
[25].

Two-point correlation functions


Now that we know how to evaluate observables we can extend this notion to two-point
correlation functions of the form ⟨ϕα (x)ϕβ (y)⟩, where ϕα are the right eigenoperators of S
and x and y indicate the position of the operators, for x ̸= y. Even though no general closed
expression exists for binary MERA, we can take an intuitive approach to this problem. Note
that ⟨ϕα (x)ϕβ (y)⟩ =
̸ ⟨ϕα (x)⟩⟨ϕβ (y)⟩ as eventually their causal cones will connect after a certain
number of iterations of S and the observables are not independent anymore.
Suppose x and y are sufficiently apart, that is they are not nearest neighbors, then their
causal cones are not yet connected. After a number of layers their causal cones will be connected
and at this point we can simply evaluate ⟨ϕα (x)ϕβ (y)⟩ like Eq. 3.19. The question now is how
many iterations of S are necessary to achieve this.
To gain intuition for this question, let us look at the same situation but for a ternary MERA
[15]. ”Ternary” implies that isometries map one site to three sites. In this case ϕα (x) and ϕβ (y)
can be placed at specific sites such that x − y = 3q for q = 1, 2, ... . Then, after q = log3 |x − y|
iterations of S transformations the operators ϕα (x) and ϕβ (y) are nearest neighbors and we
have
Tr[(ϕα (x) ⊗ ϕβ (y))ρ]
⟨ϕα (x)ϕβ (y)⟩ = , (3.20)
|x − y|∆α +∆β

22
3.2. DMERA

where ∆α = − log3 (λα ) is the scaling dimension of ϕα .


Let us now return to a binary MERA. This is no longer an ideal case, but we can make an
educated guess that we will need roughly log2 |x − y| iterations of S to have ϕα (x) and ϕβ (y)
be nearest neighbors. The evaluation of the two operators will then look similar, namely
1 Tr[(ϕα (x) ⊗ ϕβ (y))ρ]
⟨ϕα (x)ϕβ (y)⟩ ≈ , (3.21)
(λα λβ )c |x − y|∆α +∆β

where now ∆α = log2 (λα ). There might be an additional factor (λα λβ )−c for c = 0, 1, 2, ... due
to the exact placement of the operators and overestimation of the number of iterations of S.
The most important feature of this expression is the polynomial decay of the correlator with
respect to the scaling dimensions. This polynomial behavior of the correlations is also present
for systems at the critical point and this regime is precisely what we are interested in.
To summarize, due to the way MERA is structured, we can generalize the tensor network to
a repeated application of superoperators, which stems from the fact that the causal cone of an
operator can have a constant width. These superoperators come with a set of eigenoperators
and eigenvalues which describe the behavior of a general operator or state. Only one of these
eigenoperators has eigenvalue λ = 1, which corresponds to the fixed point of MERA. Addition-
ally, we discovered that MERA has similar properties to systems at the critical point, namely
for the entanglement entropy scaling and decay of the correlator. Based on the properties,
MERA looks like a better candidate for studying critical systems than MPS, which does not
have the same properties.

3.2 DMERA
Now that we have established MERA we can move to DMERA. As hinted in the formal in-
troduction of MERA (Section 3.1.1), we can interpret the network in a quantum manner by
replacing the tensors with quantum gates and the legs with qubits. In this section, we formalize
this concept and discuss several unique features of DMERA.
DMERA has been proposed by Kim and Swingle in 2017 as a method to study strongly
interacting quantum many-body states at criticality on a noisy quantum computer [1]. They
suggest that DMERA has several strong qualities – of which we will discuss a few in this section
– namely fast contraction on a QC, approximation of a wide variety of physical ground states,
resilience to noise and good approximation of the ground state energy in presence of noise.
In this section we will formally introduce DMERA and its differences with MERA, then we
will discuss how the fixed width of the causal cone changes with larger depth D and finally we
prove that DMERA with noisy gates has an upper bound on the cumulative error.

3.2.1 Formal definition


The deep MERA (DMERA) (Figure 3.8) is a quantum circuit that transforms an initial n-qubit
state |ψ0 ⟩ = |0n ⟩ ∈ H0 to a final state |ψf ⟩ ∈ Hf in T steps as
 i E
|ψi+1 ⟩ = Ui |ψi ⟩ ⊗ 02 , (3.22)

iE
where Ui is a depth D quantum circuit (layer) and 02 contains 2i qubits in |0⟩-state which

iE
are placed alternating with the qubits from |ψi ⟩ [1]. Alternatively, we can merge Ui with 02 ,

Ui′ = Ui ⊗ [I ⊗ |0⟩ ⊗ I ⊗ ... ⊗ |0⟩] , (3.23)

23
3.2. DMERA

Figure 3.8: One DMERA layer with depth D = 4. The wires are qubits and the tensors are
two-qubit quantum gates. At each isometry (blue) a new qubit in the |0⟩-state is inserted.

such that Ui′ : Hi 7→ Hi+1 = Hi⊗2 to get

|ψi+1 ⟩ = Ui′ |ψi ⟩ . (3.24)

All layers Ui are constructed from the same set of two-qubit unitaries {w, u1 , u2 , ..., uD−1 },
where the w-unitaries are applied first to ψi (interspersed with |0⟩), then the u1 -unitaries and
so on, in a brick wall structure, see Figure 3.8.
So far, we have been more or less repeating the definition of MERA from Section 3.1.1.
Many properties and tools that apply to DMERA can be copied directly from MERA, such as
translation and scale invariance or evaluation of observables. However, there are some subtle
differences. For example, we consider DMERA to contain unitaries instead of tensors because
we exclusively use two-qubit unitary gates. Another difference is that we explicitly introduce a
qubit in the |0⟩-state at every isometry, instead of letting the isometry be general.

Depth
However, there is one major difference: DMERA is allowed to have more than two unitary
layers in each Ui . This is necessary as the DMERA works with qubits, which have a fixed local
dimension d = 2. We cannot group the qubits together to increase the bond dimension as was
the case for MERA, since DMERA needs to be a valid quantum circuit to work on a quantum
computer. Nonetheless, the extra unitary layers in Ui allows us to increase the bond dimension
and accuracy of DMERA.
We will now show that any DMERA can be formulated as a MERA, and vice versa. Consider
a DMERA with n qubits and a depth D = 2. This simply corresponds to a MERA with d = 2.
Now consider a DMERA with depth D = 3. To relate this to a MERA, let us group the
unitaries {w, u1 , u2 } together such that we get new effective unitaries {w∗ , u∗ }, see Figure 3.9
[17]. These unitaries still follow the MERA definition from Eq. 3.1 and 3.2,

w∗ : C4 7→ (C4 )⊗2 , u∗ : (C4 )⊗2 7→ (C4 )⊗2 ,

so they are a valid set of tensors. This now allows us to write down a MERA with d = 4.
Similarly, we can do this for any D to obtain a MERA with d = 2D−1 by continuing this
”triangular” grouping. By going the opposite way a MERA with d = 2D−1 can be formulated
as a DMERA with depth D.

24
3.2. DMERA

Figure 3.9: Structure with 3 layers transformed to an ordinary MERA with increased local
dimension

3.2.2 Causal cone


We have discussed the causal cone for MERA in Section 3.1.3 where we have seen that the
causal cone of an operator O can have a fixed width. The addition of more unitary layers
means that the fixed cone widths we derived for MERA generally do not apply to DMERA.
Instead, we will also need to take the depth D into account. Similarly in the earlier derivation,
we will look at the effect of one U † transformation on an n-qubit operator with odd n and later
we will generalize to even n. Note that the case of D = 2 is trivial because it is the same for
MERA.
For odd n, an operator O connects to ⌈ n2 ⌉ u†2 -unitaries, then to (⌈ n2 ⌉ + 1) u†1 -unitaries and
finally to (⌈ n2 ⌉ + 2) w† -unitaries. At the end of the U † transformation we have mapped an
n-qubit operator to an (⌈ n2 ⌉ + 2)-qubit operator, disregarding any unitaries that would contract
to identity like before. For D = 3, we can see that a 1-qubit operator has a causal cone width
of 3 qubits, a 3-qubit operator a cone width of 4 qubits, a 5-qubit operator a cone width of 5
qubits and a 7-qubit operator a cone width of 6 qubits. The new fixed width of a causal cone
is now 5 qubits which differs from the previously found values for MERA.
At this point we can already generalize the result by considering the effect of the extra
layer in the D = 3 example. The extra layer adds one extra qubit to the cone width because q
entanglers always connect to q + 1 unitaries from below due to the brick wall structure. For a
given D and odd n the width of the causal cone is transformed as
lnm
n→ + D − 1, (3.25)
2
and for even n,
n
n→ + D − 1 (aligned) (3.26)
2
n
n → + D (disaligned), (3.27)
2
where observables can be aligned or disaligned, c.f. Section 3.1.3.
We are particularly interested in the fixed widths since this allows us to formulate the
DMERA transformation as a quantum channel which transforms an n-qubit operator to an

25
3.2. DMERA

n-qubit operator. Recall that


iE
⟨ψi+1 | O |ψi+1 ⟩ = (⟨ψi | ⊗ 02i )U † O U (|ψi ⟩ ⊗ 02 ),


(3.28)

then we can identify a quantum channel Φi such that


iE
(⟨ψi | ⊗ 02i )U † O U (|ψi ⟩ ⊗ 02 ) = ⟨ψi | Φi [O] |ψi ⟩ ,


(3.29)

where D i iE
Φi [O] = 02 U † O U 02 , (3.30)

for any operator O.


We see that Φi has the same function as the scaling operator S from Section 3.1.5. Similarly,
we will only consider cases where Φi acts on operators O whose support is equal to the fixed
width of the causal cone of Φi . In other words, Φi [O] and O must have support on the same
number of qubits. Therefore, we can drop the index i for simplicity.

3.2.3 Noise resilience


In this section we will investigate the noise resilience of DMERA in two ways; first we will prove
that the noise generated by a noisy DMERA channel Φ̃ after n layers is upper bounded by ν,
such that µ < ν < 1 where µ is the second-largest eigenvalue, and the depth D, that is

n
O(ϵD2 )
Φ̃ − Φn ≤ , (3.31)

∞ 1−ν
and secondly we will investigate experimentally how an early error dissipates with consecutive
noiseless layers. The first proof is based on the derivation from Kim-Swingle, supported by
results from Wolf [1, 30].

Mathematical argument
We assume that the DMERA channel Φ has only one eigenvalue λ such that |λ| = 1.
Furthermore, we assume that the noisy channel Φ̃ is unital but can have more eigenvalues with
modulus 1.
In this section we will make use of the induced operator norm which is given by

∥Φ∥∞→∞ = max ∥Φ[O]∥∞ ; ∥O∥∞ = sup | ⟨ψ| O |ϕ⟩ |, (3.32)


∥O∥∞ =1 ψ,ϕ; ⟨ψ|ϕ⟩=1

for any quantum channel Φ. For the rest of this section we will only use the induced operator
norm, so for simplicity we will write ∥Φ∥∞→∞ as ∥Φ∥∞ .
To formally analyze the noise resilience, we will first need to introduce the noisy version of
Φ, namely Φ̃. The noisy counterpart is constructed with noisy gates which have some error
given by
∥Γ − Γ̃∥∞ ≤ ϵ, (3.33)
where Γ is a two-qubit gate and ϵ is the error rate.
We would like to express the error between Φ and Φ̃. We do so by replacing a noiseless gate
with a noisy gate and observe the effect on the error. Intuitively, for an N-gate quantum circuit
we might assume that we need to apply Eq. 3.33 for each gate we replace. As we will see, this
intuitive idea is correct.

26
3.2. DMERA

First, let us examine a simple circuit with 2 consecutive gates, namely ∥Γ2 Γ1 − Γ̃2 Γ̃1 ∥, and
express the error rate as

∥Γ2 Γ1 − Γ̃2 Γ̃1 ∥ = ∥Γ2 Γ1 − Γ̃2 Γ1 − (Γ̃2 Γ̃1 − Γ̃2 Γ1 )∥


= ∥(Γ2 − Γ̃2 )Γ1 − Γ̃2 (Γ̃1 − Γ1 )∥
≤ ∥Γ2 − Γ̃2 ∥ · ∥Γ1 ∥ + ∥Γ̃2 ∥ · ∥Γ1 − Γ̃1 ∥
≤ 2ϵ, (3.34)

where in the first line we effectively added 0, from the second to the third line we used the
triangle inequality and finally we used Eq. 3.33. Similar arguments can be used to show that
∥Γ2 ⊗ Γ1 − Γ̃2 ⊗ Γ̃1 ∥ ≤ 2ϵ. For a circuit with N consecutive gates we can write the error rate as

∥ΓN · · · Γ1 − Γ̃N · · · Γ̃1 ∥ = ∥ΓN · · · Γ1 − Γ̃N ΓN −1 · · · Γ1 − (Γ̃N · · · Γ̃1 − Γ̃N ΓN −1 · · · Γ1 )∥


= ∥(ΓN − Γ̃N )ΓN −1 · · · Γ1 − Γ̃N (Γ̃N −1 · · · Γ̃1 − ΓN −1 · · · Γ1 )∥
≤ ϵ + ∥ΓN −1 · · · Γ1 − Γ̃N −1 · · · Γ̃1 ∥
≤ ϵ + ϵ(N − 1)
= ϵN. (3.35)

Then, by induction it holds that

|| Φ − Φ̃ ||∞ ≤ O(ϵD2 ), (3.36)

as the DMERA channel Φ contains D layers and O(D) gates within each layer, so in total Φ
holds O(D2 ) gates.
As we are particularly interested in the decaying property of Φ, we are free to eliminate
the stationary part of operator O, i.e. by removing the identity operator which corresponds to
the largest eigenvalue of 1. Let us define the traceless operator Ō = O − Tr(O)
d I. Using this
operator, we can write
Φ̃n [Ō] − Φn [Ō] = Φ̃n [O] − Φn [O]. (3.37)
Let us introduce Φ(1) which is defined as
X
Φ(1) = λk Pk , (3.38)
k:|λk |=1

where Pk is the projection for λk (Eq. 6.13 from [30]). Given that Φ has only one eigenvalue
with modulus 1, then there is only one Pk = cI and Φ(1) projects any operator onto the identity
operator, i.e. Φ(1) [O] = cI. Using this projector and the traceless operator Ō we can write

Φn [Ō] = Φn [O] − Φn(1) [O], (3.39)

which has a spectral radius of µ = maxλ∈spec(Φ) {|λ| < 1}. In other words, µ is the second-largest
eigenvalue of Φ.
Now that we have Eq. 3.39 we can give an upper bound on ||Ψn ||∞ . Using Theorem 8.23
from [30] we know that for any norm

||Φn − Φn(1) || ≤ Cµn ndµ −1 , (3.40)

where C is a constant independent of n and µ, and dµ is the size of Jordan block corresponding
to µ. In general, we do not know the size of the Jordan block of an arbitrary eigenvalue of our

27
3.2. DMERA

channel, and especially not the Jordan block corresponding to µ. However, for any µ < ν < 1
there is an n0 such that ν n ≥ µn nd−1 for n ≥ n0 . Indeed,

ν n ≥ µn nd−1
⇐⇒ n log ν ≥ n log µ + (d − 1) log n
ν
⇐⇒ n log ≥ (d − 1) log n
µ
n d−1
⇐⇒ ≥ . (3.41)
log n log µν

Therefore, for µ < ν < 1 we can write

||Φn − Φn(1) || = O(ν n ). (3.42)

Using the telescopic decomposition, we can write Φ̃n − Φn as


n
X
Φ̃n − Φn = Φ̃i−1 ◦ (Φ̃ − Φ) ◦ Φn−i . (3.43)
i=1

For the proof and discussion of the telescopic decomposition, see Appendix A.2.
We can find the upper bound of the error after n layers as follows,
n
X
n n k−1 n−k
Φ̃ − Φ = Φ̃ ◦ (Φ̃ − Φ) ◦ Φ



k=0 ∞
n
X
≤ ||Φ̃k−1 ||∞ · ||Φ̃ − Φ||∞ · ||Φn−k − Φn−k
(1) ||∞
k=0
Xn
≤ ||Φ̃ − Φ||∞ · ν n−k
k=0
n
X
≤ O(ϵD2 ) ν n−k
k=0
2
O(ϵD )
≤ . (3.44)
1−ν

From the second to the third line, we used that Φ̃k−1 is norm-nonincreasing. Next, we used Eq.
3.36 and by using Eq. 3.39 we can apply the upper bound from Eq. 3.42. Finally, we notice
that the sum is upper bounded by the geometric series of ν.
The result from Eq. 3.44 tells us that the upper bound of error on observables is inde-
pendent of n, the number of DMERA layers. Additionally, we expect the error to decrease
with additional DMERA layers, since we discard qubits that have errors in them and introduce
new qubits at each layer. Therefore, we should expect that increasing the number of layers is
beneficial to estimating the ground state energy, based on the argument of both the scaling
dimensions analysis and the previously found upper bound. Another interesting consequence
of this result is that if there exists a noiseless DMERA that can approximate the ground state
energy then there exists a noisy DMERA that approximates the ground state energy up to an
2
error of O(ϵD
1−ν .
)

28
3.2. DMERA

Experimental argument
Now that we have proven an upper bound on the noise, let us consider a different argument
for the noise resilience of DMERA. With each DMERA layer, or channel iteration, we introduce
pure qubits and throw away qubits that potentially contain errors. As a result, the errors
cannot grow beyond a certain size, which is exactly the result derived in Eq. 3.44. Instead of
considering many (noisy) layers, let us see what happens with only one noisy layer followed by
several noiseless layers. In Figure 3.10 we can see the effect of the error from a single noisy
channel iteration on the ground state energy after several iterations of the noiseless channel.
The difference between the measured ground state energy Ẽ and the exact ground state energy
E0 converges exponentially for all noise levels. From this we can see that noise tends to spread
out and get eliminated in further layers. The two-qubit gate infidelity noise model, described
later in Section 4.1.2, was used with various levels of noise.

Decay of noise after one noisy layer


2 2 = 0.02
2 = 0.04
2 = 0.06
2 = 0.08
4 2 = 0.1
log(E E0)

10

0 1 2 3 4 5
# noiseless iterations

Figure 3.10: Decay of noise after an initial noisy layer followed by noiseless layers. This
experiment was performed for several levels of noise indicated by the noise parameter σ 2 .

29
Chapter 4

Implementing DMERA

We have laid the groundwork for implementing DMERA on a quantum computer in the previous
section. What is left is to discuss the practical side of implementing DMERA by for example
discussing the specific quantum gates that will be used. To test the implementation, we verify
the performance by computing the energy of the prepared states and comparing the results to
those of Sewell and Jordan [2].
For this thesis, we have chosen the trapped ion QC developed by IonQ which currently
contains 11 qubits [31]. The access to this QC is facilitated by the Amazon Braket service.
This particular QC has the advantage that for all qubits the fidelity is relatively high compared
to its competitors. In addition, the QC has all-to-all connectivity so any two qubits can interact
with each other without the use of SWAP gates.
In a recent study in 2021, Sewell and Jordan successfully managed to implement DMERA
to prepare the ground state of the quantum Ising chain model on the Honeywell HØ trapped
ion QC [2]. The machine has six qubits that are also all connected. The advantage of this
computer over the IonQ machine is that it allows for mid-circuit qubit resets. Any qubit that
has been used earlier in the quantum circuit can be reset and used again in the computation
which is extremely useful for DMERA implementations as old qubits can be reset and inserted
at a new layer. Mid-circuit resets thus allow for an arbitrary number of layers. In this thesis we
will adapt their DMERA implementation and compare our obtained results with their results.
In this chapter, we will discuss the practicalities of running DMERA on both a simulated
and real quantum computer, such as compiling a DMERA circuit in terms of native gates before
performing the circuit on a quantum computer. We then measure the energy of the states that
were prepared by circuits with various number of layers to test the performance of DMERA.

4.1 Practicalities
4.1.1 Native gates
Quantum computers have a preference to work with so-called native gates. These are the basic
operations that a quantum computer uses, depending on its type of design. Any unitary that
is given to a quantum computer will first be compiled to a series of native gates. To avoid any
unnecessary errors, we will want to present a DMERA circuit to the computer that is already
comprised of native gates. Our circuit only consists of the w and u unitaries, so we will want
to decompose them in terms of native gates (for now we do not consider the observable as part
of the circuit). The two native gates we will use for this is the Mølmer-Sørensen XX gate (MS)

30
4.1. Practicalities

and the Z-axis rotation gate (Rz ) as they are used on the IonQ quantum computer [31].
The Rz gate is defined as
θ
Rz (θ) = e−i 2 Z
θ
!
e−i 2 0
= θ (4.1)
0 ei 2

where Z is the Pauli Z spin operator. Rz rotates a single qubit around the z-axis. On a quantum
computer this operation is particularly useful since it does not actually have to be performed.
Instead, it can be performed virtually by updating the angles of the following gates, which is
equivalent to rotating the qubit [32]. For this reason the Rz gate adds practically no runtime
or errors.
The MS gate is a two-qubit gate that can be implemented particularly well on an ion-trap
quantum computer and is shown to allow for efficient construction of quantum circuits [33].
The XX MS gate, also known as the Ising XX coupling gate, can be defined as
θ
XXmn (θ) = e−i 2 Xm Xn
cos θ2 −i sin θ2
  
0  0 
θ
 0 cos 2  −i sin θ2 0 
=  (4.2)
 0  −i sin θ2 cos θ2 0  
−i sin θ2 0 0 cos θ2

where Xm is the Pauli X spin operator on the m-th qubit. Xm and Xn commute with each
other, so XXmn is invariant under interchange of qubits m and n.
With these two native gates, we can construct a native gate decomposition for the DMERA
gates w and u, as described in [2]:

w(θ) = XX(π/2) · Rz,1 (θ) · Rz,2 (θ − π/2) · XX(−π/2)


(4.3)
u(θ) = XX(π/2) · Rz,1 (θ) · Rz,2 (θ) · XX(−π/2),

where Rz,n acts on the n-th qubit. Note that the angle dependence resides only in the rotation
gate which is implemented as a virtual operation.
Sewell and Jordan researched two different native gate decompositions, namely Eq. 4.3
and another with angle dependence in the XX gate [2]. The latter one was found to be more
susceptible to noise than the former one, hence Eq. 4.3 has been chosen for this thesis.

4.1.2 Noise models


Before we discuss how to simulate DMERA on a classical computer, we must first consider the
types of noise that are usually present in quantum computations. In this thesis we consider
two sources of noise: two-qubit gate infidelity and spontaneous depolarization. In this section,
we will discuss what these sources are and how they can be modeled.

Two-qubit gate infidelity noise


Two-qubit gate infidelity is the imprecise implementation of a two-qubit gate on the quan-
tum computer. Generally, this error occurs due to imprecise angles of the gate. For this reason,
Sewell and Jordan modeled gate infidelity in their experiment as imprecise rotation angles of
the XX gates given by a Gaussian distribution with standard deviation σ [2]. We use σ as a
parameter to indicate the level of noise for two-qubit gate infidelity.

31
4.1. Practicalities

The noisy version of the XX gate is denoted as XXg and the Kraus operators, which will
be given meaning in Section 4.1.3, are given by
r r
1 + e−σ2 /2 1 − e−σ2 /2
K0 = XX(θ) K1 = XX(θ − π). (4.4)
2 2
2
The XXg gate is equivalent to applying an ideal XX gate with a probability of (1 − e−σ /2 )/2
to flip both qubits afterwards. The gate infidelity noise is correlated between the two qubits.

Depolarizing noise
Another source of error we have to deal with is spontaneous errors due to the environment
(e.g. cosmic rays, external electromagnetic fields, internal particle interactions). When a qubit
interacts with the environment in these manners, the qubit may become depolarized with
probability p. We can model spontaneous noise by applying the depolarizing channel Fp [ρ],
given by
p
Fp [ρ] = (1 − p)ρ + (XρX + Y ρY + ZρZ)
  3
4p 2p
= 1− ρ + I, (4.5)
3 3

to each qubit before and after applying an XX gate. In this way, the noise is uncorrelated
between the qubits. Note that it is not necessary to apply the depolarizing channel before and
after a phase rotation gate as this is implemented virtually. The parameter p will be used to
indicate the level of noise for depolarizing error.

4.1.3 Classical simulation of DMERA


Classical simulation can be used to predict what should happen on a quantum computer and
also to predict what will realistically happen. Simulating DMERA without any added noise
gives us a prediction how well the method can approach the desired ground state in an ideal
scenario. However, on a NISQ-era quantum computer we will always have to deal with noise,
so ideally we will also want to simulate noisy DMERA circuits. In the previous section (Section
4.1.2) we introduced two noise models which we can now use to simulate noise on a classical
computer.
The advantages of using classical simulations is that one has access to more information
that is normally not available on a quantum computer, such as the matrix structure of the
DMERA channel or mid-circuit probing of the qubits. This information is useful to accurately
implement DMERA in experiments. Another benefit is that the classical simulation used in
this thesis is deterministic and thus the end results do not change with successive trials due to
randomness as is the case for quantum computers. All these benefits make it so that classical
simulations have been an essential part of this research.
In this section we will discuss the procedure of simulating DMERA on a classical computer
using Kraus operators and how this method can be used to introduce noise in the simulation.
Lastly, we introduce an alternative method to Kraus operators by using an efficient tensor
contraction method that can be used to evaluate noiseless DMERAs.

Kraus operators
In general, we will perform a DMERA layer by doing each unitary layer separately. Any
completely positive trace-preserving map can be written in terms of its Kraus operators Xi

32
4.1. Practicalities

such that
r
Xi M Xi† ,
X
Ω[M ] = (4.6)
i

for all M ∈ L(H) [21]. For a 3-qubit state ρ, we can then compute the isometries layer as
X
w ⊗ w ⊗ w[ρ] = (Xi ⊗ Xj ⊗ Xk )ρ(Xi ⊗ Xj ⊗ Xk )† , (4.7)
i,j,k

where X are the Kraus operators of w. We can generalize this statement to N qubits by
allowing N isometries.
For the simplest case, namely the noiseless case with only 1 Kraus operator for w and u,
we use the matrices
cos θ − π4 sin θ − π4
    
0
√ 0√ cos(θ) 0 0 sin(θ)
1 1
0 2 √2 − 2√ 2 0
   0 1 0 0 
w(θ) =  1 1
 u(θ) =  ,
 0 2 2 2 2 0   0 0 1 0 
π π
 
− sin θ − 4 0 0 cos θ − 4 sin(θ) 0 0 cos(θ)
(4.8)
given in [17]. For D = 2 there are only two unitary layers formed by w(θ1 ) and u(θ2 ) inside
π
each DMERA layer where θ1 = 12 and θ2 = − π6 . For D = 4 we instead have four unitary layers
consisting of w(θ1 ), u(θ2 ), u(θ3 ) and u(θ4 ). The angles θ1 , θ2 , θ3 , θ4 are derived in the paper by
Evenbly and White [17].
To add two-qubit gate infidelity from Section 4.1.2 we use the Kraus operators and the native
gate decomposition given by Sewell and Jordan [2]. The gate XX has two Kraus operators K0
and K1 so from Eq. 4.3 we know that each w and u has four Kraus operators, namely two for
each appearance of XX. In total, we have four Kraus operators per w and u gate and Eq. 4.7
still applies.

Figure 4.1: (a) The general contraction scheme for any depth DMERA. Note that not all legs
need have the same dimensions. This scheme can be used for both left and right alignment
by contracting the appropriate leg between U and U † . (b) Example for D = 4 of construction
blocks used in the scheme.

The Kraus operator method is computationally intensive as there are a lot of matrix mul-
tiplications involved. Specifically for the noiseless channel there is an alternative numerical
method that we can use which is more optimized and much faster to compute. Due to the

33
4.1. Practicalities

fact that we are dealing with tensors, we can use e.g. the einsum() function from the numpy
Python package [34]. This specific function is highly optimized to work with tensors and can
efficiently contract them.
Additionally, we can construct a general contraction scheme that works for any depth D.
For a DMERA with any depth D > 1 we can group the tensors in such a way that we can form
blocks that fit the scheme shown in Figure 4.1.
The classical simulation code we created and used for this thesis can be found on GitHub
[35].

4.1.4 Algorithmic cooling


The quantum computer that we have used in this project is the IonQ quantum computer which
has 11 qubits (at the time of writing). For the simplest case of DMERA, namely depth 2, we
need three new qubits for each layer to insert at the isometries. As we also need three qubits
to start with, we need 3(L + 1) qubits if we want to perform L DMERA layers. With 11 qubits
we are limited to two layers, however if we had one more qubit we would be able to do one
additional layer for a total of three layers.
A class of methods exist called algorithmic cooling which allows one to cool one qubit by
heating up other qubits [36, 37]. We can understand cooling as making the qubit more likely
to be in one state than in the other state. Heating up is then the opposite, that is making
the state more mixed. Alternatively, we can view algorithmic cooling as a transfer of entropy
where the entropy flows from one qubit to other qubits such that the targeted qubit has a lower
entropy in the end.
For algorithmic cooling to work we assume that all qubits involved in the cooling process
start with a bias towards |0⟩, i.e. P (0) = 1+ϵ 1−ϵ
2 and P (1) = 2 , where ϵ is the polarization bias.
If a qubit has a starting bias towards |1⟩ then we can simply flip the spin so that the bias is
now towards |0⟩. In general the bias ϵ will not be the same for all qubits, so instead we can use
that qubit A has bias ϵA , qubit B has bias ϵB and so on.
Note that a quantum computer is a closed system of qubits so we cannot transfer heat to
the outside. This eliminates several choices of algorithmic cooling which make use of a heat
bath. Using a heat bath would allow us to cool a qubit to an arbitrary temperature, but a
closed system with a fixed amount of qubits naturally defines a limit of how much a qubit can
be cooled. In an ideal scenario we approach the cooling limit, but in practice this turns out to
be difficult to achieve.

3B-Comp
As we have no access to a heat bath, we have to find a different method. What we can do
is swap probabilities of states around such that the targeted qubit becomes colder. A method
we can use for 3 qubits is called 3B-Comp [37]. This method performs the swap |011⟩ ↔ |100⟩
and leaves the other states unchanged. The swap |011⟩ ↔ |100⟩ is the only swap that would
increase the bias ϵ of the first qubit, as any other swap would either reduce or not change the
bias, given that all three qubits have a bias towards |0⟩. We can implement the swap with the
small circuit shown in Figure 4.2 which involves the use of a CNOT, X and CSWAP gate. The
CSWAP gate swaps two target qubits if the control bit is 1, else it does nothing.
The 3B-Comp method works for only three qubits, however we can use it for more than
three qubits by repeatedly applying the swap for different sets of three qubits. Suppose we
have five qubits A, B, C, D and E with equal bias and we want to cooldown qubit A. We can
perform the first swap with qubits A, B and C and the second swap with qubits A, D and E.
Assuming that all qubits have a bias towards |0⟩ qubit A should be cooled down further due to
the extra iteration of 3B-Comp than with a single iteration of 3B-Comp.

34
4.1. Practicalities

Figure 4.2: The 3B-Comp circuit. This effectively swaps the states |011⟩ and |100⟩ and leaves
the other states intact. In this configuration qubit A is cooled down and thus its bias towards
|0⟩ is increased while qubits B and C are heated up.

Now that we have introduced a simple method for three qubits, let us put this to practice.
After we have done two DMERA layers, we have used 9 from the available 11 qubits of the
IonQ QC. 3 of the used qubits are kept in the computation and 3 additional qubits need to be
inserted into the circuit as pure |0⟩. As we only have 2 qubits left over, we need to cool one
previously used qubit back to |0⟩ for an additional layer. By using 3B-Comp we can attempt
to cooldown one single qubit by using the other qubits that are left over after the other layers.

Timing of cooling
The first choice to consider, is whether the cooling should take place after the first or the
second layer. After the first layer means that there are only three qubits available for algorithmic
cooling which limits the number of times we can iterate 3B-Comp. However, it would allow
us to insert the cooled qubit back in already at the second layer. Inserting an impure qubit
at the new layer introduces noise, however this may be balanced out by the noise resilience of
DMERA. Additionally, using only three qubits also uses fewer gates meaning there is less error
due to gate infidelity.
On the other hand, cooling after the second layer means there are more qubits available
and thus we can construct a larger circuit that can cool down one qubit further than what is
possible with only three qubits. In an ideal scenario, a larger circuit would be able to cool more
efficiently, but in a noisy environment it might work reversed; a large circuit could generate
more noise and in the end the qubit would be less cool in comparison.
To investigate this issue of cooling timing, we will test several scenarios. First, we cool one
qubit after the first layer for both the noiseless and noisy case. Then, we do the same but
with the qubit cooled after the second layer. In both noiseless and noisy cases, we only have
to use the impure qubit for a DMERA circuit with three layers. For DMERA with less than
three layers we have enough pure qubits to use, so the difference in cooling timing will only be
apparent for three layers. We compare the results with three layers that all have pure qubits.
The results are shown in Figure 4.3. In general, the energies perfectly overlap for the first
two layers as no impure qubits are used yet. When preparing the circuit with three layers we
introduce an impure qubit halfway through the circuit depending on the scenario and measure
the energy only after the third layer. For the noiseless case, in Figure 4.3a we see that it is
energetically favorable to cool the qubit after two layers. We argued before that this would be
the case as there are more qubits available to cool with. In a noisy environment, the opposite
is true, as can be seen in Figure 4.3b. Running a large cooling circuit is prone to more noise
than a smaller cooling circuit, thus it is better to cool after one layer.
All scenarios in Figure 4.3 show a common theme. The energy of the third layer is never
lower than the energy of the second layer, regardless of the timing of the cooling algorithm.

35
4.2. Energies

Energy with cooled qubit (noiseless) Energy with cooled qubit (noisy)
0.0 0.0
3 pure layers 3 pure layers
After 1st layer After 1st layer
0.2 After 2nd layer After 2nd layer
0.1

0.4 0.2
Energy (a.u.)

Energy (a.u.)
0.6 0.3

0.8
0.4

1.0
0.5

1.2
0.6
0 1 2 3 0 1 2 3
Layers Layers

(a) Cooling with no noise. (b) Cooling with depolarizing noise p = 0.05.

Figure 4.3: The energy after three layers with a qubit that is cooled after the first or second
layer.

We expect an exponential decay in the energy with pure qubits, so for algorithmic cooling to
be of use we need to see the same behavior when using cooled qubits. If the energy does not
lower after cooling, we are better off not doing the cooling at all, as it would interfere with the
decaying behavior of DMERA. As this simple experiment shows, it is not beneficial for us to
cool a qubit to perform an extra layer.
There are several reasons why algorithmic cooling does not give the desired results. Firstly,
we are very limited in the number of qubits available for the cooling process. This means we are
also limited in the number of times we can do 3B-Comp or exchange probabilities in any other
manner. Eventually, the qubits besides the targeted qubit are too hot and we hit a natural limit
of cooling. Additionally, we do not have access to a heat bath. This means that our circuit,
or quantum computer, is a closed system and we cannot exchange heat with the outside. If we
did have a heat bath, there would be more possibilities for cooling algorithms.
It is also possible that we did not find the optimal cooling circuit or algorithm for the
specific situation. For the cooling algorithm, we have limited ourselves to using only 3B-Comp
as this can be used in a general situation, which might also not be the optimal choice. We could
implement a specific circuit that cools the targeted qubit down further, however we assumed
that we do not know anything about the qubits except that they have a bias towards |0⟩. This
limits our approach to an algorithmic method, which we have used to no success.

4.2 Energies
One way to benchmark the implementation of DMERA is to compute the energy of the prepared
state and compare the results with the true ground state energy. When the measured energy
converges to the exact energy and follows the expected behavior, i.e. an exponential decay,
we can be confident that the DMERA works as expected. In this section we will first discuss
two general points on the Hamiltonian and left-right alignment of the circuit, then we will
describe the method to find the ground state energy for both classical simulation and quantum
computers and lastly we showcase the results of both methods.

36
4.2. Energies

Hamiltonian of model
The critical system of choice is the Ising quantum critical chain, which is a conformal field
theory. Conformal field theories and its applications are outside the scope of this thesis, but
the important property we will use is that such a theory is by definition invariant under angle-
preserving transformations. We have already implicitly used this fact in Section 3.1.5.
The associated Hamiltonian for this critical Ising model is given by
X
H=− Xr Xr+1 + Zr . (4.9)
r

However, we will use the transformed Hamiltonian for a fully scale-invariant (D)MERA
X
H= (Xr Zr+1 Xr+2 − Xr Xr+1 ). (4.10)
r

The detailed derivation and discussion of Eq. 4.10 is given by Evenbly and White [17]. Ef-
fectively, we only prepare three qubits (for depth 2) with DMERA to the ground state so the
Hamiltonian reduces to
1
H = XZX − (IXX + XXI), (4.11)
2
where the factor 12 accounts for the average of both XX terms. We will explicitly use this
expression as the three-qubit observable to measure the energy of the prepared state.

Left and right centering


In Section 3.1.3 we briefly discussed the issue of left and right centered qubits where we had
to average over both options to get the correct superoperator, that is S = 21 (SL + SR ). This
issue is also present for DMERA, so similarly we will need to average over both options in both
the classical simulation and quantum computation by constructing a left and right channel, i.e.
Φ = 21 (ΦL + ΦR ), and process the data accordingly in further analysis.

4.2.1 Classical method


Using the methods described in Section 4.1.3 we can compute the expectation value of H to
find the energy of the prepared state. Once a state ρ has been prepared by the DMERA circuit,
the energy can be computed as
E = ⟨H⟩ρ = Tr[Hρ].
To account for the left and right centering of the channel, we compute both centerings of
the channel. After each layer we take the average over the three remaining qubits for both
centering which will then serve as the input for the next layer. Repeating this for all layers
gives us the average over all possible configurations for DMERA.
The convergence of the ground state energy is shown in Figure 4.4 and the converged values
in Table 4.1. After roughly five layers the starting state has almost converged to the fixed state
in all cases and thus the energy will also not change much further. The exponentially decaying
convergence occurs because, when viewed in the Heisenberg picture, the decay is characterized
by the eigenoperators ϕa of Φ that are present in the Hamiltonian. All eigenoperators have
an associated eigenvalue λa which is strictly less than 1 except for the fixed operator which
has eigenvalue 1. When a layer is applied, all components gain a factor λa and after only a
few layers the non-fixed eigenoperators vanish. This point has been discussed in more detail in
Section 3.1.5.

37
4.2. Energies

The noise models show convergence after several layers, despite the potential concern that
error builds up in large quantum circuits. This result indicates that DMERA is robust to noise
and supports the findings from Section 3.2.3.
We can also see that a larger depth gives a closer approximation to the true ground state
energy than a small depth. This result is not unexpected, as we have seen before that increas-
ing the bond dimension of an MPS or MERA better approximates the ground state energy.
Increasing the depth is an alternative way of increasing the bond dimension.

Ground state energies from simulated DMERA


0.0 Noiseless, D = 2
Noiseless, D = 4
Depolarizing error, p = 0.0087
0.2 2-q gate infidelity, 2 = 0.112
True ground state energy
0.4

0.6
O

0.8

1.0

1.2

0 1 2 3 4 5
n

Figure 4.4: The decay of the ground state energies to the converged values. As more layers are
applied to a starting state, the energy converges to a final value. This is shown for the noiseless
case with D = 2 and D = 4 and with added depolarizing and gate infidelity noise.

Exact Noiseless (D = 2) Noiseless (D = 4) Depolarizing Gate infidelity


E -1.27324 -1.24223 -1.26784 -1.08375 -1.08360

Table 4.1: The converged energy values compared with the exact value E0 = − π4 . The values
have been computed after applying 20 DMERA layers to the starting state ρ = |000⟩⟨000|. The
code used to compute and generate this figure can be found in [35].

4.2.2 Quantum method and experiment


The quantum computation is limited in its capabilities compared to the classical simulation. In
actuality, the classical simulation can do non-physical things like measuring multiple observables
one after each other. Nonetheless, the method to extract energies from the quantum computer
is more involved than it is for classical simulation, so we will first need to describe this method.

Method
Each experiment run on the quantum computer consists of a number of shots. One shot
is a single preparation and measurement of a state after going through the quantum circuit.

38
4.2. Energies

We can see this prepared state as a probability distribution for each basis state. Each shot
might give a different value than the previous shot due to this distribution. To get an accurate
estimation of the expectation value of an observable we want to do experiments with many
shots such that the sample size is large enough.
We cannot implement the Hamiltonian from Eq. 4.11 as it is stated. Instead, we need to
separate the Hamiltonian in the three observables XZX, IXX and XXI and measure each
observable individually. The energy can then be calculated in classical post-processing by
adding up the expectation values of each observable.
It is not possible to do the averaging over the left and right centering of the three qubits after
a layer, but instead the quantum circuit needs to be constructed explicitly for either left or right
centering. As there are two choices for each layer, we need to construct 2N different circuits for
a DMERA with N layers. In total, we need to perform 3 × 2N different experiments and bring
the results together in post-processing to find the expectation value of the full Hamiltonian.
The error margin can also be found indirectly from the experiment. First, we obtain the
marginal probabilities of the three qubits that are measured from the data. We then artificially
reproduce the experiment on a classical computer by randomly picking an outcome according
to the obtained marginal probability distribution. All outcomes are stored to form a large set of
”measurements” from which we can compute the standard deviation. This standard deviation
is then the error margin for our results.

Experiment
The experiment was performed with the IonQ 11-qubit trapped ion quantum computer [31]
which can be accessed through the Amazon Braket service [38]. In total, we have run 21 tasks
with 1000 shots each: 3 tasks for 0 layers, 6 tasks for 1 layers and 12 tasks for 2 layers. The
results from the tasks have been summed up to find the energy and averaged according to the
number of possible left/right configurations.

Energy measurement with IonQ QC (1000 shots)


0.0 Simulation, noiseless
Simulation, depolarizing p = 0.105
Measurement data
0.2

0.4
Energy (a.u.)

0.6

0.8

1.0

1.2
0 1 2
Layers

Figure 4.5: Energy results for two layers with standard deviation σ = 0.03. Data is from IonQ
quantum computer with curves from simulations (noiseless and noisy). The code used to send
the quantum tasks to the QC and analyze the data can be found in [35].

The results of the experiment are shown in Figure 4.5. The standard deviation is comparable
for each data point and has a value of 0.03. Additionally, the energies as found by classical
simulations for the noiseless case and the depolarizing noise case (p = 0.105) are shown. It
is immediately clear that the results from the quantum computer differ significantly from the
results for the noiseless case. The energy did not lower between 1 layer and 2 layers which does

39
4.2. Energies

not line up with our expectation of an exponential decay. However, we may not have performed
enough shots or experiments to get precise measurement values.
The simulation needs a large amount of noise to fit the QC data. We have chosen depo-
larizing noise as the noise model as it is a commonly occurring error and the noise parameter
that fits well to the data is p = 0.105. This value is much larger than the value that Sewell and
Jordan used in their paper and that we have used in the classical simulations [2]. The choice of
noise model is however not as relevant. Regardless of the chosen noise model, one would need
to add a high amount of noise to fit the QC data.
As we are only able to perform experiments that are limited in terms of channel iterations
and qubit usage, we refer to Sewell and Jordan for a more thorough and detailed experiment
[2]. Their results show that DMERA works well on a real quantum computer and converges
after only five layers (Figure 9 in their paper). They found an approximation of the ground
state energy of E = −1.08 which is 15% higher than the true ground state energy.

40
Chapter 5

Extracting scaling dimensions

Throughout this thesis we have seen enough reasons to believe that DMERA is robust against
noise, from a mathematical proof in Section 3.2.3 to a convergence of the ground state energy
in Section 4.2. We can conclude that DMERA is an ansatz that can reliably prepare a ground
state when implemented on a noisy quantum computer. Now that we have computed energies,
we look to find and investigate other physical quantities. An interesting quantity is the so-called
scaling dimensions, which are briefly discussed in both Kim-Swingle and Sewell-Jordan [1, 2].
At the foundation of preparing and converging to a ground state lie the eigenvalues of
the DMERA channel Φ. The eigenvalues govern the decaying behavior to the ground state,
which corresponds to the largest eigenvalue. These eigenvalues are interesting for investigation
into DMERA, but they can additionally be linked with another physical quantity, namely the
scaling dimensions. This quantity is of interest in the study of conformal field theories, which
can describe critical models such as the quantum Ising chain that we have been using. A scaling
dimension describes how the corresponding operator transforms under a scaling transformation,
i.e. x → αx for some scale factor α [39].
Each layer of DMERA in fact implements a scaling transformation, namely with a scaling
factor α = 2 as we double the number of qubits at each layer. This hints at a presence of
scaling dimensions in DMERA which we can then find from the eigenvalues. With a simple
conversion, ∆ = − log2 λ, we can obtain the scaling dimension ∆ from the eigenvalue λ. As the
specific DMERA channel investigates a critical model, we can correspond the DMERA scaling
dimensions to the scaling dimensions of the critical model.
We know that DMERA is noise resilient and that shows in the energy computations as well.
We speculate that the scaling dimensions are in turn also resilient against noise. To investigate
this hypothesis, we will have to compute the eigenvalues of the DMERA channel. In a classical
simulation this is a simple task as we have direct access to the channel which is diagonalizable.
However, on a quantum computer this is not possible. Instead, we can extract the eigenvalues
directly from the measurement data as the eigenvalues are directly involved in the convergence
of the expectation value of an observable. For the ground state energy we already observed
that there is a convergence to a fixed value. This behavior is not exclusive to the Hamiltonian,
but can be seen for all observables. By using a new quantum algorithm, we can extract the
eigenvalues from the measurement data and compute the scaling dimensions for DMERA.
The code used in this section for both the classical and quantum algorithm can be found in
[35].
In this section, we will first investigate the robustness of the scaling dimensions in a classical
simulation in Section 5.1 by diagonalizing the DMERA channel Φ. Then in Section 5.2 we

41
5.1. Classical algorithm

extract the scaling dimensions on a (simulated) quantum computer by first introducing a new
quantum algorithm and then testing the algorithm on simulated QC data to find the scaling
dimensions.

5.1 Classical algorithm


Finding the eigenvalues for the classical simulator is a matter of diagonalizing the DMERA
channel Φ. This can be done explicitly since we have direct access to Φ and can easily compute
the corresponding matrix. For the method using tensor contraction we can simply contract
the tensors and form a new tensor which can be interpreted as a matrix at which point the
eigenvalues are easily evaluated.
For eigenvalues in a noisy channel we turn to Kraus operators once again, c.f. Section 4.1.3.
As we are modeling quantum gates, we do not have an explicit matrix representation so we will
have to prepare one in order to diagonalize the DMERA channel. Superoperators are linear
maps from operators to operators so we can construct a matrix representation by picking a
basis for the space of operators and computing each matrix element [40].
We will use that Φ maps the space of n-qubit states to itself. For this space, let us pick the
basis |i⟩⟨j| for i, j = 0, 1, . . . , 2n − 1, where |i⟩ indicates the vector whose i-th element is 1 and
all other elements are 0. Each element of the matrix representation M of Φ is then given by

M{(kl),(ij)} = ⟨|k⟩⟨l| , Φ[|i⟩⟨j|]⟩


= Tr (|k⟩⟨l|)† Φ[|i⟩⟨j|]
 

= ⟨k| Φ[|i⟩⟨j|] |l⟩ , (5.1)

where ⟨· , ·⟩ is the Hilbert-Schmidt inner product, for k, l = 0, 1, . . . , 2n − 1. Note that (kl), and
likewise (ij), constitutes one index and that there is freedom in choosing how the index is formed
from k and l. For example, one can pick the row-ordering to be (00), (01), . . . , (0d), (10), . . . , (dd),
where d = 2n , and similarly for the column-ordering. The particular choice for the ordering
does not affect further results.
Using Eq. 5.1 for depth D = 2 we obtain the matrix representation M of Φ by computing all
its elements for each permutation of i, j, k, l. Concretely, this involves (23 )4 = 4096 calculations
as Φ is a map from 3-qubit states to 3-qubit states. Once the matrix M is determined, we can
diagonalize M and compute its eigenvalues and eigenoperators. Currently, we are interested in
the scaling dimensions ∆a which can be found from the eigenvalues via ∆a = 2−λa .

Results
The 32 smallest scaling dimensions for both the noiseless and noisy case are plotted in
Figure 5.1. Figure 5.1a contains the scaling dimensions found for two different DMERA depths
in the exact case. Figure 5.1b includes the exact and noisy scaling dimensions for two different
noise models. The latter figure also appears in Sewell-Jordan and in this thesis we managed to
reproduce the figure using the same noise models and parameters [2].
In the exact case, both depths agree on scaling dimensions until after the 20th eigenvalue.
The reason for this discrepancy could be due to the larger space of operators for D = 4. The
causal cone for D = 4 is extended to include 7 qubits as opposed to 3 qubits for D = 2. The
dimensions of the space of operators for D = 4 is then 214 × 214 and much larger than the
space for D = 2. Consequently, there are many more eigenvalues in the space for D = 4. The
distribution of eigenvalues may differ between both depths which could explain the deviation.
For the noise models, the noisy scaling dimensions are remarkably close to the exact scaling
dimensions. The decaying behavior is dictated by the smallest scaling dimensions, which in

42
5.1. Classical algorithm

Noiseless and noisy scaling dimensions


Noiseless scaling dimensions
4 D=2
D=4 4

3 3

Scaling dimension
Scaling dimension

2 2

1 Noiseless, D = 2
1
Depolarizing, p = 0.0087
2-q gate infidelity, 2 = 0.112
0 Ising CFT
0
1 5 10 15 20 25 30
1 5 10 15 20 25 30 n-th eigenvalue
n-th eigenvalue
(b) The 32 smallest scaling dimensions with
(a) The 32 smallest noiseless scaling dimen-
and without noise. The two noise models used
sions as obtained from the classical simulation
in this simulation are described in Section 4.1.2
of DMERA. Two different depths for DMERA
with depolarizing noise p = 0.0087 and two-
layers are shown
qubit gate infidelity σ = 0.112.

Figure 5.1: Scaling dimensions for different scenarios. Code used to produce the figures from
[35].

particular are close across the three cases. We know from the energy calculations in Section
4.2 and the bound on the error of the noisy channel in Section 3.2.3 that DMERA is stable in
presence of noise and such it is not surprising that the scaling dimensions are also stable in a
noisy environment. This result gives us the confidence that DMERA can yield reliable results
when performed on a real quantum computer.

43
5.2. Quantum algorithm

5.2 Quantum algorithm


In the previous section we obtained the eigenvalues of the DMERA channel by diagonalization.
This method is possible only if one has access to the matrix representation of the channel.
For a quantum computer, this is unfortunately not the case. The amount of information that
we can access via a quantum computer is severely limited compared to a classical simulation.
Nonetheless, it is possible to extract the eigenvalues out of the data returned by the quantum
computer. In this section, we will discuss a new algorithm that estimates the eigenvalues of a
DMERA channel Φ given the expectation values of an observable for up to k layers.

5.2.1 ESPRIT algorithm


The mechanism that enables us to obtain the eigenvalues is the ”Estimation of Signal Parame-
ters via Rotational Invariance Techniques” (ESPRIT) algorithm [16]. What this allows one to
do, in short, is to estimate the λa from a series X with elements of the form
X
Xn = xa λna (5.2)
a=1

for n = 0, 1, . . . , k. Conventionally, this algorithm is used in signal processing for direction of


arrival estimation, where one estimates real frequencies ωa by using λa = e−2πiωa [41, 42, 43,
44].
Let us first define our input that we will use for the algorithm. For any 3-qubit observable
O we can obtain η ∈ Rk+1 for up to k layers whose elements are given by

ηn = ⟨O⟩ρn = Tr[Φn [O]ρ0 ]. (5.3)

The magnitude of ⟨O⟩ρn is determined by the eigenvalues of the DMERA channel Φ. As its
eigenvalues have modulus strictly less than one, except for the largest which has modulus one,
⟨O⟩ρn will be solely determined by the largest eigenvalue for large n. For this reason, the
elements of η typically show a decaying behavior, hence we will call η the observable decay
curve. If O is the Hamiltonian H then η gives the energies for up to k layers, which we
extensively discussed in Section 4.2.
Using Eq. 3.18 we can write η in a more suggestive manner so that we can justify using
ESPRIT, X X
ηn = xa λna = xa 2−∆a , (5.4)
a a
h i
where xa = Tr ϕ̂a O Tr[ϕa ρ0 ] is a factor independent of n. Each element is then a sum of
exponentials, similar to X, and we see that η is an appropriate input for ESPRIT. Note that
we will have complex frequencies as opposed to real frequencies. In general this will not be an
issue as the ESPRIT algorithm works regardless of the nature of its input.

Hankel matrix
The next ingredient we will need for ESPRIT is the Hankel matrix H(X). The defining
property of a Hankel matrix is that each skew diagonal contains the same entries, that is
hi,j = hi−1,j+1 . By specifying the first column and the last row, combined with this property,
the Hankel matrix is completely determined. This approach works not only for square Hankel
matrices, but for also for non-square Hankel matrices. A general H(X) ∈ Cm×n for X =

44
5.2. Quantum algorithm

(x1 , x2 , . . . , xm+n−1 ) is then defined as


 
x1 x2 ··· xn
 x2 x3 ··· xn+1 
H(X) =  . (5.5)
 
.. .. ..
 ..

. . . 
xm xm+1 ··· xm+n−1
As an example, for the array X = (1, 2, 3, 4, 5, 6), the Hankel matrix H(X) ∈ C3×4 is given
by  
1 2 3 4
H(X) = 2 3 4 5 . (5.6)
3 4 5 6
Specifically for ESPRIT where X ∈ Ck+1 , in general we will construct H(X) ∈ C(L+1)×(k−L+1)
by choosing L = ⌊ k+1
2 ⌋.

ESPRIT algorithm
Now we can describe the sequence of the ESPRIT algorithm. The following description is
based on [45].

Algorithm 1 ESPRIT algorithm


Input: X ∈ Ck+1 , desired number of eigenvalues q
Output: Λq ∈ Cq

1. Choose L = ⌊ k+1
2 ⌋

2. Compute the Hankel matrix H(X) ∈ C(L+1)×(k−L+1)


3. Perform the singular value decomposition on H(X),

H(X) = [U U⊥ ] Σ [V V⊥ ] ,

where
• U ∈ C(L+1)×q
• U⊥ ∈ C(L+1)×(L+1−q)
• Σ ∈ Cq×q
• V ∈ C(k−L+1)×q
• V⊥ ∈ C(k−L+1)×(k−L+1−q)
4. Find U0 by taking the first L rows of U and U1 by taking the last L rows of U
5. Compute Ψ = U0+ U1 , where U0+ is the Moore-Penrose inverse of U0

6. Compute the eigenvalues of Ψ, which is Λq = {λa }qj=1


7. Return the eigenvalues Λq

Pq
If input X is in the form of X = a=1 xa λa then the eigenvalues of Ψ are precisely the same
λa as in X. However, in the case that we have a noisy input X̃ = X + η, where η represents a
noise vector, we should still expect to recover approximately λa [45].

45
5.2. Quantum algorithm

5.2.2 F-test for nested models


Running the ESPRIT algorithm on η in the form of Eq. 5.4 gives us back q eigenvalues of the
DMERA channel Φ, depending on the restriction q. In general, we do not know what value to
choose for restriction q so we need a statistical test to determine what value for q is optimal.
The F-test is suited for this problem. We will use this test to compare two models with q and
q + 1 eigenvalues and seeing which one of the two fits ”best” to η.
Specifically, we consider the F-test for nested models [46]. Consider the sequences y1 =
a + bx1 and y2 = a + bx1 + cx2 , then y1 is nested inside y2 as it has fewer parameters than y2
does. If we were to fit both y1 and y2 to a set of points Y then we expect y2 to have a better
fit than y1 . In general, a sequence with more parameters would fit better to a set of points
than a sequence with fewer parameters. However, by allowing more parameters, we increase
the number of degrees of freedom which we are trying to restrict. The F-test serves to give the
balancing point when adding more parameters is ’too much’.
For two models M1 and M2 , where M1 is nested inside M2 , and a data set D the F-value
(or F-statistic) is given by  
SS1 −SS2
p2 −p1
F =   , (5.7)
SS2
n−p2

where SSi is the residual sum-of-squares of model Mi , pi the number of parameters in model
Mi and n the number of data points in D. The F-value is then compared to the value of the
F-distribution for the values of degrees of freedom p2 − p1 and n − p1 and α = 0.05. We will
call this value from the F-distribution the critical value Fcrit . If the calculated F-value is larger
than the critical value, then the model with more parameters is a better fit. If it is instead
smaller than the critical value, then the simpler model is correct.
We can apply the F-test for two models with eigenvalues obtained from the ESPRIT algo-
rithm with restriction q and q + 1. We take a model Mq (k) to be
q
X
Mq (k) = wa λka , (5.8)
a=1

where wa is the weight for each λa and k is the number of DMERA layers. Starting with M1 (k)
and M2 (k), we compute the F-value between these models and compare the value to the Fcrit .
If the model with more parameters is a better fit, then we repeat the same process, now for
M2 (k) and M3 (k). This procedure is continued until we find an F -value which is smaller than
the critical value. Once this is determined, we can be certain that only the q eigenvalues given
by ESPRIT are statistically likely to be eigenvalues of the DMERA channel.
The weights wa in Mq (k) can be determined by fitting the model to η using specialized
coding functions, e.g. curve fit() from the scipy Python package [47]. These functions will
find an optimized set of weights that minimizes the sum of squares. This method is called
(non-linear) least squares. The optimized weights can then be used in the model Mq (k) for the
F-test.

5.2.3 Extraction algorithm


Now that we have discussed all the necessary parts, we can assemble the extraction algorithm.
In this following section we will introduce the extraction algorithm and discuss how to use in
order to find the eigenvalues of the DMERA channel.
Using the ESPRIT algorithmP(Algorithm 1) we are able to estimate eigenvalues λa from X
given it is of the form of X = a=1 xa λka . In general, we do not know precisely how many

46
5.2. Quantum algorithm

λa are present in X, so the input restriction q is not yet determined. For that reason, we
try different restrictions q and see which model Mq constructed from the q values returned by
ESPRIT has the ”best fit” to X, as described in Section 5.2.2. Algorithm 2 provides guidelines
to find a set of eigenvalues that are likely to be in X.

Algorithm 2 Eigenvalue extraction algorithm


Input: X ∈ Ck+1
Output: Λq ∈ Cq

1. Set q = 1
2. Input X into ESPRIT (Algorithm 1) with restriction q to obtain Λq . Repeat for restriction
q+1
3. Construct models Mq , Mq+1 with the values from Λq and Λq+1
(q) (q+1)
4. Find weights wa and wa by fitting Mq and Mq+1 to X
5. Compute the F -value for Mq and Mq+1

• If F -value ≥ Fcrit ,
then go back to step 2 and increase q by 1
• If F -value < Fcrit ,
then return Λq as computed in step 2

If we use η as an input for Algorithm 2 then we are able to find q likely eigenvalues from
the DMERA channel. The only information required from the (simulated) quantum computer
is measurements for a given observable for up to k layers. The algorithm and computation of
the eigenvalues is done classically and this ensures that the algorithm is as reliable as possible.

Limitations
A single pair of a starting state and an observable cannot not yield all eigenvalues from
the channel. The main reason is that a starting state ρ0 and an observable O cannot have
overlap with all left and right eigenoperators from the channel. Recall from Section 3.1.5 that
O has overlap with the right eigenoperator ϕa of Φ if ⟨ϕa , O⟩ ̸= 0. Similarly, we can say that
ρ0 has overlap with the left eigenoperator ϕ̂a of Φ if ⟨ϕ̂a , ρ0 ⟩ ̸= 0. If either ⟨ϕa , O⟩ or ⟨ϕ̂a , ρ0 ⟩
vanishes for a, then there will be no contribution from λa in η and thus it cannot be found by
the extraction algorithm.
Another problem inherent to DMERA is that there is a bias in η towards larger eigenvalues.
Large eigenvalues decay much slower than small eigenvalues and are the dominant part in the
signal after many layers. Smaller eigenvalues, however, decay quickly and due to the discrete
nature of layers they will be negligible after only a few layers. As we have access only to the
data from the observable decay, it is difficult to distinguish between the contribution of small
eigenvalues and noise. Both the ESPRIT and the extraction algorithm will then likely classify
these smaller eigenvalues as noise and filter these out.
The first problem can be solved by collecting data from many pairs of different starting
states and observables. Different pairs will have overlap with different eigenoperators and as a
result we can retrieve more eigenvalues of the channel. We will discuss the data collection in
Section 5.2.4 and the results in Section 5.2.5. Another benefit of collecting more data is that

47
5.2. Quantum algorithm

we will find more of the same eigenvalues, so the certainty that a found eigenvalue is correct
increases.

5.2.4 Data collection


As discussed in Section 5.2.3, a single pair of starting state and observable cannot find all
eigenvalues of the DMERA channel Φ. To find as many eigenvalues as possible, we need to
consider many pairs of starting states and observables. In addition, the starting states and
observables must be simple so they can easily be prepared on a quantum computer. For
these reasons, we consider the stabilizer states as the starting states and Pauli gates as the
observables. In this section, we will explore the reasoning for these choices and discuss how the
data is collected and analyzed.

Stabilizer states and observables


Our goal is to choose a set of starting states which forms a basis for 3-qubit states so that
all left eigenoperators of Φ are contained in this set. Trivially, this can be done by taking the
set of left eigenoperators and transforming them to states, i.e. ρ = dI + ϕ̂a for all but the fixed
point state. This set would yield every eigenvalue when analyzed with the eigenvalue extraction
algorithm. However, we assume that we do not know the eigenoperators and in addition they
would be difficult to prepare on a quantum computer. Instead, we will need to find a different
set of states which are easy to prepare and that form a basis of quantum states.
Let us consider stabilizer states. We say that |ψ⟩ is stabilized by a unitary U if U |ψ⟩ =
+1 |ψ⟩. For example, |0⟩ is stabilized by Z and |1⟩ is stabilized by −Z. Stabilizer states can
be generated by a circuit of stabilizer gates (CNOT, Hadamard H and phase gate P ) starting
from |0 . . . 0⟩⟨0 . . . 0| [48]. Normally, the states are used in quantum error-correction, but they
also have uses outside this area.
Starting from |0⟩⟨0|, we can create the single qubit stabilizer states
     
1 0 1 1 1 1 1 −i
|0⟩⟨0| = , |+⟩⟨+| = , |i⟩⟨i| = ,
0 0 2 1 1 2 i 1
     
0 0 1 1 −1 1 1 i
|1⟩⟨1| = , |−⟩⟨−| = , |−i⟩⟨−i| = .
0 1 2 −1 1 2 −i 1
One can check that the given states are stabilized by either ±X, ±Y or ±Z, and trivially by I.
One of our requirements is that this set of single qubit stabilizer states forms a basis for
3-qubit density matrices. First, let us see if this set forms a basis for 1-qubit density matrices.
In fact, these six states form an overcomplete basis as two states can be written as a linear
combination of other states in this set, that is

|−i⟩⟨−i| = |0⟩⟨0| + |1⟩⟨1| − |i⟩⟨i|

and
|−⟩⟨−| = |0⟩⟨0| + |1⟩⟨1| − |+⟩⟨+| ,
so we can eliminate |−i⟩⟨−i| and |−⟩⟨−| from the set. We can alternatively write the remaining
states as
1 1
|0⟩⟨0| = (I + Z) , |+⟩⟨+| = (I + X) ,
2 2
1 1
|1⟩⟨1| = (I − Z) , |i⟩⟨i| = (I + Y ) .
2 2

48
5.2. Quantum algorithm

Then for a general state ρ we have

ρ = α |+⟩⟨+| + β |i⟩⟨i| + γ |0⟩⟨0| + δ |1⟩⟨1|


α β γ δ
= (I + X) + (I + Y ) + (I + Z) + (I − Z)
2 2 2 2
1
= [(α + β + γ + δ)I + αX + βY + (γ − δ)Z] , (5.9)
2
where we recognize this expression as the Bloch sphere picture if we require α + β + γ + δ = 1.
As the Pauli matrices form a basis for 1-qubit states, then the set {|0⟩⟨0| , |1⟩⟨1| , |+⟩⟨+| , |i⟩⟨i|}
does as well. Given this set then by extension we can construct a basis for 3 qubits, i.e.
{|0⟩⟨0| , |1⟩⟨1| , |+⟩⟨+| , |i⟩⟨i|}⊗3 .
Now that we have shown that this set of stabilizer states forms a basis for 3-qubit states, we
also note that stabilizer states are particularly easy to prepare on a quantum computer. They
are generated from a circuit consisting of stabilizer gates, which itself are basic operations that
a quantum computer can easily perform. In particular, the phase gate P does not happen phys-
ically but is rather performed virtually, thus any usage of P cannot introduce error. Starting
from |0⟩ we can construct the stabilizer states as

|0⟩⟨0| = I[|0⟩⟨0|]
|1⟩⟨1| = X[|0⟩⟨0|]
|+⟩⟨+| = H[|0⟩⟨0|]
|i⟩⟨i| = P H[|0⟩⟨0|].

For the observables we will pick the set consisting of Pauli gates {X, Y, Z}. These observables
are the natural choice as quantum computers already use these measurement gates and any
arbitrary observable is converted in terms of the Pauli gates by the quantum computer. For
example, the Hamiltonian that we used in Section 4.2 can be decomposed in terms of Pauli
gates as H = XZX − 12 (IXX + XXI).

Data collection
From the basis states set we can construct 43 = 64 starting states and from the Pauli
gates set 33 = 27 observables, then in total we have 1728 pairs for which we can compute the
observable decay curve η. However, most pairs have no overlap at all and return an empty η,
i.e. ηn = 0 for all n. This occurs in roughly between 50 and 75% of the pairs which reduces
the number of different η to be between 400 and 900.
In general, we will want to collect data points until the curve η has converged. The con-
vergence rate differs from state-observable pair to another, so we need a method that utilizes
relative changes. We say that the curve is converged when the mean of the relative changes of
the last three data points is less than 0.02. As the curve may sometimes decay too slowly for
this to occur, we also include a forced stop at 30 layers. The value of 0.02 is obtained through
trial-and-error by looking at a state-observable pair decay curve at different cut-off values, de-
termining when it looks converged and cross-checking with other state-observable pairs. The
ESPRIT algorithm performs best on decay curves that are just converged.

Combining decay curves


Algorithm 2 is best used with a combined sample of multiple η, instead of individual η.
Combining multiple decay curves decreases the susceptibility to errors and deviations, and
emphasizes the decay from the eigenvalues. Consider two decay curves η (1) , η (2) ∈ Rk+1 which

49
5.2. Quantum algorithm

contain the same eigenvalues λa . Then in general an element from η (1) , and similarly for η (2) ,
will be of the form X
(1)
ηm = x(1)
a λa + ϵ
(1)
(m),
a
(1)
where ϵ (m) is a noise vector. Adding both curves η (1) , η (2) gives
X
(1) (2)
ηm + ηm = (x(1) (2)
a + xa )λa + (ϵ
(1)
(m) + ϵ(2) (m)).
a

(1) (2)
Assuming the curves are similar enough, that is xa and xa have the same sign and similar
magnitude, and that the noise is uncorrelated, then the linear sum will give us a better esti-
mation of λa than the individual curves would. Extending this method to summing more than
two curves will improve the results even further.
For this procedure, we assumed that both η (1) and η (2) share the same eigenvalues. In
general, it is unlikely that this assumption holds for any two curves, and if there are more than
two curves involved then it becomes increasingly unlikely. However, the linear combination
method still works if we assume that the curves share some eigenvalues, especially those in
which we are interested. If a set of curves share the same larger eigenvalues, then that is a
good candidate set to use to find those eigenvalues. In practice, we will combine the curves
from the state-observable pairs which share the same observable. From observations by using
Algorithm 2 on all decay curves, we found that pairs which share an observable have a similar
decay curve and yield similar eigenvalues.
The decay curves η (1) and η (2) need not have the same length, i.e. k1 ̸= k2 , as they may
have converged at different lengths. In this case we cannot directly sum both curves, so to do
this we extend one of the curves until their lengths are equal. For example, if η (1) is shorter
than η (2) , so k1 < k2 , then we add k2 − k1 elements onto η (1) whose values are equal to the
value of the previously last element of η (1) . We call this procedure padding.

5.2.5 Results
The scaling dimensions found using Algorithm 2 are shown in Table 4.1, compared with values
from the exact Ising CFT and matrix diagonalization from classical simulation. The values
found with the algorithm are more likely to correspond with the values from the diagonalization
method, as both methods aim to find the scaling dimensions of the same function, i.e. the
DMERA channel. We can see in the results that this is indeed the case.
Ising CFT Diagonalization (D = 2) Algorithm (error 1σ)
∆0 0 0 0.013 ± 0.009
∆1 0.125 0.140 0.140 ± 0.001
∆2 1 1 1.033 ± 0.011
∆3 1.125 1.136 1.135 ± 0.086
∆4 2 2 1.942 ± 0.047

Table 5.1: Scaling dimensions obtained with Algorithm 2. The error size is one standard
deviation. The code used to generate the data and perform Algorithm 2 can be found in [35].

Although some eigenvalues are abundant in the data, such as λ1 (with corresponding ∆1 =
0.140 ± 0.001), other eigenvalues do not occur often. The eigenvalue λ2 (w. corresp. ∆2 =
1.033 ± 0.011) was only encountered in the data of a low number of state-observable pairs,
while the eigenvalues λ0 (w. corresp. ∆0 = 0.013 ± 0.009) and λ1 were present in the data

50
5.2. Quantum algorithm

of almost all state-observable pairs. Consequently, the ∆2 from the algorithm is 3σ removed
from the diagonalization (and Ising CFT) value while both ∆0 and ∆1 are within 1.5σ of the
diagonalization value. This rare occurrence of some eigenvalues makes it difficult to accurately
determine if the found eigenvalues are real or only noise.
Padding curves allows us to sum curves that do not have the same length, i.e. not the same
convergence rate. However, it can happen that the length of the linear combination is too great.
This makes it difficult for the ESPRIT algorithm to find good approximations of eigenvalues
that are in the data. Normally, we do not want the decay curve η to be unnecessarily long,
hence why we only collect data points until η has converged. However, when we pad an η so
it can be used in the linear combination, we extend the η past its convergence, which only
worsens the performance of ESPRIT. To make sure we get the best performance from ESPRIT,
we remove several last elements from the linear combination before using it as input for the
extraction algorithm.

5.2.6 Discussion and outlook


To conclude, based on the results shown in Table 5.1, the extraction algorithm seems to perform
well as we were able to recover most of the larger DMERA scaling dimensions within one or two
σ. The results show a bias towards larger eigenvalues, as small eigenvalues decay after only a
few channel iterations, which makes it difficult to be recovered by the extraction algorithm. In
this thesis this bias was not an issue as we were not focused on recovering specific eigenvalues,
however the algorithm might not be suited for research that is interested in many smaller
eigenvalues.

Outlook
For future outlook, there are several things that can be considered. Firstly, the performance
is directly related to the input data, so with more data (decay curves) the algorithm might be
able to more accurately extract eigenvalues/scaling dimensions. One method that could be
deployed is shadow estimation [49]. By using many random unitaries we would be able to
accurately determine the expectation value of an observable O which in turn results in accurate
decay curves.
Secondly, in this thesis we have used a DMERA circuit that is derived analytically and is
thus both scale and translation invariant [17]. However, this approach might not be optimal for
approaching the ground state energy. By starting from the analytic circuit and using a hybrid
quantum-classical method of optimization, described in Kim-Swingle [1], we can minimize the
energy further.
Another potential next step for DMERA would be to explore different critical models. The
quantum Ising chain is a relatively simple model such that DMERA does not yield a large benefit
over a classical method. An one-dimensional model that is proposed by Kim and Swingle is the
so-called D1-D5 system, which is constructed in string theory [50]. Furthermore, DMERA can
currently only investigate 1D systems, however an extension to higher dimensions would be an
interesting point of research. There are many 2D systems that are non-trivial to solve, such as
the Hubbard model [51].
Finally, a recent study proposed a generalized MERA (gMERA) network that is also suited
for a quantum computer [52]. Although gMERA is a different approach to preparing MERA
on a quantum computer, its results are consistent with results obtained from DMERA. A
comparison between DMERA and gMERA would be useful to determine which method has a
better performance and higher accuracy for a given situation.

51
Appendix A

Appendices

A.1 Tensor contraction


Tensors are a generalization of scalars, vectors, matrices and higher multidimensional arrays,
or simply speaking a mathematical object with indices. A tensor T is characterized by the way
its components transform under a change of basis, for example
∂xµ ∂xρ ν
Tσρ = T . (A.1)
∂xσ ∂xν µ
We can identify a tensor in terms of its rank, or type, which describes how many lower and
upper indices a tensor has. For example, the tensor T in Eq. A.1 is a (1,1)-rank tensor. A
general (m, n)-rank tensor has m upper indices and n lower indices.
A new tensor can be formed by combining tensors and summing over the connected indices,
which we call tensor contraction. Using the Einstein summation convention, an example of
tensor contraction is
vl = Akl wk , (A.2)
where we have summed over the index k and contracted Akl and wk to obtain vl .
Alternatively, there is a graphical notation for tensors, sometimes called the Penrose graph-
ical notation [53, 54]. A tensor is represented as a shape with a number of legs which is equal to
the number of indices, see Figure A.1a. Tensor contraction can then be visualized as connecting
the legs of the tensors that are to be contracted. Eq. A.2 and Figure A.1b imply the same
thing, but differ in notation.

(a) Different tensors (b) Tensor contraction

Figure A.1: Tensors

52
A.2. Telescopic decomposition

A.2 Telescopic decomposition


In this appendix we will prove that we can express Am − B m as
m
X
Am − B m = Ai−1 (Ai − Bi )B m−i , (A.3)
i=1

which we call the telescopic decomposition.


Suppose Am is a product of Ai ,
m
Y
Am = Ai = A1 · · · Am , (A.4)
i=1

where Ai come with an associative product, i.e. (A1 A2 )A3 = A1 (A2 A3 ), and similarly for B m .
We will prove Eq. A.3 by induction. First for the base case m = 1,

A1 − B 1 = I(A1 − B1 )I = A1 − B1

and then for the induction step,

Am+1 − B m+1 = Am Am+1 − Am Bm+1 + Am Bm+1 − B m Bm+1


= Am (Am+1 − Bm+1 ) + (Am − B m )Bm+1
m
!
X
m i−1 m−i
= A (Am+1 − Bm+1 ) + A (Ai − Bi )B Bm+1
i=1
m+1
X
= Ai−1 (Ai − Bi )B m+1−i ,
i=1

which concludes our proof by induction.

53
Bibliography

[1] Isaac H. Kim and Brian Swingle. Robust entanglement renormalization on a noisy quan-
tum computer. 2017. arXiv: 1711.07500.
[2] Troy J. Sewell and Stephen P. Jordan. Preparing Renormalization Group Fixed Points on
NISQ Hardware. 2021. arXiv: 2109.09787.
[3] Amazon. Quantum Computing Service. url: https://aws.amazon.com/braket.
[4] IBM. IBM Quantum Computing. url: https://www.ibm.com/quantum.
[5] Google. Google Quantum AI. url: https://quantumai.google.
[6] Matthias Evers, Anna Heid, and Ivan Ostoji. Pharma’s digital Rx: Quantum computing
in drug research and development. 2021. url: https://www.mckinsey.com/industries/
life-sciences/our-insights/pharmas-digital-rx-quantum-computing-in-drug-
research-and-development.
[7] Dylan Herman et al. A Survey of Quantum Computing for Finance. 2022.
[8] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A Quantum Approximate Opti-
mization Algorithm. 2014.
[9] John Preskill. “Quantum Computing in the NISQ era and beyond”. In: Quantum 2 (Aug.
2018), p. 79.
[10] Xiongjie Yu, David Pekker, and Bryan K. Clark. “Finding Matrix Product State Rep-
resentations of Highly Excited Eigenstates of Many-Body Localized Hamiltonians”. In:
Phys. Rev. Lett. 118 (1 Jan. 2017), p. 017201.
[11] Michele Dolfi et al. “Matrix product state applications for the ALPS project”. In: Com-
puter Physics Communications 185.12 (Dec. 2014), pp. 3430–3440.
[12] Philippe Corboz et al. “Simulation of strongly correlated fermions in two spatial dimen-
sions with fermionic projected entangled-pair states”. In: Physical Review B 81.16 (Apr.
2010).
[13] G. Vidal. “Entanglement Renormalization”. In: Physical Review Letters 99.22 (Nov.
2007).
[14] G. Vidal. “Class of Quantum Many-Body States That Can Be Efficiently Simulated”. In:
Physical Review Letters 101.11 (Sept. 2008). issn: 1079-7114.
[15] Robert N. C. Pfeifer, Glen Evenbly, and Guifré Vidal. “Entanglement renormalization,
scale invariance, and quantum criticality”. In: Physical Review A 79.4 (Apr. 2009). issn:
1094-1622.

54
Bibliography

[16] Richard H. Roy, Arogyaswami Paulraj, and Thomas Kailath. “Estimation of Signal Pa-
rameters via Rotational Invariance Techniques - ESPRIT”. In: MILCOM 1986 - IEEE
Military Communications Conference: Communications-Computers: Teamed for the 90’s
3 (1986), pp. 41.6.1–41.6.5.
[17] Glen Evenbly and Steven R. White. “Entanglement Renormalization and Wavelets”. In:
Physical Review Letters 116.14 (Apr. 2016). issn: 1079-7114.
[18] P.W. Shor. “Algorithms for quantum computation: discrete logarithms and factoring”.
In: Proceedings 35th Annual Symposium on Foundations of Computer Science (1994),
pp. 124–134.
[19] Lov K. Grover. “A Fast Quantum Mechanical Algorithm for Database Search”. In: Pro-
ceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing. STOC
’96. Philadelphia, Pennsylvania, USA: Association for Computing Machinery, 1996, pp. 212–
219. isbn: 0897917855.
[20] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Informa-
tion. 10th. Cambridge University Press, 2000.
[21] John Watrous. The Theory of Quantum Information. Cambridge University Press, 2018.
[22] J. Eisert, M. Cramer, and M. B. Plenio. “Area laws for the entanglement entropy”. In:
Reviews of Modern Physics 82.1 (Feb. 2010), pp. 277–306.
[23] J. Ignacio Cirac et al. “Matrix product states and projected entangled pair states: Con-
cepts, symmetries, theorems”. In: Rev. Mod. Phys. 93 (4 Dec. 2021), p. 045003.
[24] Ulrich Schollwöck. “The density-matrix renormalization group in the age of matrix prod-
uct states”. In: Annals of physics 326.1 (2011), pp. 96–192.
[25] G. Evenbly and G. Vidal. “Algorithms for entanglement renormalization”. In: Physical
Review B 79.14 (Apr. 2009). issn: 1550-235X.
[26] G. Evenbly and G. Vidal. Quantum Criticality with the Multi-scale Entanglement Renor-
malization Ansatz. Springer, Berlin, Heidelberg, 2013.
[27] G. Evenbly and G. Vidal. “Class of Highly Entangled Many-Body States that can be
Efficiently Simulated”. In: Physical Review Letters 112.24 (June 2014). issn: 1079-7114.
[28] Ya-Lin Lo et al. “Quantum impurity in a Luttinger liquid: Universal conductance with
entanglement renormalization”. In: Physical Review B 90 (Dec. 2014). doi: 10.1103/
PhysRevB.90.235124.
[29] G. Evenbly and G. Vidal. “Scaling of entanglement entropy in the (branching) multiscale
entanglement renormalization ansatz”. In: Physical Review B 89.23 (June 2014).
[30] Michael M. Wolf. Quantum Channels and Operators: Guided Tour. 2012. url: https://
www-m5.ma.tum.de/foswiki/pub/M5/Allgemeines/MichaelWolf/QChannelLecture.
pdf.
[31] K. Wright et al. “Benchmarking an 11-qubit quantum computer”. In: Nature Communi-
cations 10.1 (Nov. 2019).
[32] David C. McKay et al. “Efficient Z-gates for quantum computing”. In: Physical Review
A 96.2 (Aug. 2017).
[33] Dmitri Maslov and Yunseong Nam. “Use of global interactions in efficient quantum circuit
constructions”. In: New Journal of Physics 20.3 (Mar. 2018), p. 033018. issn: 1367-2630.
[34] Charles R. Harris et al. “Array programming with NumPy”. In: Nature 585.7825 (Sept.
2020), pp. 357–362.

55
Bibliography

[35] Bjarne Bouwer. Python scripts for project. url: https : / / github . com / amsqi / msc -
bjarne.
[36] Yuval Elias, Tal Mor, and Yossi Weinstein. “Semioptimal practicable algorithmic cooling”.
In: Physical Review A 83.4 (Apr. 2011).
[37] José M. Fernandez et al. “Algorithmic Cooling of Spins: A Practicable Method for Increas-
ing Polarization”. In: International Journal of Quantum Information 02 (2004), pp. 461–
477.
[38] Amazon Web Services. Amazon Braket. 2020. url: https://aws.amazon.com/braket/.
[39] John Cardy. Scaling and Renormalization in Statistical Physics. Cambridge University
Press, 1996, pp. 52–54.
[40] Walter Rudin. Principles of mathematical analysis. McGraw-Hill Book Company, Inc.,
New York-Toronto-London, 1953, p. 210.
[41] Feifei Gao and Alex B Gershman. “A generalized ESPRIT approach to direction-of-arrival
estimation”. In: IEEE signal processing letters 12.3 (2005), pp. 254–257.
[42] Tukaram Baburao Lavate, VK Kokate, and AM Sapkal. “Performance analysis of MUSIC
and ESPRIT DOA estimation algorithms for adaptive array smart antenna in mobile
communication”. In: 2010 Second International Conference on Computer and Network
Technology. IEEE. 2010, pp. 308–311.
[43] Bjorn Ottersten and Thomas Kailath. “Direction-of-arrival estimation for wide-band sig-
nals using the ESPRIT algorithm”. In: IEEE Transactions on Acoustics, Speech, and
Signal Processing 38.2 (1990), pp. 317–327.
[44] Shahram Shahbazpanahi, Shahrokh Valaee, and Mohammad Hasan Bastani. “Distributed
source localization using ESPRIT algorithm”. In: IEEE Transactions on Signal Processing
49.10 (2001), pp. 2169–2178.
[45] Weilin Li, Wenjing Liao, and Albert Fannjiang. Super-resolution limit of the ESPRIT
algorithm. 2019. url: https://arxiv.org/abs/1905.03782.
[46] Andrew Siegel. Practical Business Statistics. 7th ed. Academic Press, 2016, pp. 367–369.
[47] Pauli Virtanen et al. “SciPy 1.0: Fundamental Algorithms for Scientific Computing in
Python”. In: Nature Methods 17 (2020), pp. 261–272.
[48] Scott Aaronson and Daniel Gottesman. “Improved simulation of stabilizer circuits”. In:
Physical Review A 70.5 (Nov. 2004).
[49] Hsin-Yuan Huang, Richard Kueng, and John Preskill. “Predicting many properties of a
quantum system from very few measurements”. In: Nature Physics 16.10 (June 2020),
pp. 1050–1057.
[50] Gautam Mandal. A review of the D1/D5 system and five dimensional black hole from
supergravity and brane viewpoint. 2000.
[51] Daniel P. Arovas et al. “The Hubbard Model”. In: Annual Review of Condensed Matter
Physics 13:239–74 (2022).
[52] Sajant Anand et al. Holographic quantum simulation of entanglement renormalization
circuits. 2022.
[53] Roger Penrose. “Applications of negative dimensional tensors”. In: Combinatorial math-
ematics and its applications 1 (1971), pp. 221–244.

56
Bibliography

[54] Jacob C. Bridgeman and Christopher T. Chubb. “Hand-waving and interpretive dance:
an introductory course on tensor networks”. In: Journal of Physics A: Mathematical and
Theoretical 50 (2016).

57

You might also like