Academia.eduAcademia.edu

Continuous-time Monte Carlo methods for quantum impurity models

2011, Reviews of Modern Physics

Quantum impurity models describe an atom or molecule embedded in a host material with which it can exchange electrons. They are basic to nanoscience as representations of quantum dots and molecular conductors and play an increasingly important role in the theory of "correlated electron" materials as auxiliary problems whose solution gives the "dynamical mean field" approximation to the self energy and local correlation functions. These applications require a method of solution which provides access to both high and low energy scales and is effective for wide classes of physically realistic models. The continuous-time quantum Monte Carlo algorithms reviewed in this article meet this challenge. We present derivations and descriptions of the algorithms in enough detail to allow other workers to write their own implementations, discuss the strengths and weaknesses of the methods, summarize the problems to which the new methods have been successfully applied and outline prospects for future applications.

1 Continuous-time Monte Carlo methods for quantum impurity models Emanuel Gull and Andrew J. Millis arXiv:1012.4474v2 [cond-mat.str-el] 5 May 2011 Department of Physics, Columbia University, New York, NY 10027, USA Alexander I. Lichtenstein Institute of Theoretical Physics, University of Hamburg, 20355 Hamburg, Germany Alexey N. Rubtsov Department of Physics, Moscow State University, 119992 Moscow, Russia Matthias Troyer and Philipp Werner Theoretische Physik, ETH Zurich, 8093 Zurich, Switzerland Quantum impurity models describe an atom or molecule embedded in a host material with which it can exchange electrons. They are basic to nanoscience as representations of quantum dots and molecular conductors and play an increasingly important role in the theory of “correlated electron” materials as auxiliary problems whose solution gives the “dynamical mean field” approximation to the self energy and local correlation functions. These applications require a method of solution which provides access to both high and low energy scales and is effective for wide classes of physically realistic models. The continuous-time quantum Monte Carlo algorithms reviewed in this article meet this challenge. We present derivations and descriptions of the algorithms in enough detail to allow other workers to write their own implementations, discuss the strengths and weaknesses of the methods, summarize the problems to which the new methods have been successfully applied and outline prospects for future applications. CONTENTS I. Introduction A. Overview B. Quantum impurity models: definitions and examples C. State of the art prior to continuous time QMC D. Why continuous time? II. Diagrammatic Monte Carlo in continuous time A. Basic ideas B. Monte Carlo basics: sampling, errors, Markov chains and the Metropolis algorithm C. Diagrammatic Monte Carlo – the sampling of path integrals and other diagrammatic expansions D. The negative sign problem III. Interaction Expansion algorithm CT-INT A. Partition function expansion B. Updates C. Measurements D. Generalization to clusters, multi-orbital problems and retarded interactions 2 2 2 4 5 7 7 8 9 10 11 11 13 13 13 IV. Continuous-time auxiliary field algorithm CT-AUX A. Partition function expansion B. Updates C. Measurements 1. Measurement of the Green’s function 2. Role of the parameter K – potential energy V. Hybridization expansion solvers CT-HYB A. The hybridization expansion representation B. Density - density interactions C. Formulation for general interactions D. Krylov implementation E. Updates F. Measurements 14 14 15 16 16 16 16 16 17 17 18 19 19 VI. Infinite-U method CT-J A. Overview B. Partition function expansion C. Updates D. The Kondo model 20 20 21 21 22 VII. Phonons and retarded interactions A. Models B. CT-HYB 23 23 23 2 C. CT-INT VIII. Expansion on the Keldysh contour: Real-time and non-equilibrium physics A. Introduction B. Keldysh formalism C. Real-time CT-AUX D. Sign problem IX. Comparison of the efficiency of the different methods 1. Average expansion orders and matrix sizes 2. Accuracy for constant CPU time 25 26 26 26 27 29 29 29 30 X. Technical aspects 31 A. Inverse matrix formulas 31 B. Spin-flip updates 32 1. Delayed spin-flip updates 32 C. Efficient measurements in the CT-AUX and CT-INT formalism 33 1. Self energy binning measurement 33 D. Green’s function (“worm”) –sampling 34 E. Wang Landau sampling 35 F. Computation of the trace for general interactions in the hybridization expansion 36 1. Block diagonalization 36 2. Basis truncation 37 3. Binning and tree algorithms for the hybridization expansion 37 G. Use of symmetries, global updates 38 H. Vertex functions 38 I. High frequency expansions of the self energy 39 XI. Applications I: DMFT 39 A. Overview 39 B. Single-site DMFT approximation to the single orbital Hubbard model 40 C. Cluster dynamical mean field theory of the single orbital Hubbard model. 42 D. Dual-fermion calculations for the single-orbital Hubbard model 44 XII. Applications II: DMFT for multi-orbital and Kondo models A. Heavy Fermion compounds and the Kondo Lattice Model B. Dynamical mean field theory for realistic models of correlated materials 46 47 49 XIII. Applications III: Nanoscience A. Transport through quantum dots: linear response and quantum phase transitions B. Metal atom clusters on surfaces 51 51 XIV. Applications IV: non-equilibrium impurity models and nanoscale transport A. Overview B. Results: Current-voltage characteristics 1. Real-time CT-HYB 2. Real-time CT-AUX 3. Nonequilibrium DMFT 53 53 53 53 53 54 XV. Prospects and open issues 51 55 Acknowledgments 56 References 57 I. INTRODUCTION A. Overview This article aims to provide a comprehensive overview of recent developments which have made continuoustime quantum Monte Carlo (CT-QMC) approaches the method of choice for the solution of broad classes of quantum impurity models. We present derivations and descriptions of the algorithms in enough detail to allow other workers to write their own codes, and give a general introduction to diagrammatic Monte Carlo methods on which these algorithms are based. We discuss the strengths and weaknesses of the methods, and their range of applicability. We summarize the problems to which the new methods have been successfully applied, and outline prospects for future applications. We hope that readers will come away from the review with an appreciation of the power and flexibility of the techniques and with the knowledge needed to apply them to new generations of problems in nanoscience, correlated electron physics, nonequilibrium systems and other areas. But before entering into specifics it is worth asking: ‘what are quantum impurity models?’ and also ‘why study them with continuous time -methods?’ B. Quantum impurity models: definitions and examples Quantum impurity models were introduced to describe the properties of a nominally magnetic transition metal ion embedded in a non-magnetic host metal. A magnetic transition metal atom such as Fe and Co has a partly filled d shell, and the intra-d Coulomb interactions act to organize the electrons in the d-shell into a high-spin local moment configuration. Hopping from the d shell to the metal or vice versa favors non-magnetic configurations and thus competes with the local interactions. In 1961 P. W. Anderson (Anderson, 1961), following important earlier work of Friedel (1951, 1956), wrote down a mathematical model (now referred to as the Anderson Impurity Model) which encodes this competition. Anderson’s concept has proven enormously fruitful, with implications extending far beyond its original context of impurity magnetism. Quantum impurity models are basic to nanoscience as representations of quantum dots and molecular conductors (Hanson et al., 2007) and have been used to understand the adsorption of atoms onto surfaces (Brako and Newns, 1981; Langreth and Nordlander, 1991). They are of theoretical interest as solvable examples of nontrivial quantum field theories (Affleck, 2008; Wilson, 1975) and in recent years have played an increasingly important role in condensed matter physics as auxiliary problems whose solution gives the “dynamical mean field” (DMFT) approximation to the properties of correlated electron materi- 3 als such as high temperature copper-oxide and pnictide superconductors (Georges et al., 1996; Held et al., 2006; Kotliar et al., 2006). A quantum impurity model (see e.g. (Mahan, 2000)) may be represented as a Hamiltonian with three basic terms: Hloc which describes the “impurity”: a system with a finite (typically small) number of degrees of freedom, Hbath which describes the noninteracting but infinite (continuous spectrum) system to which the impurity is coupled, and Hhyb which gives the coupling between the impurity and bath. Thus HQI = Hloc + Hbath + Hhyb . (1) It is sometimes convenient to represent the partition function Z of the impurity model as an imaginary time path integral (Negele and Orland, 1988). In this representation it is easy to formally eliminate the bath degrees of freedom (a technique pioneered by Feynman and Vernon (1963)), obtaining an action which for Hamiltonians involving a hybridization of the form of Eq. (6) is Z Z = D[d† , d]e−S , (8) Z Z β h X  S= dτ dτ ′ d†a (τ ) ∂τ + E ab δ(τ − τ ′ ) (9) ab The physics represented by HQI is in general nontrivial because [Hloc , Hhyb ] 6= 0 (in physical terms, coupling to the bath mixes the impurity eigenstates). In the situation of primary physical interest Hloc may be represented in terms of a set of single-particle fermion states labeled by quantum numbers a = 1, . . . , N (including both spatial and spin degrees of freedom) and created by operators d†a as 0 I Hloc = Hloc + Hloc , X 0 ab † Hloc = E da db , (2) (3) ab I Hloc = X I pqrs d†p d†q dr ds + . . . . (4) 0 Z i + ∆ab (τ − τ ′ ) db (τ ′ ) + 0 kα The most commonly used form of the mixing term is characterized by a hybridization matrix V X Hhyb = Vkαb c†kα db + H.c., (6) kαb although exchange couplings of the form X exchange Hhyb = c† c d† d Jkabcd 1 k2 k1 a k2 b c d (7) k1 k2 abcd also arise, most famously in the “Kondo problem” of a spin exchange-coupled to a bath of conduction electrons (Kondo, 1964). Coupling of impurity models to oscillators (representing for example phonons in a solid) has also been considered. A discussion in the CT-QMC context is presented in Section VII. I dτ Hloc (τ ). In this formulation the hybridization function X aα −1 ∆ab (iωn ) = V ∗ k (iωn − εkα ) Vkαb (10) kα compactly encapsulates those aspects of the bath that are relevant to the impurity model physics. It will play a crucial role in our subsequent discussions. It is also often useful to define the noninteracting impurity model Green’s function G 0 via G 0 = −(∂τ + E + ∆)−1 . pqrs The ab components of the matrix E describe the bare level structure, I parametrizes electron electron interactions and the ellipsis denotes terms with 6 or more fermion operators. Hbath may be thought of as describing bands of itinerant electrons, each labeled by a one-dimensional momentum coordinate k or band energy εk and an index (spin and orbital) α. One usually writes X Hbath = εkα c†kα ckα . (5) β (11) The paradigmatic quantum impurity model is the single-impurity single-orbital Anderson model (Anderson, 1961). In this model, Hloc describes a single orbital, so the label a is spin up or down, E ab is (in the absence of magnetic fields) just a level energy ε0 , and the interaction term collapses to U n↑ n↓ . Thus X HAIM = ε0 d†σ dσ + U n↑ n↓ (12) σ  X X + Vk c†kσ dσ + H.c. + εk c†kσ ckσ . kσ kσ Impurity models with more degrees of freedom are assuming increasing importance. More degrees of freedom means a richer variety of physical phenomena, implying a more complicated structure for the interactions. For example, in transition metal oxide materials with partially filled d shells or in compounds involving rare earth or actinide atoms with partially filled f shells, the interactions express not only the energy cost of multiply occupying the atom but also the Hund’s rule physics that states of maximal spin and orbital angular momentum are preferred. Thus the interaction Hamiltonian describing the energetics of different configurations of electrons in the d orbitals which play an important role in the physics of transition metal oxides with cubic perovskite structures is normally written in the “Slater-Kanamori (SK)” form (Imada et al., 1998; Mizokawa and Fujimori, 1995): 4 C. State of the art prior to continuous time QMC I Hloc = HSK ≡ U + (U − 3J) −J X a6=b X a X na↑ na↓ + (U − 2J) X na↑ nb↓ a6=b naσ nbσ a>b,σ  d†a↑ d†a↓ db↑ db↓ + d†a↑ d†b↓ db↑ da↓ . (13) In nanoscience applications the impurity typically represents the highest occupied (HOMO) and lowest unoccupied (LUMO) molecular orbitals and the interactions are computed from Coulomb matrix elements involving these orbitals. In “cluster” dynamical mean field applications the “impurity” is thought of as a (typically small) number of sites with the E ab representing an intersite hopping Hamiltonian, thus in a two-site approximation to the Hubbard model  X  † (14) Hloc = Hcl = ε0 d1σ d1σ + d†2σ d2σ σ  X  † + t d1σ d2σ + d†2σ d1σ + U (n1↑ n1↓ + n2↑ n2↓ ) . σ Solving the quantum impurity model means computing the correlation functions of the d operators. Of these the most important is the d Green function (Tτ denotes timeordering). D E † Gab (τ ) = − T d (τ )d (0) . (15) τ a d b 0,ab (iωn ) ≡ In the absence of interactions, Gab d (iωn ) = Gd −1 [(iωn − E − ∆) ]ab . The effect of interactions may be −1 parametrized by the self energy Σ(iωn ) = G 0 −G−1 . Solving the quantum impurity model is conceptually and algorithmically challenging. As Eq. (9) demonstrates, a quantum impurity model is a quantum field theory in 0 space + 1 time dimension. While 0 + 1 dimensional quantum field theories are easier to solve than higher dimensional ones, they are still (in the general case) nontrivial. Only in a few cases are exact solutions known, and while in many more cases the form of the “universal” low energy behavior has been determined, the dynamical mean field and nanoscience applications require information about behavior beyond the universal limit, as well as quantitative information about the parameters describing the universal limit. A further complication is that impurity models typically involve several energy scales, including an interaction scale, often high, a hybridization scale, typically intermediate, and one or more dynamically generated energy scales, which in many cases are very low relative to the basic interaction and hybridization scales. A robust method which works for general models over a range of energy scales is required. Quantum impurity models have been of longstanding interest and a wide range of approximate techniques have been developed to solve them, including perturbative expansions in coupling constant (Yosida and Yamada, 1975) and in flavor degeneracy (Coleman, 1984; Read and Newns, 1983), perturbative (Anderson et al., 1970; Solyom and Zawadoswki, 1974) and functional (Hedden et al., 2004) renormalization group, as well as “X-operator” techniques (Gunnarsson and Schönhammer, 1983; Hubbard, 1964; Keiter and Kimball, 1970) and different formulations of auxiliary (“slave”) particle methods (Abrikosov, 1965; Barnes, 1976; Coleman, 1984; Florens and Georges, 2004; Read and Newns, 1983). An important subclass of analytical methods is based on the resummation to all orders of a particular subset of diagrams. In the impurity model context the most important of these are the noncrossing approximation (NCA) (Bickers, 1987) and its generalizations (Haule et al., 2001; Pruschke and Grewe, 1989). The terminology refers to the structure of diagrams in an expansion in the hybridization. In this expansion contractions of bath operators are represented as lines. If one uses a time-ordered perturbation theory diagrams may be classified by the number of times that lead operator lines cross and all diagrams with zero or one crossing may be analytically summed by solving an integral equation. While uncontrolled, these approximations capture many aspects of the physics of impurity models and can be formulated on the real frequency axis. They have therefore been used as inexpensive solvers for impurity models and to study nonequilibrium phenomena in nano-contacts (Wingreen and Meir, 1994). Very recently, techniques closely related to those described here have been used to formulate a numerically exact solution based on an expansion around the NCA (Gull et al., 2010). It is found that the approximations are accurate in the Mott insulating phase but do not capture the important diagrams in the metallic phase of the Anderson impurity model. Powerful field-theoretical and Betheansatz-based analytical methods have been developed to classify and compute exactly the universal low energy behavior (Affleck, 2008; Andrei et al., 1983; Wilson, 1975) of broad classes of quantum impurity models. However, the dynamical mean field and nanoscience applications require an approach that works for wide classes of physically relevant impurity models and gives access to physics beyond the universal limit. Thus, while analytical methods provide very valuable insights, they do not provide the comprehensive solutions, valid over a wide range of frequencies, that are needed for modern applications. Starting from work of Wilson (1975) and subsequently of White (1992), (see (Schollwöck, 2005) for a review), an important set of numerical methods has been developed based on intelligently chosen truncations of the 5 Hilbert space of the many-body problem in question. The “numerical renormalization group” (NRG) methods are based on iterative diagonalization using a logarithmic discretization of the energy spectrum of the lead states and are reviewed for example in (Bulla et al., 2008) while the “density matrix renormalization group” (DMRG) techniques involve an isolation of the relevant low-lying states. These methods are complementary to the methods discussed here: CT-QMC methods are most naturally formulated in imaginary time and very efficiently handle a wide range of energy scales and relatively general classes of models, but require analytical continuation to obtain real-time information and have difficulty resolving subtle low energy features such as the fine structure of quantum criticality. On the other hand, the NRG and DMRG methods can be formulated directly on the real frequency axis or in real time and are particularly powerful in resolving ground states and low-lying levels but encounter difficulties in providing information over a wide range of frequencies and the difficulties increase rapidly as one moves beyond the simple Anderson/Hubbard models. Both DMRG (Hallberg, 2006; Nishimoto et al., 2006) and NRG (Bulla et al., 2008) methods have been implemented as “solvers” for the quantum impurity models of dynamical mean field theory. But except for the single-orbital Anderson model, where NRG methods have proven to be useful, especially in situations where a precise understanding of the very low energy behavior is crucial, NRG and DMRG solvers are not in widespread and general use in the DMFT community. A more widely applied class of techniques is based on the “exact diagonalization” (ED) idea introduced in the early days of dynamical mean field theory by Caffarel and Krauth (1994). These authors approximated the continuum of bath energies and values of the hybridization by a small number of variationally chosen eigenstates and hybridization functions. HQI then becomes a finite system, which is exactly diagonalized, leading to a Gd characterized by a delta function spectrum. The cost scales exponentially with the number of sites considered. The largest systems which are typically studied contain on the order of 15 sites with one non-degenerate orbital on each site. Thus, in the singleimpurity Anderson model, Eq. (12), the continuum of bath states ckσ may be approximated by 7 or 8 (×2 for spin) orbitals while for say a three orbital model only two or three bath orbitals per impurity state can be accommodated. With the development of more modern algorithms and computers, enough bath sites can be included that for the single-orbital Anderson impurity model the temperature dependence can be computed, the convergence of results with bath size can be studied (Capone et al., 2007), and systematic comparisons to other methods can be made (Comanac et al., 2008; Werner et al., 2006). Recently, results on small clusters (Civelli et al., 2005; Kancharla et al., 2008; Kyung et al., 2006a; Liebsch and Tong, 2009) and single-impurity, multiorbital models (Liebsch, 2005; Liebsch et al., 2008) have also been obtained, although here the number of bath sites per orbital is limited and the convergence with bath site number cannot yet be addressed rigorously (Koch et al., 2008). Quantum Monte Carlo techniques provide a general method for solving quantum field theories, and prior to the development of CT-QMC methods the principal impurity solver was the Hirsch-Fye quantum Monte Carlo method (Hirsch and Fye, 1986). This method is based on writing an imaginary-time functional integral, discretizing the interval [0, β) into M equally spaced “time-slices” ∆τ = β/M and then on each time-slice i applying a discrete Hubbard-Stratonovich transformation which for the single-orbital Anderson model is   n +n 1 X λsi (n↑ −n↓ ) −∆τ U n↑ n↓ − ↑ 2 ↓ e , (16) = e 2 s =±1 i    1 λ = arcosh exp ∆τ U . (17) 2 For a fixed choice of Ising variables {si } the problem thus becomes a noninteracting fermion model in a time dependent z-oriented magnetic field h(τi ) = si which may be formally solved, so one is left with the problem of sampling the trace over the 2M dimensional space of the si . The Hirsch-Fye method was for almost two decades the method of choice, but ultimately three difficulties limit its power. The first is that it requires an equally spaced time discretization. (A linear-in-β method for impurity models (Khatami et al., 2010), while fast, similarly requires discretization both of the bath and the imaginary time axis.) The second is that at large interactions and low temperatures equilibration may become an issue. While techniques have been developed to ameliorate these problems (Blümer, 2008; Gorelik and Blümer, 2009) and new update techniques have been proposed (Alvarez et al., 2008; Nukala et al., 2009), the difficulties of managing the discretization and equilibration issues within HirschFye are real. It appears that the CT-QMC methods discussed here are now preferred by most practitioners. The third, and most fundamental difficulty is that for interactions other than the simple one-orbital Hubbard model the Hubbard-Stratonovich fields required to decouple the interactions proliferate and may have to be chosen complex so sampling the space of auxiliary fields becomes prohibitively difficult (Sakai et al., 2006b). D. Why continuous time? Imaginary-time path integral representations of quantum problems such as Eq. (9) are mathematically defined (see, e.g. Negele and Orland, 1988 and references therein) in terms of the result of a limiting process in which one rewrites the partition function, Z = 6 exp(−βH), of a system described by a Hamiltonian H at temperature T = 1/β by defining ∆τ = β/N , τj = j∆τ as Z = e−(τN −τN −1 )H e−(τN −1 −τN −2 )H . . . e−(τ1 −τ0 )H . (18) The path integral is defined by inserting complete sets of states between every pair of exponentials and then taking the limit ∆τ → 0. This mathematical definition motivates a numerical approach (Suzuki, 1976) in which one approximates the path integral by (i) retaining a non-zero ∆τ and (ii) using a Monte Carlo method to estimate the sums over all intermediate states. The exact partition function is recovered after the twin steps of converging the Monte Carlo and extrapolating the results to ∆τ = 0. While clever and efficient methods (for example, the Hirsch-Fye procedure (Hirsch and Fye, 1986) mentioned in the previous subsection) have been devised for performing the Monte Carlo, the time step extrapolation remains an issue. The difficulties are particularly severe for the quantum impurity problems of interest here because the basic object in the theory is the Green function, which drops rapidly as τ is increased from 0 and has discontinuous derivatives at τ = 0, β which need to be correctly evaluated (see e.g. Fig. 2 in (Werner et al., 2006)). The discretization errors are large, and a very small ∆τ and a precise extrapolation to ∆τ = 0 are required to obtain accurate results. However, the low energy behavior of interest is carried by times τ ∼ β/2, so that simulations on a homogeneous grid require many points. Methods which do not involve an explicit time discretization) would therefore appear to be advantageous. The basic idea behind all of the continuous time methods discussed in this review is to avoid the time discretization entirely by sampling the terms in a diagrammatic expansion, instead of sampling the configurations in a complete set of states. One of the first important methods to do this is Handscomb’s method (Handscomb, 1962, 1964). This method and its generalization, the stochastic series expansion (SSE) algorithm (Sandvik and Kurkijärvi, 1991) are based on a Taylor expansion of the partition function in powers of βH and have been successful for quantum magnets. However, they require that the spectrum of the Hamiltonian is bounded from above so applications to boson problems require a truncation of the Hilbert space while applications to fermion problems are limited by a bad sign problem. The continuous time methods in use now stem from work of (Prokof’ev et al., 1996) and (Beard and Wiese, 1996), who showed that simulations of bosonic lattice models can be implemented simply and efficiently in continuous-time by a stochastic sampling of a diagrammatic perturbation theory for the partition function. The general scheme for treating diagrams with continuous variables of arbitrary nature – diagrammatic Monte Carlo – is formulated in (Prokof’ev and Svistunov, 1998; Prokof’ev et al., 1998). In these methods the systematic errors associated with time discretization and the Suzuki-Trotter decomposition were eliminated. The gain in computational efficiency is so large that the problem of simulating unfrustrated bosonic lattice models can now be considered as solved, although special cases, for example bosons coupled to a gauge field (rotating atomic gases, charged bosons in a magnetic field) remain challenging. The success of CT-QMC methods for bosons stimulated efforts to adapt the technique to fermionic problems (Rombouts et al., 1999). However, in contrast to standard (unfrustrated) bosonic systems, where diagrams all have the same signs, in fermionic models individual diagrams may have positive or negative signs, so that the sampling of individual diagrams suffers from a severe sign problem. This sign problem may be reduced by combining classes of diagrams analytically into determinants. Unfortunately, Rombouts and collaborators found that a prohibitively severe sign problem remained in the parameter regimes relevant to strong correlation physics. This, and the fact that the lattice algorithm given in (Rombouts et al., 1999) was restricted to density-density interactions, caused many researchers to abandon the approach – except for the special case of sign-problemfree models with an attractive interaction, where CTQMC methods have successfully been used to investigate the BEC-to-BCS crossover in ultracold atomic gases (Burovski et al., 2008, 2006a,b). The increasing importance of impurity models has motivated a reexamination of CT-QMC methods. Impurity models turn out to have a much less severe sign problem than the full lattice problem (indeed in some cases the sign problem is absent). The reduction in severity of the sign problem has allowed the development of flexible and powerful continuous-time quantum Monte Carlo impurity solvers, first in a weakcoupling formulation (Rubtsov and Lichtenstein, 2004; Rubtsov et al., 2005), soon thereafter in a complementary hybridization expansion formulation (Werner et al., 2006), and more recently in an auxiliary field formulation (Gull et al., 2008a). These methods have quickly been extended in many directions and applied to numerous dynamical mean field studies of model Hamiltonians. They enabled accurate simulations of the Kondo lattice model (Otsuki et al., 2009a), the first quantitative studies of multi-orbital models with realistic rotationally invariant (non-diagonal) interactions (Chan et al., 2009; Haule, 2007; Rubtsov et al., 2005; Werner et al., 2008; Werner and Millis, 2006, 2007c) and allowed much more efficient simulations of the multi-site clusters needed to study spatial correlation effects within dynamical mean field theory (Ferrero et al., 2009a; Gull et al., 2009, 2008b; Haule and Kotliar, 2007b; Mikelsons et al., 2009a; Park et al., 2008a; Sordi et al., 7 2010; Werner et al., 2009a). They have also enabled more realistic “LDA+DMFT” studies of materials (Marianetti et al., 2007). Continuous time quantum Monte Carlo methods can also be used to efficiently compute four-point correlation functions, which are important for susceptibilities, phase boundaries, and in connection with recently developed extensions of dynamical mean field theory (Kusunose, 2006; Rubtsov et al., 2008; Slezak et al., 2009; Toschi et al., 2007). The methods have been applied to nanoscience topics including the properties of transition metal clusters on metal surfaces (Gorelov, 2007; Savkin et al., 2005). Previously inaccessible physics questions such as the quasiparticle dynamics and thermal crossovers in heavy fermion materials are being addressed (Haule and Kotliar, 2007a; Park et al., 2008b; Shim et al., 2007) and applications to questions motivated by experiments on fermions in optical lattices have begun to appear (Dao et al., 2008; De Leo et al., 2008). Extensions to nonequilibrium problems are now under development (Mühlbacher and Rabani, 2008; Schiró and Fabrizio, 2009; Werner et al., 2010, 2009b). While the new CT-QMC methods have been transformative, opening wide classes of problems to systematic study, they have not solved the fermion sign problem. As far as is known, sign problems are physical and unavoidable, at least in itinerant phases with unpaired fermions (Troyer and Wiese, 2005) and indeed set the ultimate limits on the problems and parameter regimes which can be studied by the continuous time methods discussed here. Further discussion of sign problems will be given in Sec. II.D and in the context of the discussion of specific algorithms. II. DIAGRAMMATIC MONTE CARLO IN CONTINUOUS TIME A. Basic ideas The basic idea of the CT-QMC methods is very simple. One begins from a Hamiltonian H = Ha + Hb which is split into two parts labeled by a and b, writes the partition function Z = e−βH in the interaction representation with respect to Ha and expands in powers of Hb , thus (Tτ is the time ordering operator) # " Z Z =Tr Tτ e−βHa exp − = X k k (−1) Z 0 β dτ1 . . . β dτ Hb (τ ) 0 Z β dτk τk−1   × Tr e−βHa Hb (τk )Hb (τk−1 ) . . . Hb (τ1 ) . (19) The trace evaluates to a number and diagrammatic Monte Carlo methods (Prokof’ev and Svistunov, 1998) enable a sampling over all orders k, all topologies of the paths/diagrams and all times τ1 , · · · , τk in the same calculation. Because the method is formulated in continuous time from the beginning, time discretization errors do not have to be controlled and the simulation can be arranged to ensure that the method focuses attention on the time regions which are most important to the process under study. Provided the spectrum of the perturbation term is bounded from above the contributions of very 1 large orders are exponentially suppressed by the factor k! originating from the expansion of an exponential. Thus the sampling process does not run off to infinite order and no truncation of the diagram order is needed. (Note that for bosonic operators a perturbation in the interaction would be divergent since the spectrum cannot be bounded from above unless a cutoff in bosonic occupation number is introduced (Itzykson and Zuber, 2006), so an expansion in the hybridization is usually employed.) The method does not rely on an auxiliary field decomposition although it may be advantageously combined with one (Gull et al., 2008a). Further, the method does not rely on a particular partitioning into “interacting” and “noninteracting” parts; in principle the only requirement is that one may decompose the Hamiltonian in such a way that the time evolution associated with Ha and the contractions of operators Hb may easily be evaluated. In practice, the sign associated with interchanges of fermion operators means that the expansion must be arranged such that terms differing only in the contractions of fermion operators are combined; for example into determinants. In the impurity model context four types of expansion have been formulated, which we refer to as CT-HYB I , Eq. (4)), CT(Hb = Hhyb , Eq. (6)), CT-INT (Hb = Hloc I AUX (Hb = Hloc but with an additional auxiliary field decomposition) and CT-J (an expansion for Kondo-like exchange problems with Hb = Hhyb , Eq. (7)). The advantage of the hybridization expansion is that arbitrarily complicated impurity interactions can easily be treated; the disadvantage is that because [Hhyb , Hloc ] 6= 0 at least one of the operators is non-diagonal so the expansion generically requires the manipulation of matrix blocks whose size grows exponentially with the number of impurity orbitals. The present state of the art is that 5 spin-degenerate orbitals can be treated. Various truncation and approximation schemes provide limited access to larger problems but as the number of orbitals is increased the difficulties rapidly become insurmountable. CT-INT and CT-AUX are variations of an “interaction expansion”. They are sometimes referred to as “weak coupling” expansions, but this is a misnomer – the expansion is in powers of the interaction but is not (in principle) restricted to small interactions. The series is always convergent for nonzero temperature and finite number of orbitals. In CT-INT and CT-AUX the scaling with number of impurity orbitals is not exponential, so much 8 larger systems can be treated. However, the methods are most suited to Hubbard-like models with a single local density-density interaction. More complicated interactions typically require multiple expansions in the several vertices and if the interactions do not commute (as is the case for the components of the spin exchange) the difficulties increase. CT-J is an expansion organized for Kondo-like models where the interaction vertex also creates particle-hole pairs in the conduction bands. It combines aspects of both the interaction and hybridization expansion. While all of the expansions are based on the same general idea, there are significant differences in the specifics of how the expansion is arranged, the measurements are done and the errors are controlled. We therefore devote a separate section to each expansion. In the remainder of this section we provide an overview of general aspects of continuous time Monte Carlo methods. ability p(x)/Z and averaging the contributions A(xi ): hAip ≈ hAiMC ≡ In this subsection we recall some basic results pertaining to the Monte Carlo evaluation of high dimensional integrals. For reader unfamiliar with Monte Carlo, the books by Landau and Binder (2000) and Krauth (2006) give an extensive introduction to the technique. In the CT-QMC methods, as in many other classical or quantum many-body problems, one is faced with the issue of evaluating sums over phase spaces or configuration spaces which we denote generically by C. C is typically of a very high dimension, so Monte Carlo techniques are the only practical methods of evaluation. A crucial quantity is the partition function, Z, which we will write formally as an integral over configurations x ∈ C with weight p(x): Z= Z dxp(x). (20) C In a classical system x might be a point in phase space with a Boltzmann weight p(x) = exp(−βE(x)), where E(x) is the energy of the configuration x. In the quantum problems described here x will represent a particular term in a diagrammatic partition function expansion. The expectation value of a quantity A is given by the average, over the configuration space C with weight p, of a quantity A(x): hAip = 1 Z Z dxA(x)p(x), (21) C The auxiliary quantity A(x) depends on the specific representation chosen in a particular algorithm. The average (21) can be estimated in a Monte Carlo procedure by selecting M configurations xi with a prob- (22) According to the central limit theorem, if the number of configurations is large enough the estimate (22) will be normally distributed around the exact value hAip with variance VarA (∆A)2 ≡ h(AMC − Ap )2 i = . (23) M It will sometimes be advantageous or necessary to sample configurations xi with a distribution ρ(x) different from p(x). The expectation value hAiρ in the ensemble then has to be reweighed: Z 1 hAi = dxA(x)p(x) Z C R dxA(x) p(x) hA pρ iρ ρ(x) ρ(x) C ≡ = R . (24) h pρ iρ dx p(x) ρ(x) C B. Monte Carlo basics: sampling, errors, Markov chains and the Metropolis algorithm M 1 X A(xi ). M i=1 ρ(x) To estimate this expectation value one needs to sample both the numerator and denominator and collect averages of A(xi )p(xi )/ρ(xi ) and p(xi )/ρ(xi ). Care must be taken in estimating the statistical errors of such ratios, since cross-correlations will make naı̈ve error propagation unreliable. A jackknife or bootstrap procedure (see, e.g. (Vetterling et al., 1992)) is needed. Integrals with general distributions such as Eqs. (20), (24) are best sampled by generating configurations using a Markov process. A Markov process is fully characterized by a transition matrix Wxy specifying the probability to go from state x to state y in one step of the Markov process. PNormalization (conservation of probabilities) requires y Wxy = 1. Starting from an arbitrary distribution the Markov process will converge exponentially to a stationary distribution p(x) if two conditions are satisfied. • Ergodicity: It has to be possible to reach any configuration x from any other configuration y in a finite number of Markov steps: for all x and y there exists an integer N < ∞ such that for all n ≥ N the probability (W n )xy 6= 0. • Balance: Stationarity implies that the distribution p(x) fulfills the balance condition Z dx p(x)Wxy = p(y), (25) C that is p(x) is a left eigenvector of the transition matrix Wxy . A sufficient but not necessary condition usually used instead of the balance condition is the detailed balance condition p(y) Wxy = , (26) Wyx p(x) which we will use below. 9 The first, and still most widely used, algorithm that satisfies detailed balance is the Metropolis-Hastings algorithm (Hastings, 1970; Metropolis et al., 1953). There, an update from a configuration x to a new configuration prop y is proposed with a probability Wxy but accepted only acc with probability Wxy . If the proposal is rejected the old configuration x is used again. The transition matrix is prop acc Wxy Wxy = Wxy (27) and the detailed balance condition (26) is satisfied by using the Metropolis-Hastings acceptance rate acc Wxy = min [1, Rxy ] . (28) with the acceptance ratio Rxy given by Rxy = prop p(y)Wyx prop p(x)Wxy (29) and Ryx = 1/Rxy . To simplify the notation we will often quote just Rxy , and imply that min[1, Rxy ] is the actual acceptance probability. Note that the acceptance ratio Rxy includes both the weights and the proposal probabilities. In the following sections we will always prop specify both the proposal probabilities Wxy and the acceptance ratios Rxy . b) a) 0 c) τ1 0 β β d) 0 τ1 τ2 β 0 τ1 τ2 τ3 β FIG. 1 Diagrammatic representation of configurations x = {(k; τ1 , . . . τk )} ∈ C showing examples with orders k = 0, 1, 2, 3 and vertices (represented by dots) at times τ1 , . . . , τ3 . could cause difficulties with proposal and acceptance probabilities in the random walk in configuration space. As (Beard and Wiese, 1996; Prokof’ev and Svistunov, 1998; Prokof’ev et al., 1996, 1998) showed, this is not the case. The various algorithms reviewed here differ in the representations, weights, and updates, as well as in the most convenient representation for the measurement of observables, but all express the partition function in the general form (30). To illustrate the Monte-Carlo sampling of such continuous-time partition function expansions and in particular to demonstrate that the infinitesimal does not cause problems, we consider the very simple partition function Z β Z β ∞ Z β X w(k) , (33) dτ1 dτk dτ2 · · · Z= k! 0 0 0 k=0 which using time ordering can be rewritten as C. Diagrammatic Monte Carlo – the sampling of path integrals and other diagrammatic expansions The partition function Eq. (19) may be expressed as a sum of integrals originating from a diagrammatic expansion: Z β ∞ X Z β X Z= dτk w(k, γ, τ1 , . . . , τk ), dτ1 . . . k=0 γ∈Γk 0 Z= ∞ Z X k=0 0 β dτ1 Z β τ1 dτ2 · · · Z β dτk w(k). (34) τk−1 The distribution describing the probability of a diagram of order k with vertices at times {τj } is (here we make the times explicit) τk−1 (30) which has the form of Eq. (20). The individual configurations are of the form x = (k, γ, (τ1 , . . . , τk )), (31) where k is the expansion or diagram order and τ1 , . . . , τk ∈ [0, β) are the times of the k vertices in the configuration. The parameter γ ∈ Γk includes all discrete variables, such as the topology of the diagram and spin, orbital, lattice site, and auxiliary spin indices associated with the interaction vertices. A configuration x has a weight p(x) = w(k, γ, τ1 , . . . , τk )dτ1 · · · dτk , (32) which we will assume to be non-negative for now. The case of negative weights is discussed in Sec. II.D. Although these weights are well-defined probability densities they involve infinitesimals dτ , which one might worry p((k, τ1 , . . . , τk )) = w(k) k Y dτi . (35) i=1 In the following we will always assume time-ordering τ1 ≤ τ2 ≤ . . . ≤ τk and visualize the configurations using a diagrammatic representation as in Fig. 1. Transitions between configurations x and y are realized by updates. Updates in diagrammatic Monte Carlo codes typically involve (i) updates that increase the order k by inserting an additional vertex at a time τ and (ii) updates that decrease the order k by removing a vertex τj . These insertion and removal updates are necessary to satisfy the ergodicity requirement and are often sufficient: we can reach any configuration from another one by removing all the existing vertices and then inserting new ones. Additional updates keeping the order k constant are typically not required for ergodicity but may speed up equilibration and improve the sampling efficiency. In some special circumstances, for example if all odd order 10 initialize simulation 0 τ1 τ2 e ov m re ert s in 0 τ1 β choose update τ3 τ2 β FIG. 2 An insertion update (top to bottom) inserting a vertex at time τ3 and the corresponding removal update (bottom to top), removing the vertex at time τ3 . diagrams have zero weight, updates which insert or remove multiple vertices are required. In the following we will focus on the insertion and removal updates, illustrated in Fig. 2. For the insertion let us start from a configuration (k, ~τ ) = (k, τ1 , . . . , τk ) of order k. We propose to insert a new vertex at a time τ uniformly chosen in the interval [0, β), to obtain a new time-ordered configuration (k + 1, ~τ ′ ) = (k + ′ 1, τ1 , . . . , τ, . . . , τk ) ≡ (k + 1, τ1′ , . . . , τk+1 ). The proposal rate for this insertion is given by the probability density prop W(k,~ τ ),(k+1,~ τ ′) dτ . = β prop = p((k + 1, ~τ ′ )) W(k+1,~τ ′ ),(k,~τ ) prop p((k, ~τ )) W(k,~ τ ),(k+1,~ τ ′) propose removal update measure observables compute proposal probability of update and inverse update, compute probability ratio of new to old configuration no Metropolis reject change, leave old configuration yes set configuration to new configuration (36) The reverse move is the removal of a randomly chosen vertex. The probability of removing a particular vertex to go back from (k + 1, ~τ ′ ) to (k, ~τ ) is just one over the number of available vertices: 1 prop (37) W(k+1,~ τ ′ ),(k,~ τ) = k + 1 . To obtain the Metropolis acceptance rates we first calculate the acceptance ratio R(k,~τ ),(k+1,~τ ′ ) = propose insertion update (38) ′ w(k + 1)dτ1′ · · · dτk+1 1/(k + 1) w(k + 1) β = . w(k)dτ1 · · · dτk dτ /β w(k) k + 1 Observe that all infinitesimals cancel: the additional infinitesimal in the weight p((k + 1, ~τ ′ )) is canceled by the infinitesimal of the proposal rate for insertions. Equation 38 implies that the acceptance rates W acc are well defined finite numbers given by   acc (39) W(k,~ τ ),(k+1,~ τ ′) , τ ),(k+1,~ τ ′ ) = min 1, R(k,~   acc W(k+1,~τ ′ ),(k,~τ ) = min 1, 1/R(k,~τ ),(k+1,~τ ′ ) . (40) Acceptance rates for updates that preserve the order k, such as shifting some of the τi times or updating the discrete parameters γ are straightforward to evaluate since there all infinitesimals cancel trivially. The general scheme of diagrammatic Monte Carlo algorithms is illustrated in Fig. 3. One cannot stress often enough that measurements are performed again on the old configuration if the proposed update has been rejected. FIG. 3 Continuous-time Quantum Monte Carlo flow diagram. D. The negative sign problem Until now we have tacitly assumed that the expansion coefficients of our partition function expansion are always positive or zero. This has allowed us to interpret the weights as probability densities on the configuration space and the stochastic sampling of these configurations in a Monte Carlo simulation. If the weights p(x) become negative, as is often the case in fermionic simulations due to the anti-commutation relations between fermionic operators, they can no longer be regarded as probabilities. The common solution is to sample with respect to the absolute value of the weight ρ(x) = |p(x))| and reweight the measurements according to Eq. (24). The ratio p(x)/ρ(x) is then just sign(p(x)) = p(x)/|p(x)|. This gives for the average (21) hAi = hA · signi|p| , hsigni|p| (41) which can be evaluated by sampling numerator and denominator separately with respect to the positive weight |p(x)|. While sampling with the absolute value and reweighing allows Monte Carlo simulations of systems with negative weights, it does not solve the “sign problem”. Sampling Eq. (41) suffers from exponentially growing errors. To 11 see this let us consider the average sign R dx sign(x)|p(x)| Z , = hsigni = C R Z|p| dx|p(x)| C (42) which is just the ratio of the partion function Z and the partition function of a “bosonic” system with positive weights |p(x)|. This ratio can be expressed through the difference ∆F in free energies of these two systems Z = exp(−β∆F ), hsigni = Z|p| (43) and decreases exponentially as the temperature is lowered or the volume of the system increased. The sign problem is thus the accurate measurement of this near-zero sign from individual measurements that are +1 or −1 , a cancellation problem. The variance of the sign is Var sign = hsign2 i − hsigni2 = 1 − exp(−2β∆F ) ≈ 1 (44) and the relative error after M measurements p Var sign/M exp(β∆F ) √ ∆sign = ≈ (45) hsigni M grows exponentially with decreasing temperature and increasing system size. The sign problem has been proven to be nondeterminstic polynomial (NP) hard, and hence in general no polynomial time solution is believed to exist (Troyer and Wiese, 2005). However, the severity of the sign problem (in the notation of Eq. (45) the magnitude of the coefficient exp(β∆F )) depends both on the model considered and on the representation chosen for the model. Impurity models tend to have less severe sign problems than comparable finite-sized lattice models (‘turning off’ the coupling to the bath often makes the sign problem worse). In special cases the sign problem is absent. For example, Yoo and coworkers proved that there is no sign problem in Hirsch-Fye simulations of the single impurity single orbital Anderson impurity model (Yoo et al., 2005), and this proof can be easily extended to some multi-orbital models and adapted to the continuous time algorithms presented in this review. A trivial sign problem arises if the operator −Hb is negative and odd perturbation orders are allowed. A simple example is the weak coupling expansion of the repulsive (positive U ) Hubbard model. In this particular case the sign problem may be avoided by a trick discussed in Section III.A. In the hybridization expansion a severe sign problem may occur if the hybridization function and the   bare level energy do not commute ∆ab , E ab 6= 0 (Wang and Millis, 2010). An apparently related difficulty occurs in the weak coupling approach if the hybridization function and interaction are not diagonal in the orbital/spin occupation number basis (Gorelov et al., 2009). In the larger systems dealt with in cluster dynamical mean field theory, fermion loops occur and produce a sign problem. Because the sign problem is model and representation dependent, further discussion is postponed to the sections pertaining to specific algorithms. III. INTERACTION EXPANSION ALGORITHM CT-INT The interaction expansion algorithm CT-INT was the first continuous-time impurity solver to be introduced (Rubtsov and Lichtenstein, 2004). It proceeds from I Eq. (19), with Hb taken to be the interaction part Hloc of 0 Eq. (4), and Ha = Hbath + Hhyb + Hloc (see Eqs. (1) and (2)). It has a better scaling with system size than the hybridization algorithm and can treat more general interactions than CT-AUX. A “trivial” sign problem arises for repulsive interactions, where terms of the form (−U )k appear. Elimination of this sign problem is an important issue in the design of the algorithm. A. Partition function expansion We illustrate the method by considering the simplest model, the one orbital single site Anderson impurity model Eq. (12) which, for this expansion, is most conveniently formulated in terms of the action S = S0 + SU with S0 = − SU = U X ZZ 0 σ Z β dτ dτ ′ d†σ (τ )Gσ0 (τ − τ ′ )−1 dσ (τ ′ ), (46) β dτ n↑ (τ )n↓ (τ ), (47) 0 where Gσ0 = (iωn − ǫ0 − ∆σ )−1 , and ǫ0 is the impurity energy level. We consider more general models in Sec. III.D. The expansion of the partition function in powers of U reads Z/Z0 = 1 + (−U ) 1! (48) Z (−U )2 2! + ··· , + β dτ1 hn↑ (τ1 )n↓ (τ1 )i0 0 ZZ 0 β dτ1 dτ2 hn↑ (τ1 )n↓ (τ1 )n↑ (τ2 )n↓ (τ2 )i0 R where the notation h. . .i0 = Z10 D[d† , d]e−S0 [. . .] denotes an average in the non-interacting ensemble with quadratic R action S0 (see low order terms in Fig. 4), and Z0 = D[d† , d]e−S0 . Employing Wick’s theorem (Wick, 1950) we may express the expectation value in terms of determinants of the non-interacting Green’s function 12 0 d†↑ (τ1 ) d↑ (τ1 ) d†↓ (τ1 ) d↓ (τ1 ) β G↑0 G↑0 G↑0 G↓0 0 G↓0 β G↓0 FIG. 4 Depiction of a third order term in the weak coupling expansion. Upper panel: Hubbard interaction vertices denoted by circles. Each U n↑ (τ )n↓ (τ ) - vertex has four operators. Lower panel: one possible contraction of the interaction vertices. −hT d(τi )d† (τj )i0 = G 0 (τi − τj ): hn↑ (τ1 )n↓ (τ1 )n↑ (τ2 )n↓ (τ2 ) · · · n↑ (τk )n↓ (τk )i0 = det D↑k det D↓k , (Dσk )ij = Gσ0 (τi − τj ). (49) (50) Summing the contractions into a determinant instead of sampling them individually reduces the size of the configuration space and avoids a sign problem coming from the fermionic exchange. We thus arrive at the following series for the partition function: ! Z ∞ Y X (−U )k β dτ1 . . . dτk det Dσk . (51) Z/Z0 = k! 0 σ k=0 Two “sign problems” may potentially occur in this expansion: an “intrinsic” sign problem arising from fermion exchange because the determinants might become negative and a “trivial” sign problem, arising for U > 0 from the (−U )k factor. The arguments of Yoo et al. (2005) prove that for the single impurity Anderson model each of the determinants is no-negative, so there is no intrinsic sign problem. This is not necessarily the case for the more general models considered in subsection III.D. The “trivial” sign problem arising for U > 0 can be managed in several ways. For the single band, single-impurity Anderson model (Rubtsov, 2003) showed that the replacement d†↓ → d˜↓ , d↓ → d˜†↓ leads to following changes in the parameters of the effective action: ǫ0↓ → −ǫ0↓ , ǫ0↑ → ǫ0↑ + U, ∆↓ (τ ) → −∆↓ (−τ ) U → −U. (52) The repulsive interaction becomes attractive and the “trivial” sign problem due to the interaction term vanishes. This approach performs a particle-hole transformation on the down spins only such that up and down spins are treated inequivalently. While the entire series formally maintains spin inversion symmetry (in the absence of a magnetic field), restoring it dynamically by Monte Carlo sampling is challenging in practice. It is better to avoid the symmetry breaking as follows. First, observe that the transformed Hamiltonian can be viewed in the original variables as an expansion in U n↑ (n↓ − 1); this leads to a down-spin determinant with diagonal elements replaced by G↓0 (0) − 1. The absence of a sign problem means that the down spin determinant must generate a minus sign that compensates the (−U ) factor. This approach may be generalized: expanding in powers of SU = U Z β 0 dτ (n↑ (τ ) − α↑ ) (n↓ (τ ) − α↓ ) , (53) with the corresponding change ǫ0σ → ǫ0σ − U α−σ , G 0 → G̃ 0 in S0 , leads to D E det Dσk = Tτ [nσ (τ1 ) − ασ ] · · · [nσ (τk ) − ασ ] 0 G̃σ0 (0) − ασ G̃σ0 (τ1 − τ2 ) · · · G̃σ0 (τ1 − τk ) .. . . G̃ 0 (τ − τ ) G̃σ0 (0) − ασ . . = σ 2. 1 .. .. .. .. . . . G̃σ0 (τk − τ1 ) ··· . (54) · · · G̃σ0 (0) − ασ Rubtsov (2003) showed that for α↑ + α↓ = 1, α↑ α ↓≤ 0 the trivial sign problem is absent. Finally, it is advantageous to avoid this explicit symmetry breaking at the cost of introducing an auxiliary field s =↑, ↓ and expanding in powers of Z U β X SU = dτ (n↑ (τ ) − αsτ ↑ ) (n↓ (τ ) − αsτ ↓ ) , (55) 2 0 s τ Expanding this action we get an additional random variable si =↑, ↓ at each vertex that needs to be sampled over. In practice this does not introduce any difficulties: all expressions remain the unchanged, apart from an an additional index αsi σ instead of ασ in the determinants of Eq. (54). In the actual calculation it is useful to take the parameter αsσ = 0.5 + δ for s = σ and αsσ = −δ otherwise. In principle, δ can be taken to be zero but setting it to a small positive value δ ≈ 0.01 allows to avoid numerical instabilities due to nearly-singular matrices. An interaction expansion has also been derived in (Assaad and Lang, 2007) for retarded interactions such as XZ β Sret = dτ dτ ′ Oa (τ )W ab (τ − τ ′ )Ob (τ ′ ), (56) ab 0 where O denotes a fermion bilinear. This formalism will be discussed in Sec. VII.C. 13 is estimated by Gτ1 τ1 ,...,τk τk (τ, τ ′ ) (corresponding to A(x) in Eq. (21)): B. Updates The series (51) and the corresponding one for (55) are of the type (30), and we can employ continuous-time sampling as described in Sec. II.C. We insert and remove interaction vertices on the imaginary time axis, corresponding to the terms U (n↑ (τ )−αsτ ↑ )(n↓ (τ )−αsτ ↓ ) (see Figure 5). Proposing a vertex insertion update with probability dτ /(2β) (for the imaginary time location and the orientation of the auxiliary spin sτ ) and a removal update with probability 1/(k + 1) we obtain R= βU Y det Dσk+1 . (k + 1) σ det Dσk (57) Note that for the interaction defined in Eq. (55) the prefactor 21 is compensated by the factor 2 in the ratio of proposal probabilities, which comes from the two possible values of sτ , so that the acceptance ratio is the same as in the straightforward approach. This update and its inverse are sufficient to be ergodic. In evaluating the determinant ratios the fast-update technique described in Sec. X.A should be used, since it allows to calculate the ratio R in O(k 2 ) operations, substantially faster than the naı̈ve evaluation of determinants with O(k 3 ) operations. b) a) 0 β 0 β 0 β c) 0 β Gσ (τ − τ ′ ) = hGτ1 τ1 ,...,τk τk (τ, τ ′ )iMC , Gτ1 τ1 ,...,τk τk (τ, τ ′ ) = − (59) hTτ dσ (τ )d†σ (τ ′ )n1σ n2σ · · · nkσ i0 . hn1σ n2σ · · · nkσ i0 (60) The h· · · iMC denotes a Monte Carlo average, while the h· · · i0 denotes all possible Wick’s contractions of one particular Monte Carlo configuration. The denominator is a determinant that cancels the Wick’s contraction of a partition function configuration p, and the numerator determinant consists of a matrix with an additional row [Gσ0 (τ − τ ′ ), Gσ0 (τ − τ1 ), Gσ0 (τ − τ2 ), . . . , Gσ0 (τ − τk )] and column (Gσ0 (τ − τ ′ ), Gσ0 (τ1 − τ ′ ), Gσ0 (τ2 − τ ′ ), . . . , Gσ0 (τk − τ ′ )]. The configuration Gτ1 τ1 ,...τk τk (τ, τ ′ ) in Eq. (60) depends on two independent arguments τ, τ ′ , while the observable average Eq. (58) is time-translation invariant. This symmetry of the effective action is restored only after the averaging in Eq. (59). It will be shown in Sec. X.C that it is best either to measure a quantity corresponding to ΣG or to perform a Fourier transform to Matsubara frequencies analytically, so that the Green’s function is calculated directly in the frequency domain. There is one particular observable estimate that can be obtained just from the properties of the random walk itself, without any additional calculation: the average value of the perturbation operator. One can see from a term-to-term comparison of the respective series that the average perturbation order hki is proportional to the inverse temperature and the average value of the interaction operator, d) FIG. 5 Local updates for the CT-INT algorithm. (a): starting configuration (b): Insertion of a vertex, (c): removal of a vertex, (d): shift of a vertex in imaginary time. hkiMC = hSU i. (61) Therefore, the expecation value of the interaction operator U n↑ n↓ is hkiMC /β. D. Generalization to clusters, multi-orbital problems and retarded interactions C. Measurements Monte Carlo averages are calculated using Eq. (21) and (22), where the distribution p of Eq. (21) is given by the coefficients of Eq. (51). In particular, the Green’s function Gσ (τ − τ ′ ) = − × D Z ∞ Z0 X (−U )k dτ1 . . . dτk Z k! k=0 E Tτ dσ (τ )d†σ (τ ′ )n1↑ (τ1 )n1↓ (τ1 ) · · · nk↓ (τk ) 0 (58) In the case of the Hubbard model on a cluster, the only difference to the single orbital case is that creation and annihilation operators acquire an additional site index. We can absorb all quadratic hopping terms in G 0 and perform the interaction expansion in X (ni↑ − αi↑ )(ni↓ − αi↓ ), (62) SU = U i where i runs over the sites of the cluster. The αiσ -terms are chosen as in the single site case; optionally with an auxiliary spin si at each site. 0 The Green’s functions Gij (τi − τj ) are site-dependent, but the spin up and spin down contributions still factor 14 into separate determinants: ∞ X Z = Z0 k=0 Z dτ1 . . . dτk IV. CONTINUOUS-TIME AUXILIARY FIELD ALGORITHM CT-AUX (−U )k Y det Dσk , (63) k! σ =±1 X s1 ···sk i1 ···ik 0 where (Dσk )ij = Gij,σ (τi − τj ) − δij αiσ . It follows immediately that there is no sign problem in the half-filled case, where the determinants of the up- and down matrices are identical. However, away from half filling a sign problem occurs in general, see e.g. Fig. 5 in Ref. (Gull et al., 2008a). For the updates a generalization of Eq. (57) should be used, where β is replaced by the factor βNc , with Nc the number of sites in the cluster. In the general case of multiple orbital problems intrinsic sign problems typically occur, and management even of the trivial sign problem becomes more involved. The I basic idea is to express the interaction Hloc [Eq. (2)] in action form as XZ Z Sloc = dτ dτ ′ I pqrs (d†p ds − αps )(d†q dr − αqr ) (64) A first continuous-time auxiliary field method for fermionic lattice models was developed by Rombouts et al. (1998, 1999), and applied to the nuclear Hamiltonian and small Hubbard lattices. We present here a different formulation (Gull et al., 2008a) that is also applicable to (cluster) impurity problems. This continuous-time auxiliary field (CT-AUX) algorithm is based on an interaction expansion combined with an auxiliary field decomposition of the interaction vertices. One may view CT-AUX as an “optimal” Hirsch-Fye algorithm, on a non-uniform time grid and with a varying number of “time slices” that are chosen automatically for given parameters. The approach allows the combination of numerical techniques developed for the Hirsch-Fye algorithm (see e.g. Sec. X.B.1) with the advantages of a continuous-time method. It was shown to be equivalent to the weak coupling algorithm in the case of the single-band Hubbard model (Mikelsons et al., 2009b). Currently the CT-AUX impurity solver is the method of choice for large cluster simulations. pqrs and then perform a multiple expansion in the interactions I pqrs . In multi-oribtal systems the number of terms proliferates; for N orbitals there are of order N 4 terms, although in practice some of them vanish by symmetry. Denoting the tuple (pqrs) at vertex i by ξi we have Nξ Z ∞ X X (−1)k Iξ1 . . . Iξk Z = hvξ1 · · · vξk i0 , dτ1 . . . dτk Z0 k! k=0 ξ1 ...ξk vξ ≡ (d†pξ dsξ − αps )(d†qξ drξ − αqr ). (65) If we insert random (nonzero) matrix elements at random times in [0, β), the prefactor of the acceptance probability ratios in Eq. (57), βU/(k + 1), is modified by a factor Nξ , becoming βINξ /(k + 1). Wick’s theorem of Eq.(65) yields a determinant similar 0 to (54). If the Green’s function matrix Gij for the different orbitals is diagonal in the orbital indices, the determinant factorizes into smaller-size determinants, each However, in general there is no reason for the determinant of D to have the same sign for all configurations. The choice of α-terms has an influence on the sign statistics, and they need to be adjusted for each problem such that the expansion is sign - free or at least has an average sign that is as large as possible. How this is best done is still an open question. An ansatz has been presented in Ref. (Gorelov, 2007). The basic principle is to treat the off-diagonal interaction terms with small but non-zero α, whereas the symmetrized form (55) is used for the density-density part. A. Partition function expansion We present the derivation for the case of a cluster impurity problem with Nc cluster sites. The generalization to multiorbital models with density density interactions is straightforward. The application to more general multiorbital models would involve techniques similar to Sakai et al. (2006a,b) and has not yet been attempted. Starting from the partition function Z = Tre−β(H0 +HU ) we add a non-zero constant K to HU : HU = U Nc  X i ni↑ ni↓ − ni↑ + ni↓ 2  − K , β (66) H0 = HAIM − HU + K/β, (67) such that  Z = Tr e −βH0 Tτ e R dτ  K β −U   PN c  n +n ni↑ ni↓ − i↑ 2 i↓ i . (68) Expanding the exponential in powers of HU and applying the auxiliary field decomposition (Rombouts et al., 1999)  Nc  1 ni↑ + ni↓ βU X = ni↑ ni↓ − 1− K i 2 2Nc X eγsi (ni↑ −ni↓ ) , i,si =±1 U βNc cosh(γ) = 1 + , 2K (69) (70) 15 0 τ1 τ2 β τ1 0 τ1 0 τ2 τ3 β we obtain ∞ X X Z k=0 s1 ,···sk =±1 β dτ1 ... 0 Zk ({si , τi , xi }) ≡ Tr Z β dτk τk−1  K 2βNc k Zk ({sk , τk , xk }), (71) 1 Y e−∆τi H0 esi γ(nxi ↑ −nxi ↓ ) , τ1 β FIG. 6 Pictorial representation of configurations {k; (sj , τj )} ∈ C which are sampled by the CT-AUX algorithm. Diagrams for orders 0, 1, 2, and 3. In this algorithm, an auxiliary spin sj (represented here by the red and blue vertices and the direction of the arrows) needs to be sampled in addition to the imaginary time location τj of a vertex. Z= 0 (72) i=k with ∆τi ≡ τi+1 − τi for i < k and ∆τk ≡ β − τk + τ1 . Equation (72) is very similar to the equations for the BSS (Blankenbecler et al., 1981) or HirschFye (Hirsch and Fye, 1986) algorithms (see also (Georges et al., 1996), appendix B1). Using the identity P P P † † † Trd† ,di {e− ij di Aij dj e− ij di Bij dj e− ij di Cij dj } = det i  1 + e−A e−B e−C and following the derivation in (Gull et al., 2008a), we obtain Y Zk ({si , τi , xi }) = det N−1 (73) σ ({si , τi , xi }), Z0 σ=↑,↓  {si }  {s } {τ ,x } Vσ i N−1 − G0σi i eVσ − 1 , (74) σ ({si , τi , xi }) ≡ e   {si } σ σ eVσ ≡ diag eγ(−1) s1 , . . . , eγ(−1) sk , (75) with the notations (−1)↑ ≡ 1, (−1)↓ ≡ −1 and {τ ,x } {τ ,x } (G0σi i )i,j = Gx0i xj ,σ (τi − τj ) for i 6= j, (G0σi i )i,i = Gx0i xi ,σ (0+ ) (we assume in this section that Gx0i xi ,σ (0+ ) > 0). As we handle a variable number of time slices at constantly shifting imaginary time locations, it is advantageous to formulate the algorithm in terms of a matrix Nσ , defined by Gσ = Nσ G0σ instead of G. With Eq. (74) we express the weight of any (auxiliary spin, time, site) configuration in terms of the bath Green’s function Gσ0 , the constant γ defined in Eq. (69), and the determinant of two matrices Nσ . The contribution of such a configuration to the whole partition function is given by Eq. (73). B. Updates In the CT-AUX-algorithm the partition function Eq. (71) consists of a sum over all expansion orders k 0 τ1 e ov m re ert s in 0 β τ3 τ2 τ2 β β FIG. 7 An insertion update and its corresponding removal update within the CT-AUX algorithm. up to infinity, another discrete sum over auxiliary fields s and sites x, and a k-dimensional time-ordered integral from zero to β, so we can employ the sampling scheme of chapter II.C. In addition to the imaginary time locations of the interaction vertices we also need to sample auxiliary spins sj associated with each vertex. Thus, the configuration space C (Eq. 20) is given by the set C ={{}, {(s1, τ1 , x1 )}, {(s1 , τ1 , x1 ), (s2 , τ2 , x2 )}, · · · , {(s1 , τ1 , x1 ), · · · , (sk , τk , xk )}, · · · }, (76) where the sj are auxiliary Ising spins that take values ±1, k is the expansion order, xj denotes cluster sites and the τj are continuous variables between 0 and β, which we assume to be time-ordered, i.e. τ1 < τ2 < · · · < τk . Note that this representation is different from the one proposed in (Rombouts et al., 1999), where the configuration space consists of a number Nmax of fixed “slots” at which interaction operators can be inserted into an operator chain (a “fixed length” representation). This leads to additional combinatorial factors in the acceptance probabilities. Although they are not sufficient for an ergodic sampling, we first consider spinflip updates at constant order which are fast to compute and very useful for reducing autocorrelation times: ((s1 , τ1 , x1 ), · · · , (sj , τj , xj ), · · · , (sk , τk , xk )) → ((s1 , τ1 , x1 ), · · · , (−sj , τj , xj ), · · · , (sk , τk , xk )), (77) the probability density ratios of the two configurations are computed from Eq. (73) as: Q ′ ′ ′ det N−1 p(x′ ) σ ({si , τi , xi }) . (78) = Qσ R= −1 p(x) σ det Nσ ({si , τi , xi }) Vertex insertion updates from configuration x = {si , τi , xi } to configuration y = {s′i , τi′ , x′i }, on the other hand, have to be balanced by a removal updates (Fig. 7). The proposal probability accounts for choosing a random time between 0 and β, a random site, and a random spin direction: prop Wxy = 1 dτ . 2Nc β (79) 16 The proposal probability of removing a spin, going from order k + 1 to order k consists of choosing one of k + 1 spins: prop Wyx 1 . = k+1 (80) Therefore we obtain, following Eq. (28): Rxy = K det N↑ (y) det N↓ (y) . k + 1 det N↑ (x) det N↓ (x) (81) The efficient numerical computation of these expressions is discussed in Sec. X.A and X.B. 2. Role of the parameter K – potential energy Similar to the weak-coupling expansion parameter α of Sec. III, the parameter K of Eq. (66) can be freely adjusted. The average perturbation order hkctaux i is related to K, the potential energy and filling by hkctaux iMC = K − βU hn↑ n↓ − (n↑ + n↓ )/2i, (86) and hence the perturbation order in the continuous-time auxiliary-field method grows linearly with K. C. Measurements 1. Measurement of the Green’s function The main observable of interest is the Green’s function Gpq,σ (τ, τ ′ ) for cluster sites p and q and spin σ. First, let us note that we are free to add two additional “noninteracting” spins s = s′ = 0 to Eq. (72) at any arbitrary time τ and τ ′ (we denote the corresponding matrices of size n + 2 with a tilde). ZGpq,σ (τ, τ ′ ) is then given by an expression similar to Eq. (72), with an insertion of dσ (τ ) and d†σ (τ ′ ) at the corresponding times. Using the same linear algebra as in Hirsch-Fye (Eq. (118) of Ref. (Georges et al., 1996)) we obtain Z β Z β 1 X  K k X Gpq,σ (τ, τ ′ ) = dτ1 . . . dτk Z 2βNc 0 τk−1 s =±1 k≥0 i 1≤i≤k i ,τi ,xi } × Zk ({si , τi , xi })G̃{s (τ, τ ′ ), pq,σ (82) {τ } {s ,τ ,x } i i i i . Since s = = Ñσ,pr ({si , τi , xi })G̃0,rq,σ with G̃pq,σ s′ = 0, a block calculation yields 0 i ,τi ,xi } G̃{s (τ, τ ′ ) = Gpq,σ (τ, τ ′ ) pq,σ + k X l,m=1 0 Gpx (τ, τl )Mlm Gx0m q,σ (τm , τ ′ ). l ,σ {si } Mlm = [(eVσ (83) − 1)Nσ ({si , τi , xi })]lm , , (84) and Gpq,σ (τ, τ ′ ) = hG̃pq,σ (τ, τ ′ )iMC . As in the CT-INT algorithm we may Fourier transform the above expression to obtain a measurement formula in frequency space: 0 G̃pq (iωn ) = Gpq (iωn ) (85) 0 0 X Gpl (iωn )Gmq (iωn ) eiωn τl Mlm e−iωn τm . − β lm By accumulating the Fourier coefficients directly, we avoid many of the discretization and related high frequency expansion problems (Sec. X.C). A closer analysis of Eq. (83) shows that it is possible and advantageous to measure S = MG 0 = ΣG directly, as will be discussed in Section X.C.1. V. HYBRIDIZATION EXPANSION SOLVERS CT-HYB A. The hybridization expansion representation A complementary approach to the CT-INT and CTAUX solvers described in chapters III and IV is the hybridization expansion algorithm (CT-HYB) developed by Werner, Millis, Troyer, and collaborators (Werner et al., 2006; Werner and Millis, 2006). It proceeds from Eq. (19) with Hb taken to be the hybridization term Hhyb and Ha = Hbath + Hloc . An advantage of this approach is that the average expansion order for a typical problem near the Mott transition is much smaller than in the interaction expansion methods and therefore lower temperatures are accessible (Gull et al., 2007). General interactions can easily be treated as long as the local Hilbert space is not too large. The first paper, (Werner et al., 2006), presented an algorithm and applications for the single impurity Anderson model. A generalization to multi-orbital models with complex interactions and the Kondo model soon followed (Werner and Millis, 2006), and this formalism was later extended by Haule (2007) who introduced the ideas of basis truncation and sector statistics, and implemented the algorithm for models with off-diagonal hybridization functions. P † Since Hhyb = pj (Vpj c†p dj + Vpj∗ d†j cp ) = H̃hyb + H̃hyb contains two terms which create and annihilate electrons on the impurity, respectively, only even powers of the expansion and contributions with equal numbers of H̃hyb † and H̃hyb can yield a non-zero trace. The partition function therefore becomes Z= ∞ Z X k=0 β dτ1 . . . 0 Z β τk−1 dτk Z 0 β dτ1′ . . . Z β τk′ −1 dτk′ (87) i h † † (τk′ ) . . . H̃hyb (τ1 )H̃hyb (τ1′ ) . × Tr Tτ e−βHa H̃hyb (τk )H̃hyb 17 † Inserting the H̃hyb and H̃hyb operators explicitly yields Z= ∞ Z β X k=0 X dτ1 . . . 0 X j1 ,···jk p1 ,···pk ′ ′ j ′ ,···j ′ p1 ,···pk 1 k Z β dτk τk−1 j′ ∗ Vpj11 Vp′1 1 β Z 0 dτ1′ . . . β Z ′ τk−1 dτk′ j′ ∗ · · · Vpjkk Vp′k k Separating the bath and impurity operators we obtain Z β Z β Z β ∞ Z β X Z= dτ1 . . . dτk dτ1′ . . . dτk′ (89) 0 τk−1 X j1′ ∗ p′1 j1 ,···jk p1 ,···pk ′ ′ j ′ ,···j ′ p1 ,···pk 1 k Vpj11 V jk′ ∗ p′k · · · Vpjkk V h i × Trd Tτ e−βHloc djk (τk )d†j ′ (τk′ ) · · · dj1 (τ1 )d†j ′ (τ1′ ) 1 k   × Trc Tτ e−βHbath c†pk (τk )cpk′ (τk′ ) · · · c†p1 (τ1 )cp′1 (τ1′ ) . We can now integrate out the bath operators cp (τ ), since they are non-interacting and the time-evolution (given by Ha ) no longer couples the impurity and the bath. Defining the bath partition function YY Zbath = Tre−βHbath = (90) (1 + e−βεp ), and the anti-periodic (Eq. (10)), ∆lm (τ ) = p hybridization function ∆  X Vpl∗ Vpm −e−εp (τ −β) , 0 < τ < β × , ε β e−εp τ , −β < τ < 0 e p +1 p (91) we obtain the determinant h X 1 Trc Tτ e−βHbath Zbath p ,···p 1 k X p′1 ,···p′k j′ ∗ j′ ∗ Vpj11 Vp′1 · · · Vpjkk Vp′k 1 i ×c†pk (τk )cpk′ (τk′ ) · · · c†p1 (τ1 )cp′1 (τ1′ ) = det ∆, k (92) for an arbitrary product of bath operators. Here, ∆ is a k × k matrix with elements ∆lm = ∆jl jm (τl − τm ). In practice, and in analogy to the algorithms in previous sections, it will be more convenient to handle the inverse of this matrix ∆, which we denote by M = ∆−1 (see Sec. X.A). The partition function expansion for the hybridization algorithm now reads (for time-ordered configurations) X ZZZ X X Z = Zbath (93) dτ1 · · · dτk′ k s τ3↑ e τ2↑ d † d e τ3↑ d d † d d† d d† d d† e τ3↓ s τ1↓ e τ1↓ s τ2↓ e τ2↓ s τ3↓ β FIG. 8 Segment representation of term in hybridization expansion of single orbital Anderson model. Upper line: spin up orbital, lower line, spin down orbital: heavy line, orbital occupied; light line, orbital empty. For each orbital, length of black line (occupied orbitals) determines the chemical potential contribution to the weight factor (95). Shaded areas: regions where both up and down orbitals are filled, so the impurity is doubly occupied. The length of the shaded area enters into an overall weighting factor for the potential energy (Hubbard U ). ′ τk−1 0 σ d d † 0 1 k=0 s τ2↑ e τ1↑ (88) h × Tr Tτ e−βHa djk (τk )c†pk (τk )cpk′ (τk′ )d†j ′ (τk′ ) k i † ′ ′ † · · · dj1 (τ1 )cp1 (τ1 )cp′1 (τ1 )dj ′ (τ1 ) . X s τ1↑ Hloc j1 ,···jk j1′ ,···jk′ If the coupling to the bath is diagonal in the “flavor” (spin, site, orbital, . . . ) indices j, then ∆ is a blockdiagonal matrix and Eq. (93) simplifies to Z β ∞ Z β YX j Z = Zbath dτ1j . . . (94) dτ ′ kj j kj =0 0 τ ′ jk j −1 i × Trd Tτ e−βHloc dj (τkjj )d†j (τk′jj ) . . . dj (τ1j )d†j (τ1′j ) det ∆j . h B. Density - density interactions We first consider (multi-orbital) models with densitydensity interactions. In this case, the local Hamiltonian Hloc commutes with the occupation number operator of each orbital. We may therefore represent the time evolution of the impurity by collections of “segments” which represent time intervals in which an electron of a given flavor resides on the impurity. An example of such a segment configuration for a single orbital model (two spin flavors) is shown in Fig. 8. Since the local Hamiltonian is diagonal in the occupation number basis the contribution of the trace factor can be computed for each segment configuration. For a model with n orbitals and a total length Lj of segments in orbital j and a total overlap Oij between segments of flavor i and j one obtains (s is a sign depending on the operator sequence) wloc (x) = Trd [. . .] = seµ Pn j Lj − e Pn i<j (Uij Oij ) , (95) except in the trivial case where there are no operators for certain flavors. In the latter case, several segment configurations, involving “full” and “empty” lines, contribute to the trace. C. Formulation for general interactions h i If Hloc is not diagonal in the occupation number ba× Trd Tτ e−βHloc djk (τk )d†j ′ (τk′ ) · · · dj1 (τ1 )d†j ′ (τ1′ ) det ∆. sis defined by the d† , a separation of flavors, as in α 1 k 18 Hloc 0 β FIG. 9 A typical term in the expansion (93): three “flavors” (red, blue, yellow) of fermionic creation and annihilation operators (denoted by filled and empty triangles, squares, and circles) are placed at times between 0 and β. In the general case, orbital occupation is not conserved by local Hamiltonian so two operators of the same type may follow each other. the segment formalism, is no longer possible (see illustration in Fig.   = Q 9) and the calculation of wloc (x) Trd Tτ e−βHloc α dα (τkαα )d†α (τk′αα ) . . . dσ (τ1α )d†α (τ1′α ) becomes more involved. One strategy, proposed in (Werner and Millis, 2006) is to represent the operators dα and d†α as matrices in the eigenbasis of Hloc because in this representation the time evolution operators e−Hloc τ become diagonal. The evaluation of the trace factor thus involves the multiplication of matrices whose size is equal to the size of the Hilbert space of Hloc . Since the dimension of the Hilbert space grows exponentially with the number of flavors, the calculation of the trace factor becomes the computational bottleneck of the simulation, and the matrix formalism is therefore restricted to a relatively small number of flavors (. 10). The technical part of evaluating these traces is described in detail in Sec. X.F. Haule (2007) observed that conserved quantum numbers may be exploited to facilitate the calculation of the trace. If the eigenstates of Hloc are ordered according to conserved quantum numbers, the evaluation of the trace is reduced to block matrix multiplications (see Sec. X.F) of the form h X (96) wloc (x) = Trm · · · contr.m · · ·(O)m′′ ,m′ (e−(τ ′ −τ )Hloc i )m′ (O)m′ ,m (e−τ Hloc )m , where O is either a creation or annihilation operator, m denotes the index of the matrix block, and the sum runs over those sectors which are compatible with the operator sequence. With this technique, 3-orbital models or 4-site clusters can be simulated efficiently (Chan et al., 2009; Gull et al., 2008b; Haule and Kotliar, 2007b; Park et al., 2008b; Werner et al., 2008). However, since the matrix blocks are dense and the largest blocks still grow exponentially with system size, the simulation of 5-orbital models becomes already quite expensive and the simulation of 7-orbital models with 5, 6 or 7 electrons is only feasible with current computer resources if the simulation is restricted to a few valence states and, within this subspace, the maximum size of the blocks is truncated (see X.F.2). Simulations based on such a truncated version of the matrix formalism were used in (Marianetti et al., 2008; Shim et al., 2007). The Krylov method described in the next section avoids truncations to a large extent. D. Krylov implementation An alternative strategy to evaluate the trace in Eq. (93) was proposed in (Läuchli and Werner, 2009) based on the observation that in the occupation num(†) ber basis both the di -operator matrices and Hloc are (†) typically very sparse, so the di -operators can easily be applied to any given state while efficient Krylov-space methods can be used to evaluate the imaginary time evolution. This implementation involves only matrix-vector multiplications with sparse operators d(†) and Hloc , and is thus doable even for systems for which the multiplication of dense matrix blocks becomes prohibitively expensive. Furthermore, no explicit truncation of states of the local Hamiltonian is required, so that all excited states remain accessible at intermediate times τ in the trace. The outer trace may be approximated by a sum over the lowest energy states. If this is done it is important to measure the various local observables at τ = β/2 in order to be least affected by the truncation of the trace at τ = 0 (and equivalently at τ = β). The complexity of the Krylov algorithm is O(Ndim × Ntr × Nhyb × Niter ), where Ndim is the size of the impurity Hilbert space, Ntr the number of states kept in the outer trace, Nhyb the number of hybridization events, and Niter the number of Krylov iterations used for the calculation of the time evolution from one operator to the next. In Ref. (Hochbruck and Lubich, 1997) it has been shown rigorously that these Krylov space algorithms converge rapidly as a function of Niter , typically reaching convergence for very small iteration numbers p ≪ Ndim , although the number of iterations depends on the time interval τ . In the worst case where all states in the trace are retained (Ntr = Ndim ) and the complexity scales as 2 Ndim , where as in the best case Ntr = O(1) and the complexity is linear in the dimension of the Hilbert space. In comparison, the complexity of the approach described in Sec. V.C is cubic in Ndim. Läuchli and Werner (2009) showed that the Krylov approach with outer trace truncated to the lowest energy states becomes favorable for models with more than 4 orbitals (or 4 sites). The systematic error resulting from the truncation of the outer trace becomes negligible at temperatures below a few percent of the bandwidth. The Krylov-based hybridization expansion thus provides a method for the systematic investigation of larger problems such as the dynamical mean field theory of transition metal and actinide compounds. 19 lmax β e ov m re ert s in 0 a) d) b) e) c) 0 β FIG. 10 An insertion update and its corresponding removal update within the hybridization algorithm. E. Updates In order to sample Eq. (93) we perform a Monte Carlo simulation as described in Sec. II.C. We explain the sampling procedure for the formulation with density-density interactions. The two basic updates required for ergodicity are the insertion and the removal of a segment. Starting from a configuration of segments xk = {(τ1s , τ1e ), (τ2s , τ2e ), · · · , (τks , τke )} we attempt to insert a new segment sk+1 starting at τ s to obtain a configuration yk+1 . This move is rejected if τ s lies on one of the existing segments, since we cannot create two identical fermions at the same site. Otherwise, we choose a ′ random time uniformly in the interval [τ s , τ s ) of length ′ lmax (Fig. 10), where τ s is the start of the next segment in xk . For the reverse move, the proposal probability is given by the probability of selecting that given segment for removal. Therefore the proposal probabilities are prop Wyx 2 dτ , βlmax 1 = , k+1 prop Wxy = (97) (98) and the acceptance ratio becomes Rxy = prop py Wyx βlmax wloc (y) det ∆(y) . prop = px Wxy k + 1 wloc (x) det ∆(x) f) 0 (99) An important second update, equivalent to the insertion of a segment, is the insertion of an “antisegment”: the insertion of a annihilator-creator pair istead of a creator-annihilator pair. The formulae for the acceptance ratio are the same as Eq. (99). Besides smaller autocorrelation times these updates cause the two zero-order contributions “full occupation” and “no segment” to be treated on equal footing. Further updates, like the shift of a segment or the shift of one or both end points do not change the order of the expansion, but help to reduce autocorrelation times. The shift moves are “self - balancing” (proposal probabilities β 0 β FIG. 11 Updates of the hybridization algorithm as described in the text: (a) original configuration; (b) removal of a segment; (c) shift of an end point of a segment; (d) insertion of an antisegment; (e) removal of an antisegment; (f) removal of another antisegment such that the remaining segment ”wraps” around β. for shift moves and their inverse are the same), so Rxy = wloc (y) det ∆(y) . wloc (x) det ∆(x) (100) Global updates (Sec. X.G), e.g. the exchange of all segments of two orbitals, may be required to ensure ergodicity, i.e. that the random walk does not get trapped in one part of phase space. Such updates require the configuration to be recomputed from scratch, and are in general of order O(k 3 ). They are essential in calculations of magnetic phase boundaries (Chan et al., 2009; Poteryaev et al., 2008; Kuneš et al., 2009). F. Measurements The CT-HYB algorithm generates configurations with the weight that they contribute to the partition function Z. To obtain expectation values of an observable we can either simulate the series of that observable (which, for the Green’s function, corresponds to the “worm” algorithm described in Sec. X.D), or estimate the observable according to Eq. (21). The single most important observable for quantum Monte Carlo impurity solvers is the finite temperature imaginary time Green’s function Glm (τ ) = −hTτ dl (τ )d†m (0)i. The series for this observable is h X Z Glm (τl , τm ) = −Zbath dτ1 ...dτk′ det ∆k Trd Tτ e−βHloc j ,···j k, j1′ ,···jk ′ 1 k i . dl (τl )d†m (τm )djk (τk )d†j ′ (τk′ ) . . . dj1 (τ1 )d†j ′ (τ1′ ) 1 k (101) This shows that Green’s function configurations at expansion order k are partition function configurations at expansion order k with additional dl and d†m operators or, alternatively, partition function operators at order k + 1 with no hybridization line connecting to dl (τl ) and d†m (τm ). In practice we obtain an estimator of Glm (τl , τm ) by identifying two operators dl (τl ), d†m (τm ) 20 Hloc d† d d† d d† d τ1s τ1e τ2s τ2e τ3s τ3e 0 β τ = τ3s − τ1e commute with the local Hamiltonian): X Uij Dij , Epot = (106) i>j FIG. 12 Hybridization algorithm: Green’s function configuration. A typical configuration for a Green’s function, created by taking the partition function configuration of order k = 3 and identifying the creation operator at τ3s and the annihilation operator at τ1e as the Green’s function operators to obtain a Green’s function configuration corresponding to a partition function configuration at one order lower. Red: creation and blue: annihilation operators of the partition function. Light purple: Green’s function operators. in a partition function configuration that are an imaginary time distance τ = τl − τm apart, and removing the hybridization line connecting them (see Fig. 12). The insertion of local operators into a partition function configuration, as it is done in the interaction expansion formalism, is not ergodic in the hybridization expansion. l ,τm of The size (k−1)×(k−1) hybridization matrix ∆τk−1 † all hybridization operators except for dl (τl ) and dm (τm ) corresponds to ∆ with the column/row sl and sm corresponding to the operators dl and d†m removed, and the weight of a Green’s function configuration Glm (τl , τm ) is pGlm Z l ,τm det ∆τk−1 = . det ∆ (107) The occupation nj of the jth flavor is estimated by the length Lj (Eq. 95) of all the segments: nj = hLj /βi. Double occupancies and interaction energies are obtained from the overlap Oij of segments as Dij = hOij i/β. The system has a total magnetization of Sz = (hLtot ↑ − i)/β. Overlaps and lengths of segments are comLtot ↓ puted at every Monte Carlo step, and thus these observables may be obtained with high accuracy at essentially no additional cost. Finally the average expansion order of the algorithm is an estimator for the kinetic energy (Haule, 2007), similar to Epot in the case of the CT-INT and CT-AUX algorithms: Ekin = 1 hkiMC . β (108) VI. INFINITE-U METHOD CT-J A. Overview (102) An expansion by minors or the inverse matrix formulas of Sec. X.A describe how such a determinant ratio is computed: pGlm = (∆)−1 sm sl = M sm sl . Z Dij = hni nj iMC . (103) We can bin this estimate into fine bins to obtain the Green’s function estimator + * k 1 X Mji δ̃(τ, τm − τl )δt(i)l δt(j)m , (104) Glm (τ ) = β ij MC  δ(τ − τ ′ ), τ ′ > 0 ′ δ̃(τ, τ ) = (105) −δ(τ − τ ′ − β), τ ′ < 0, with t(i) denoting the orbital index of the operator at row / column i. For a configuration at expansion order k we obtain a total of k 2 estimates for the Green’s function – or one for every creation-annihilation operator pair or every single element of the (k×k)-matrix M = ∆−1 . The measurement in Eq. (105) may suffer from bad statistics if very few hybridization lines are present (k is small) in an orbital. In this case, the Green function measurement should be based on the insertion of operators. In the segment representation, very efficient estimators exist for the density, the double occupancy and the potential energy (and similarly for all observables that In many cases the physics of interest is captured by low energy effective theories in which some (often high energy) degrees of freedom have been integrated out, leaving a model described by a restricted Hilbert space. A standard example is the “Schrieffer-Wolf” transformation which obtains the Kondo Hamiltonian (describing a single S = 1/2 spin exchange-coupled to a conduction band) as the low energy theory of the single-impurity Anderson model in the regime where the charge fluctuations are suppressed and the impurity is occupied by only one electron. The projection is conceptually advantageous, because it allows one to focus on the important degrees of freedom. There is also a computational advantage: while the CT-HYB method accomplishes the projection (because the simulation produces a large weight for the relevant states and a small weight for the states which are projected out), transitions between the states in the low energy manifold require excursions to states with very small weight, leading to large auto-correlation times. The direct study of a projected Hamiltonian avoids this problem. Otsuki and collaborators have presented a CTQMC method for dealing with projected Hamiltonians (Otsuki et al., 2007). Their papers focus on a particular class of “Coqblin-Schrieffer” (CS) or generalized Kondo models arising in the context of the physics of heavy fermion compounds. We follow their presentation here.be applicable to a much wider range of downfolded models. 21 Coqblin-Schrieffer models arise when an impurity spends most of its time in a state of definite charge, with occasional virtual fluctuations into different charge states. An example is the Anderson model, Eq. (12), in the large U , weak V limit where, if the level energy is correctly tuned, at almost all times the impurity is occupied by one electron which may be of spin up or down. Fluctuations into a state with density n = 0 or n = 2 followed by a return to a state n = 1 allow the impurity to exchange spin with the bath. More generally, the dominant charge state will have an N -fold degeneracy including spin and orbital degrees of freedom and virtual transitions will lead to a variety of exchange processes which may be encoded in a Hamiltonian of the CoqblinSchrieffer form (Coqblin and Schrieffer, 1969) X αk αk × Jα1 α′1 · · · Jαk α′k E YD ×s Tτ c†α (τ1′ )cα (τ1′′ ) · · · c†α (τk′ α )cα (τk′′α ) α c i × Trspin Tτ e−βHspin Xα1 α′1 (τ1 ) · · · Xαk α′k (τk ) . h (114) Eα Xαα , (111) (112) (116) α HJ = − α1 α1 (110) kb Hspin = k=0 εk c†kb ckb , (109) with impurity states labeled by a spin- or orbital quantum number α and a bath described by an energy quantum number k and a spin/orbital quantum number b: X The partition function Z divided by the conduction electron contribution Zbath = Trc e−βHbath is Z β Z β ∞ X X X Z ··· = (−1)k dτ1 · · · dτk Zbath 0 τk−1 ′ ′ Here, the conduction electron operators are grouped by component index α (a resultant sign in the permutation † is represented by s), kα is the P number of operators cα cα for each component α, and α kα = k. We also used the −1 notation h· · · ic = Zbath Trc [e−βHbath . . .]. Wick’s theorem for the conduction electrons implies Z β ∞ Z β X X X Z wk , (115) = dτ1 · · · dτk ··· Zb τk−1 k=0 0 αk α′k α1 α′1 Y (kα ) wk = (−1)k Jα1 α′1 · · · Jαk α′k · s det Dα HCS = Hbath + Hspin + HJ Hbath = B. Partition function expansion X ′ ′ kk ,bb Jαα Xαα′ ckb c†k′ b′ . ′ αα′ ,bb′ kk′ (k ) Here Xαα′ = |αihα′ | and without loss of generality we have chosen a basis in which the impurity (spin) Hamiltonian is diagonal. The exchange parameters J are typically of order V 2 /U and in most treatments the kdependence is neglected. Furthermore, in the applications presented to date the spin-orbit quantum numbers of the bath electrons are those of the impurity states and are conserved so that HJ → HJCS where HJCS = − α h i × Trspin Tτ e−βHspin Xα1 α′1 (τ1 ) · · · Xαk α′k (τk ) . X Jαα′ Xαα′ cα c†α′ , (113) αα′ √ P and cα = 1/ N k ckα . The consequences of removing this approximation are an important open question. The formalism of Otsuki et al. follows Eq. (19) with HJ playing the role of the expansion term Hb . While formally this is a perturbative expansion in an interaction parameter, it is in a practical sense closely related to the hybridization expansion: each vertex changes the state of the impurity, just as does the V -term in CT-HYB; the difference is that at each event, one electron and one hole is created. In what follows we summarize the main features of the CT-J algorithm, following the presentation by Otsuki et al. (2007). (k ) The kα × kα matrix Dα α is defined by (Dα α )ij = Gα0 (τi′′ − τj′ ) with Gα0 (τ ) = −hTτ cα (τ )c†α (0)ic and wk is the weight of a Monte Carlo configuration of order k. This configuration can be represented in terms of the numbers {τi } = (τ1 , · · · , τk ) and {αi } = (α1 , · · · , αk ), or, as illustrated in Fig. 13, by a decomposition of the imaginary time interval into k segments [τi , τi+1 ) (modulo periodic boundary condition) with flavor αi . These variables define the sequence of X-operators Xαk αk−1 (τk ) · · · Xαi αi−1 (τi ) · · · Xα1 αk (τ1 ), (117) and a corresponding sequence of c-operators: (−1)k+1 c†αk (τ1 )cαk (τk ) . . . c†αi (τi+1 )cαi (τi ) . . . . . . c†α1 (τ2 )cα1 (τ1 ). (118) An expression equivalent to Eq. 116 was presented in the landmark 1970 Anderson-Yuval study of the Kondo model (Yuval and Anderson, 1970), but at that time could not be used as a starting point for numerical calculations. C. Updates Updates which change the order k are required for ergodicity, and updates which shift one of the operators in- 22 eq. (121) is reduced to (+) pk+1 det Dα = −Jαα . pk det Dα (122) (+) FIG. 13 Illustration of an insertion of a segment. The diagrams represent the configurations of {τi } and {αi }. crease sampling efficiency. In this section, we discuss updates which change the perturbation order by ±1. Note that if some coupling constants are 0, the straightforward sampling may not be ergodic. For example, when the interaction lacks diagonal elements in the N = 2 model, the perturbation order must be changed by ±2. We refer the reader to (Otsuki et al., 2007) for a discussion of updates which insert or remove several operators. Let us consider the process of adding τ and α, which are randomly chosen in the range [0, β) and from the N components, respectively. If τ satisfies τn+1 > τ > τn , {τi } and {αi } change into (τ1 , · · · , τn , τ, τn+1 , · · · , τk ) and (α1 , · · · , αn , α, αn+1 , · · · , αk ), respectively. In other words, one of the X-operators is replaced by Xαn+1 αn (τn+1 ) → Xαn+1 α (τn+1 )Xααn (τ ), (119) which corresponds to the change illustrated in Fig. 13: a segment α is inserted between αn and αn+1 with shortening of the segment αn . In the corresponding removal process, one removes a randomly chosen segment. Following the discussion in Sec. II.C and taking into account the proposal probabilities dτ /N β and 1/(k + 1) for insertion and removal (N is the number of local states) the ratio R of Eq. (29) becomes R= pk+1 N β , pk k + 1 (120) with pk = wk dτ1 . . . dτk as in Eq. (35), where for k 6= 0, the ratio pk+1 /pk is given by Here Dα is a matrix in which c†α (τ )cα (τ +0) is added to (+) the original one. The equal-time Green function in Dα 0 should be Gα (+0) to keep the probability positive (for J > 0). For J < 0 see the appendix in (Hoshino et al., 2010). In the case k = 0 all states contribute to the trace, and therefore p1 /p0 is given by e−βEα p1 = −Jαα P −βE ′ gα (+0). α p0 α′ e The ratios of the weights in Eq. (121)–(123) change their signs depending on the signs of the coupling constants. It was found in (Otsuki et al., 2007) that the probability remains positive in the case of antiferromagnetic couplings, i.e., Jαα′ > 0. This is consistent with the fact that the CS model with antiferromagnetic couplings is derived from the Anderson model, where the minus sign problem does not appear. Staggered susceptibilities and other two-particle correlation functions are discussed in (Otsuki et al., 2009b). D. The Kondo model Perhaps the most important projected model is the spin S = 1/2 Kondo model which is typically written as X H= εk c†kσ ckσ + JS · ~σc , (124) kσ where ~σc = ψc† ~σ ψc is the Pauli matrix for conduction electrons (ψc† = (c†↑ , c†↓ )). While it is possible to simulate this model directly using CT-HYB (Werner and Millis, 2006), it may be more convenient for some applications to re-express it in Coqblin-Schrieffer form by introducing pseudo-Fermion operators f † , f to represent the different states of the local moment: S = 21 ψf† ~σ ψf . Rearranging gives X X X H= εk c†kσ ckσ + v c†σ cσ + J fσ† fσ′ c†σ′ cσ , σ kσ Jα α Jααn pk+1 = n+1 e pk Jαn+1 αn (+) −l(Eα −Eαn ) det Dα det Dα det D̃αn . det Dαn (121) Here l = τn+1 − τ is the length of the new segment. (+) Dα is the matrix with c†α (τn+1 )cα (τ ) added to Dα , and D̃αn is the matrix with one of the operators shifted in time according to c†αn (τn+1 ) → c†αn (τ ). The ratio of determinants can be evaluated in O(k) using fast-update formulas (see Sec. X.A). If α = αn in Fig. 13, the change is just an addition of a diagonal element Xαα (τ ), so that (123) σ,σ′ (125) which is of the Coqblin-Schrieffer form Eq. (113) with Jσσ′ = J and with an additional potential scattering given by v = −J/2. Carrying out the CT-J expansion requires knowledge of the c-electron Green function G̃(z) in the presence of the potential scattering v. G̃(z) may be expressed in terms of the bare (v = 0) Green function G0 (z) as G̃ = G0 . 1 − vG 0 (126) 23 In the simulation of the CS model, G 0 (z) is replaced by G̃(z) and the calculation yields the impurity t-matrix tJ (z), computed with respect to G̃(z). To obtain the t-matrix t(z) of the Kondo model, Eq. (124), the contribution of the potential scattering should be subtracted from G̃(z). The full Green’s function G(z) can be expressed as G = G̃ + G̃tJ G̃ = G 0 + G 0 tG 0 . (127) Solving Eq. (127) for t(z) gives tJ v + . 1 − vG 0 (1 − vG 0 )2 t= (128) VII. PHONONS AND RETARDED INTERACTIONS A. Models In this section we present the application of CT-QMC techniques to models of electrons coupled to harmonic oscillators or, equivalently, to models of electrons subject to time-dependent (retarded) interactions. Such models arise in the study of electron-phonon coupling and if dynamical screening is important (Werner and Millis, 2010). In Hamiltonian form the quantum impurity model HQI is supplemented by a boson Hamiltonian HB and an electron-boson coupling Hel-B so that HQI → HQI +HB + Hel-B with X X HB + Hel-B = gνa Oa (b†ν + bλ ) + ων b†ν bν . (129) νa ν b†ν Here is the creation operator for a boson mode labeled by ν, Oa denotes a bilinear fermion operator, and ων , gνa are the boson frequency and electron-boson coupling constant respectively. In the widely studied “HolsteinHubbard” model, for example, there is just one boson mode and the operator O is the on-site electron density. An alternative representation in terms of an action may be obtained by integrating out the bosons, leading to the contribution XZ β Sret = dτ dτ ′ Oa (τ )W ab (τ − τ ′ )Ob (τ ′ ), (130) ab 0 with ab W (τ ) = Z ∞ 0 (W ab )′′ (ω) = −π arguments shows that these interactions may be represented in Hamiltonian form by adding a coupling to bosons, as defined in Eq. (129). Solving HQI + HB + Hel-B requires keeping track of the bosonic sector of the Hilbert space, which in principle involves an infinite number of additional states. Previous approaches to the problem have involved either treating the bosons semiclassically (Blawid et al., 2003; Deppeler and Millis, 2002) or truncating the boson Hilbert space, retaining only a finite number of boson states (Capone et al., 2004; Koller et al., 2004a,b; Sangiovanni et al., 2006, 2005). The semiclassical approach cannot account for quantal phonon effects such as electronic mass renormalization or superconductivity, while treating the boson Hilbert space directly adds very considerably to the computational overhead and therefore limits what can be done. Two approaches have been used in the CT-QMC context. One (Werner and Millis, 2007b) is based on a canonical transformation applied to the CT-HYB method and is (at least in its present form) restricted to models in which the operators O to which the phonons couple commute with the local Hamiltonian. For models (such as the single-site dynamical mean field theory of the Holstein-Hubbard model or of the dynamically screened Hubbard U ) which fulfill these conditions an electronboson coupling can be added at essentially no additional computation cost. The other method (Assaad and Lang, 2007) is a generalization of CT-INT to time-dependent interactions and can treat more general models, although at a substantially higher computational cost. dω ′ cosh[(τ − β/2)ω ′ ] (W ab )′′ (ω ′ ) , π sinh(βω ′ /2) X ν (131)   gνa gνb δ(ω − ων ) − δ(ω + ων ) . (132) Conversely, models of electrons subject to timedependent (retarded) interactions are defined by an action involving a term such as Sret and reversing the above B. CT-HYB In models where the oscillator degree of freedom couples to a conserved quantity of the local Hamiltonian, the phonons can be decoupled from the electrons by a canonical transformation of the sort originally introduced by Lang and Firsov (1962). By using the transformed variables to evaluate the trace over the phonon states, the hybridization expansion can be performed with negligible extra computational overhead (Werner and Millis, 2007b). We present the idea in the context of the single-site dynamical mean field description of the Holstein-Hubbard model, for which the local Hamiltonian may be written as Hloc = −µ(n↑ + n↓ ) + U n↑ n↓ √  ω0 X 2 + P 2 . (133) + 2λ(n↑ + n↓ − 1)X + 2 Here the boson coordinate X and momentum P satisfying [P, X] = i are related to the familiar boson cre√ ation and annihilation operators by X = (b† + b)/ 2 √ √ and P = (b† − b)/i 2, and the 2 in the coupling term 24 and the notation of the coupling constant as λ are conventional. Following (Lang and Firsov, 1962), the boson and fermion√operators may be decoupled by shifting X by X0 = ( 2λ/ω0 )(n↑ + n↓ − 1). The shift is accomplished by the unitary transformation eiP X0 so that the Hamiltonian H̃loc = eiP X0 Hloc e−iP X0 becomes ω0 2 H̃loc = −µ̃(ñ↑ + ñ↓ ) + Ũ ñ↑ ñ↓ + (X + P 2 ). (134) 2 H̃loc is of the Hubbard form but with modified chemical potential µ̃ and interaction strength Ũ , where µ̃ = µ − λ2 /ω0 , Ũ = U − 2λ2 /ω0 . spin 0 spin 0.07 λ −b) † dσ , † d˜σ = eiP X0 dσ e−iP X0 = e− ω0 (b −b) dσ , (137) (138) and this factor propagates into the hybridization. After the transformation, the electron and boson sectors are decoupled and the expectation value h· · · ib becomes the product of a term involving electron operators which is analogous to that computed for the Hubbard model without phonons, and a phonon term which is the expectation value of a product of exponentials of boson operators. The total weight of a configuration w({Oi (τi )}) = wb ({Oi (τi )})w̃Hubbard ({Oi (τi )}). (139) Here, w̃Hubbard is the weight of a corresponding configuration in the pure Hubbard impurity model (with parameters modified according to Eqs. (135) and (136) while the phonon contribution is E D wb ({Oi (τi )}) = es2n A(τ2n ) es2n−1 A(τ2n−1 ) · · · es1 A(τ1 ) 0.05 p(k) † λ/t=0 λ/t=0.2 λ/t=0.4 λ/t=0.6 U/t=6 0.06 The impurity electron creation and annihilation operators are transformed according to λ β FIG. 14 Illustration of an order n = 4 diagram for the Holstein-Hubbard model. Empty (full) circles and squares represent V † (V ) hybridization events. Dashed lines indicate interactions K(τ ) connecting all pairs of hybridization events. Adapted from (Werner and Millis, 2010). (135) (136) d˜†σ = eiP X0 d†σ e−iP X0 = e ω0 (b ∼ µ ∼ U Κ 0.04 U/t=4 0.03 0.02 0.01 0 0 20 40 60 80 100 120 140 perturbation order k FIG. 15 Distribution of perturbation orders in converged single-site DMFT solutions of the half-filled Holstein-Hubbard model with a semi-circular density of states with bandwidth 4t, phonon frequency ω0 = 0.2t, inverse temperature βt = 400 and values of electron-electron (U ) and electron-phonon interaction strength (λ) indicated. Both in the insulating (U/t = 6) and metallic (U/t = 4) phases, the distribution shifts little as λ is increased except near the phase boundary to the bipolaronic phase (λ/t = 0.6, U/t = 4). b (140) with 0 ≤ τ1 < τ2 < . . . < τ2n < β, si = 1 (−1) if the nth operator is a creation (annihilation) operator and A(τ ) = ωλ0 (eω0 τ b† − e−ω0 τ b). The expectation value is to be taken in the thermal state of free bosons. The standard disentangling of operators leads to   λ2 /ω02 n eβω0 + 1 wb ({Oi (τi )}) = exp − βω0 −1 e  X  ω0 (β−(τi −τj )) ω0 (τi −τj ) , + +e si sj e  2n≥i>j≥1 (141) The phonon contribution can be interpreted as an interaction K(τ − τ ′ ) between all pairs of operators (see Fig. (14) and (Werner and Millis, 2010)) of the form K(τ ) = − λ2 cosh[ω0 (τ − β/2)] − cosh[ω0 β/2] , ω02 sinh[ω0 β/2] (142) which is the twice integrated retarded interaction (Eq. 131) produced by the phonon coupling. The inclusion of phonons (or more generally any operator which commutes with Hloc ) is thus possible without any truncation and with negligible extra computational cost. Figure 15 presents statistics on the perturbation order for a DMFT simulation of the Holstein-Hubbard model with semicircular density of states with bandwidth 4t and phonon frequency ω0 = 0.2t. The average perturbation order is seen to be little affected by the strength of the phonon coupling. Additional results on the HolsteinHubbard model may be found in (Werner and Millis, 2007b). A closely related method has also been applied to study the consequences of dynamical screening of the Hubbard interaction by other degrees of freedom in the solid. Screening leads to a retarded interaction of the form of Eq. (130) with Oa the on-site density and W ′′ determined by the dynamical charge susceptibility of the other degrees of freedom in the solid. The passage back 25 Lc=12, A(k,ω), βt = 7.5, λ=0.35 Lc=12, A(k,ω), βt = 40, λ=0.35 3π/8 3π/8 SŨ + SW with SŨ the usual Hubbard interaction and XZ β SW = dτ dτ ′ [ni (τ ) − 1]W (τ − τ ′ )[ni (τ ′ ) − 1]. 0 i (143) kF kF with W given by Eq. 131. As in other CT-INT calculations, it is advantageous to introduce auxiliary fields in the interaction terms to eliminate a trivial sign problem. Assad and Lang choose SŨ = π/8 π/8 0 0 -1 -0.5 0 ω /t 0.5 1 -1 -0.5 0 ω /t 0.5 1 FIG. 16 Single particle spectral function of the one dimensional Holstein model, computed as a function of frequency using “CDMFT” cluster dynamical mean field methods on an Lc = 12 cluster at filling n = 1/4 at high temperature (left panel) and low temperature (right panel). The spectra reveal a temperature dependent line broadening and the appearance at low T of a near-Fermi-level structure associated with the development of intersite correlations. From Ref. (Assaad, 2008). to Eq. (129) provides an oscillator representation and the formalism described above may be applied. Details are given in (Werner and Millis, 2010). Z β dτ 0 Ũ X Y (ni,σ (τ ) − ασ (s)) 2 i,s σ (144) with σ the physical spin, s = ±1 an auxiliary spin and ασ (s) = 1/2 + σsδ, with δ some constant (see also Sec. III.A). The phonon term is shifted as Z β X X SW = dτ dτ ′ W (τ − τ ′ ) 0 σ,σ′ s=±1 × [ni,σ (τ ) − α+ (s)] [ni,σ′ (τ ′ ) − α+ (s)] . (145) Assaad and Lang then perform an expansion of the CTINT type, but employing a general vertex V (τ ) = {i, τ, σ, τ ′ , σ ′ , s, ν} , (146) where ν enumerates the vertex types Hubbard (ν = 0) or phonon (ν = 1). The sum over the available phase space becomes X Z β X = dτ ′ , (147) V (τ ) i,σ,σ′ ,s,ν 0 and the weight of the vertex is w [V (τ )] = δν,0 Ũ δ(τ − τ ′ ) + δν,1 W (τ − τ ′ ). 2 (148) Using furthermore the notation C. CT-INT Assaad and Lang (2007) showed that the CT-INT approach, too, allows a transparent treatment of phonon degrees of freedom. Their algorithm enables the simulation of wider classes of models than the canonical transformation approach but at much greater computational expense. To date it has been formulated for the HolsteinHubbard model and applied (Assaad, 2008) to cluster dynamical mean field studies of the one-dimensional Holstein model, and we follow this formulation in our description below. The formalism however appears to apply also to non-Holstein couplings and further investigation along these lines would be of great interest. The treatment begins from an action formulation, with an interaction term which Assaad and Lang write as S = H[V (τ )] = δν,0 δσ,↑ δσ′ ,↓ δ(τ − τ ′ ) [n↑ (τ ) − α+ (s)] [n↓ (τ ) − α− (s)] + δν,1 [nσ (τ ) − α+ (s)] [ni,σ′ (τ ′ ) − α+ (s)] , the partition function can be written as Z τn−1 Z β ∞ X X Z n dτn w[V1 (τ1 )] . . . (−1) dτ1 = Z0 n=0 0 0 V1 (τ1 ) D E X w [Vn (τn )] Tτ H [V1 (τ1 )] . . . H [Vn (τn )] . × Vn (τn ) 0 (149) The Monte Carlo procedure follows the scheme described in Chapter III, with addition and removal of general vertices. 26 In a cluster dynamical mean field calculation of the one-dimensional Holstein model (Eq. (133) with U = 0) the method reveals interesting near-Fermi-level structures in the electron spectral function related to intermediate range correlations (Assaad, 2008); see Fig. 16 for representative results. The flexibility of the method, which seems applicable also to CT-HYB and CT-J, and the importance of electron-phonon couplings in materials science suggests that the implementation and investigation of more general types of electron phonon couplings would be a worthwhile effort. U=0 V>0 U>0 t 1 s1 V>0 t 2 s2 0 t max t 3 s3 t 4 s4 t 1 s1 U>0 t 2 s2 V>0 t 3 s3 0 U>0 t max t 5 s5 t 4 s4 t 6 s6 V=0 VIII. EXPANSION ON THE KELDYSH CONTOUR: REAL-TIME AND NON-EQUILIBRIUM PHYSICS A. Introduction In this section we describe diagrammatic Monte Carlo techniques capable of computing the real-time and nonequilibrium properties of quantum impurity models. These methods have been used to calculate the transport properties and relaxation dynamics of currentbiased quantum dots and as impurity solvers for dynamical mean field studies of the nonequilibrium properties of solids. Real-time CT-QMC methods were pioneered by Mühlbacher and Rabani who used a hybridization expansion method to study a problem of electrons coupled to phonons (Mühlbacher and Rabani, 2008). The non-equilibrium hybridization expansion was generalized to the case of electron-electron interactions in (Schmidt et al., 2008), (Werner et al., 2009b), (Schiró and Fabrizio, 2009), (Schiró, 2010), while the real-time version of the CT-AUX method was given in (Werner et al., 2009b), (Werner et al., 2010) and used in (Eckstein et al., 2009, 2010; Eurich et al., 2011; Tsuji et al., 2010). It is important to bear in mind that unlike in the equilibrium case, where the algorithms have been tried, tested and optimized, the nonequilibrium extensions of CT-QMC are still in an experimental stage. The methods which have been implemented so far are more-or-less straightforward adaptations of the equilibrium CT-QMC algorithms. Significant improvements may be possible. While both CT-AUX and CT-HYB based methods have been studied, we will restrict our explicit discussion in this section to the CT-AUX algorithm, which has allowed an accurate study of the steadystate current-voltage characteristics of half-filled quantum dots in the weak- and intermediate-correlation regime. For a discussion of the real-time CT-HYB algorithm we refer the reader to the original papers by Mühlbacher and Rabani (2008), Schiró and Fabrizio (2009), and also to Ref. (Werner et al., 2010), where both CT-AUX and CT-HYB real-time algorithms are t 7 s7 −i β FIG. 17 Illustration of the Keldysh contour for a CT-AUX study of the Anderson model with interaction quench (top panel) and voltage quench (bottom panel). In an interaction quench starting from U = 0, the imaginary time branch of the contour is shifted to t = −∞ and need not be explicitly considered in a weak-coupling Monte Carlo simulation. The red arrows represent auxiliary Ising spin variables. The top panel shows a Monte Carlo configuration corresponding to perturbation order n+ = 2, n− = 2, and the bottom panel a configuration corresponding to n+ = 3, n− = 2, nβ = 2. From Ref. (Werner et al., 2010). presented in detail. B. Keldysh formalism The basic theoretical task is to evaluate the expectation value of some operator O at some time t, given that the system was prepared at time t = 0 in a state described by the density matrix ρ0 . Using the Heisenberg representation the expectation value may be expressed mathematically as i h Rt ′ R t ′′ ′ ′′ (150) hO(t)i = T r ρ0 ei 0 dt H(t ) Oe−i 0 dt H(t ) (the generalization to operators with multiple time dependencies is straightforward and will not be written explicitly). A nonequilibrium situation may arise through a time dependence of H (as occurs for example in a system ‘pumped’ by a laser), through nonequilibrium correlations expressed by ρ0 (as occurs for a quantum dot with current flowing across it) or through an initial density matrix ρ0 which is different from the long-time (thermal equilibrium) limit, as occurs if a system is ‘quenched’ into a different state. One may (Kadanoff and Baym, 1962) view the expectation value in Eq. (150) as an evolution on SchwingerKeldysh contours, examples are given in Fig. 17. In each 27 panel the upper contour represents the evolution from initial time t = 0 to measurement time t via e−iHt , the operator O is positioned at the bend where the lower and upper contours meet, and the lower contour represents the evolution from t back to t = 0 via eiHt . The two panels of Fig. 17 also indicate two ways to prepare the initial state of the system. The upper panel indicates the standard approach, which we call the interaction quench. In this approach one imagines that at times t < 0 the interactions are turned off, so that ρ0 is the density matrix of the non-interacting, but potentially non-equilibrium system. At t = 0 the interactions are turned on, and one studies the subsequent evolution of the system. The lower panel indicates an alternative approach, the voltage quench. In this approach one prepares the system by performing an equilibrium simulation of the interacting model (accomplished formally by propagating along the imaginary branch of the contour shown in the figure) and then turns on the nonequilibrium effects at time t = 0. The general strategy for evaluating Eq. (150) is the same as in the equilibrium case, namely to write H as a sum of two terms: one, Ha , for which the time evolution can be treated exactly and another, Hb , which is treated by a formal perturbative expansion. The expansion in Hb generates diagrams which are sampled stochastically, using an importance sampling which accepts or rejects proposed diagrams on the basis of their contributions to hÕi, with for example Õ = 1. Time plays the role of β = 1/T . There are crucial differences. In equilibrium calculations, the expansion can be formulated on the imaginary time axis 0 ≤ τ < 1/T as an expansion of Rβ TrTτ e−βHa exp[− 0 dτ Hb (τ )]. Thus one can work with real (or Hermitian) quantities and only one exponential must be expanded. In the nonequilibrium situation one must expand two exponentials, doubling the perturbation order required to reach a given time. Also, the result of a measurement at a finite time depends on the initial preparation of the system. It is thus essential that the computation proceed for long enough to build up the correct entanglement between the impurity and the bath before steady-state quantities are measured. The main difficulty of nonequilibrium calculations, however, is that the expansion must be done for real times, so diagrams come with factors of i raised to powers determined by the perturbation order. The terms in the expansion are complex so a ’phase’ problem exists, but in all cases known to date the expansion can be arranged so that all terms are real. A sign problem however remains. Convergence of the perturbation theory is thus oscillatory rather than exponential and the result is a sign problem which severely limits the maximum perturbation order that can be attained and hence the maximum time which can be reached. C. Real-time CT-AUX Here we present the formalism needed for a nonequilibrium application of the CT-AUX method. For simplicity we focus on a nonequilibrium version of the singleimpurity Anderson model, Eq. (12), where the local Hamiltonian is coupled to two leads (“left” and “right”) which may be at different chemical potentials µα . Thus the bath and hybridization terms in the Hamiltonian become X X  α† α Hbath = (151) εα p,σ − µα cp,σ cp,σ , α=L,R p Hhyb = X X α=L,R p,σ  Vpα cα† p,σ dσ + h.c. , . A crucial parameter is the level broadening X Γα (ω) = π |Vpα |2 δ(ω − εα p) (152) (153) p associated with lead α. The total level broadening is Γ = ΓL + ΓR . (154) Γ is the imaginary part of the real axis hybridization function. It plays a crucial role in what follows so we identify it by a separate symbol. In nanoscience applications one is interested in the current flowing through the impurity. The flow of charge into, say, the left lead may be determined from the time derivative of the number of left lead electrons N̂ L = P L† L L with the pσ apσ apσ . Taking the commutator of N̂ Hamiltonian shows that the current flowing through the impurity into the left lead is determined by the t → t′ limit of the quantity X ′ (155) A(t, t′ ) = VpL TC cL† pσ (t)dσ (t ) . pσ TC is the contour ordering operator, which exchanges the product A(t)B(t′ ) of two operators if t is earlier on the contour than t′ (a minus sign is added if the exchange involves an odd number of Fermi operators). Finding an efficient means of measuring A is an important part of the algorithm. In the nonequilibrium Anderson model an interaction quench corresponds to taking U = 0 for times t < 0 with an instantaneous step to a non-zero U at t = 0 while the chemical potential difference is time independent and the initial density matrix that appropriate to noninteracting electrons in the given bias voltage. A voltage quench corresponds to taking µL = µR for time t < 0 with an instantaneous step to a nonzero µL − µR at t = 0. One assumes that the lead electrons equilibrate instantly to the new chemical potential so that the β equal time correlators of lead operators are hcα† p,σ cp′ ,σ′ i = x/T + 1)−1 δα,β δp,p′ δσ,σ′ fTα (εα p,σ − µα ), with fT (x) = (e 28 the Fermi distribution function for temperature T and µα the value of the chemical potential for lead α at the appropriate time. A compact derivation of all measurement formulae for both voltage and interaction quenches may be obtained from manipulations of the “partition function” (more precisely, an expression for the expectation value of the operator O = 1) on the contour shown in the lower panel of Fig. (17):  eq 0 Z = e−Kβ Tr e−β(Hbath +Hdot +Hhyb +HŨ −Kβ /β) neq 0 × eit(Hbath +Hdot +HU +Hhyb −Kt /t)  neq 0 × e−it(Hbath +Hdot +HU +Hhyb −Kt /t) . (156) neq Hbath The notation indicates that on the real-time portion of the contour the two leads may have different chemeq ical potentials, whereas Hbath means that on the imaginary time portion of the contour the two leads have the same chemical potential. At this stage Kβ and Kt are arbitrary constants. Convenient choices for Kβ,t will be discussed below. In Eq. (156) the interaction Ũ on the imaginary time branch need not be the same as the interaction U on the real-time branches. The generalization to time-dependent U (t) or µL,R (t) is straightforward (Eurich et al., 2011; Tsuji et al., 2010). In the voltage quench Ũ = U while in the interaction quench Ũ = 0 and the imaginary time portion of the contour drops out of the problem. The time evolution along the real-time and imaginarytime contours is expanded in powers of HU − Kt /t and HU − Kβ /β, respectively. Each interaction vertex is then decoupled using Ising spin variables (x = t or β) Kx X γx s(nd,↑ −nd,↓ ) HU − Kx /x = − e , (157) 2x s=−1,1 cosh(γx ) = 1 + (xU )/(2Kx ), (158) as in Eq. (69). The resulting collection of Ising spin variables on the contour represents the Monte Carlo configuration {(t1 , s1 ), (t2 , s2 ), . . . (tn , sn )}, with ti denoting the position of spin i on the L-shaped contour (see illustration in Fig. 17). There are n+ spins on the forward branch, n− spins on the backward branch and nβ spins on the imaginary-time branch of the contour (n = n+ + n− + nβ ). The weight of such a configuration is obtained by tracing over the dot and lead degrees of freedom and can be expressed in terms of two determinants of n × n matrices Nσ−1 : p({(t1 , s1 ), (t2 , s2 ), . . . (tn , sn )}) = (−in− )(in+ )(Kt dt/2t)n− +n+ (Kβ dτ /2β)nβ Y det Nσ−1 , σ Nσ−1 =e Sσ − (iG0,σ )(e Sσ − I). (159) (160) Here (G0,σ )ij = G0,σ (ti , tj ) is the ij element of the n × n matrix of non-interacting Green functions G0,σ (t, t′ ) = −ihTC dσ (t)d†σ (t′ )i0 (161) computed using the possibly time-dependent chemical potentials and evaluated at the time arguments defined by the Ising spins. The quantity eSσ = γ1 s1 σ γn sn σ diag(e ,...,e ) is a diagonal matrix depending on the spin variables (with γi = γt for spins located on the real-time branches and γi = γβ for spins on the imaginary time branch). A Monte Carlo sampling of all possible spin configurations is then implemented based on the absolute value of the weights (159). The contribution of a specific configuration c = {(t1 , s1 ), (t2 , s2 ), . . . (tn , sn )} to the Green function (Gcσ ) and current (Acσ ) is given by (Werner et al., 2009b) Gcσ (t, t′ ) = G0,σ (t, t′ ) n X G0,σ (t, ti )[(eSσ − I)Nσ ]i,j G0,σ (tj , t′ ), (162) +i i,j=1 Acσ (t, t′ ) = n X +i i,j=1 A0,σ (t, t′ ) G0,σ (t, ti )[(eSσ − I)Nσ ]i,j A0,σ (tj , t′ ), (163) with the first term on the right hand side giving the contribution to the non-interacting Green function or current and the second term a correction due to the interactions. In Eq. (163) X ′ A0,σ (t, t′ ) = VpL hTC cL† (164) pσ (t )dσ (t)i0 pσ denotes a dot-lead correlation function of the noninteracting model. The Green function and current expectation value is Gσ (t, t′ ) = hGcσ (t, t′ )φc i/hφc i, X I(t) = −2Im [hAcσ (t, t)φc i/hφc i], (165) (166) σ where h.i denotes the Monte Carlo average and φc the phase of the weight of the configuration c. As in Eq. (198), it is advantageous to accumulate the quantity Xσ (s1 , s2 ) = i n X i,j=1 δC (s1 , ti )[(eSσ − 1)Nσ ]i,j δC (s2 , tj ), (167) which is related to the self-energy Σ by X ⋆ G0 = Σ ⋆ G (with ⋆ denoting a contour convolution). Furthermore, it follows from Eq. (159) that the weight of a Monte Carlo configuration changes sign if the last spin (corresponding to the largest time argument) is shifted from the forward contour to the backward contour or vice versa. Since the 29 absolute value of the weight does not change, these two configurations will be generated with equal probability. As a result, all the terms in Eq. (167) which do not involve the last operator on the contour will cancel. It is therefore more efficient to accumulate Xσ (s1 , s2 ) = i(1 − δ({ti })) + iδ({ti }) n X n X x(s1 , i; s2 , j) i,j=1 [x(s1 , last; s2 , l) + x(s1 , l; s2 , last)], l6=last (168) with x(s1 , i; s2 , j) ≡ δC (s1 , ti )[(eΓσ − 1)Nσ ]i,j δC (s2 , tj ) and δ({ti }) = 1 if maxi Re(ti ) > 0 and 0 otherwise. In an interaction quench starting from U = 0, the imaginary-time evolution is not explicitly considered in the Monte Carlo simulation and temperature appears only as a parameter in the noninteracting Green functions (see Fig. 17). Moreover, the latter depend only on time differences, and thus can be easily expressed in terms of their Fourier transform. Assuming a large band cutoff and neglecting the real part of the lead self-energy we find (Jauho et al., 1994; Werner et al., 2009b) α ′ X Z dω ′ Γ (ω)(f (ω − µα ) − ΘC (t, t )) , e−iω(t−t ) G0 (t, t ) = 2i 2π (ω − εd − U/2)2 + Γ2 α=L,R Z dω −iω(t−t′ ) ΓL (ω)ΓR (ω)(f (ω − µL ) − f (ω − µR )) A0 (t, t′ ) = −2i e 2π (ω − εd − U/2)2 + Γ(ω)2 Z dω −iω(t−t′ ) ΓL (ω)(ω − εd − U/2)(f (ω − µL ) − ΘC (t, t′ )) . e +2 2π (ω − εd − U/2)2 + Γ2 ′ In the voltage quench, on the other hand, the interaction is non-vanishing on the imaginary time portion of the contour (Fig. 17), while the chemical potential difference jumps instantaneously from zero (on the imaginary branch) to V (on the real branches). Because of the time dependence of the chemical potentials, the noninteracting Green functions are not time translation invariant and we cannot express G0,σ and the dot-lead correlator A0,σ in the form of a Fourier transform. Instead, those functions must be computed numerically from their equations of motion, as explained in (Werner et al., 2010). (169) (170) the spin degree of freedom effectively disappears (eγsσ = −1) and the algorithm becomes the real-time version of the weak-coupling solver (Section III) for the particlehole symmetric interaction term HU − Kx /x = U (nd,↑ − 1 1 The odd perturbation orders are con2 )(nd,↓ − 2 ). tinuously suppressed as Kx approaches −xU/4. This suppression of odd perturbation orders was essential in the nonequilibrium dynamical mean field calculations of (Eckstein et al., 2009, 2010) and the current calculations of (Werner et al., 2010). IX. COMPARISON OF THE EFFICIENCY OF THE DIFFERENT METHODS D. Sign problem The sign (phase) problem in the real-time CT-QMC methods grows exponentially with the average perturbation order on the real-time branches, which in turn is proportional to the simulation time. Operators on the imaginary time branch do not add significantly to the sign problem. While accurate results can be obtained for average signs down to 10−3 , this threshold is reached if the expected number of operators on the real-time contour is approximately ten. To reach long times or strong interactions, it is therefore important to reduce the average perturbation order on the real-time branches as much as possible. In this context it is worth noting that in the particle-hole symmetric case, the parameters Kx of the CT-AUX algorithm can be chosen such that only even perturbation orders appear in the expansion. In fact, for Kx = −xU/4 (171) 1. Average expansion orders and matrix sizes For all diagrammatic quantum Monte Carlo algorithms discussed here, the computational effort scales as the cube of the expansion order or matrix size, as discussed in detail in (Gull et al., 2007). For a Hirsch and Fye (1986) solver the matrix size is determined by the time discretization ∆τ = β/N . In the case of the continuoustime solvers it is determined by the perturbation order k, which is peaked roughly at the mean value hki determined by the probability distribution p(k). In Fig. 18, we plot these matrix sizes as a function of inverse temperature β for fixed U/t = 4 and as a function of U/t for fixed βt = 30, for a semi-circular density of states with bandwidth 4t. It is obvious from the upper panel of Fig. 18 that the matrix size in all three algorithms scales linearly with 30 150 tion expansion is best suited to study strongly correlated impurity models with density-density interactions. There is of course a prefactor to the cubic scaling, which depends on the computational overhead of the different algorithms and on the details of the implementation. However, the results presented here indicate large enough difference between the methods that the effects of optimization are of secondary importance. CT-INT CT-HYB Hirsch Fye Matrix Size 100 50 2. Accuracy for constant CPU time 0 0 20 30 βt 40 50 60 70 4 5 6 7 CT-INT CT-HYB Matrix Size 100 10 50 0 0 1 2 3 U/t FIG. 18 Upper panel: Bethe lattice, single site DMFT, scaling of matrix size with temperature at U/t = 4 for the Hirsch-Fye, CT-INT and CT-HYB algorithms. For HirschFye, the resolution N = βU has been chosen as a compromise between reasonable accuracy and acceptable speed, while the average matrix size is plotted for the continuoustime solvers. Lower panel: scaling of matrix size with U/t for fixed βt = 30. The solutions for U ≤ 4.5 are metallic, while those for U ≥ 5.0 are insulating. The much smaller matrix size in the relevant region of strong interactions is the reason for the higher efficiency of the hybridization expansion method. From Ref. (Gull et al., 2007). β. The Hirsch-Fye data are for a number of time slices N = βU , which is apparently a common choice, although Fig. 19 shows that it may lead to considerable systematic errors. Thus, the grid size should in fact be chosen much larger (N & 5βU ). While the matrix size in the CT-INT approach is approximately proportional to U/t, as in Hirsch-Fye, the U dependence of the hybridization expansion algorithm is very different: a decrease in average matrix size with increasing U/t leads to much smaller matrices in the physically interesting region 4 . U/t . 6, where the Mott transition occurs in this model. The results in Fig. 18 and the cubic dependence of the computational effort on matrix size show why the continuous-time solvers are much more powerful than Hirsch-Fye and why the hybridiza- The CT-INT, CT-HYB, and Hirsch-Fye algorithms considered here work in very different ways. Not only are the configuration spaces and hence the update procedures entirely different, but so also are the measurement procedures of the Green’s functions and other observables. Ref. (Gull et al., 2007) proposed that the performance of solvers should be compared by measuring the accuracy to which physical quantities can be determined for fixed CPU time. This is the question which is relevant for implementations and avoids the tricky (if not impossible) task of separating the different factors which contribute to the uncertainty in the measured results. Because the variance of the observables measured in successive iterations of the self-consistency loop turned out to be considerably larger than the statistical error bars in each step, the mean values and error bars were determined by averaging over 20 DMFT iterations starting from a converged solution. The Hirsch-Fye solver suffers from additional systematic errors due to time discretization. These systematic errors are typically much larger than the statistical errors. In order to extract meaningful results from Hirsch-Fye simulations it is essential to do a careful (and time-consuming) ∆τ → 0 analysis (Blümer, 2002). The continuous-time methods are free from such systematic errors. The high precision of the hybridization expansion results for the kinetic energy indicate that this algorithm can accurately determine the shape of the Green’s function near τ = 0 and β. For the self-energy, Σ(iωn ) = G0 (iωn )−1 − G(iωn )−1 , (172) the Matsubara Green’s functions have to be inverted and subtracted. This procedure amplifies the errors of the self-energy especially in the tail region where G0 (iωn ) and G(iωn ) have similar values. Fig. 19, in contrast, shows low frequency results ImΣ(iω0 )/ω0 for U/t = 4 and several values of β. This quantity is related to the quasiparticle weight Z ≈ 1/(1 − ImΣ(iω0 )/ω0 ). Again, the Hirsch-Fye results show systematic errors due to the time discretization which must be extrapolated. The results from the continuous-time solvers agree within error-bars, 31 -2.6 determinants of matrices Dk of size k and Dk+1 of size k + 1, -2.8 Im(Σ(iω0))/ω0 r= CT-INT in ω CT-HYB Hirsch Fye, ∆τ t = 0.1 Hirsch Fye, ∆τ t = 0.15 Hirsch Fye, ∆τ t = 0.2 -3 -3.2 10 20 30 βt 40 50 FIG. 19 Self-energy ImΣ(iω0 )/ω0 at the lowest Matsubara frequency ω0 = πT as a function of β for U/t = 4.0. The Hirsch-Fye results exhibit large discretization errors, while the continuous-time methods CT-INT and CT-HYB agree within error bars. CT-HYB is particularly suitable for measuring quantities which depend on low-frequency components, such as the quasi-particle weight. From Ref. (Gull et al., 2007). but the size of the error bars is very different. The hybridization expansion approach yields very accurate results for low Matsubara frequencies in general. The advantage of measuring in Matsubara frequencies as opposed to imaginary time in the CT-INT and CTAUX algorithms becomes apparent for large ωn . Only the difference of G to the bare Green’s function G0 has to be measured in this algorithm (see the detailed discussion in Sec. X.C, in particular Eq. 196). These differences decrease with 1/ωn for large ωn and Eq. (196) yields an accurate high frequency estimate, so that the tail of the self energy can be computed without amplification of errors. Further discussion of the relative advantages of different methods can be found in (Gull et al., 2007). X. TECHNICAL ASPECTS The following sections, independent from each other, are referenced from the algorithm sections and explain aspects of updates, measurements, and numerical methods needed for the efficient implementation of these continuous-time algorithms. Efficient and accurate measurement, especially of the high frequency behavior, remains a bottleneck in the computations; further progress in this area would be desirable. The dominating computational task in most continuous-time quantum Monte Carlo impurity solver algorithms is the computation of ratios r of (173) with matrices that have one row and one column (sometimes two rows and two columns) changed, added, or removed. The only exceptions are given by the hybridization expansion in its general formulation (Sec. V.C), which is dominated by a trace computation, and the bold CT-QMC method (Gull et al., 2010), where the determinant structure is replaced by an analytic resummation of diagrams. To compute the determinant of large matrices directly, it is best to first perform a factorization like the LU or QR factorization, where the matrix A is written as the product of a matrix of which the determinant is known, and another matrix where the determinant is easy to compute, e.g. the diagonal of an upper / lower triangular matrix. The cost of such a straightforward factorization is O(k 3 ). Determinant ratios of two matrices that differ only by one or two rows and columns can be computed much more efficiently if the inverse of one of the matrices is known. This is the reason for computing the inverse Green’s function matrix M = D−1 in the CTINT algorithm, the inverse hybridization function matrix M = ∆−1 in the hybridization algorithm, and the matrix N in the CT-AUX algorithm. We illustrate the linear algebra at the example of the CT-AUX matrix N of Eq. (74) introduced in Sec. IV. The formulas for Eq. (49) (CT-INT) and Eq. (92) (CT-HYB) are computed analogously. We start by considering a configuration at expansion order k, characterized by an N-matrix of size k × k, and consider the insertion of a vertex, thereby enlarging the configuration to k + 1 vertices. For ease of writing we choose a basis such that the rows and columns changed are the last ones, though of course in the code any row/column can be changed. Inserting a vertex into the configuration of order k leaves most of the inverse of N unchanged (Eq. (74)): it adds one row (here called R) and one column Q to it, enlarging it to a (k + 1) × (k + 1) - matrix. However, changes to the new Nk+1 − matrix, denoted by quantities with a tilde, are dense:  k −1  (N ) Q , R S   P̃ Q̃ = . R̃ S̃ (Nk+1 )−1 = Nk+1 A. Inverse matrix formulas det Dk+1 , det Dk (174) (175) P̃ is of size k × k, the vectors Q, Q̃ and R, R̃ have size (k × 1) and (1 × k), and S, S̃ are scalar. A block calculation shows that the elements of the matrix Nk+1 may 32 be computed from Nk , R, S, and Q : S̃ = (S − [R][N(k) Q])−1 , Q̃ = −[N (k) (176a) Q]S̃, (176b) (k) (176c) R̃ = −S̃[RN ], P̃ = N(k) + [N(k) Q]S̃[RN(k) ]. Spin-flip proposals are O(1) (as in Hirsch-Fye), and the same linear algebra applies. Spin-flip updates are not ergodic in continuous-time algorithms; updates which change the expansion order and vertex times are needed. (176d) 1. Delayed spin-flip updates The determinant ratios needed to accept or reject an update in Eq. (57), Eq. (81), or Eq. (99) are given by k+1 −1 det(N ) 1 = = det(S − RNk Q), k −1 det(N ) det S̃ (177) as can e.g. be seen from an LU decomposition of the block-matrix (Nk+1 )−1 . The computational effort for computing the insertion acc probability Wxy of a spin (or vertex, or segment) is 2 O(k ), or a matrix-vector multiplication followed by an inner product, as in Eq. (176a). The removal probability is computed in O(1), as S̃ is an element of Nk+1 and therefore already known. If an update is accepted, an O(k 2 ) rank one update has to be performed for Eq. (176d). As approximately k updates are needed to decorrelate a configuration, the overall algorithm scales as O(hki3 ), with hki the average expansion order. Note that the acceptance probabilities for vertex insertions or removals are more expensive than spin-flips in the case of the Hirsch-Fye algorithm (O(k 2 ) vs. O(1)), while accepted updates require rank one updates (O(k 2 )) in both cases. B. Spin-flip updates In CT-AUX, if only the value of an auxiliary spin is changed and not the imaginary time or site index of a vertex, a Dyson equation similar to the Hirsch-Fye Dyson equation may be employed. For a spin-flip from interaction Vpq to Vp′q of spin pq at Monte Carlo step q we obtain: Spin-flip updates can be separated into two parts: the computation of the acceptance ratio R (Eq. (182)), and the update of the Green’s function after an accepted spin-flip move. “Delayed” updates, a concept developed by Alvarez et al. (2008) for the Hirsch-Fye algorithm and applied to continuous-time methods in Gull (2008), delay the (expensive and slow) update of the Green’s function to a later time, while computing a sequence R1 , · · · , Rqmax of Monte Carlo spin-flip acceptance ratios. In analogy to Alvarez et al. (2008), we define two vectors aqi and bqj (compare to Eq. 179) as aqi = λq ((NG0 )qipq − δipq ), bqj = Npqq j . At the first step (q = 1) N = N0 is known. We start by selecting a spin p1 . We then compute R1 according to Eq. (182), and accept or reject the update. In a next step (q = 2), we choose the spin p2 . In order to compute R2 , we need to know Np12 p2 , which we compute as Np12 p2 = Np02 p2 + a1p2 b1p2 . = λq = + ((NG0 )qipq q q − δipq )λ (N )qpq j , q γ γ = q, 1 + (1 − (NG0 )qpq pq )γ q R ′ γ q = eVpq −Vpq − 1 q R = 1 + (1 − (179) (180) 0 dqj = Njj + (182) Niz G0zj (eVj − 1) = Nij eVj − δij , (NG0 )lm = (Nlm e Vm − δlm )/(e (183) Vm − 1) (184) alj blj . (188) We define two vectors, colq and rowq , that iteratively recompute the elements of the matrix N for the row and column pq : coljq = 0 Njp q + q R is the spin-flip acceptance ratio. The expression (NG0 )qlm = Gqlm can easily be computed from the identity q X l=1 (181) (NG0 )qpq pq )γ q . (187) q More generally, the jth diagonal element dqj = Njj after q (accepted) spin-flips is given by (178) Nijq (186) For Rq we need to know (NG0 )qpq pq = Gqpq pq , which is computed by Eq. (184) from Npqq pq . (NG0 )q+1 = (NG0 )qij + ((NG0 )qipq − δipq )λq (NG0 )qpq j , ij Nijq+1 (185) rowjq = Np0q j + q X l=1 q X q alj blpq = Njp q (189) alpq blj = Npqq j . (190) l=1 These are sufficient to compute the new q-th column (row) of the matrices aqj (bqj ) (Eq. (185) and (186)) and 33 the new diagonal vector dj : aqi = λq ((NG0 )qipq − δipq ) = = = (191) Vpq − δipq )/(e q Vpq λq ((coli e − δipq )/(eVpq rowjq , q+1 dqjj + aqj bqj = Njj . = λq ((Nipq e bqj dq+1 j Vpq − 1) − δipq ) − 1) − δipq ), (192) (193) As seem in Eq. (182), dq+1 pq+1 is needed to accept or reject the next spin-flip at the next proposed position pq+1 . After some steps qmax we retrieve the full N-matrix by computing Nijqmax = Nij0 + qX max ail blj . (194) l=1 A complexity analysis shows the cost of the delayed spin-flip updates: Eq. (189) and (190) are O(qk), Eq. (193) is O(k), and Eq. (194) is an O(k 2 qmax ) matrixmatrix multiplication. The reason for performing delayed spin-flip operations instead of straightforward spinflips or insertion and removal updates is that, on current hardware architectures, the final matrix multiplication in Eq. (194) is about a factor 10 faster for large (i.e. out-of cache) matrices than successive rank one updates, as fast matrix operations that reuse data can be employed. The additional overhead of computing a, b, d, row, and col will dominate the algorithm for large q max . We therefore recompute N often enough that the overhead does not dominate, but that we can still take advantage of the matrix operations. In practice q max = 32 or 64 are reasonable values (Alvarez et al., 2008), also in the continuous-time algorithms. For more information see also (Mikelsons, 2009) and (Gull et al., 2011a). C. Efficient measurements in the CT-AUX and CT-INT formalism In the CT-AUX and CT-INT algorithms the Green’s function measurement formula Eq. (83) and Eq. (60) in imaginary time, for sites i and j and at times τi and τj , is 0 Gij,σ (τi − τj ) = Gij,σ (τi − τj ) E DX 0 0 − Gix (τ − τ )G (τ − τ )M i p p j pq xq j,σ p ,σ pq (195) MC , where xp (xq ) and τp (τq ) denote the site and time of the vertex at row (column) p (q) of M . Fourier transformed to Matsubara frequencies, the Green’s function is estimated as 0 Gij,σ (iωn ) = Gij,σ (iωn ) (196) D E X 1 . G 0 (iωn )Gx0q j,σ (iωn )eiωn τp Mpq e−iωn τq − β pq ixp ,σ MC Measurement using Eq. (195) in the imaginary time domain has a crucial drawback: To sample the smooth function Gij (τ ) the formulas need to be evaluated for definite τ on some grid (which may be chosen non-equidistant). Further processing, e.g. Fourier transforms, may introduce discretization errors caused by this grid. As the cost computing G straightforwardly is proportional to the number of imaginary time points at which it needs to be evaluated, a fine grid of time points becomes prohibitively expensive. In addition, G estimated by Eq. (195) has a further drawback: the observable average G(τi − τj ) is translationally invariant, while the estimator explicitly depends on two times τi , τj , so that translation symmetry needs to be restored by the random walk. In the Matsubara frequency domain, Eq. (196), there already is a discrete grid of frequencies ωn = (2n + 1)π/β, n = 0, 1, 2, . . .. The summands inside h·iMC decay as 1/ωn2 . A measurement method implemented directly in frequency space measures all frequencies up to a maximum cutoff ωmax . To obtain the number of frequency points needed we use information from a high frequency expansion of the self energy or the Green’s function, and automatically adjust the cutoff frequency such that systematic errors from the cutoff are much smaller than statistical (Monte Carlo) errors. This controllability makes this method the preferred one for high accuracy measurements of the Green’s function. In translationally invariant clusters, only diagonal entries of the Green’s function in k-space are non-zero. For a cluster with Nc sites this implies that only Nc independent k-space Green’s functions need to be measured (instead of Nc2 real space Green’s functions), at the small cost of performing a (real space) Fourier transform. Computing the exponential factors exp(±iωn τ ) needed for the frequency measurement is expensive. Even with fast vectorized functions available as part of numerical libraries, these operations are so time consuming that they may dominate computer time in large simulations. An obvious simplification consists of creating a fine imaginary time grid. At the start of the simulation, exp(iωn τ ) is computed for all ωn needed and all τ on that grid, and the exponentials in Eq. (196) are taken from it. This eliminates the expensive calculation of eiωn τ at runtime at the cost of some (but relatively little) additional memory. We did not observe any inaccuracies introduced by this discretization. 1. Self energy binning measurement An efficient measurement method, presented in Gull et al. (2008a), is based on measuring ΣG ≡ S. M plays the role of a T -matrix: M G 0 = ΣG. This measurement method works in imaginary time but does not have the drawbacks described in the previous section. The measurement formula (omitting the spin index) is 34 rewritten as 0 Gij (τ ) = Gij (τ ) − 0 = Gij (τ ) − × DX pq Z dτz DX pq X l E 0 0 (τ − τ )M G (τ ) Gix p pq q x j q p 0 = Gij (τ ) − Z 0 β dτz X l (197) Gil0 (τ − τz ) E δ(τz − τp )δxp l Mpq Gx0p j (τq ) MC MC Gil0 (τ − τz )hSlj (τz )iMC . collaborators (Prokof’ev et al., 1998) and, among other problems, applied to the attractive - U Hubbard model (Burovski et al., 2006b). The name “worm” refers to two dangling Green’s function lines in the diagrams that build the head and tail of the “worm”. Instead of generating configurations of Z and measuring G, a worm method stochastically samples both the series for Z and the series for Gij,σ (τi − τj ) simultaneously. To this end the configuration space C is enlarged to include both the set of diagrams for Z, CZ and the set of diagrams CG for G: C = CZ ∪ CG . (199) The Matsubara Green’s function can similarly be exA new partition function of the combined system is detracted directly from the expectation value of S: fined by extending Z by the sum over all Green’s function Z βX diagrams, with an arbitrary factor η that controls the rel0 Gij (iωn ) = Gij (iωn ) − Gil0 (iωn ) dτz eiωn τz hSlj (τz )iMC . ative importance the Green’s function sector CG and the 0 partition function sector CZ , l (198) X ZZ Ztot = Z + η dτ1 dτ2 Gij,σ (τ1 , τ2 ). (200) In the Monte Carlo process only the quantity hSiMC is ij,σ measured and binned into fine (typically 10000) bins. In practice, η is chosen such that both summands for Ztot The cost of this binning process is independent of the have non-vanishing weight. number of time discretization points of S, and only reThe random walk and updates are modified such that 0 quires the evaluation of M G at runtime. In practice we the entire space C is sampled: In addition to the partiemploy the translational invariance in the time - domain tion function updates, “worm insertion” and “removal” to obtain multiple estimates of the Green’s function at updates, i.e updates that transition between CZ and CG the same step, and perform a matrix-matrix multiplicaby inserting or removing Green’s function operators, as 0 0 tion of the matrix Mpq and a matrix Gqj = Gsq sj (τq − τj ) well as updates in the Green’s function space like the with randomly chosen τj to obtain estimates for S. The shift of Green’s function lines or vertex insertions (and method is accurate and significantly faster than the other removals) in the Green’s function space need to be conmethods presented here and is therefore the measurement sidered. Updates that change the vertex part of a Green’s method of choice for the CT-AUX and CT-INT algofunction configuration are important, as they allow imrithms, unless access to large clusters or high precision portance sampling for all elements of a Green’s function in the high frequency part of the self energy is needed (e. diagram. g. for analytic continuation), in which case we use the Measurements in real space and imaginary time are frequency measurement. straightforward: A histogram of worm positions with the appropriate sign needs to be recorded. Such an imaginary time measurement yields estimates for the Green’s D. Green’s function (“worm”) –sampling function with continuous times. These estimates are best In all algorithms discussed so far, diagrams or configmeasured on a fixed, but preferably non-uniform, grid by urations were generated with the weight that they conproposing “worm shift” updates onto measurement locatribute to the partition function (Sec. II.C). For meations. surements, Green’s function diagrams were then obtained Worm methods have been implemented both for CTby modifying such partition function configurations. A HYB and CT-INT algorithms (Gull, 2008). For equilibpriori it is not clear that the Green’s function configurarium DMFT simulations, without reweighing, the worm tions with large weight are the ones created by modifying algorithm did not result in much better statistics than configurations important for the partition function. If the partition function algorithm, as sampling problems the Green’s function estimates generated by importance appear to be minimal. Combined e.g. with “Wang Lansampling of partition function configurations are not the dau” techniques the worm method offers the possibility ones with large contributions to the Green’s function, to perform reweighing of the Green’s function to obtain the Green’s function estimator obtains a large variance better statistics. Worm updates are however crucial in and therefore the measurement of the Green’s function a “bold” sampling method (Gull et al., 2010) where, due becomes inefficient. This problem may be overcome by to a partial resummation of some diagrams, there is no employing a “worm” algorithm. The concept was origidirect relation between Green’s function and partition nally developed in the bosonic context by Prokof’ev and function diagrams. 35 E. Wang Landau sampling In the usual sampling process of the partition function CT-INT algorithm, as well as in the other algorithms described in Sec. IV and V, diagrams of the expansion (elements of the configuration space) are sampled with the weight that they contribute to the partition function. Observables are then measured in this ensemble. However, we are free to sample any arbitrary ensemble – as long as the proper reweighing according to Eq. (24) is performed. While the samples generated are likely to have a larger variance [Eq. (23)], there may be other advantages, in particular smaller autocorrelation times. Here we describe so-called flat-histogram sampling methods. These methods are particularly useful for problems, such as first order phase transitions, with barriers in the configuration space of the Markov walker. Wang and Landau (2001a,b) presented a general reweighing scheme that is designed to find and overcome barriers and phase transitions without prior knowledge of where in phase space they are. The method was extended to quantum problems by Troyer et al. (2003). In the quantum Wang-Landau method the key is to reweigh the perturbation series so that all orders up to some kmax are sampled with approximately equal probability (see Fig 20). kmax has to be chosen in such a way that all local minima of phase space have some overlap with order kmax , so they can be reached by flat histogram sampling. For the reweighed system, the acceptance ratio (57) is 1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0.0 0.0 0 10 20 30 40 50 60 0.0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 FIG. 20 Sketch of the histogram of the expansion order for flat histogram sampling. x-axis: expansion order k. y-axis: histogram h(k). Left panel: no flat histogram sampling. Middle panel: flat histogram sampling up to half of the maximum order. Right panel: flat histogram sampling up to the maximum contributing order. From Ref. (Gull, 2008) replaced with R λk+1 λk with λ(k) = ( 1 p(k) , k < kmax , 1, k > kmax (201) where p(k) is the probability of having expansion terms at order k in a non-reweighed sampling. Reweighing factors need also be taken into account while calculating averages (Eq. (24)). The probability p(k) is unknown at the start of the simulation. Therefore the reweighing coefficients are adjusted as the simulation proceeds: the value of λ(k) is slightly decreased for the frequently-visited values of k and increased for rarely-visited ones, until the histogram is flat. As the ensemble λ(k) does not enter the expectation value of the observables, it is not important to have a very accurate estimate of it, as long as it is sufficient to ensure ergodicity. The reweighed algorithm generates diagrams both at the physically interesting orders and at orders that are very close to zero, i.e. the bare Green’s function or noninteracting partition function in CT-INT. Deliberately generating configurations that contribute little weight to the partition function may seem inefficient, as the idea behind importance sampling is to generate the diagrams with the importance they contribute to the partition function. However, when revisiting the noninteracting case at 0th order of the series, all vertices and therefore all correlations are removed, and when the series is rebuilt it will likely end up in a different part of phase space – for example in a different global symmetry sector, thereby avoiding trapping in local minima. In other words, the method aims to provide a reduction of the autocorrelation time for the Markov walker. Closer analysis shows that the algorithm can be improved by minimizing the round-trip time between low and high order states (Dayal et al., 2004; Trebst et al., 2004). In practice, flat-histogram sampling turned out to be very efficient at obtaining symmetrized, paramagnetic Green’s functions (Gull, 2008). The fact that most configurations sampled have low order and contribute next to nothing to the observables is compensated by the fact that they are quick to sample due to the O(k 2 ) scaling of the matrix operations. A further important application of the flat-histogram methods is the calculation of thermodynamic potentials. The grand potential of the Hubbard model on a finite lattice at temperature T , for example, can be found by the integration of the equation dΩ = −N dµ + DdU , because the quantities on the right hand side can be measured in a standard simulation. Such a procedure, however, requires several simulations for a finite U range. For the Hubbard model (with no DMFT self-consistency) a more elegant and efficient way (Li et al., 2009) is to employ flat-histogram methods to obtain the partition function Z directly, as the zero order term for Z is just 1. Given the reweighing factors λ(k) and frequency Pk with which different perturbation orders have been visited during the random walk,Pthe partition function is computed as Z = (P0 λ(0))−1 Pk λ(k). Knowing Z, all thermodynamic potentials can be calculated. As an example, we present in Figure 21 the graph for the entropy of a 4 × 4 Hubbard cluster. If a converged solution of an impurity model is available the same technique may be used to compute the impurity model partition function, but relating this to the thermodynamic properties of the full lattice model requires taking into account the variation of the bath density of states. This has not yet been explored. 36 Computing the exponential of a matrix (Moler and Loan, 2003) is an expensive operation. In the following we transform to the eigenbasis of the local Hamiltonian by diagonalizing it. In the eigenbasis, exp(−Hτ ) is diagonal. The (formerly sparse) local creation and annihilation operators become dense matrices. 1.4 2ln2 1.2 S 1 0.8 0.6 1. Block diagonalization U=4 U=1 0.4 0 0.5 1 1.5 2 2.5 T/t 3 3.5 4 4.5 5 FIG. 21 Entropy per site of the half-filled Hubbard model with t = 1, computed for a periodic 4 × 4 cluster at different temperatures. From (Li et al., 2009). F. Computation of the trace for general interactions in the hybridization expansion occupation number basis [Hloc , Sz ] = 0 = [Hloc , N ] further symmetries N1 N2 N3 N4 Hloc Hloc The local Hamiltonian Hloc has symmetries. While these symmetries are dependent on the exact form of the local Hamiltonian, usually the total particle number Ntot , the total spin z-component Sz and rotational or translational symmetries of the impurity Hamiltonian z are conserved: [Hloc , Ntot ] = 0 = [Hloc , Stot ]. This implies that the local Hamiltonian may be decomposed into a block-diagonal form, containing several blocks with size nblock ≪ nloc . This procedure is illustrated in Fig. (22). The advantage of changing to a block-diagonal form (Haule, 2007) is that operators di , d†j are also in blockmatrix form (see Fig. 22, 23). The operator d†i↑ , for example, raises both the total particle number and the total Sz -component by one and therefore consists of offdiagonal blocks connecting the (Sz , n) - symmetry sector with the (Sz + 1, n + 1) - sector. As the most expensive part of the code isP the computation of matrix products, which scales as O( block n3block ) or O(n3max block ) instead of O(n3loc. Ham ), the advantage of using symmetries is obvious (Gull et al., 2008b; Haule, 2007). Hloc d†↑ FIG. 22 Sketch of results of applying rotation / block diagonalization operations to the local Hamiltonian. The Hamiltonian in the occupation number basis (depicted in the left panel) is sparse but not blocked. A first permutation operation builds blocks according to the occupation number and spin of the local Hamiltonian, leading to a Hamiltonian (depicted in the middle panel) which is nearly block diagonal with dense blocks. A second (rotation) matrix further reduces block size by considering rotational and translational invariance of the impurity Hamiltonian, leading to the sparse block structure shown in the right panel. In the general formulation of the hybridization expansion, as derived in Eq. (93), the principal computational difficulty is the evaluation of the trace of a product of operators and exponentials of the local Hamiltonian. In a given basis this corresponds to taking the trace of a product of O(k) (large) matrices that have the linear size nloc of the local Hilbert space. Matrix-matrix multiplications of matrices with size nloc scale as O(n3loc ). It is therefore important to find a way both to reduce the size of the matrices that need to be multiplied as well as the number of matrix-matrix multiplications that have to be performed. n↑ =1 n↓ =1 n↑ =1 n↓ =0 {  n↑ =0 n↓ =1  n↑ =0 n↓ =0 { Tr[ e−Hloc τ1 d†↑ d↓ e−Hloc τ2 d↓ e−Hloc τ3 · · · ] FIG. 23 Sketch of one of the optimizations in the general representation: Four symmetry sectors are drawn, for which Sz and N are different. After the trace of d†↑ and d↓ is taken only one of the symmetry sectors still contributes. In the implementation, we first identify which symmetry sectors contribute, and then compute the matrix product and trace only for these sectors. Additional symmetries vastly simplify the computation. A typical example is the four-site Hubbard plaquette with next-nearest neighbor (t′ −) hopping. The local Hamiltonian has a size of 256 × 256 elements (44 local states). However, H commutes with n↑ , n↓ and has a four-fold rotational symmetry (or a couple of inversion and mirror symmetries). This allows us to split up the 256 × 256 matrix into 84 small blocks that have at most 16 × 16 elements. 37 An appropriate basis choice also allows insight into the physics. The CT-HYB formalism allows one to determine which impurity model states make the dominant contribution to the computation of an observable, and the matrix of eigenstate occupation probabilities is the projection of the density matrix onto the localized orbital basis. This information is much more easily interpreted if a physically motivated, symmetry-related basis choice is made. For examples see Figs. 30 and 42. Hloc 0 β bin 1 bin 2 p (τ D p (τ A B − ) τA B p (τ C − τC bin 3 D ) p(τE − τD bin 4 ) − τE p (τ F −τ B) C E ) F p(τG −τ F) G FIG. 24 Top √ the k operp panel: Binning algorithm: binning of ators into hki bins, each having approximately k elements, reduces the effort of computing p the trace after inserting or removing an operator to O( hki). Bottom panel: binary tree for the tree algorithm, O(loghki). (Ref. (Gull, 2008)) 2. Basis truncation As noted in Haule (2007), in situations where the local Hilbert space is prohibitively large, e.g. in the case of large multi-orbital problems or clusters, the computation of the trace is only feasible if the size of the local Hilbert space is reduced by an appropriate truncation of the basis. In systems with very highly excited states that are unlikely to contribute (e.g. the 5, 6 or 7 -electron states in Cerium), it is common practice to simply truncate the local Hilbert space and eliminate these states entirely. The same can be done for high energy / high momentum states in clusters. In addition to that, the highest few excited states of the local Hamiltonian in a particular symmetry sector may be truncated. Simple truncation based on some a priori criterion is an uncontrolled approximation justified only by a need to solve a particular problem with available resources. Truncation based on the eigenvalues of the local Hamiltonian only is especially dangerous, as the hybridization may broaden and shift levels. Truncation is likely to introduce systematic errors. Short excursions into infrequently visited states are often needed to produce transitions between frequently visited states. For example, in the large U Anderson impurity model it is the rare transitions into the states with n = 0, 2 that produces spin flips. If truncation is to be used, it is advantageous to do so in two steps: First one does a short simulation, keeping as many states as feasible, while keeping a histogram of visited states. Even if this simulation is not fully thermalized or long enough to allow accurate measurements, it will enable an identification of frequently and infrequently visited states, which may be used to construct a truncated Hilbert space for extensive simulation. A “dynamic” truncation method that speeds up the calculation of the trace without introducing errors involves checking if exp(−∆τ Hloc ) falls below machine precision or some other threshold, and if so not computing the remainder of the product of that particular part of the trace. Unlike in the “static” truncation case described above, short excitations into highly excited states are still possible, but the computational gain is significantly smaller. 3. Binning and tree algorithms for the hybridization expansion The most expensive part of the algorithm is the computation of the trace, which is linear in the numbers of hybridization operators present in the configuration. Computing the complete trace in the general case will be O(k), as each operator matrix must be accessed at least once. However, recomputing the trace after an operator insertion or removal update allows simplifications: A first step √ is trivial to implement and reduces the effort k): the operator trace is chopped into around to O( p hki intervals between zero and β. We then store the matrix product of all the operators within this interval, p such that each sub-interval contains approximately hki operators. If we insert two operators, we will change the matrix product of one or two √ intervals - which need to be The whole rerecomputed at the cost of k operations. √ compute operation is therefore of O( k), and a sweep of O(k 3/2 ). This algorithm is illustrated in the upper panel of Figure 24. A better, but more complicated algorithm uses the properties of self-balancing binary trees. AVL (Adelson-Velskii and Landis, 1962a,b; Knuth, 1997) trees are one possibility. Denoting dense matrices from the hybridization operators with capital letters and the exponential vectors p(τi+1 − τi ) = e∆τ H0 = pi,i+1 with lower case letters, we can write the trace in Eq. (93) as i h (202) Tr pi0A Aij pjAB Bjk pkBC Ckl plCD · · · Zpi piZβ , and arrange all the operators in (202) in a binary tree. It is easy to see that for every exponential p(τ → τi+1 ) = eH0 (τi −τi+1 ) between the first and last operator we can assign one of the branches of the tree. These “propagators” from time τi to time τi+1 , where a right branch contains the propagator from the node to the smallest time of the right subtree, and a left branch contains the propagation from the largest time of the left subtree to the node (Fig. 24). The main idea of the algorithm is that each node stores products of the matrix product of the left subtree times the propagator to the left, the operator, and the 38 propagator to the right times the matrix product of the right subtree. This requires an extra storage cost of O(k) in memory and additional functions for the re-balancing of binary trees, but reduces the computational effort of a sweep to O(k log k). G. Use of symmetries, global updates Global updates are updates that affect many or all of the vertices of a configuration. Two simple examples are a spin-flip of all auxiliary spins in a CT-AUX simulation, and the exchange of all the segments of two orbitals in CT-HYB. If the update corresponds to an exact symmetry (global spin-flips in a paramagnetic system, segment interchange for degenerate orbitals, . . . ), the weight of a configuration remains unchanged and a proposal of the global update will always be accepted. In cases with exact symmetries the same effect may be achieved by enforcing the symmetry at the end of the simulation. Global updates are useful for the systems with weakly broken symmetries, in particular near a phase transition, where they may help to radically reduce autocorrelation times. An example are weakly spin-polarized states. In all instances considered in the literature so far, global updates required the recalculation of determinants to estimate acceptance probabilities, at the cost of O(k 3 ) operations. Hence they should be performed at most once per hki update steps. The concept has proved to be useful to describe an insulating state with a small polarization in Refs. (Poteryaev et al., 2007, 2008), and similarly in (Chan et al., 2009; Kuneš et al., 2009). H. Vertex functions For some applications, in particular the determination of phase boundaries, response functions, or “dual Fermion” (Rubtsov et al., 2008), “DΓA” (Toschi et al., 2007) and other (Kusunose, 2006; Slezak et al., 2009) extensions beyond dynamical mean field theory, the expectation value of observables with four (or even six) creation and annihilation operators are needed. An example is Γ4abcd (τ1 , τ2 , τ3 , τ4 ) = hTτ d†a (τ1 )db (τ2 )d†c (τ3 )dd (τ4 )i. These correspond to reducible vertices. Time translation invariance implies that the four-point vertex is dependent on three time differences or three frequencies. Orbital or cluster symmetries of the impurity model may further reduce the number of independent indices abcd. Nevertheless, for most these problems the number of observables that need to be measured – especially for clusters or multi-orbital problems – is overwhelming: In a single orbital model, retaining 100 Matsubara frequencies in each of the three momentum indices requires obtaining and storing results of 106 mea- surements. In a four-site cluster calculation, the same number would lead to 64 million observables. In the CT-AUX and CT-INT algorithms, the four point correlation functions are computed using the fact that for a fixed auxiliary spin configuration the problem is Gaussian and Wick’s theorem can therefore be used together with Eq. (83). Thus the problem reduces to the accumulation of the determinant of a 2 × 2 matrix (Gull et al., 2008a; Rubtsov et al., 2005) * + {s ,τ ,x } {s ,τ ,x } (G012 + G01k Mkl i i i G0l2 ) (G014 + G01k Mkl i i i G0l4 ) {s ,τ ,x } {s ,τ ,x } (G032 + G03k Mkl i i i G0l2 ) (G034 + G03k Mkl i i i G0l4 ) (203) {si ,τi ,xi } with Mkl defined in Eq. (84). If only a few correlation functions are measured, Eq. (203) is best evaluated at run-time. If many or all correlation functions have to be measured at nτ time points and the size nM of M is comparatively small, it is advantageous to accumulate {s ,τ ,x } {s ,τ ,x } {s ,τ ,x } only hMij i i i i and hMij i i i Mkl i i i i and reconstruct the correlation function at the end of the computation. While binning the latter expression is O(n3τ ) in memory, it is only O(n3M ) computationally (using the time translation symmetry). For larger problems, in particular cluster problems, Gab (ω1 , ω2 ), the instantaneous single-particle Green’s functions, are computed directly in frequency (and in DCA: momentum) space for a given spin configuration. The four-point functions are then obtained by computing the Monte Carlo average of the instantaneous Green’s function products Γabcd = hGab Gcd − Gad Gcb i and using symmetries to reduce the number of observables. For many problems, only some of the four-point functions are needed (e.g. only the ones with energy transfer 0, or diagonal cluster momenta in DCA). Direct frequency measurement allows selective measurement of only these observables. In the CT-HYB, configurations with four local operators are generated by removing two hybridization lines from configurations of the partition function, similar to how Green’s function configurations are generated by removing one hybridization line. Some applications (see, e.g., (Slezak et al., 2009; Toschi et al., 2007)) require the computation of irreducible two-particle quantities. In order to obtain these vertices from the reducible ones, Bethe-Salpeter equations have to be inverted. This process can be numerically unstable, and how it is best done is currently still an open question. Our experience is that vertices measured directly in frequency space by the CT-AUX and CT-INT methods are most accurate, so that this is the method that should be used. 39 I. High frequency expansions of the self energy In many of the CT-QMC algorithms the self energy is obtained as the difference between the inverses of the full (G) and bare (G 0 ) Green functions. Because both of these become small at high frequencies while their errors stay constant, the errors of the difference of the inverses become large and Σ(ω) is difficult to measure accurately for large ω. It is therefore useful to have an analytical representation of the self energy at high frequencies. In a general N orbital model the self energy is an N × N matrix Σ and its high frequency expansion is   1 1 . (204) Σ1 + O Σ(iωn ) = Σ∞ + iωn ωn2 The coefficients Σ0,1 may be obtained by from the coefficients in the high frequency expansions of the full and bare Green functions (Potthoff et al., 1997) using 1 1 1 G0 + G1 + G2 , G(iωn ) = 2 iωn (iωn ) (iωn )3 (205) −1 the analogous equation for G 0 , and for Σ = G 0 − G−1 . The coefficients in the high frequency expansions of G and G 0 are in turn obtained from the discontinuities in the derivatives of G and G 0 across τ = 0 as Gn = ∂τ(n) G(τ = 0+ ) − ∂τ(n) G(τ = 0− ). (206) The time derivatives themselves may be computed from the definition D E Gab (τ − τ ′ ) = − Tτ da (τ )d†b (τ ′ ) (207) = (fermion antisymmetry implies that I a1 a2 b1 b2 −I a2 a1 b1 b2 ). Important for the self energy are the commutators with the interaction term, which are (bearing in mind the antisymmetry) i h aa1 b1 b2 † ˆ da = 2 P da1 db1 db2 , (213) Jˆa ≡ I, a1 b1 b2 I h i P a1 a2 b1 b † † ˆ d† = 2 Jˆb† ≡ I, da1 da2 db1 . (214) b a1 a2 b1 I Expanding and comparing terms we find that the constant term in the self energy is the familiar Hartree term Σab ∞ = 4 X I aa1 b1 b d†a1 db1 , (215) a1 b1 while Σab 1 = Dn Jˆa , Jˆb† oE . (216) The expectation values in Eq. (215) and (216) must in general be measured. For the single-orbital Anderson impurity model we find (with a chemical potential shift of U/2 usually employed) (Blümer, 2002; Comanac, 2007; Knecht, 2003)     U2 1 1 + . hn−σ i(1 − hn−σ i) + O Σ(ω) = U hn−σ i − 2 iωn iωn2 (217) Expressions for multi-orbital models with density density interactions are derived in (Gull, 2008), for plaquette CDMFT in (Haule, 2007; Haule and Kotliar, 2007b), and for multi-orbital models with the Slater-Kanamori form of interactions in (Wang, 2010). by performing a small-time expansion of da (τ ) = eHτ da e−Hτ (208) and its conjugate. The structure of the second derivative term is simplified by exploiting time translation invariance to place the derivative on the first or second operator as appropriate. The result is D E † (209) Gab 0 = {da , db } = δab , D E † Gab (210) 1 = {[H, da ] , db } , D h i E † Gab (211) 2 = {[H, da ] , H, db } . The coefficients for G 0 are obtained using the Hamiltonian without the interaction term. We illustrate the procedure for the generic Hamiltonian X X H= I a1 a2 b1 b2 d†a1 d†a2 db1 db2 (212) E ab d†a db + ab a1 a2 b1 b2  X X + Vkαb c†kα db + H.c. + εkα c†kα ckα kαb kα XI. APPLICATIONS I: DMFT A. Overview Dynamical mean field theory (DMFT) provided an important initial motivation for the development of CTQMC impurity solvers and is perhaps the domain to which the new solvers have made the most important contributions. We therefore consider dynamical mean field applications in some detail. We do not review the dynamical mean formalism in detail here, instead referring the reader to reviews of the original “singlesite” (Georges et al., 1996) and subsequent “cluster” (Maier et al., 2005a) formulations (see also (Potthoff, 2006)) and to reviews of the combination of the formalism with modern electronic structure theory which provides an important step towards an ab-initio description of strongly correlated compounds (Held, 2007; Kotliar et al., 2006). However, for clarity we provide a brief explanation of the essential ideas. A common strategy in theoretical physics is to obtain an approximation to the solution of a problem in terms of 40 a solution of a more tractable auxiliary problem, which is specified by a self-consistency condition. Weiss meanfield theory and density functional band theory are examples. Dynamical mean field theory provides an approximate solution of a lattice fermion problem in terms of an auxiliary quantum impurity model with interaction terms specified by the interactions in the original lattice model and single particle energies and hybridization functions determined by a self-consistency condition. One may think of it as based on an approximation of the full self energy Σab (k, ω), which depends on a discrete set of orbital labels a, b and continuous momentum and frequency variables k, ω, in terms of N functions of frequency Σj=1...N (ω) which are the self energies of an N orbital impurity model. Different ’flavors’ of dynamical mean field theory correspond to different prescriptions for reconstructing the lattice self energy from the Σj (ω) and to different forms of the self-consistency condition. All formulations require a solution of the quantum impurity model which is of high and reasonably uniform accuracy over a wide frequency range. It is not unfair to say that it is the development of CT-QMC techniques that has given DMFT the computational power needed to address the full range of problems arising in the physics of correlated electron physics. The first applications of dynamical mean field theory were “single-site DMFT” approximations to the physics of model systems such as the one orbital Hubbard model and the one orbital Anderson models. (Georges et al., 1996) For these two cases the auxiliary impurity model is the single-impurity Anderson model (Eq. (12)) which can be solved to sufficient accuracy for most purposes by pre-CT-QMC techniques, in particular the Hirsch-Fye approach. While the greatly improved efficiency of CTQMC methods has enabled a more refined study of some aspects of the physics and has shed light on some special cases, the single-site, single-orbital case has mainly served as a test-bed for investigating and evaluating CTQMC methods. A second class of DMFT applications is the “cluster” extensions (Maier et al., 2005a), which can treat the short ranged correlations characteristic of high temperature superconductors and other low dimensional systems. In the single-site DMFT method the self energy is replaced by its average over the Brillouin zone. Cluster DMFT methods allow for some coarse-grained momentum dependence and include some aspects of intersite correlations. They are thus of interest in the context of understanding the strong momentum space differentiation observed in high-Tc cuprates and other low dimensional systems. As of this writing, most of the “cluster-DMFT” literature has focused on models with a “Hubbard” interaction. For models with Hubbard interactions considerable progress has been made by the use of Hirsch-Fye (Jarrell et al., 2001; Macridin et al., 2006; Maier et al., 2005a,b; Vidhyadhiraja et al., 2009) and exact-diagonalization(Kancharla et al., 2008; Kyung et al., 2006b; Liebsch et al., 2008; Liebsch and Tong, 2009) methods (but see (Koch et al., 2008)). However, the more efficient CT-QMC methods have permitted the examination of much wider regions of phase space, which has led to new results and insights. A third class of DMFT applications is to the study of materials such as transitional metal oxides and actinides with partially filled d or f shells.(Held, 2007; Held et al., 2006; Kotliar et al., 2006; Kotliar and Vollhardt, 2004) In these materials multiplet interactions such as Eq. (13) are crucial to many aspects of the physics. CT-QMC methods have provided the first reliable solvers for this class of models and have yielded new insight into their physics. A fourth class of DMFT applications are extensions such as the “dual fermion” and “dynamical vertex” approximations (Kusunose, 2006; Rubtsov et al., 2008; Slezak et al., 2009; Toschi et al., 2007). These methods require the accurate calculation of the full four-point vertices of impurity models, and this computationally challenging task seems feasible only with CT-QMC methods. In the rest of this section we summarize the applications in the order presented above, and close with remarks about future challenges. B. Single-site DMFT approximation to the single orbital Hubbard model An important early success of single-site dynamical mean field theory was an improved understanding of the “Mott” or correlation-driven metal insulator transition. This is one of the fundamental questions in electronic condensed matter physics (Imada et al., 1998; Mott, 1949). The essential physics is captured by the one-band Hubbard model, specified by a hopping tij between sites i and j and an on-site interaction U : X X ni↑ ni↓ . (218) tij c†iσ cjσ + U H= ij i It has been known for many years (Imada et al., 1998) that at a carrier concentration n = 1 per site the model exhibits a paramagnetic metal to paramagnetic (’Mott’) insulator transition as the interaction strength U is increased above a critical value of the order of the bandwidth. The state obtained by doping the large U Mott insulating state has many unusual properties. A single-site dynamical mean field theory of the Hubbard model was formulated in Ref. (Georges and Kotliar, 1992). As shown by Müller-Hartmann (1989) and by Metzner and Vollhardt (1989), it becomes exact in a limit of spatial dimensionality d → ∞ and is believed to be reasonably reliable in d = 3. (Kotliar and Vollhardt, 2004) corrections are significant in d = 2 and d = 1. The corresponding quantum impurity model is Eq. (12). 41 0.525 single site 0.25 0.4 exact diagonalization continuous-time Hirsch-Fye 0.52 0.38 0.515 0.36 0.51 0.34 0.15 Bad Bad metal insulator -Ekin/t T/t 0.20 0.00 -0.2 0.5 0.3 0.28 U/t=4.95 (right scale) 0.49 Paramagnetic insulator Fermi liquid 0.32 U/t=4 (left scale) 0.495 0.10 0.05 0.505 0.26 0.485 0.24 0.48 -0.1 0.0 0.1 0 0.2 0.0005 0.0015 0.002 0.0025 0.22 0.003 2 U r/t (T/t) 0.25 U/t=6.5 U/t=6.0 1/2 0.2 (0.5-n) FIG. 25 (Color online) Metal-insulator phase diagram of paramagnetic two dimensional Hubbard model in the single site DMFT approximation, plotted against normalized interM IT with UM IT = 9.35t for this action strength Ur = U −U UM IT model. The transition is first order, with coexistence region indicated by shading (yellow online). The dashed line indicates the bad metal/bad insulator crossover determined from the condition that the imaginary part of the self-energy at few lowest Matsubara frequencies is flat at the crossover value of U . From Ref. (Park et al., 2008a). 0.001 U/t=5.8 U/t=5.7 U/t=5.6 0.15 0.1 0.05 0 0.5 Studies prior to the advent of CT-QMC established that in the single-site dynamical mean field approximation the phase diagram at half filling involves a first-order transition with a critical end-point in the T − U plane and a higher temperature crossover regime, as shown in Fig. 25. Physics beyond the single-site approximation will correct the phase diagram. In two spatial dimensions the change is qualitative, but in higher dimension the changes are less severe and the single-site phase diagram remains relevant. The scales are low, presenting a challenge to computational methods. The metal-insulator transition may P be characterized by the ‘kinetic energy’, essentially h ij ti−j c†iσ cjσ i, which gives a measure of the degree to which electron motion is blocked by the interaction U (Millis, 2004). At low T the transition from insulator to metal is marked by the appearance of a very narrow band of quasiparticle states inside the gap, which itself remains well formed for a range of U below the transition. These states form a Fermi liquid, but with very low Fermi temperature. Theoretical arguments (Fisher et al., 1995; Kotliar et al., 2002) established that the doping driven transition is also first order at low T , marked by the sudden appearance of states inside the Mott gap. However, the transition in this case is only weakly first order and for many years proved difficult to observe. These and other somewhat unusual features of the phase diagram occur because in the single-site approximation the paramagnetic insulating state has an extensive entropy of ln 2 per site (Georges et al., 1996). In the physical situation the entropy will be quenched below some scale, but in real three dimensional materials the scales may be low 0.6 0.7 0.8 µ/(U/2) 0.9 1 FIG. 26 Top panel: Kinetic energy obtained using the indicated impurity solvers plotted as a function of temperature for the Hubbard model with a semicircular density of states and bandwidth 4t and interactions indicated. For U = 4t the model is in a strongly renormalized metallic phase, for U = 4.95t a low-T metal to higher T insulator transition occurs, visible as a jump in kinetic energy at (T /t)2 ≈ 0.0007. From Ref. (Werner et al., 2006). Bottom panel: Doping per spin, 0.5 − n, as a function of chemical potential for βt = 400 and indicated values of U/t. At this temperature, the transition at half-filling (µ = µ1 = U/2) occurs at Uc (T ) ≈ 5.65. For U > Uc the n = 1 state is insulating and shifting the chemical potential induces a first order metal-insulator transition visible as a discontinuity in n(µ). From Ref. (Werner and Millis, 2007a). enough that the single-site phase diagram remains experimentally relevant.(Kotliar and Vollhardt, 2004) The top panel of Fig. 26 shows results from the first CT-HYB study of the interaction driven metal-insulator phase transition (Werner et al., 2006). It compares the kinetic energy calculated in the single-site dynamical mean field theory for the one band Hubbard model via a Hirsch-Fye simulation, an exact-diagonalization method, and the CT-HYB method. One see that the CT-HYB method agrees with the other methods (where there is overlap), allows access to very low temperatures, clearly reveals the T 2 behavior associated with a strongly renormalized Fermi liquid and captures the firstorder Mott transition. The bottom panel, taken from (Werner and Millis, 2007a) shows the dependence of car- 42 rier concentration on chemical potential for interaction strengths above and below the Mott transition providing the first clear verification that the doping-driven Mott transition is first order. Results such as these established that the CT-HYB method provides a successful description even of subtle, low temperature properties of impurity models. Another long-standing question in correlated electronic theory was Nagaoka’s prediction (Nagaoka, 1966) of ferromagnetism in the Hubbard model at carrier concentrations very near to half filling and very strong interactions. The status of this result was unclear for many years because Nagaoka’s original arguments applied rigorously only to one hole in a Mott insulator, not to a thermodynamic density of holes. Park, Haule, Marianetti and Kotliar used the CT-HYB method to establish the existence of a thermodynamic Nagaoka phase (Park et al., 2008b), at least in the d = ∞ limit. While quantum Monte Carlo methods are most effective for imaginary time (thermodynamic) simulations, it is of course very important to attempt to obtain spectra which can be compared to experimental response functions. The standard method is maximumentropy analytical continuation of the imaginary time data (Jarrell and Gubernatis, 1996). One question of particular importance has been the value of the insulating gap in the strong correlation limit at half filling. Here a weakness of the CT-QMC methods reveals itself: because the Green function is numerically very small in the middle of the imaginary time window, the simulation does not visit this region much and the statistics are relatively poor. But it is precisely this region which is important for the value of the insulating gap. Straightforward analytical continuation of the Green function leads to broadened gap edges. Ref. (Wang et al., 2009) discusses the issue in detail, arguing that one should instead continue the self energy and construct the green function from the continued self energy. C. Cluster dynamical mean field theory of the single orbital Hubbard model. The single-site dynamical mean field theory neglects spatial correlations and while it becomes exact in an appropriately defined infinite dimensional limit (Metzner and Vollhardt, 1989) it is known to provide an insufficient description of the metal insulator transition in finite dimensional models. Deviations from the single-site dynamical mean field picture are particularly large in the case of two spatial dimensions relevant for high temperature superconductivity. A striking and still ill-understood feature of hole-doped high temperature cuprate superconducting materials is the ‘pseudogap’, a suppression of the electronic spectral function occurring for momentum states along the Brillouin zone face FIG. 27 Comparison of momentum-dependence of the selfenergy of the two-dimensional Hubbard model with parameters U/t = 4, µ/t = 3.1 and T /t = 0.4 calculated at the Matsubara frequency ω0 = ξ = π/β calculated for along the cut (0, 0)−(π, 0)−(π, π)−(0, 0) in the first Brillouin zone using a numerically exact “diagrammatic Monte Carlo” procedure and using CT-AUX simulations of single site and 4, 8, 16, and 32-site DCA DMFT approximations. From Ref. (Kozik et al., 2010). but not for states along the zone diagonal. (In electron doped cuprates a phenomenologically somewhat different effect, confusingly also sometimes termed “pseudogap” is now understood as arising from proximity to a state with long-ranged two sublattice antiferromagnetic order (Armitage et al., 2010; Kyung et al., 2004; Motoyama et al., 2007; Zimmers et al., 2005)). The pseudogap is a dramatic example of the more general phenomenon of ‘momentum space differentiation’: an increase in the variation of physical quantities around the Fermi surface as the insulating phase is approached. Its origin and consequences remain hotly debated topics. Attention in recent years has focused on cluster dynamical mean field theories (Maier et al., 2005a), which capture at least some aspects of spatial correlations. These methods have produced a range of very exciting results with strong qualitative similarities to the cuprates (Tremblay et al., 2006) but are computationally very demanding.To date, cluster dynamical mean field approximations have mainly been used to study the single-band, two dimensional Hubbard and t − J (Zhang and Rice, 1988) models, although some work on Hubbardlike models related to heavy fermions has appeared (Sun and Kotliar, 2005). Significant results were obtained with approximate analytical and semi-analytical methods (Chakraborty et al., 2008; Kyung et al., 2006b; Parcollet et al., 2004), exact diagonalization (Civelli et al., 2005; Kancharla et al., 2008; Koch et al., 2008; Liebsch et al., 2008; Liebsch and Tong, 2009) and Hirsch-Fye QMC (Huscroft et al., 2001; Jarrell et al., 2001; Lichtenstein and Katsnelson, 2000; Macridin et al., 2006; Maier et al., 2005a, 2007a, 2005b, 43 plaquette 0.25 T/t=0.01 T/t=0.3 0.20 T/t 2006, 2007b; Vidhyadhiraja et al., 2009) approaches. While ED results have been reported only for clusters up to 4 sites, Hirsch-Fye approaches have been extended up to clusters of size 64 (at weak interaction strength) (Moukouri and Jarrell, 2001) and 16 (Vidhyadhiraja et al., 2009) (at moderate to strong interaction strength) although the magnitude of the computations required meant that studies were restricted to select dopings. The advent of CT-QMC methods greatly increased the ranges of parameters that could be studied with reasonable computational resources. Scans of parameter space became feasible and phase diagrams have been established. Hybridization expansion methods have been used to study 2-site (Ferrero et al., 2009b,a) and 4-site (Gull et al., 2008b; Park et al., 2008a) clusters. In the case of these small clusters, the analysis of cluster eigenstate occupation probabilities has provided new insights. For larger clusters the dimension of the local Hilbert space is so large that the hybridization expansion method has not been successfully applied. CT-AUX methods have been used to study 8 site clusters (Gull et al., 2009; Werner et al., 2009a) and, at U = 4t, a range of cluster sizes up to 32 (Kozik et al., 2010). We present here a few representative CT-QMC cluster DMFT results which illustrate the power of the methods and the nature of the new results which have been obtained. The convergence of cluster schemes with cluster size is shown in Fig. 27, which compares CT-AUX cluster DMFT results are to a direct Monte Carlo evaluation (“diag-MC”) of diagrams of the lattice problem (Kozik et al., 2010). While “diagMC” as of now only works for relatively weak interactions, the results do not contain a k-space discretization of the self energy, so that the results are exact within error bars. As seen in Fig. 27 for U = 4, convergence of the cluster DMFT results to the exact ones is achieved with 32 sites. We now turn to results relating to stronger coupling physics, beginning with results obtained for four site clusters, which have been studied using CDMFT (Kotliar et al., 2001) and DCA (Hettler et al., 1998) versions of cluster dynamical mean field theory. The foursite cluster calculations may be thought of as approximating the full momentum dependence of the self energy by its value at the four points S = (0, 0), Py = (0, π), Px = (π, 0) and D = (π, π). The physics brought by the added momentum dependence changes the character of the Mott transition in dimension d = 2. Fig. 28 shows the phase diagram of the two dimensional Hubbard model obtained in a detailed CT-HYB study of the 4-site CDMFT approximation (Park et al., 2008a). It should be compared to Fig. 25 which presents single-site DMFT results for the same model. The interaction-driven transition was found to be first order, as in the single-site case. However, not only is the critical interaction strength much less than Bad metal 0.15 0.10 0.05 0.00 -0.2 Fermi liquid -0.1 Bad insulator U c1 U MIT U c2 Paramagnetic insulator 0.0 0.1 0.2 U r/t FIG. 28 (Color online) Metal-insulator phase diagram of the paramagnetic phase of the two-dimensional Hubbard model in the plane of temperature T /t and interaction U/t measured relative to the critical end-point value UMIT = 6.05t in the 4 site CDMFT cluster approximation. Band parameters are identical to those used in Fig. 25 Inset: pie-chart histogram of occupancy probability of the two insulating states at low and high temperatures. From Ref. (Park et al., 2008a). in the single-site approximation, but the phase boundary bends in the opposite direction from that found in the single site calculation, indicating that in the multi-site approximation the insulating phase has lower entropy than the metallic phase. The narrow band of in-gap states whose appearance characterizes the Mott transition in high dimension(Fisher et al., 1995; Kotliar et al., 2002) is not found in cluster calculations for 2D systems. Insight into the metal-insulator transition is enhanced by the ability of CT-HYB to provide sector occupation statistics (Haule, 2007). These are indicated in Fig. 28 by pie-chart insets. The low temperature insulating phase was found to be characterized by a strongly dominant occupation of one state, corresponding to a singlet configuration of the four electrons on the plaquette. This correlation was argued by Gull et al. (Gull et al., 2008b) to indicate that in the cluster dynamical mean field methods the metal-insulator transition was driven by the appearance of strong short ranged order (most likely related to a columnar dimer phase). By contrast, the high temperature “bad insulator” state, which has entropy of the order of ln(2), populates many states of the plaquette with significant probability. Further evidence of the importance of short ranged order was obtained from the electron spectral functions (Gull et al., 2008b; Park et al., 2008a) computed by maximum entropy analytical continuation and shown in Fig. 29. The insulating state has a gap. The dotted line gives the spectral function calculated in a mean field approximation based on a two sublattice order; the strong similarity indicates that short ranged order is responsible for the insulating behavior. The left panel of Fig. 30 presents the changes in the density of states in the P = (0, π), (π, 0)-sector as elec- 44 Atot(ω) AP(ω) Mean Field Atot Mean Field AP DOS 0.3 0.2 0.1 0 -8 -4 -6 -2 0 2 4 8 6 Energy [t] FIG. 29 Solid line: on-site spectral function computed for different momentum sectors by maximum entropy analytical continuation of QMC data for U = 6t and doping x = 0. Dashed line: spectral function in the P = (0, π), (π, 0)momentum sector. Dotted and dash-dotted lines: P = (0, π), (π, 0) and local spectral functions obtained by performing the DCA momentum averages of the standard SDW mean field expressions for the Green function, with gap ∆ = 1.3t. From Ref. (Gull et al., 2008b). Probability 0.5 AP(ω) 4 electron singlet 4 electron triplet 5 electron spin 1/2 6 electron singlet 0.8 0.6 0.4 0.3 0.6 0.4 0.2 0.2 0.1 0 -8 -6 -4 -2 ω [t] 0 2 4 0 0 0.1 0.2 Doping FIG. 30 Left panel: Doping dependence of P = (0, π), (π, 0)sector density of states obtained by analytical continuation of quantum Monte Carlo data obtained from DCA approximation at U = 5.2t, temperature T = t/60 and dopings x = 0.04 (solid), x = 0.08 (dashed), and x = 0.15 (dashdotted line). Dotted line denotes the noninteracting density of states. Right panel: Evolution of the occupation probabilities with doping at U = 5.2t and temperature T = t/30. From Ref. (Gull et al., 2008b). trons are added. The curves are obtained by analytical continuation of quantum Monte Carlo data. The ‘Mott’ gap visible in Fig.. 29 has filled in even at the lowest doping shown, but for the lower dopings a small ‘pseudogap’ (suppression of density of states) appears near the Fermi level while for x = 0.15 the value of the spectral function at the Fermi level approaches that of the noninteracting model, indicating the restoration of Fermi liquid behavior, consistent with experiment and with many previous theoretical results. Examination of the sector statistics shown in the right panel of Fig. 30 indicated that the transition from pseudogapped to Fermi liquid behavior occurred at the doping at which the plaquette singlet state ceased to dominate the physics. An intriguing and still open question concerns the degree to which the level crossing in sector statistics is related to the ‘avoided criticality’ discussed by Haule and Kotliar (Haule and Kotliar, 2007a). Very recently CT-AUX methods have been used to examine the larger 8 site cluster shown in Fig. 31. The greater efficiency of the CT-AUX method permitted a comprehensive examination of the behavior as a function of interaction strength, carrier concentration, second neighbor hopping and temperature (Gull et al., 2009; Werner et al., 2009a). A striking new result is that both the interaction-dependent and doping-dependent metal insulator transitions are multi-staged, with different regions of the Fermi surface are successively gapped as carrier concentration or interaction strength are varied. (Similar behavior was also found in a 2 site cluster with a clever choice of momentum-space patching (Ferrero et al., 2009a)). The phase diagram for the interaction-driven transition is shown in the right-hand panel of Fig. 31. Identification of a gapped region in a spectrum can be based on analytical continuation. However, obtaining data of the requisite quality for analytical continuation is very expensive, and analytical continuation is in any event a notoriously ill-posed problem. Methods for identifying metal-insulator phase boundaries directly from imaginary time are therefore valuable. At present it appears that the most reliable method is to plot βGK (β/2) in momentum sector K, related toRthe density of states at AK (ω) dω the Fermi energy by βGK (β/2) = 2πT cosh[ω/(2T )] . This is shown as a function of interaction strength or chemical potential for several temperatures as shown in Fig. 32. The sector gapping transitions were identified from the temperature dependence of βGK (τ = β/2). One sees from Fig. 32 that a gap opens in sector C at lower µ than in sector B. Remarkably, this sector-selectivity occurs on the hole doped but not on the electron-doped side of the phase diagram. The successive gapping bears an intriguing similarity to the behavior of high-Tc cuprates in the pseudogap regime. The interpretation and implications of the CT-QMC results are at present the subject of active investigation. D. Dual-fermion calculations for the single-orbital Hubbard model Cluster dynamical mean field methods suffer from several drawbacks. A cluster of a given size corresponds to a coarse-graining either in real or reciprocal space, which may bias the physics. At model parameters relevant for high-Tc cuprates, the sign problem limits the range of cluster sizes that can be studied, even with CT-QMC methods, so that systematics of scaling with cluster size has been established only for weak interactions (Fig. 27, (Kozik et al., 2010; Maier et al., 2005a,b)). Alternative ways to handle nonlocal correlations have been proposed (Kusunose, 2006; Rubtsov et al., 2008; Slezak et al., 2009; Toschi et al., 2007). These methods 45 FIG. 31 Left panel: Brillouin zone partitioning associated with the 8-site cluster DCA approximation with definition of the four inequivalent momentum sectors A, B, C and D. The noninteracting Fermi surface for t′ = −0.15t and density n = 1 is indicated by the gray line. Right panel: sketch of the paramagnetic state DCA phase diagram of the Hubbard model, calculated for the cluster shown in the left panel at half filling, as a function of interaction strength U and next-nearest neighbor hopping t′ . A Fermi liquid metal phase (left, red on-line), a sector selective intermediate phase (middle, green on-line) in which the sectors labeled as C are gapped but those labeled as B remain gapless, and a fully gapped insulating phase (right, blue on-line) are shown. From Ref. (Gull et al., 2009). FIG. 33 Spectral function Aω=0,k at the Fermi level calculated for the Hubbard model at t = 0.25, t′ = −0.075, U = 4.0, β = 80 and doping x = 0.14 using the lowest order momentum-dependent diagram in the dual fermion method, with analytical continuation performed by polynomial extrapolation from Matsubara frequencies. An anisotropic destruction of the Fermi surface in the pseudogap regime is clearly visible. From (Rubtsov et al., 2009). 1.6 βt = 10 βt = 15 βt = 20 βt = 25 βGK(β/2) 1.2 site DMFT appear as diagrams containing the reducible vertex parts of single-site DMFT impurity problems at nodes, whereas lines are propagators for dual Green’s functions corresponding to non-local parts of the DMFT lattice Green’s function. 0.8 0.4 0 -3 -2 -1 µ/t 0 1 2 FIG. 32 βG( β2 ) calculated in DCA approximation to 8-site cluster for sectors B (full symbols) and C (empty symbols), at U/t = 7 and t′ /t = −0.15. The strong temperature dependence in the sector C curves arises from the Van Hove divergence in the density of states. The crossing points indicate the onset of gapping in the sectors. From Ref. (Gull et al., 2009). are systematic expansions around the single-site DMFT approximation, and have the advantage that both shortand long-range fluctuations are treated simultaneously, but require evaluation of vertex functions. We discuss here the dual fermion approach where CT-QMC methods have been extensively applied; the computational issues for the other methods are similar. The dual fermion approach (Rubtsov et al., 2008) is formulated as a standard diagrammatic technique in terms of auxiliary, so-called dual variables, introduced via a continuous HubbardStratonovich transformation. The corrections to single- Technically, the method requires an impurity solver that can provide not only single-electron Green’s functions of the (single-site) impurity problem, but also the full four point vertices (also of the single-site impurity problem) as a function of all frequencies. The CT-QMC algorithms allow such calculations in both the interaction and hybridization expansion formalisms and have been employed for dual-fermion analyses of the Hubbard model. In Ref. (Rubtsov et al., 2009) the pseudogap regime of the doped t − t′ Hubbard model was studied. A CT-INT solver was used to obtain both the Green’s function G and the 4-point vertex Γ(4) in the Matsubara-frequency domain. The spectral function Ak = −1/πImGω=0,k for the entire Brillouin zone is shown in Fig. 33 for 14% doping. The phenomenon of momentum space differentiation is clearly seen: the Fermi surface in the antinodal direction is relatively diffuse, whereas sharp quasiparticles appear near the nodal points. CT-INT was used in (Hafermann et al., 2009) to sum the particle-hole ladders in dual diagrams for the halffilled Hubbard model, revealing a pseudogap formed by antiferromagnetic correlations even in the absence of a explicit symmetry breaking. Further investigation of this and related approximations is an active area of research. Some of the results are presented in Fig. 34. DOS 46 0.175 0.15 0.125 0.1 0.075 0.05 0.025 −10 0.25 0.2 0.15 0.1 0.05 −5 0 ω 5 10 0 −6 −4 −2 0 2 4 6 ω FIG. 34 Momentum-integrated spectral function (DOS) of half-filled single-band Hubbard model calculated using the dual-fermion method. Left: Metallic (thin lines, red on-line) and insulating (heavy line, black on line) spectral function within the coexistence region of the Mott transition U/t = 6.25, T /t = 0.08. Right: DMFT (dashed), lowest-order dual fermion correction (thin line) and ladder dual fermion (thick line) DOS at U/t = 4, T /t = 0.19. The ladder dual fermion result exhibits a pseduogap of antiferromagnetic origin. From (Hafermann et al., 2009). XII. APPLICATIONS II: DMFT FOR MULTI-ORBITAL AND KONDO MODELS Most “correlated electron” materials involve transition metal, rare-earth or actinide states with multiply degenerate levels. The electrons in these levels are subject to complicated interactions such as the “Slater-Kanamori” couplings shown in Eq. (13) and exhibit a richer variety of physical effects than found in the single-orbital Hubbard model, including high-spin to low-spin transitions, orbital ordering and orbitally selective Mott transitions. Until recently investigation of these models was hampered by a lack of good numerical methods: there were no good auxiliary field transformations, so HirschFye methods could not be used unless the rotational symmetry of the interaction was broken so only the Jz (density) component of the spin exchange was retained and the “pair hopping” terms in the Slater-Kanamori Hamiltonian were neglected. There were too many states for exact diagonalization or NRG methods. The situation has now changed. Studies of realistic models of materials involving electrons in two or three-fold degenerate orbitals are straightforward, 5 orbital problems (i.e. the full d multiplet, needed e.g. for the pnictides) are manageable, and problems involving one electron or hole in the 7-fold degenerate f shell are becoming possible. However, a complete single-site DMFT computation for materials such as Pu in which the f shell is multiply occupied and rotationally invariant (exchange and pair-hopping) interactions are important cannot be done with present methods: basis truncations (Sec. X.F.2) or other approximations (for example the Krylov techniques discussed in section V.D) are required. The calculations are generally done with the hybridization solver because the interactions are strong and multiple interactions are important and so far have been restricted to the single-site dynamical mean field approximation because the proliferation of orbitals means that multisite models involve too many FIG. 35 (Color online) Main panel: phase diagram of the 3 band model with semicircular density of states at βt = 50 and J = U/6 in the plane of particle density n and interaction strength U . The vertical lines indicate the Mott insulating phases at integral values of n. The magnetic state is labeled by P (paramagnetic), F (ferromagnetic) and A (two sublattice antiferromagnetic) while the labels O(N) denote the 3 classes of orbital ordering discussed in (Chan et al., 2009). The heavy dashed line (orange on-line) gives the boundary of the non-Fermi-liquid frozen-moment phase discovered in (Werner et al., 2008). Inset: Hartree-Fock phase diagram for magnetic phases of the same model. Magnetic phase boundaries are indicated by solid lines and orbital ordering boundaries by dashed lines. OO and OS stand for the orbitally-ordered and orbitally-symmetric phases respectively. All transitions are second order except the FM-AFM transition and the orbital ordering transitions at U & 12t and small n. From Ref. (Chan et al., 2009). states to be practical at present. Figure 35 shows the phase diagram (Chan et al., 2009) calculated for a model of electrons moving among three degenerate orbitals with the full rotationally invariant interactions. The effect of the Hund’s coupling on the multiorbital Mott transition was determined and a rich multiplicity of phases has been found. The orbital degree of freedom is important to stabilize the metallic phase at relevant interaction strengths (the two orbital model with two electrons and J/U = 1/6 is insulating for U & 3.7t (Werner and Millis, 2007c)). Suppressing the L = 1 orbital angular momentum states by applying a crystal field rapidly leads to an insulator. A remarkable feature of the phase diagram is the line indicating an apparent quantum “spin freezing” transition with unusual properties (Werner et al., 2008). The phase exists only in the window 0 < J < U/3. For J = 0 the frozen moment phase does not exist while for J > U/3 the term U ′ − J = U − 3J in Eq. (13) changes sign and the physics of the model becomes different. The spin freezing transition was originally identified from an unusual behavior of the self energy and its nature was confirmed by an examination of the local spin and or- 47 <n1(0)n2(τ)>, <Sz(0)Sz(τ)> 0.25 the CT-HYB method. This ability has been used in a recent paper to gain important new insights into the “hidden order” phase of the heavy fermion material URu2 Si2 (Haule and Kotliar, 2009). n=1.21 n=1.75 n=2.23 n=2.62 n=2.97 0.2 0.15 0.1 0.05 A. Heavy Fermion compounds and the Kondo Lattice Model 0 -0.05 -0.1 0 5 10 15 20 25 τt FIG. 36 Imaginary time dependence of the spin-spin correlation function hSz (0)Sz (τ )i (positive correlation function, full symbols) and orbital correlation function hn1 (0)n2 (τ )i (negative correlation function, open symbols) calculated for a threeband model with U = 8t and J = U/6 using CT-HYB at carrier concentrations n indicated. The convergence of the spin correlations to a value different from 0 while the orbital correlation converge to zero indicates spin but not orbital freezing in the model. From Ref. (Werner et al., 2008). bital correlation functions. Figure 36 presents results for the imaginary-time impurity-model spin-spin and orbital-orbital correlators COO (τ ) = hO(τ )O(0)i with O representing either the P electron spin density Sz = 13 α 12 (d†α,↑ dα,↑ − d†α,↓ dα,↓ ) P or the orbital density n̂α = σ d†α,σ dα,σ . In a Fermi liquid at low temperature T , CSS (τ ) ∼ (T / sin(πτ T ))2 for imaginary times τ sufficiently far from either τ = 0 or τ = 1/T . The DMFT results are consistent with this form in the Fermi liquid phase, but in the non-Fermiliquid phase the spin-spin correlator CSS is seen to approach a constant at long times indicating the presence of frozen moments whereas the orbital correlator is seen to decay rapidly with time on both sides of the phase transition. The hybridization expansion solver yields information (Haule, 2007) on which of the different eigenstates of Hloc are represented in the partition function. At J > 0 at couplings (U & 4t) only a few states are relevant. The large-U density-driven transition is marked by a change in the dominant states from the one-electron states S = 1/2, L = 1 to a nine-fold degenerate manifold of two electron states with S = 1 and L = 1, with the two manifolds becoming degenerate at the transition. The interaction-driven transition is on the other hand marked by a change in the weight of the two subleading states S = 1/2, L = 1 and S = 3/2, L = 0, implying a change in the magnitudes of coupling strengths. The ability to combine measurements of response functions with an analysis of which states contribute appreciably to the partition function is a great advantage of “Heavy fermion” compounds pose one of the great conceptual challenges of correlated electron physics (Stewart, 1984). These materials are intermetallic compounds in which one element is a rare earth (such as Ce or Yb) or actinide (such as U or Pu) with a partially filled f -shell, while the other elements contribute s, p, d, electrons to broad, weakly correlated bands. The f electrons are weakly hybridized to the other bands and are subject to strong interactions, so that typically one f valence state is strongly dominant. At temperatures of the order of room temperature, the materials appear as two-component systems, with magnetic moments (arising from the f shells) embedded in and weakly coupled to a Fermi sea of s, p, d electrons. At low temperatures, however, the spins and conduction electrons combine into a new object, which may become a heavy mass Fermi liquid or a narrow gap Kondo insulator, or may become unstable to unconventional superconductivity, magnetic order, or may exhibit a variety of quantum critical behavior (Löhneysen et al., 2007; Stewart, 2001). Our understanding of the heavy fermion state has been hampered by a lack of unbiased numerical methods. While numerics is still far from being able to address the full richness of heavy fermion physics, the combination of dynamical mean field theory and methods including the CT-QMC approach is beginning to have an impact on the field. CT-HYB methods have been applied to the study of heavy fermion materials (Haule and Kotliar, 2009; Shim et al., 2007) but it appears at present that difficulties arising in the course of dealing with realistic models of heavy fermions are sufficiently large that CT-HYB methods have been mainly used to spot-check the results of other, approximate but much less computationally expensive, solvers. A realistic treatment of heavy fermion materials must deal with the full complexity of the f -shell and is characterized by a multiplicity of interactions, all of which are strong, a strong spin orbit coupling and (in many of the interesting materials) a low point group symmetry, leading to a complicated multiplet structure imposed on a local Hilbert space of dimension 47 . This HIlbert space is too large to treat directly by a straightforward application of the CT-HYB method. However, in many if not all cases only a small portion of the Hilbert space is relevant to the physics, so truncation schemes in which only a portion of the Hilbert space is retained may be appropriate. In some cases, such as elemental Ce or Ce-based heavy fermion compounds the relevant valence 48 kσ i while Matsumoto and collaborators (Matsumoto et al., 2009) have used the CT-J method along with input from ab-initio band theory to describe trends across families of heavy fermion compounds. This model with antiferromagnetic J is a minimal model for heavy fermion physics and also may be used to address other theoretical issues, for example by changing the sign of J. In the physically relevant antiferromagnetic J case the model is believed to have a small J magnetic phase and a larger J non-magnetic phase: a Fermi liquid if the density of the conduction band is different from 1 per site and a Kondo insulator if the conduction-band density is one per site. The qualitative form of the phase diagram has been understood since the work of Doniach (Doniach, 1977). The band theory phase diagram calculated by the CT-J method is shown in Fig 37. It has the qualitative form proposed by Doniach but the CT-J method enables one to understand in detail how the high temperature local moment phase crosses over to the Fermi liquid (Otsuki et al., 2009a), and provides insight into the relation of the Fermi liquid coherence to the magnetic phase diagram and allows one to include material-specific information. Figure 38 shows the single-particle spectral function A(κ, ω) computed by Otsuki et al. (2009a) using an analytical continuation based on Padé approximants. This continuation method requires data of extremely high precision, available only with the CT-QMC methods. The vertical white lines labeled κS indicate the positions of the Fermi surface defined by the conduction band electrons in the absence of any Kondo effect; the line labeled κL indicates where the Fermi surface would be if the local moment became an itinerant electron and were folded into the conduction band. The left panel shows A(κ, ω) for the high T = 0.25. The spectrum exhibits a behavior of almost non-interacting electrons at high energies. However, a suppression of density of states is seen near the conduction electron Fermi surface κS . The right panel shows the spectral function at T = 0.0025, which is much lower than the impurity Kondo 80 CeCu2Si2 (*) 70 CeRh2Si2 60 TN [K] states are f 0 , f 1 and perhaps f 2 and a straightforward truncation in which all higher occupancies of the f state are forbidden works well. In other situations, such as P u, more elaborate schemes involving truncation in energy, in valence and in size of sub-matrices is required. CT-J methods, which in effect reduce the Hilbert space of the local problem to the minimum possible size, are a promising alternative route. In a very interesting first step in this direction, Otsuki and collaborators (Otsuki et al., 2009a,c) have used the CT-J method to perform a detailed study of the singlesite dynamical mean field solution of the spin 1/2 Kondo lattice model, defined by the Hamiltonian X X ~i · ~σi , S (219) HKL = εk c†kσ ckσ + J 50 40 30 CeAu2Si2 CePd2Si2 CeAg2Si2 Kondo-screened 20 10 0 -0.4 AF -0.3 -0.2 CeRu2Si2 -0.1 t 0 0.1 0.2 FIG. 37 Phase diagram calculated using CT-J methods with band theory input for the material family CeX2 Si2 . The abscissa t is defined as the Kondo coupling measured relative to the critical Kondo coupling for the T = 0 magnetic transition. From (Matsumoto et al., 2009). FIG. 38 The single-particle excitation spectrum A(κ, ω) for J = 0.3 and nc = 0.9 at (a) T = 0.25 and (b) T = 0.0025. The slanted line represents the non-interacting spectrum ω = κ−µ which is realized for J = 0. From Ref. (Otsuki et al., 2009a). √ −1/g temperature defined by TK = ge ∼ 0.1 with g = 2Jρc (0). Here the spectral function takes a form closely resembling that expected if the conduction band is weakly hybridized with a very flat band near the Fermi level and the Fermi surface has shifted to the point κL , indicating that the local moments in fact contribute to the Fermi volume. This behavior was expected, based on the detailed understanding which has been obtained for the single-impurity Kondo problem, but it is remarkable to see the phenomenon clearly exhibited in a lattice calculation. The fact that the bands are well defined at all k, and that the Kondo hybridization gap which opens up at κS is well defined, are new and somewhat unexpected. In a related study (Hoshino et al., 2010), Hoshino and co-workers considered the Kondo lattice model at conduction band densities n = 1 where at larger J the ground state is a paramagnetic Kondo insulator. At smaller J the paramagnetic Kondo insulator is unstable to an antiferromagnetic insulator ground state. Figure 39 shows the spectrum for an intermediate J = 0.2 where a Kondo insulator phase is established at intermediate temperatures (left panel) and at lower T becomes unstable to antiferromagnetism (right panel). In the region of 49 FIG. 39 False-color plot of electron spectral function in frequency ω and scaled momentum κ plane for Kondo lattice model with antiferromagnetic coupling J = 0.2 in (a) the paramagnetic phase at T = 0.035 and (b) the antiferromagnetic phase at T = 0.010. From Ref. (Hoshino et al., 2010). FIG. 40 False-color plot of electron spectral function in frequency ω and scaled momentum κ plane for Kondo lattice model with ferromagnetic interaction J = −0.2 in (a) the paramagnetic phase at T = 0.035 and (b) the antiferromagnetic phase at T = 0.010. From Ref. (Hoshino et al., 2010). the Brillouin zone presented in the figures the form of the spectral function is remarkably similar in the two phases; the magnetism merely sharpens the spectral function and increases the gap size. Again the hybridization of the local moment into the conduction band is the only reasonable interpretation of the formation of the paramagnetic insulating state. It is interesting to contrast these results with those obtained for ferromagnetic Kondo coupling, shown in fig 40. Here we see that in the paramagnetic state there is a band crossing the Fermi level: the material is not an insulator because the Kondo effect does not occur (a similar effect was demonstrated in (Werner and Millis, 2006) using the CT-HYB method) and it is only when the antiferromagnetic instability occurs that a gap opens up. The application of CT-J methods to Kondo-like problems is still in its early stages, and it seems likely that further extensions to more realistic models, and to cluster dynamical mean field approaches, will yield further insights. B. Dynamical mean field theory for realistic models of correlated materials Dynamical mean field methods are more and more widely used in ab-initio based studies to model in a re- alistic way the properties of interesting materials. These studies involve many subtle issues relating to mapping the orbitals and energies derived e.g. from a density functional band theory calculation onto a theoretical model appropriate for solution with dynamical mean field methods. The subject is reviewed in (Kotliar et al., 2006) and we will not attempt to summarize the discussion here. For the purposes of the present review it is enough to note that the correlated electron aspects of real materials typically involve multiple orbitals and several interaction parameters, so a mapping onto a simple one-band Hubbard model is typically not appropriate, while the demanding nature of the band theory computations places a premium on having efficient impurity solvers for the dynamical mean field calculations. The development of CT-QMC methods has therefore had a significant impact on the field. The range of applications is large and growing rapidly; it will not be summarized here. Rather, we will focus on recent results pertaining to one particularly challenging, and particularly topical system, the ironbased superconductors, where CT-HYB methods have made an important contribution to understanding the physics. These calculations may be considered as reflective of the present “state of the art” of the “realistic DMFT” field. The unusually high superconducting critical temperatures together with unusual normal state properties are generally agreed to place the iron oxypnictides in the broad category of strongly correlated superconductors, which also includes the κ organics, cerium and plutonium based heavy fermions, and cuprate high temperature superconductors. The correlated electrons reside mainly on d-orbitals associated with the Fe site and it appears to be necessary to retain all 5 of the states in the dmultiplet and to treat carefully both the effects of the U interaction which constrains charge fluctuations and the J-type interactions which select different states at fixed total charge. Because the couplings are neither extremely large nor extremely small, approximate methods may not be reliable: the full interacting problem must be treated by a numerically exact method. The low point symmetry of each Fe site means that ligand field effects compete non-trivially with the interaction effects while the hybridization function is complicated, and must be determined using band theory input. From the dynamical mean field side the complexity of the problem is such that only single-site DMFT calculations have been attempted, sometimes with a further restriction to density-density interactions. In order to investigate the correlation effects in such complicated compounds it is important to have consistent one-electron and many-body parts of the LDA+DMFT Hamiltonian. For example, Aichhorn and collaborators studied the material LaO1−xFx FeAs using an optimized basis of the localized dpp Wannier functions which was constructed from the 22 Bloch bands, corre- 50 20 DOS / eV -1 15 LDA (Wien2K) LDA+DMFT U = 2.69 eV J = 0.79 eV -1 β = 40 eV 10 5 O-p 0 -8 -6 -4 As-p Fe-d -2 ω / eV 0 2 4 FIG. 41 (Color online) Full (all bands) spectral function for dpp Hamiltonian description of LaOFeAs. Black line: LDA. Red line: LDA+DMFT (computed retaining only densitydensity interactions). From Ref. (Aichhorn et al., 2009). FIG. 42 Histogram of occupation probabilities for each 3d atomic state in DMFT calculation for BaFe2 As2 at T = 150K. The states are sorted by total d occupancy and within each manifold of fixed occupancy by energy. From Ref. (Kutepov et al., 2010). sponding to the 10 Fe-3d, 6 As-p and 6 O-p states (note each unit cell contains two formula units and the point symmetry of the two Fe is the same) (Aichhorn et al., 2009). The Green’s function and hybridization function are constructed from the matrix elements of the KohnSham Hamiltonian in the Wannier basis, while matrix elements of the Coulomb interactions were calculated from the static limit of a constrained random phase approximation. The dynamical mean field theory was constructed by retaining the on-site intra-d interactions and projecting the k-integrated Green function onto the subspace of d Wannier functions. Other groups use slightly different procedures; for example Kutepov et al. used a self consistent GW procedure to compute the interaction and an orbital-based procedure rather than a Wannier function-based procedure to define the basis of local states (Kutepov et al., 2010). Aichhorn et al. then used CT-HYB simulations (but with only density-density interactions) at room temperature to obtain the full local spectral function for the dpp Hamiltonian corresponding to the experimental crystal structure of LaFeAsO and the realistic Coulomb matrix elements (Aichhorn et al., 2009). Results are shown in Fig. 41: The LDA+DMFT DOS near the Fermi level displays characteristic features of a metal in an intermediate range of correlations. Both occupied and empty states are shifted towards the Fermi level due to the Fermi-liquid renormalizations. No high-energy features that would correspond to lower or upper Hubbard bands can be seen in this LDA+DMFT electronic structure. In untangling the physics of the materials the ability of the CT-HYB method to provide the components of the local density matrix, in particular the probability that any one of the atomic states of the iron 3d orbital is occupied, is important. This is plotted for the material BaFe2 As2 in Fig. 42 (Kutepov et al., 2010). Even the most probable atomic states have a probability of only a few percent, hence a naive strong correlation atomic limit is qualitatively wrong for this compound. The wide spread of energies within a given submanifold is a consequence of the additional “J-like” interactions. FIG. 43 Momentum resolved spectral function A(k, ω) calculated for BaFe2 As2 . Gray inset: ARPES intensity from Ref. (Brouet et al., 2010). From Ref. (Kutepov et al., 2010). Figure 43 shows a false-color representation of the moP mentum resolved spectral function L A(k, ω)LL in the near-fermi-surface energy range (Kutepov et al., 2010). Near the Fermi level the quasiparticle bands are well defined, while at higher energies the structures become blurred, reflecting the increased phase space for scattering. The quasiparticle velocities are renormalized relative to the band theory result (not shown) by factors of 2 for x2 − y 2 and 3z 2 − r2 orbitals and 3 for the xy, xz, yz orbitals. The momentum space positions of the Fermi surface crossings are in good agreement with photoemis- 51 / XIII. APPLICATIONS III: NANOSCIENCE 1 W(τ=β/2) W(τ=β/2, T=0.01t) sion results, as are the renormalized velocities. Comparison of these sorts of calculations to the rapidly growing body of experimental data are enabling a comprehensive understanding of the physics of novel materials. A. Transport through quantum dots: linear response and quantum phase transitions One important application of quantum impurity models is as representations of “single molecule” conductors and other nano-devices (Hanson et al., 2007). Much of the attention in the nanoscience community has been focused on weakly interacting systems or on simple Hubbard-like dots. Standard perturbative or Hirsh-Fye QMC methods suffice for these situations, although CTQMC methods have been used, e.g. in a study of the accuracy of the GW approximation (Wang et al., 2008). As the field moves towards consideration of quantum dots with richer physics, other approaches including CT-QMC methods are likely to become important. An example is provided by the two-level two lead quantum dot system uncovered by Yacoby et al. (Yacoby et al., 1995). Golosov and Gefen (2006) suggested that this system could display a quantum phase transition between two different relative occupancies as level energies were varied. This issue was investigated using the CT-HYB method by Wang and Millis (2010). In the general case of the model presented by Gefen the imaginary part of the hybridization function (giving decay of the dot electrons into the leads) does not commute with the combination of the level Hamiltonian and the real part of the hybridization function (giving the renormalization of the dot energies). This causes a severe sign problem, which prevented any useful simulations in the general case. Wang and Millis (2010) argued that the universal behavior at a quantum critical point (if one existed) could be described by a sign problem-free model (essentially because at the critical point the combination of the dot Hamiltonian and real part of the hybridization function becomes the unit matrix). To investigate the criticality, Wang and Millis (2010) considered the imaginary-time dependence of correlation functions of variables defined on the quantum dot. Fig. 44 shows three different behaviors at times ∼ β/2: a T 2 dependence expected for a Fermi liquid for small U/t, a power law at the critical point, and a constant long time behavior for large U in the non-Fermi liquid phase. However, impurity problems may be characterized by exponentially small scales such as the Kondo effect. Distinguishing a very small scale from a true phase transition is numerically challenging. The ability of CT-HYB to access very low temperatures ∼ 10−3 t provides reasonable evidence of a critical point. However, for problems 0.8 0.6 0.4 12 U /t=0.6 12 U /t=0.4 12 U /t=0.3 12 U /t=0.2 12 U /t=0 0.2 0 0 0.2 0.4 4 10 (T/t) 0.6 0.8 1 2 FIG. 44 Imaginary time density-density correlation function W of two-level, two-lead model evaluated using CT-HYB at midpoint of imaginary time interval and normalized to value at T = 0.01t, as function of interaction strength with level energies tuned to be equal. Weak T and τ dependence is seen in non-Fermi-liquid phase (U = 0.4, 0.6) and strong T and τ dependence in Fermi liquid phase (U = 0, 0.2). For details see Ref. (Wang and Millis, 2010). such as this where the key question concerns the asymptotic low energy behavior, quasi-analytical functional renormalization group methods (Karrasch et al., 2006) and NRG approaches (Bulla et al., 2008; Karrasch et al., 2007) may be more powerful. B. Metal atom clusters on surfaces An active area of nanoscience research concerns the properties of one or more transition metal ions on a metal surface. Of particular interest is the density of states, which may be compared to scanning-probe microscopy data. Savkin and coworkers in (Savkin et al., 2005) applied the CT-INT scheme to a model of three interacting Kondo impurities on a metallic surface. The ability of the CT-QMC methods to treat realistic interactions allowed an accurate investigation of the interplay of cluster geometry, inter-adatom hopping, local Coulomb interactions and the Heisenberg exchange interactions between magnetic impurities. Savkin et al. (2005) showed that a rotationally invariant antiferromagnetic exchange interaction is almost twice as efficient in suppression of the single site Kondo effect as is the Ising like interaction which was all that could be treated by previous methods. The possibility to make quantitative comparisons to experiment highlights the need to incorporate as much material specificity as possible into the calculation. Gorelov (2007) performed a realistic study of Co atoms in the bulk or at the surface of a Cu host. They found 52 nk 1 2 while the Coulomb interaction Hint = c†iσ c†jσ′ ckσ′ clσ (220) iωn + µ − εnk P ijklσσ′ Uijkl involves matrix elements of the form 2 Uijkl = hdi (r1 )dj (r2 ) ε|r1e−r2 | dk (r2 )dl (r1 )i. The number of interaction terms which must be considered is large, and depends on the choice of basis in the d-sector. Use of symmetries to rearrange the interaction and eliminate redundant terms was also found to be important. Implementing all the symmetries and making an optimal basis choice led Gorleov et al to an expression for the partition function as an expansion in 129 independent interaction parameters: X (−1)n Z = Z0 n!2n n X {ijklσσ′ } Z 0 β dτ1 ... Z Ui1 j1 k1 l1 ...Uin jn kn ln det G β dτn (221) 0 2n×2n . The expansion exhibits a “trivial” sign problem which may be mitigated by appropriate choice of α parameters as discussed in Sec. III.A although the multiplicity of interactions requires a multiplicity of α parameters; for further details see Refs. (Gorelov, 2007; Gorelov et al., 2009). The expansion also suffers from an ’intrinsic’ sign problem (not curable by choice of α) whose severity was found to depend on the basis choice. “Three orbital” terms Uikkl with l 6= i were found to produce a severe sign problem but do not occur if a spherical harmonic basis is used. However, “non-diagonal” terms Uijkl with (i 6= j, k 6= l) cannot be eliminated by transformations, make important contributions to the physics, and give rise to a sign problem, of a severity which depends on other features such as the Green’s functions. To test both the expansion and the importance of the non-diagonal terms, Gorelov et al. determined the Green’s function for one orbital of the isolated atom CT-QMC, diagonal U CT-QMC, full U ED, diagonal U ED, full U 0.0 -0.40 -0.2 G( ) G( ) -0.35 -0.45 -0.4 -0.6 -0.8 -1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 -0.50 0.0 0.5 1.0 1.5 2.0 FIG. 45 (color online) Comparison of CT-INT expansion of Eq. (221) with exact diagonalization for an isolated Co atom with U = 1 eV, J = 0.4 eV, at inverse temperature β = 2 eV−1 , for 5 electrons (main panel) and 8 electrons (inset). Solid lines and crosses: Full Hamiltonian. Dashed lines and plus symbols: model specified by diagonal terms only. From Ref. (Gorelov et al., 2009). 1.4 1.0 LDA 0.8 adatom 1.2 CT-QMC, diagonal U 1st layer CT-QMC, full U bulk 1.0 0.6 0.8 DOS 0 Gij (iωn ) = X hdi |Ψnk ihΨnk |dj i -0.30 DOS that a complete treatment of the problem, including all inequivalent terms of the Coulomb interaction, was essential for obtaining physically relevant results. Inclusion of all of the interaction terms however produces a severe sign problem. While the sign problem can be mitigated to some extent by an appropriate choice of basis, it severely limits the range of temperatures over which results can be obtained. These calculations represent the current state of the art: they push the CT-INT technique to its limits and demonstrate the need for further algorithmic developments. To set up the problem, density functional band theory techniques were applied to appropriately chosen supercell geometries. From these calculations wave functions di (r) for the Co d-states and itinerant electron wave functions Ψnk (r) were extracted. The bare local Green’s function is then obtained as 0.4 0.6 0.4 0.2 0.2 0.0 -5 0.0 -4 -3 -2 -1 0 1 E (eV) 2 3 4 5 -5 0 5 E (eV) FIG. 46 (color online) Left panel: Total DOS of 3d orbital of Co atom embedded in Cu matrix. Model parameters: U = 4 eV, J = 0.7 eV, β = 10 eV−1 for 5-orbital impurity with 7 electrons. Right panel: Total DOS of 3d orbital of Co atom embedded in the bulk of Cu, into 1-st layer and Co-adatom on the Cu(111) surface. Model parameters: U = 4 eV, J = 0.7 eV, β = 10 eV−1 for 5-orbital impurity with 7 electrons. From Ref. (Gorelov et al., 2009). (i.e. with no hybridization function) using both CT-INT based on Eq. (221) (with the 129 interaction parameters) and by exactly diagonalizing the problem. Fig. 45 shows that the CT-INT expansion reproduces the exact result and that the ‘non-diagonal’ terms in the interaction are important. Gorelov et al. then computed the local density of states of a Co atom in bulk Cu (far from a surface). For this case the sign problem, while present, is not severe for the temperatures studied (T ≈ 1200K). Results are shown in the left panel of Fig. 46. The non-diagonal interactions have an important effect on the line shape. (Note that T ≈ 1200K is well above the Kondo temperature, so no 53 0.25 XIV. APPLICATIONS IV: NON-EQUILIBRIUM IMPURITY MODELS AND NANOSCALE TRANSPORT A. Overview CT-QMC methods have been used to study nonequilibrium problems defined on the “Keldysh” two-time contour. These studies are still in their early stages and we present here a few representative preliminary results concerning the current-voltage characteristics of interacting quantum dots, as well as simulations inspired by the newly developing capabilities of performing pumpprobe experiments on correlated electron compounds and “quantum quench” experiments on cold atom systems. B. Results: Current-voltage characteristics 0.2 0.15 I/Γ peak is evident at the Fermi surface). Finally, Gorelov et al. consider a Co impurity at a surface (right panel of Fig. 46). Here the relatively large non-diagonal elements of the bath Green’s function lead to a serious sign problem. To make a simulation on the surface feasible, Gorelov et al. in effect restricted the sampling to a constant-sign subset of configuration space, by only allowing updates that did not change the fermionic sign. See Ref. (Gorelov et al., 2009) for further details. 0.1 λ=2Γ, ω0=10Γ λ=2Γ, ω0=5Γ λ=4Γ, ω0=5Γ λ=4Γ, ω0=3Γ 0.05 0 0 5 10 15 20 V/Γ FIG. 47 Points: total current I computed using nonequilibrium CT-HYB methods as a function of the bias voltage for a half filled, spinless, phonon-coupled quantum dot (Eq. (133) with U=0) with voltage bias applied symmetrically, εd = 0, µL,R = ± V2 at temperature T = Γ5 at electronphonon coupling strengths λ and oscillator frequencies ω0 indicated. Lines: results of an approximate analytical calculation (Flensberg, 2003). From Ref. (Mühlbacher and Rabani, 2008). lished in (Schiró and Fabrizio, 2009; Schmidt et al., 2008; Werner et al., 2009b). Because the expansion must in this case be performed for both spin flavors and the decohering effect of phonons is not included, reaching a steady state becomes challenging. Attempts to optimize the algorithm by considering initial states with dot-lead entanglement (Schiró, 2010) have not led to dramatic improvements. 1. Real-time CT-HYB The first nonequilibrium applications of the CT-QMC technique were to the current-voltage characteristics of a quantum dot under a bias voltage. In their pioneering paper, Mühlbacher and Rabani (2008) showed that the hybridization expansion method could be directly applied on the Keldysh contour and that long enough times could be reached to permit measurements of steady state behavior. They studied a non-interacting dot coupled to phonons (essentially the Holstein-Hubbard model, Eq. (133), with spin neglected and U = 0); representative results giving the dependence of the currerent-voltage characteristics on the oscillator frequency and coupling strength are presented in Figure 47. These calculations start from an initial state in which the dot is decoupled from the leads and the calculation must build in appropriate dot-lead entanglement. This requires a coherence times which depends on the physics. In the calculations of Ref. (Mühlbacher and Rabani, 2008) convergence was facilitated by decoherence arising both from the phonons and from the relatively high T which was studied. Results for the interacting Anderson model (without phonons) have been pub- 2. Real-time CT-AUX In the weak coupling methods, for example the CTAUX algorithm explained in Sec. VIII.A, one may use ’interaction quench’ methods in which the real-time simulation starts from a U = 0 state with dot-lead entanglement. Temperature enters only as a parameter in the lead correlators, making it possible to treat arbitrary temperatures, including T = 0. While the presence of interactions of course modifies this entanglement, it seems that up to interaction strengths of U ≈ 10Γ, relatively few perturbation orders are required to reach steady state. The situation is particularly favorable for particle-hole symmetric models with symmetrically applied bias, where odd orders of perturbation theory can be suppressed. As illustration we present interactionquench results for the time-dependence of the current and the current-voltage characteristics of half-filled quantum dots with symmetrically applied voltage bias (µL = −µR = V /2). At time t = 0, the system is non-interacting but subject to an applied bias V , so a current I0 (V ) appropriate 54 0.9 1 U/Γ=6 U/Γ=0 U/Γ=4 U/Γ=6 U/Γ=8 U/Γ=10 0.8 0.8 0.7 V/Γ=8 0.6 V/Γ=6 I/Γ I/Γ 0.6 V/Γ=4 0.4 V/Γ=2 0.2 V/Γ=1 V/Γ=0.5 0 nk 0.5 0.4 0.5 1 1.5 2 tΓ 2.5 3 3.5 1 0.8 0.8 0.6 0.3 0.6 0.4 0.2 -2 0.2 0.1 00 0 0 nk 1 0 1 2 3 4 V/Γ 5 6 7 8 FIG. 48 Left panel: time evolution of the current for different voltage biases (U/Γ = 6, T = 0). Right panel: current-voltage characteristics obtained from long-time limit of the calculated currents. The symbols show Monte Carlo data for U/Γ = 4, 6, 8 and 10, while the lines correspond to fourth order perturbation theory. In the initial state, the current is given by the steady state current through the non-interacting dot. At time t = 0, the interaction is turned on. From Ref. (Werner et al., 2010). 0 0.5 1 t 1.5 2 1 2.5 3 2 0.4 -2 0.2 -1 εk 00 -1 0 0.5 1 t 1.5 2 1 2.5 εk 3 2 FIG. 49 Momentum distribution n(εk , t) for quenches from U/W = 0 to U/W = 0.75 (left panel) and U/W = 1.25 (right panel), for Hubbard model with semicircular density of states and bandwidth W . t is measured in units of the quarterbandwidth W/4. From Ref. (Eckstein et al., 2010). 3. Nonequilibrium DMFT to the non-interacting model is flowing through the dot. At t = 0+ the interaction is turned on and the system relaxes into the steady-state configuration appropriate to the interacting model. The left panel of Fig. 48 plots the time evolution of the current for fixed U/Γ = 6 and several voltage biases. For voltages V /Γ & 2, even though the transient behavior is clearly voltage-dependent, the current settles into the new steady state after a time tΓ ≈ 2. However, as the voltage is decreased below V /Γ ≈ 2 the transient time increases. At V = Γ the long time limit is attained only for tΓ & 3 and as V is further decreased, the interaction-quench method cannot reach the steady state. As discussed in (Werner et al., 2010), in the small-V regime, voltage quench simulations are a possible alternative to the interaction quench. In the voltage-quench calculations, the time-evolution starts from the interacting equilibrium state, which is a good starting point for small V . However, because the imaginary branch of the L-shaped contour must be explicitly treated, this approach is restricted to non-zero temperatures, and is only advantageous for very small voltage. The right panel of Fig 48 presents T = 0 results for the voltage dependence of the steady state current as well as analytic results obtained from fourth order perturbation theory order. The interacting current initially rises with the same slope as the non-interacting current, and reaches the non-interacting value also in the largevoltage limit. At intermediate values of V the effect of interactions is to suppress the current (Coulomb blockade). The results show clearly that at intermediate voltages the method can access interaction regimes beyond the scope of analytical perturbation theory. In agreement with conclusions reached in (Werner et al., 2009b) on the basis of (less accurate) hybridization expansion results and also with recent nonequilibrium functional renormalization group calculations (Jakobs et al., 2010) we see that this model does not display a region of negative differential resistance. The real-time CT-QMC methods can also serve as impurity solvers in nonequilibrium-DMFT simulations of bulk systems. The impurity Hamiltonian of Section VIII.A is the impurity problem relevant for the solution of the one-band Hubbard model. In the DMFT context, however, Hbath (Eq. (151)) is a single bath, whose parameters are fixed by a self-consistency equation which in the non-equilibrium context is time dependent (Freericks et al., 2006; Schmidt and Monien, 2002). CT-QMC have been used by Eckstein et al. (2009, 2010) to study the relaxation dynamics of the halffilled Hubbard model after a sudden switching-on of the electron repulsion U . The initial state was the noninteracting equilibrium state at temperature T = 0, and the DMFT selfconsistency assumed a semi-circular density of states of bandwidth W = 4. The calculation produces among other observables the time-evolution of the momentum distribution function n(εk , t) which is plotted in Fig. 49 for quenches to U = 3W/4 and U = 5W/4. Qualitative differences in the relaxation dynamics appear as the value of the interaction strength is changed. It was demonstrated in (Eckstein et al., 2009, 2010) that for a quench to U = 0.8W the momentum distribution function and double occupancy relax very fast (within a time t = 6.4/W ) to the thermal equilibrium result compatible with energy conservation. The relevant time-scale is in this case easily accessible with real-time CT-AUX. Away from this “critical interaction strength”, i. e. for quenches to U 6= 0.8W , the system is initially trapped in a non-thermal quasi-stationary state, and equilibration occurs on much longer time scales. In this case the accessible times are not long enough to observe the expected thermalization (see right panel of Fig. 49). Only the initial relaxation into the quasi-stationary state and (for U < 0.8W ) the initial part of the slow crossover towards the thermal equilibrium state are computationally accessible with present techniques. 55 XV. PROSPECTS AND OPEN ISSUES Over the last few years, continuous-time quantum Monte Carlo methods for fermionic impurity problems have been developed to a high degree. Because the methods are based on a diagrammatic expansion, they can handle many physically relevant interactions, which were not easily treatable by other methods. Also, by construction they are free from the time-discretization errors associated with methods previously used for fermions, based on the Suzuki-Trotter decomposition (Suzuki, 1976). Third, a continuous-time formulation provides, in a sense, a many-body adaptive grid method for the time evolution. These advantages of the continuous-time formulation enable a decrease, typically by several orders of magnitude, in the computational effort required to solve a problem of a given complexity, making previously intractable problems tractable and creating new opportunities for physics by allowing rapid and routine investigation of problems which had previously required access to supercomputer facilities. The methods have become very important to the field of correlated electron physics (via the connection to single (Georges et al., 1996) and cluster (Maier et al., 2005a) dynamical mean field theory) and are having an increasing impact on nanoscience. However, the first generation of results has only begun to explore what is possible. We expect that over the next few years the methods will be increasingly widely used in dynamical mean field computations of correlated electron materials and condensed atomic gases, and in studies of the equilibrium and nonequilibrium phenomena arising in the impurity models relevant to nanoscience. The nonequilibrium applications in particular represent an entirely new field with many exciting possibilities. Methodological improvements, including combinations of hybridization and coupling constant expansions and the further study of projected Hilbert space methods such as CT-J are likely to be fruitful. We hope that an increasing number of scientists will take advantage of the opportunities, by applying the methods to yet wider classes of problems and by developing them further. We conclude our discussion by summarizing what we perceive to be the strengths and weaknesses of the different CT-QMC methods, and suggesting some issues that may warrant further attention. The fundamental issues for any algorithm are the scaling with temperature, interaction strength, and system size. In addition, for fermions, one must consider the sign problem. Table [I] summarizes what we know about these scalings. The hybridization expansion algorithm, CT-HYB, diagonalizes the local Hamiltonian and expands in the impurity-bath hybridization. The principal advantage of this approach is that instantaneous (Hamiltonian) interactions of essentially arbitrary strength and functional form can be handled (retarded interactions can be con- veniently treated only in special, but physically relevant cases such as the screened density-density interaction). The hybridization expansion appears to suffer from a severe sign problem if the hybridization function does not commute with the one-body part of the local Hamiltonian, and this limits its use in the most general contexts. It appears to be most useful for the single-site dynamical mean field theory of materials with partly filled d and f shells, where its ability to treat the full complexity of general multiplet interactions is unmatched and the point symmetry ensures that the local Hamiltonian and hybridization functions commute. The fundamental computational bottlenecks of the hybridization method are the need to manipulate matrices whose size is set by the dimension of the full fermionic Hilbert space of the impurity Hamiltonian and the need to compute determinants of hybridization matrices of a size linearly growing with β. The computational burden grows exponentially with the size of the fermionic problem and as the cube of the inverse temperature, and the system-size constraint is therefore more severe. For a model of N spin degenerate orbitals the Hilbert space size is 4N . At present, 5 orbital models are accessible with large scale computing resources. For larger systems a straightforward approach is not feasible yet without truncation. The accuracy of truncation schemes is not yet established. Of course, in special cases block diagonalization is possible so the full Hilbert space need not be treated. In the most favorable case, the local Green function, hybridization matrix and interaction may all be diagonalized in the same single-particle occupation number basis (this occurs in the N -orbital impurity model with density-density interactions, if each orbital hybridizes with a different bath) and the segment representation of the hybridization expansion may be used. In this case there is no sign problem and the cost is linear in the number of orbitals and cubic in the inverse temperature. Thus if a segment representation exists, it should be used. Unfortunately, in most problems of physical interest either hybridizations or interactions entangle the different single-particle basis states, and a general matrix formulation is required. In this case the exponential scaling associated with the Hilbert space size is the crucial constraint, and an important open problem concerns the degree to which the Hilbert space can be block diagonalized or truncated. Haule pioneered the use of symmetry-based block diagonalization and of truncation (Haule, 2007). An alternative approach based on sparse matrix-vector instead of dense matrix-matrix multiplication is the Krylov technique discussed in Sec. V.D. Further research along these and related lines appears to be worthwhile. The interaction expansions CT-INT and CT-AUX are based on an expansion about the free-fermion limit. The computational burden therefore increases with the interaction strength, as well as with inverse temperature and 56 Scaling / Algorithm CT-INT 3 CT-AUX 3 CT-HYB (segment) diagonal hybridization N (βU ) N (βU ) non-diagonal hyb. (N βU )3 , sign prob. (N βU )3 , sign prob. diagonal interaction (N βU )3 , sign prob. (N βU )3 , sign prob. general Uijkl (N 2 βU )3 , sign prob. n/a CT-HYB (matrix) Nβ aeN β 2 + bN β 3 , a ≫ b N 2 (N β) , sign prob. ae β + b(N β)3 , a ≫ b, sign prob. (N β)3 , sign prob. aeN β 2 + b(N β)3 , a ≫ b, sign prob. n/a aeN β 2 + b(N β)3 , a ≫ b, sign prob. 3 3 TABLE I Summary of scaling and sign metrics in the equilibrium case for most widely studied continuous-time quantum Monte Carlo methods. CT-INT, CT-AUX, CT-HYB refer to the interaction expansion (Sec. III), auxiliary field (Sec. IV) and hybridization (Sec. V) expansion algorithms respectively. “Segment” refers to the case of the hybridization interaction where the hybridization function, local Hamiltonian and interaction are all diagonal in the same basis (V.B), while “matrix” refers to the general implementation in Sec. V.C. We distinguish Green functions which can be diagonalized by one single canonical transformation from general Green functions where the hybridization function, local Hamiltonian and self energy do not all commute, and we distinguish interactions such as the Hubbard U which are diagonal in an appropriate single-particle occupation number basis from those such as spin exchange and “pair hopping” which cannot be diagonalized. sign prob. indicates the possibility of the presence of a fermionic sign problem. the system size, making it difficult to access the very strong coupling regime. However, the scaling with system size is power-law, rather than exponential, so that these methods are the only ones feasible when many orbitals or many sites are important to the physics. At present, a lack of good auxiliary field decompositions means that the CT-AUX method can only be used for models with density-density interactions. Its main application has been to cluster dynamical mean field studies of the Hubbard model. A natural subject for further investigations is the application of the method to wider classes of models, including more general (but still density-density) interactions. The CT-INT method is equivalent to the CT-AUX method for Hubbard like interactions (although the present CT-AUX implementations appear to be more efficient), and is applicable to models with general (non density-density) interactions. However, sign problems occur and grow in severity as the complexity of the interaction increases. Our experience in the nonequilibrium context is that in a particle-hole symmetric model, the CT-INT and CTAUX methods are to be preferred over the hybridization methods because odd perturbation orders can be suppressed (Werner et al., 2010), resulting in a less severe sign problem and longer accessible times. There are two limitations associated with the CT-INT and CT-AUX methods. One issue for more realistic models with more complicated interactions is the need to make a multiple expansion in all components of the interaction. This is not a serious issue as long as no sign problem is encountered. The more fundamental limitation is the sign problem, which can arise in cluster dynamical mean field calculations from the presence of physical (real-space) fermionic loops or more generally from non-commutativity of operators appearing in the impurity model, due for example to exchange interactions or to hybridization functions which cannot be diagonalized by a single (time-independent) basis change. Sign problems are in general dependent on the choice of basis and further exploration of different representations of the interaction and the Green function, especially in the case of non-diagonal interaction, may be worthwhile. ACKNOWLEDGMENTS We are greatly indebted to our many collaborators and colleagues, Fakher Assaad, Evgeny Burovski, Armin Comanac, Martin Eckstein, Michel Ferrero, Sebastian Fuchs, Antoine Georges, Hartmut Hafermann, Kristjan Haule, Mark Jarrell, Mikhail Katsnelson, Marcus Kollar, Andrei Komnik, Gabriel Kotliar, Evgeny Kozik, Jan Kunes, Hiroaki Kusunose, Andreas Läuchli, Gang Li, Chungwei Lin, Nan Lin, Lothar Mühlbacher, Thomas Maier, Luca de Medici, Karlis Mikelsons, Hartmut Monien, Takashi Oka, Junya Otsuki, Olivier Parcollet, Lode Pollet, Nikolay Prokof’ev, Thomas Pruschke, Stefan Rombouts, Vladimir Savkin, Thomas Schmidt, Michael Sentef, Boris Svistunov, Dieter Vollhardt, and Xin Wang, without whose input such rapid development would not have been possible. The work of E. G. and A. J. M. was supported by NSF under Grant No. DMR0705847, DMR-1006282 and the US Department of Energy under grant ER-46169, P. W. and M. T. by the Swiss National Science Foundation, A. L. and A. R. by RFFIDFG grant 436 RUS 113/938/0 – 08-02-91953, SFB 668, the Kurchatov institute, and the Cluster of Excellence “Nanospintronics”. Preparation of this manuscript was supported in part by the National Science Foundation under Grant No. PHY05-51164. Open-source implementations of some of the algorithms we describe in this review have been made available for download. Codes for the interaction expansion algorithm implemented by Rubtsov et al. are available at (Rubtsov et al., 2010). A hybridization expansion code for density density interactions, corresponding to the description in Sec. V.B, as well as an interaction expansion implementation are also available as part of the ALPS 57 project (Bauer et al., 2011) and have been published in Ref. (Gull et al., 2011b). REFERENCES Abrikosov, A. A. (1965), Physics 2, 5. Adelson-Velskii, G., and E. M. Landis (1962a), Proceedings of the USSR Academy of Sciences 146, 263. Adelson-Velskii, G., and E. M. Landis (1962b), Soviet Math. Doklady (English translation) 3, 1259. Affleck, I. (2008), unpublished. Aichhorn, M., L. Pourovskii, V. Vildosola, M. Ferrero, O. Parcollet, T. Miyake, A. Georges, and S. Biermann (2009), Phys. Rev. B 80 (8), 085101. Alvarez, G., M. S. Summers, D. E. Maxwell, M. Eisenbach, J. S. Meredith, J. M. Larkin, J. Levesque, T. A. Maier, P. R. C. Kent, E. F. D’Azevedo, and T. C. Schulthess (2008), in Proceedings of the 2008 ACM/IEEE conference on Supercomputing, SC ’08 (IEEE Press, Piscataway, NJ, USA) pp. 61:1–61:10. Anderson, P. W. (1961), Phys. Rev. 124 (1), 41. Anderson, P. W., G. Yuval, and D. R. Hamann (1970), Phys. Rev. B 1 (11), 4464. Andrei, N., K. Furuya, and J. H. Lowenstein (1983), Rev. Mod. Phys. 55 (2), 331. Armitage, N. P., P. Fournier, and R. L. Greene (2010), Rev. Mod. Phys. 82 (3), 2421. Assaad, F. F. (2008), Phys. Rev. B 78 (15), 155124. Assaad, F. F., and T. C. Lang (2007), Phys. Rev. B 76 (3), 035116. Barnes, S. E. (1976), Journal of Physics F: Metal Physics 6 (7), 1375. Bauer, B., et al. (2011), Journal of Statistical Mechanics: Theory and Experiment 2011 (05), P05001. Beard, B. B., and U.-J. Wiese (1996), Phys. Rev. Lett. 77 (25), 5130. Bickers, N. E. (1987), Rev. Mod. Phys. 59 (4), 845. Blankenbecler, R., D. J. Scalapino, and R. L. Sugar (1981), Phys. Rev. D 24 (8), 2278. Blawid, S., A. Deppeler, and A. J. Millis (2003), Phys. Rev. B 67 (16), 165105. Blümer, N. (2002), Mott-Hubbard Metal-Insulator Transition and Optical Conductivity in High Dimensions, Ph.D. thesis (Universität Augsburg). Blümer, N. (2008), “Multigrid hirsch-fye quantum monte carlo method for dynamical mean-field theory,” arXiv:0801.1222v1. Brako, R., and D. M. Newns (1981), Journal of Physics C: Solid State Physics 14 (21), 3065. Brouet, V., F. Rullier-Albenque, M. Marsi, B. Mansart, M. Aichhorn, S. Biermann, J. Faure, L. Perfetti, A. TalebIbrahimi, P. Le Fèvre, F. Bertran, A. Forget, and D. Colson (2010), Phys. Rev. Lett. 105 (8), 087001. Bulla, R., T. A. Costi, and T. Pruschke (2008), Rev. Mod. Phys. 80 (2), 395. Burovski, E., E. Kozik, N. Prokof’ev, B. Svistunov, and M. Troyer (2008), Phys. Rev. Lett. 101 (9), 090402. Burovski, E., N. Prokof’ev, B. Svistunov, and M. Troyer (2006a), Phys. Rev. Lett. 96 (16), 160402. Burovski, E., N. Prokof’ev, B. Svistunov, and M. Troyer (2006b), New J. Phys. 8 (8), 153. Caffarel, M., and W. Krauth (1994), Phys. Rev. Lett. 72 (10), 1545. Capone, M., L. de’ Medici, and A. Georges (2007), Phys. Rev. B 76 (24), 245116. Capone, M., G. Sangiovanni, C. Castellani, C. Di Castro, and M. Grilli (2004), Phys. Rev. Lett. 92 (10), 106401. Chakraborty, S., D. Galanakis, and P. Phillips (2008), Phys. Rev. B 78 (21), 212504. Chan, C.-K., P. Werner, and A. J. Millis (2009), Phys. Rev. B 80 (23), 235114. Civelli, M., M. Capone, S. S. Kancharla, O. Parcollet, and G. Kotliar (2005), Phys. Rev. Lett. 95 (10), 106402. Coleman, P. (1984), Phys. Rev. B 29 (6), 3035. Comanac, A. (2007), Dynamical Mean Field Theory of Correlated Electron Systems: New Algorithms and Applications to Local Observables, Ph.D. thesis (Columbia University). Comanac, A., L. de’ Medici, M. Capone, and A. J. Millis (2008), Nat Phys 4 (4), 287. Coqblin, B., and J. R. Schrieffer (1969), Phys. Rev. 185 (2), 847. Dao, T.-L., M. Ferrero, A. Georges, M. Capone, and O. Parcollet (2008), Phys. Rev. Lett. 101 (23), 236405. Dayal, P., S. Trebst, S. Wessel, D. Würtz, M. Troyer, S. Sabhapandit, and S. N. Coppersmith (2004), Phys. Rev. Lett. 92 (9), 097201. De Leo, L., C. Kollath, A. Georges, M. Ferrero, and O. Parcollet (2008), Phys. Rev. Lett. 101 (21), 210403. Deppeler, A., and A. J. Millis (2002), Phys. Rev. B 65 (22), 224301. Doniach, S. (1977), Physica B+C 91, 231 . Eckstein, M., M. Kollar, and P. Werner (2009), Phys. Rev. Lett. 103 (5), 056403. Eckstein, M., M. Kollar, and P. Werner (2010), Phys. Rev. B 81 (11), 115131. Eurich, N., M. Eckstein, and P. Werner (2011), Phys. Rev. B 83 (15), 155122. Ferrero, M., P. S. Cornaglia, L. De Leo, O. Parcollet, G. Kotliar, and A. Georges (2009b), Phys. Rev. B 80 (6), 064501. Ferrero, M., P. S. Cornaglia, L. De Leo, O. Parcollet, G. Kotliar, and A. Georges (2009a), Europhys. Lett. 85 (5), 57009. Feynman, R. P., and F. L. Vernon (1963), Annals of Physics 24, 118 . Fisher, D. S., G. Kotliar, and G. Moeller (1995), Phys. Rev. B 52 (24), 17112. Flensberg, K. (2003), Phys. Rev. B 68 (20), 205323. Florens, S., and A. Georges (2004), Phys. Rev. B 70 (3), 035114. Freericks, J. K., V. M. Turkowski, and V. Zlatić (2006), Phys. Rev. Lett. 97 (26), 266408. Friedel, J. (1951), Philosophical Magazine 43, 153. Friedel, J. (1956), Canadian Journal of Physics 34, 1190. Georges, A., and G. Kotliar (1992), Phys. Rev. B 45 (12), 6479. Georges, A., G. Kotliar, W. Krauth, and M. J. Rozenberg (1996), Rev. Mod. Phys. 68 (1), 13. Golosov, D. I., and Y. Gefen (2006), Phys. Rev. B 74 (20), 205316. Gorelik, E. V., and N. Blümer (2009), Phys. Rev. A 80 (5), 051602. Gorelov, E. (2007), Electronic structure of multiorbital correlated systems, Ph.D. thesis (University of Hamburg). 58 Jakobs, S. G., M. Pletyukhov, and H. Schoeller (2010), Gorelov, E., T. O. Wehling, A. N. Rubtsov, M. I. Phys. Rev. B 81 (19), 195109. Katsnelson, and A. I. Lichtenstein (2009), Jarrell, M., and J. E. Gubernatis (1996), Phys. Rev. B 80 (15), 155132. Physics Reports 269 (3), 133 . Gull, E. (2008), Continuous-time quantum Monte Carlo algoJarrell, M., T. Maier, M. H. Hettler, and A. N. Tahvilrithms for fermions, Ph.D. thesis (ETH Zurich). darzadeh (2001), Europhys. Lett. 56 (4), 563. Gull, E., O. Parcollet, P. Werner, and A. J. Millis (2009), Jauho, A.-P., N. S. Wingreen, and Y. Meir (1994), Phys. Rev. B 80 (24), 245102. Phys. Rev. B 50 (8), 5528. Gull, E., D. R. Reichman, and A. J. Millis (2010), Kadanoff, L., and G. Baym (1962), Quantum Statistical MePhys. Rev. B 82 (7), 075109. chanics (Benjamin, NY). Gull, E., P. Staar, S. Fuchs, P. Nukala, M. S. Summers, Kancharla, S. S., B. Kyung, D. Senechal, M. Civelli, T. Pruschke, T. C. Schulthess, and T. Maier (2011a), M. Capone, G. Kotliar, and A.-M. S. Tremblay (2008), Phys. Rev. B 83 (7), 075122. Phys. Rev. B 77 (18), 184516. Gull, E., P. Werner, S. Fuchs, B. Surer, Karrasch, C., T. Enss, and V. Meden (2006), T. Pruschke, and M. Troyer (2011b), Phys. Rev. B 73 (23), 235337. Computer Physics Communications 182 (4), 1078 . Karrasch, C., T. Hecht, A. Weichselbaum, Gull, E., P. Werner, A. Millis, and M. Troyer (2007), Y. Oreg, J. von Delft, and V. Meden (2007), Phys. Rev. B 76 (23), 235123. Phys. Rev. Lett. 98 (18), 186802. Gull, E., P. Werner, O. Parcollet, and M. Troyer (2008a), Keiter, H., and J. C. Kimball (1970), Europhys. Lett. 82 (5), 57003. Phys. Rev. Lett. 25 (10), 672. Gull, E., P. Werner, X. Wang, M. Troyer, and A. J. Millis Khatami, E., C. R. Lee, Z. J. Bai, R. T. Scalettar, and (2008b), Europhys. Lett. 84 (3), 37009 (6pp). M. Jarrell (2010), Phys. Rev. E 81 (5), 056703. Gunnarsson, O., and K. Schönhammer (1983), Knecht, C. (2003), Numerische Analyse des Hubbard-Modells Phys. Rev. Lett. 50 (8), 604. im Rahmen der Dynamischen Molekularfeld-Theorie (GerHafermann, H., G. Li, A. N. Rubtsov, M. I. Katman), Master’s thesis (Johannes Gutenberg - Universität snelson, A. I. Lichtenstein, and H. Monien (2009), Mainz). Phys. Rev. Lett. 102 (20), 206401. Knuth, D. (1997), The Art of Computer Programming, SortHallberg, K. A. (2006), Advances in Physics 55 (5), 477. ing and Searching, 3rd ed., Vol. 3 (Addison-Wesley). Handscomb, D. (1962), Proc. Cambridge Philos. Soc. 58, 594. Koch, E., G. Sangiovanni, and O. Gunnarsson (2008), Handscomb, D. (1964), Proc. Cambridge Philos. Soc. 60, 115. Phys. Rev. B 78 (11), 115102. Hanson, R., L. P. Kouwenhoven, J. R. Petta, Koller, W., D. Meyer, and A. C. Hewson (2004a), S. Tarucha, and L. M. K. Vandersypen (2007), Phys. Rev. B 70 (15), 155103. Rev. Mod. Phys. 79 (4), 1217. Koller, W., D. Meyer, Y. Ono, and A. C. Hewson (2004b), Hastings, W. K. (1970), Biometrika 57 (1), 97. Europhys. Lett. 66 (4), 559. Haule, K. (2007), Phys. Rev. B 75 (15), 155113. Kondo, J. (1964), Prog. Theor. Phys. 32 (1), 37. Haule, K., S. Kirchner, J. Kroha, and P. Wölfle (2001), Kotliar, G., S. Murthy, and M. J. Rozenberg (2002), Phys. Rev. B 64 (15), 155111. Phys. Rev. Lett. 89 (4), 046401. Haule, K., and G. Kotliar (2007a), Kotliar, G., S. Y. Savrasov, K. Haule, et al. (2006), Phys. Rev. B 76 (9), 092503. Rev. Mod. Phys. 78 (3), 865. Haule, K., and G. Kotliar (2007b), Kotliar, G., S. Y. Savrasov, G. Pálsson, and G. Biroli (2001), Phys. Rev. B 76 (10), 104509. Phys. Rev. Lett. 87 (18), 186401. Haule, K., and G. Kotliar (2009), Nat Phys 5 (11), 796. Kotliar, G., and D. Vollhardt (2004), Hedden, R., V. Meden, T. Pruschke, and K. Schönhammer Physics Today 57 (3), 53. (2004), J. Phys. Condens. Matter 16 (29), 5279. Kozik, E., K. V. Houcke, E. Gull, L. Pollet, Held, K. (2007), Advances in Physics 56, 829. N. Prokof’ev, B. Svistunov, and M. Troyer (2010), Held, K., I. A. Nekrasov, G. Keller, V. Eyert, N. BlueEurophys. Lett. 90 (1), 10004. mer, A. K. McMahan, R. T. Scalettar, T. PrKrauth, W. (2006), Statistical mechanics : algorithms and uschke, V. I. Anisimov, and D. Vollhardt (2006), computations (Oxford University Press). Phys. Status Solidi 243 (11), 2599. Kusunose, H. (2006), J. Phys. Soc. Jpn. 75 (5), 054713. Hettler, M. H., A. N. Tahvildar-Zadeh, M. Jarrell, et al. Kutepov, A., K. Haule, S. Y. Savrasov, and G. Kotliar (2010), (1998), Phys. Rev. B 58 (12), R7475. Phys. Rev. B 82 (4), 045105. Hirsch, J. E., and R. M. Fye (1986), Kyung, B., V. Hankevych, A.-M. Daré, and A.-M. S. TremPhys. Rev. Lett. 56 (23), 2521. blay (2004), Phys. Rev. Lett. 93 (14), 147004. Hochbruck, M., and C. Lubich (1997), Kyung, B., S. S. Kancharla, D. Sénéchal, A.-M. S. SIAM Journal on Numerical Analysis 34 (5), 1911. Tremblay, M. Civelli, and G. Kotliar (2006a), Hoshino, S., J. Otsuki, and Y. Kuramoto (2010), Phys. Rev. B 73 (16), 165114. Phys. Rev. B 81 (11), 113108. Hubbard, J. (1964), Proc. R. Soc. London, Ser. A 277 (1369), 237.Kyung, B., S. S. Kancharla, D. Sénéchal, A.-M. S. Tremblay, M. Civelli, and G. Kotliar (2006b), Huscroft, C., M. Jarrell, T. Maier, S. Moukouri, and A. N. Phys. Rev. B 73 (16), 165114. Tahvildarzadeh (2001), Phys. Rev. Lett. 86 (1), 139. Landau, D. P., and K. Binder (2000), A guide to Monte Imada, M., A. Fujimori, and Y. Tokura (1998), Carlo simulations in statistical physics (Cambridge UniRev. Mod. Phys. 70 (4), 1039. versity Press). Itzykson, C., and J.-B. Zuber (2006), Quantum field theory Lang, I. G., and Y. A. Firsov (1962), Zh. Eksp. Teor. Fiz. (New York: McGraw-Hill). 43, 1843. 59 Langreth, D. C., and P. Nordlander (1991), Phys. Rev. B 43 (4), 2541. Läuchli, A. M., and P. Werner (2009), Phys. Rev. B 80 (23), 235117. Li, G., W. Hanke, A. Rubtsov, S. Bäse, and M. Potthoff (2009), Phys. Rev. B 80 (19), 195118. Lichtenstein, A. I., and M. I. Katsnelson (2000), Phys. Rev. B 62 (14), R9283. Liebsch, A. (2005), Phys. Rev. Lett. 95 (11), 116402. Liebsch, A., H. Ishida, and J. Merino (2008), Phys. Rev. B 78 (16), 165123. Liebsch, A., and N.-H. Tong (2009), Phys. Rev. B 80 (16), 165126. Löhneysen, H. v., A. Rosch, M. Vojta, and P. Wölfle (2007), Rev. Mod. Phys. 79 (3), 1015. Macridin, A., M. Jarrell, T. Maier, P. R. C. Kent, and E. D’Azevedo (2006), Phys. Rev. Lett. 97 (3), 036401. Mahan, G. D. (2000), “Many-particle physics,” Chap. 4, 3rd ed. (Kluwer Academic/Plenum Publishers). Maier, T., M. Jarrell, T. Pruschke, and M. H. Hettler (2005a), Rev. Mod. Phys. 77 (3), 1027. Maier, T. A., M. Jarrell, and D. J. Scalapino (2007a), Phys. Rev. B 75 (13), 134519. Maier, T. A., M. Jarrell, T. C. Schulthess, P. R. C. Kent, and J. B. White (2005b), Phys. Rev. Lett. 95 (23), 237001. Maier, T. A., M. S. Jarrell, and D. J. Scalapino (2006), Phys. Rev. Lett. 96 (4), 047005. Maier, T. A., A. Macridin, M. Jarrell, and D. J. Scalapino (2007b), Phys. Rev. B 76 (14), 144516. Marianetti, C. A., K. Haule, G. Kotliar, and M. J. Fluss (2008), Phys. Rev. Lett. 101 (5), 056403. Marianetti, C. A., K. Haule, and O. Parcollet (2007), Phys. Rev. Lett. 99 (24), 246404. Matsumoto, M., M. J. Han, J. Otsuki, and S. Y. Savrasov (2009), Phys. Rev. Lett. 103 (9), 096403. Metropolis, N., A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller (1953), The Journal of Chemical Physics 21 (6), 1087. Metzner, W., and D. Vollhardt (1989), Phys. Rev. Lett. 62 (3), 324. Mikelsons, K. (2009), Extensions of Numerical Methods for Strongly Correlated Electron Systems, Ph.D. thesis (University of Cincinnati). Mikelsons, K., E. Khatami, D. Galanakis, A. Macridin, J. Moreno, and M. Jarrell (2009a), Phys. Rev. B 80 (14), 140505. Mikelsons, K., A. Macridin, and M. Jarrell (2009b), Phys. Rev. E 79 (5), 057701. Millis, A. (2004), in Strong interactions in low dimensions, Physics and Chemistry of Materials with Low-Dimensional Structures, Vol. 25, edited by F. Lvy, E. Mooser, D. Baeriswyl, and L. Degiorgi (Springer Netherlands) pp. 195–235. Mizokawa, T., and A. Fujimori (1995), Phys. Rev. B 51 (18), 12880. Moler, C., and C. V. Loan (2003), SIAM Review 45 (1), 3. Motoyama, E., G. Yu, I. Vishik, O. Vajk, P. Mang, and M. Greven (2007), Nature 445, 186. Mott, N. F. (1949), Proc. Phys. Soc. London, Sec. A 62, 416. Moukouri, S., and M. Jarrell (2001), Phys. Rev. Lett. 87 (16), 167010. Mühlbacher, L., and E. Rabani (2008), Phys. Rev. Lett. 100 (17), 176403. Müller-Hartmann, E. (1989), Z. Phys. B 74, 507 . Nagaoka, Y. (1966), Phys. Rev. 147 (1), 392. Negele, J. W., and H. Orland (1988), Quantum ManyParticle Systems (Addison-Wesley Publishing Company). Nishimoto, S., F. Gebhard, and E. Jeckelmann (2006), Physica B 378-380, 283 , proceedings of the International Conference on Strongly Correlated Electron Systems SCES 2005. Nukala, P. K. V. V., T. A. Maier, M. S. Summers, G. Alvarez, and T. C. Schulthess (2009), Phys. Rev. B 80 (19), 195111. Otsuki, J., H. Kusunose, and Y. Kuramoto (2009a), Phys. Rev. Lett. 102 (1), 017202. Otsuki, J., H. Kusunose, and Y. Kuramoto (2009b), J. Phys. Soc. Jpn. 78 (1), 014702. Otsuki, J., H. Kusunose, and Y. Kuramoto (2009c), J. Phys. Soc. Jpn. 78, 034719. Otsuki, J., H. Kusunose, P. Werner, and Y. Kuramoto (2007), J. Phys. Soc. Jpn. 76 (11), 114707. Parcollet, O., G. Biroli, and G. Kotliar (2004), Phys. Rev. Lett. 92 (22), 226402. Park, H., K. Haule, and G. Kotliar (2008a), Phys. Rev. Lett. 101 (18), 186403. Park, H., K. Haule, C. A. Marianetti, and G. Kotliar (2008b), Phys. Rev. B 77 (3), 035107. Poteryaev, A., J. M. Tomczak, S. Biermann, A. Georges, A. Lichtenstein, A. Rubtsov, T. Saha-Dasgupta, and O. K. Andersen (2007), Phys. Rev. B 76 (8), 085127. Poteryaev, A. I., M. Ferrero, A. Georges, and O. Parcollet (2008), Phys. Rev. B 78 (4), 045115. Potthoff, M. (2006), Advances in Solid State Physics , 135. Potthoff, M., T. Wegner, and W. Nolting (1997), Phys. Rev. B 55 (24), 16132. Prokof’ev, N. V., and B. V. Svistunov (1998), Phys. Rev. Lett. 81 (12), 2514. Prokof’ev, N. V., B. V. Svistunov, and I. S. Tupitsyn (1996), JETP Letters 64, 911. Prokof’ev, N. V., B. V. Svistunov, and I. S. Tupitsyn (1998), JETP Sov. Phys. 87, 310. Pruschke, T., and N. Grewe (1989), Z. Phys. B 74 (4), 439. Read, N., and D. M. Newns (1983), Journal of Physics C: Solid State Physics 16 (17), 3273. Rombouts, S., K. Heyde, and N. Jachowicz (1998), Physics Letters A 242 (4-5), 271 . Rombouts, S. M. A., K. Heyde, and N. Jachowicz (1999), Phys. Rev. Lett. 82 (21), 4155. Rubtsov, A., et al. (2010), http://code.google.com/p/ctqmc/. Rubtsov, A. N. (2003), “Quantum monte carlo determinantal algorithm without hubbard-stratonovich transformation: a general consideration,” arXiv:cond-mat/0302228. Rubtsov, A. N., M. I. Katsnelson, and A. I. Lichtenstein (2008), Phys. Rev. B 77 (3), 033101. Rubtsov, A. N., M. I. Katsnelson, A. I. Lichtenstein, and A. Georges (2009), Phys. Rev. B 79 (4), 045133. Rubtsov, A. N., and A. I. Lichtenstein (2004), JETP Letters 80, 61. Rubtsov, A. N., V. V. Savkin, and A. I. Lichtenstein (2005), Phys. Rev. B 72 (3), 035122. Sakai, S., R. Arita, and H. Aoki (2006a), Physica B 378-380, 288 , proceedings of the International Conference on Strongly Correlated Electron Systems - SCES 2005. Sakai, S., R. Arita, K. Held, and H. Aoki (2006b), Phys. Rev. B 74 (15), 155102. 60 Sandvik, A. W., and J. Kurkijärvi (1991), Phys. Rev. B 43 (7), 5950. Sangiovanni, G., M. Capone, and C. Castellani (2006), Phys. Rev. B 73 (16), 165123. Sangiovanni, G., M. Capone, C. Castellani, and M. Grilli (2005), Phys. Rev. Lett. 94 (2), 026401. Savkin, V. V., A. N. Rubtsov, M. I. Katsnelson, and A. I. Lichtenstein (2005), Phys. Rev. Lett. 94 (2), 026402. Schiró, M. (2010), Phys. Rev. B 81 (8), 085126. Schiró, M., and M. Fabrizio (2009), Phys. Rev. B 79 (15), 153302. Schmidt, P., and H. Monien (2002), “Nonequilibrium dynamical mean-field theory of a strongly correlated system,” arXiv:cond-mat/0202046. Schmidt, T. L., P. Werner, L. Mühlbacher, and A. Komnik (2008), Phys. Rev. B 78 (23), 235110. Schollwöck, U. (2005), Rev. Mod. Phys. 77 (1), 259. Shim, J. H., K. Haule, and G. Kotliar (2007), Science 318 (5856), 1615. Slezak, C., M. Jarrell, T. Maier, and J. Deisz (2009), J. Phys. Condens. Matter 21 (43), 435604. Solyom, J., and A. Zawadoswki (1974), J. Phys. F 4 (1), 80. Sordi, G., K. Haule, and A.-M. S. Tremblay (2010), Phys. Rev. Lett. 104 (22), 226402. Stewart, G. R. (1984), Rev. Mod. Phys. 56 (4), 755. Stewart, G. R. (2001), Rev. Mod. Phys. 73 (4), 797. Sun, P., and G. Kotliar (2005), Phys. Rev. Lett. 95 (1), 016402. Suzuki, M. (1976), Prog. Theor. Phys. 56 (5), 1454. Toschi, A., A. A. Katanin, and K. Held (2007), Phys. Rev. B 75 (4), 045118. Trebst, S., D. A. Huse, and M. Troyer (2004), Phys. Rev. E 70 (4), 046701. Tremblay, A. M. S., B. Kyung, and D. Senechal (2006), Low Temp. Phys. 32 (4-5), 424. Troyer, M., S. Wessel, and F. Alet (2003), Phys. Rev. Lett. 90 (12), 120201. Troyer, M., and U.-J. Wiese (2005), Phys. Rev. Lett. 94 (17), 170201. Tsuji, N., T. Oka, P. Werner, and H. Aoki (2010), “Dynamical band flipping in fermionic lattice systems: an ac-field driven change of the interaction,” arXiv:1008.2594. Vetterling, W. T., B. P. Flannery, W. H. Press, and S. A. Teukolski (1992), “Numerical Recipes in FORTRAN - The Art of Scientific Computing - Second Edition,” Chap. 2.7 Inversion by Partitioning (University Press, Cambridge). Vidhyadhiraja, N. S., A. Macridin, C. Şen, M. Jarrell, and M. Ma (2009), Phys. Rev. Lett. 102 (20), 206407. Kuneš, J., D. M. Korotin, M. A. Korotin, V. I. Anisimov, and P. Werner (2009), Phys. Rev. Lett. 102 (14), 146402. Wang, F., and D. P. Landau (2001a), Phys. Rev. E 64 (5), 056101. Wang, F., and D. P. Landau (2001b), Phys. Rev. Lett. 86 (10), 2050. Wang, X. (2010), Application of New Numerical Methods to the Physics of High-Tc Cuprate Superconductors and to Quantum Dots, Ph.D. thesis (Columbia University). Wang, X., E. Gull, L. de’ Medici, M. Capone, and A. J. Millis (2009), Phys. Rev. B 80 (4), 045101. Wang, X., and A. J. Millis (2010), Phys. Rev. B 81 (4), 045106. Wang, X., C. D. Spataru, M. S. Hybertsen, and A. J. Millis (2008), Phys. Rev. B 77 (4), 045119. Werner, P., A. Comanac, L. de’ Medici, et al. (2006), Phys. Rev. Lett. 97 (7), 076405. Werner, P., E. Gull, O. Parcollet, and A. J. Millis (2009a), Phys. Rev. B 80 (4), 045120. Werner, P., E. Gull, M. Troyer, and A. J. Millis (2008), Phys. Rev. Lett. 101 (16), 166405. Werner, P., and A. J. Millis (2006), Phys. Rev. B 74 (15), 155107. Werner, P., and A. J. Millis (2007a), Phys. Rev. B 75 (8), 085108. Werner, P., and A. J. Millis (2007b), Phys. Rev. Lett. 99 (14), 146404. Werner, P., and A. J. Millis (2007c), Phys. Rev. Lett. 99 (12), 126405. Werner, P., and A. J. Millis (2010), Phys. Rev. Lett. 104 (14), 146401. Werner, P., T. Oka, M. Eckstein, and A. J. Millis (2010), Phys. Rev. B 81 (3), 035108. Werner, P., T. Oka, and A. J. Millis (2009b), Phys. Rev. B 79 (3), 035320. White, S. R. (1992), Phys. Rev. Lett. 69 (19), 2863. Wick, G. C. (1950), Phys. Rev. 80 (2), 268. Wilson, K. G. (1975), Rev. Mod. Phys. 47 (4), 773. Wingreen, N. S., and Y. Meir (1994), Phys. Rev. B 49 (16), 11040. Yacoby, A., M. Heiblum, D. Mahalu, and H. Shtrikman (1995), Phys. Rev. Lett. 74 (20), 4047. Yoo, J., S. Chandrasekharan, R. K. Kaul, D. Ullmo, and H. U. Baranger (2005), J. Phys. A 38 (48), 10307. Yosida, K., and K. Yamada (1975), Prog. Theor. Phys. 53 (5), 1286. Yuval, G., and P. W. Anderson (1970), Phys. Rev. B 1 (4), 1522. Zhang, F. C., and T. M. Rice (1988), Phys. Rev. B 37 (7), 3759. Zimmers, A., J. M. Tomczak, R. P. S. M. Lobo, N. Bontemps, C. P. Hill, M. C. Barr, Y. Dagan, R. L. Greene, A. J. Millis, and C. C. Homes (2005), Europhys. Lett. 70 (2), 225.