A Perturbational Approach For Approximating HA Models
A Perturbational Approach For Approximating HA Models
A Perturbational Approach For Approximating HA Models
Heterogeneous-Agent Models*
Anmol Bhandari Thomas Bourany David Evans
University of Minnesota University of Chicago University of Oregon
Mikhail Golosov
University of Chicago
February 2023
Abstract
*
We thank Adrien Auclert and participants of SITE conference for thoughtful comments. Bhandari, Evans,
and Golosov thank the NSF for support (grant #36354.00.00.00).
1
1 Introduction
Heterogeneous-agent (HA) models, with their ability to map naturally to rich comprehensive
micro-level datasets, are quickly becoming a staple of macroeconomic analysis. Despite the
popularity of these models, using them to study effects of aggregate shocks and macroeconomic
risk has proven difficult. State variables in the recursive representations of such models are
complex high- (often infinite-) dimensional objects, such as distributions of agents’ endogenous
choices and exogenous characteristics. These complex state variables are endogenous and
change stochastically over time. Standard computational techniques struggle with handling
such problems.
A “small-noise” perturbational approach is one of the most popular techniques to solve
representative-agent (RA) macroeconomic models;1 this approach lies in the heart of the widely
used DYNARE approximation package. This method approximates equilibrium policy func-
tions around a non-stochastic steady state using various order of Taylor expansions with respect
to a scalar that multiplies all aggregate shocks. This approach is popular because it applies to
a broad class of economies, scales to any order of approximation, and can be executed com-
putationally fast. There are two problems with extending this approach to HA models. First,
it quickly breaks down as the state in the recursive representation becomes large since com-
putational complexity of this approach grows exponentially with the dimension of the state.
Moreover, this approach struggles when there are kinks in policy functions, the feature that
emerges naturally in HA models with occasionally binding borrowing constraints.
The goal of this paper is to overcome this challenge and develop a numerical approach that
can be used to study equilibrium behavior in a broad class of macroeconomic models, that can
handle high-dimensional state variables, and that can be easily scale-able to second- and higher-
order approximations. This scalability requirement is one of the central focuses of our approach,
as higher-order approximations are necessary to study many central issues in macroeconomics:
the effects of risks and asset pricing, welfare analysis of macroeconomic stabilization policy,
understanding non-linearities and interaction effects of macroeconomic shocks. Our approach
is intentionally designed to preserve central advantages of the DYNARE-like techniques but to
be applicable to economies with arbitrarily complex state variables and occasionally binding
borrowing constraints.
In our approach, we follow Reiter (2009), Ahn, Kaplan, Moll, Winberry, and Wolf (2018),
1
This approach was developed to solve quantitative macroeconomic models, which originally were based
almost exclusively on the representative agent assumption. The same techniques work well if agents are non-
identical but heterogeneity is small. For simplicity, we refer to all such environments as RA models.
1
Auclert, Bardóczy, Rognlie, and Straub (2021) and use the invariant distribution in the econ-
omy without aggregate shocks as the “point-of-approximation”. This invariant distribution can
be computed efficiently using existing numerical techniques and we treat the solution to this
problem as known for the purposes of our approximation. We then use small-noise expansions
of policy functions around this point-of-approximation is it is typically done in the perturba-
tional approach, e.g., Schmitt-Grohé and Uribe (2004). The standard approach then requires
computing derivatives of policy function with respect to the state. In HA economies, this is an
infinitely-dimensional object and direct computation of it is not feasible. One of the central
result of our paper is that it is also not necessary. We show that in a broad class of HA models
it is possible to derive analytically expressions that characterize, for any given order of approx-
imation, how the law of motion for the distribution is affected by aggregate shocks, and how
aggregate variables respond to those changes in the distribution. Moreover, these expressions
have tractable linear recursive structure. This tractability allows us to collapse the problem
of finding any order of approximations to a problem solving small-dimensional linear systems
of equations. The unknown variables in these systems are the approximation coefficients that
describe responses to shocks. The known parameters are given by explicit formulas that use
the point-of-approximation variables. These analytical expressions are theoretically exact and
no numerical methods are used to derive them. We then use numerical approximations in
the final step, when we compute the point-of-approximation invariant distribution, construct
required matrices from those solutions, and invert them to find approximation coefficients.
We show that our technique applies to a broad class of economies, including models with
borrowing constraints. Our technique is easily parallelizable and can compute equilibria very
fast even in models with large heterogeneity. It takes, for example, about 0.1-0.2 seconds to fully
solve for the first-order approximation (and 1-2 seconds for the second-order approximation) a
high-dimensional version of the Krusell-Smith economy on a standard desktop computer. We
also show how our techniques can be used to quickly and easily solve portfolio problems and
problems with stochastic volatility in HA environments.
To derive our analytical results, we rely on linear operator techniques. Mathematically,
the dependence of any policy function on distributions is captured by Frechet derivatives.
Even though these Frechet derivatives are typically infinite-dimensional objects in standard
HA economies, they are linear (or, more precisely, multi-linear, when we consider arbitrary
order of the derivative) operators. Moreover, it is possible to derive analytically some of their
properties. These properties, together with linearity of the operator, allows us to show that
any order of approximation of the law of motion of the aggregate distribution is described
2
by a linear recursive mathematical structure and to characterize it explicitly. This ultimately
allows us to derive exact analytical characterization of various orders of approximation of HA
models.
3
There are two key differences between our approach and theirs that enable us to obtain
higher-order approximations. Firstly, we use recursive state space representation of equilibrium
conditions, while ABRS use sequence-space formulation. For orders of approximation that are
higher than the first, knowing responses to MIT shocks is insufficient to recover all approxi-
mation coefficients. Secondly, and perhaps even more importantly, ABRS follow most of the
literature and start with approximating the invariant distribution and its transition probability
(using the popular “histogram method”, see Young (2010)) before using numerical techniques
in constructing first-order approximations. We instead derive exact analytical expressions first
before using any numerical approximations. This distinction is important. In the paper we
show that even in the limit, as the grid size used in the histogram method becomes arbitrarily
small, constructed approximations do not converge to their exact theoretical counterparts, and
that some of the second-order terms get lost. We show that this can significantly affect the
conclusions one obtains.
Our paper is also related to the approximation method used by Bhandari, Evans, Golosov,
and Sargent (2021). Like us, they used a state-space variational approach and Frechet deriva-
tives to obtain various orders of approximation of HA economy. Their approach does not extend
to economies with occasionally binding borrowing constraints. It also changes the point-of-
approximation at each node of the aggregate history, which requires many re-computations of
the point-of-approximation economies. Our method instead computes the point-of-approximation
only once and can handle kinks in policy functions that emerge due to borrowing constraints.
The analytical characterization of the law of motion of the aggregate distribution, which is the
central theoretical result that simplifies our approach, is new to our paper.
2 Environment
We consider an infinite period economic environment with a continuum of agents. The economy
consists of a vector of aggregate variables Xt that economic agents take as given and a vector
of individual variables xi,t for each agent i, aggregate shocks Θt and idiosyncratic shocks θi,t .
The shocks are exogenous and satisfy autoregressive representation
Θt = ρΘ Θt−1 + Et , (1)
where Et and εi,t are mean zero stochastic processes, independent across time and agents and
|ρΘ |, |ρθ | < 1. For technical reasons, we assume that Et is such that |Et | are all bounded by some
E. For now, we assume that these processes are time-invariant, with probability distribution
4
of εt given by continuous µ, but drop this assumption later in the paper. To streamline the
exposition, we treat Θt and θi,t to be scalars but our approach extends straightforwardly to
the case when these are arbitrary vectors. The economic decisions of individual agents can be
characterized by mappings F of the form
where initial θi,0 for each i and Θ0 as well as some subset zi,0 ∈ xi,0 and Z0 ∈ X0 are given. All
the variables must satisfy aggregate conditions that are described by mapping R of the form
Z
R xi,t di, Xt , Θt = 0 for all t. (4)
Our goal is to characterize the solution to the economy described by equations (3) and (4).
We refer to this representation as the sequence-space representation.
We assume that the solution admits a recursive representation. In particular, we suppose
that there is some subset zt ∈ xt of individual state variables, with Ωt being a joint distribution
over (zt , θt ) so that solution to (3) and (4) can be represented by policy functions X
e (Θ, Ω) and
x
e (z, θ, Θ, Ω), together with the law of motion for distribution, Ω
e (Θ, Ω). Such policy functions
must satisfy state-space representation,
F x e+ , X,
e, x e θ = 0 for all (z, θ, Θ, Ω) , (5)
Z
R x
edΩ, X, Θ = 0 for all (Θ, Ω) ,
e e (6)
and
Z Z
e (Θ, Ω) z ′ , θ′ = ι ze(z, θ, Θ, Ω) ≤ z ′ ι(ρθ θ + ϵ ≤ θ′ )µ (ϵ) dϵdΩ ⟨z, θ⟩ for all (Θ, Ω) .
Ω
(7)
e+ is the expectation of x
Here x e next period,
h i
e+ (z, θ, Θ, Ω) := E x
x e (Θ, Ω) , ρθ θ + ε, ρΘ Θ + E ,
e ze (z, θ, Θ, Ω) , Ω
5
2.1 A Krusell-Smith Example
Consider a simple version of Krusell and Smith (1998) economy. All agents supply inelastically
one unit of labor that is subject to idiosyncratic efficiency shocks θi,t which process is given by
(2). Agents receive wage Wt and save in capital stock ki,t that earns interest rate Rt . Agent’s
decisions can be represented by a maximization problem
∞
X
max E0 β t U (ci,t )
{ci,t ,ki,t }t≥0
t=0
subject to some initial capital stock ki,−1 , non-negativity constraint ki,t ≥ 0, as well as ni,t =
exp (θi,t ) and the budget constraint
Here, β, δ ∈ (0, 1) are discount and depreciation rates. We assume that U is strictly concave
and smooth function with derivative Uc that satisfies the Inada conditions. Capital and labor
are hired by a representative firm endowed with a Cobb-Douglas production technology Yt =
exp (Θt ) Ktα Nt1−α where Θt follows stochastic process (1). The market clearing conditions are
Z Z
Kt = ki,t−1 di, Nt = ni,t di. (9)
We first derive its sequence-space representation. Let λi,t and ζi,t be multipliers on agent’s
budget and non-negativity constraints. Agents optimality conditions are characterized by
equations (8),
λi,t = Rt Uc (ci,t ), Uc (ci,t ) + ζi,t = βEt λi,t+1 ,
as well as ni,t = exp (θi,t ) and the complementary slackness condition ki,t ζi,t = 0. These four
equations define a mapping F in (3), with xi,t consisting of (ki,t , ci,t , λi,t , ζi,t ), and Xt consisting
of (Rt , Wt , Kt , Nt ). The mapping R in (4) is given by equation (9) and
This problem can be written recursively in state-space representation (5), (6) and (7) with
zi,t = ki,t being an individual state variable and Ωt being a joint distribution over (ki,t , θi,t ).
As the Krusell-Smith example makes clear, mapping F consists of all necessary and sufficient
conditions that characterize individual agent (household, firm, etc) optimization, while R in-
cludes all aggregation and market clearing conditions. Since F , R, X, x are all arbitrary
6
variables, this formulation encompasses a lot of cases that may not be immediately obvious
from the simple example we gave. For example, our formulation encompasses cases when ag-
gregate shocks affect individual agents directly (just have a variable Xt′ as a part of vector
Xt that satisfies Xt′ = Θt as a part of mapping Rt ) or where some endogenous aggregate
variable Πt , such as inflation in the previous period, is a part of the endogenous state space
(just augment definition of vector zi,t to include Πt and appropriately extend definition of Ωt
to this space). Similarly, our formulation allows any moment of distribution of xi,t to affect
equilibrium, not just the first moment. To see this, suppose one considers an economy in which
equilibrium depends on variance of some variable x′i,t . Let Xt′˙ := x′i,t di be the mean of x′i,t ,
R
2 R ′′ 2
x′′i,t := x′i,t − Xt′ be the square deviation from mean of x′i,t ,and Xt′′ := xi,t di be the
variance of x′i,t . It is easy to see that this fits into our specification (3) and (4), with the first
and third equations being parts of the R mapping, and the second equation being a part of
the F mapping. Any other moment of x′i,t can be dealt with analogously.
3 Approximation approach
The aggregate dynamics of economy characterized by equations (3) and (4). For given initial
conditions (Θ−1 , Ω−1 ), they describe the stochastic process for aggregate variables Xt E t as
a function of each history of aggregate shocks E t = (E0 , ..., Et ). Our goal is to develop nu-
merical techniques to obtain first-, second-, and, in principle, higher-order approximations of
this dynamics that can be done quickly and efficiently computationally in situations when the
dimensionality of the aggregate state Ω is large, e.g. infinite. Such situations arise frequently
in heterogeneous agents economies and cannot be handled with standard approximation tech-
niques.
We build on variational methods developed by Schmitt-Grohé and Uribe (2004) and that are
widely used in popular approximation packages such as DYNARE. In particular, we consider
a sequence of economies, parameterized by a scalar σ ≥ 0, in which Θt follows the stochastic
process
Θt = ρΘ Θt−1 + σEt . (10)
That is, relative to our original economy, all aggregate shocks are scaled by σ. The state space
representation holds for all σ; and we use notation X e (Θ, Ω; σ), x
e (z, θ, Θ, Ω; σ), Ω
e (Θ, Ω; σ) to
denote policy functions in the economy parameterized by σ. The case σ = 0 corresponds to a
deterministic economy, i.e. the economy without aggregate shocks. Depending on the initial
(Θ−1 , Ω−1 ), this economy may still have a non-trivial transition dynamics. We assume that
7
it converges to a unique (0, Ω∗ ), where Ω∗ the invariant distribution corresponding that that
deterministic economy. We use Z to refer to the aggregate state (Θ, Ω), with Z ∗ = (0, Ω∗ ).
Just like with the standard variational techniques, we use various orders of Taylor expan-
sions of policy functions with respect to σ around the point (Z, σ) = (Z ∗ , 0). The brute force
application of standard techniques is impossible when Ω is a large-dimensional object since it
would require separately computing dependence of policy functions on each of these dimen-
sions. Moreover, as we explain later in the paper, approximating Ω with a small-dimensional
vector would often result in incorrect expressions for higher-order (that is higher than first-
order) terms. Our key contribution is to show these difficulties can be sidestepped entirely. By
using linear operator techniques, it is possible to characterize analytically the systems of linear
equations that determine all coefficients in the first-, second-, and higher-order approximations
of Xt E t , without having to solve explicitly for the relationship between policy functions X
e
and various moments of Ω. This analytical system of linear equations can then be solved
quickly on the computer to obtain equilibrium approximations of various orders.
To avoid carrying (Z ∗ , 0) in all formulas, we use bars to denote policy functions that are
evaluated at the point (Z, σ) = (Z ∗ , 0). Thus, X, x (z, θ), Ω are used to refer to Xe (Z ∗ ; 0),
e (z, θ, Z ∗ ; 0), and Ω
x e (Z ∗ ; 0). We refer to them as zeroth-order approximations. We use notation
Λ(z ′ , θ′ , z, θ) to denote the transition probability matrix (or, rather, function, since our space
of (z, θ) is not countable) between states (z, θ) and (z ′ , θ′ ) in the invariant distribution Ω∗
Similarly, xz (z, θ) , xzz (z, θ), xσ (z, θ), etc refer to various derivatives of policy functions with
respect to z and σ, again evaluated at (Z ∗ , 0).
We are interested in understanding the response of equilibrium variables to aggregate
shocks. Aggregate shocks affect both the exogenous state Θ and the endogenous distribu-
tion Ω. As we shall see, all the local information about the effect of this distribution can be
summarized by the Frechet derivatives of policy functions. We use notation X Z to denote the
(first order) Frechet derivatives of X with respect to Z, and use notation X Z · Ẑ to denote the
value of that derivative when evaluated in direction Ẑ.2 X Z is a linear operator on the space
of distributions over (z, θ). Since we are working with economies in which such distributions
are infinitely-dimensional, computing X Z will usually be impractical. However, the linearity of
X Z allows us to manipulate these objects analytically without having to know their numerical
values. Importantly, the analytical tractability is maintained even for higher-order expansions.
For example, let X ZZ be the second-order Frechet derivative, with X ZZ · Ẑ ′ , Ẑ ′′ denoting
2
In particular, XhZ · Ẑ
satisfies (since
Frechet and i directional or Gateaux derivative coincide when both exist)
X Z · Ẑ = limα→0 1
Xe Z ∗ + αẐ; 0 − X e (Z ∗ ; 0) . See Luenberger (1997) for overview of Frechet derivatives of
α
various orders.
8
its value in directions Ẑ ′ and Ẑ ′′ . X ZZ is a bi-linear operator and this conditional linearity
will be the key to extending our approach to the second-order approximations.
In parallel with definitions of X Z and X ZZ , we use xZ (z, θ), xZZ (z, θ), ΩZ , etc to denote
Frechet derivatives of various orders of policy functions x
e (z, θ) and Ω.
e The derivatives for the
law of motion of the aggregate state (Θ, Ω) is denoted by Z Z . Since the first element of Z is
T
ρΘ
exogenous and decays with fixed rate ρΘ , this derivative has a form Z Z = ΩZ .
0
Throughout our analysis we make the following assumptions.
Assumption 1. (a) For any Z0 = [Θ0 , Ω0 ], let Ω0 = Ω0 and Ωt (Z0 ) := Ω Ωt−1 (Z0 ) for all
t > 0. limt→∞ Ω̄t (Z0 ) = Ω∗ for all Z in a neighborhood of Z ∗ .
(b). X̃ and Ω̃ are smooth with respect to Z, σ in the neighborhood of (Z ∗ , 0);
(c). x̃ are a.e. smooth under Ω∗ with respect to z, θ, Z, σ in the neighborhood of (Z ∗ , 0);
Condition (a) is required to ensure that economy is stable and that impulse responses
converge back to stationary values, condition (b) is needed to ensure that Taylor expansions
are well-defined. Both conditions (a) and (b) have direct analogue in standard perturbational
approach to solving representative agent economies. Condition (c) is an extension of condition
(b) to individual policy functions in heterogeneous agent economies with occasionally binding
constraints. It allows for kinks in these policy functions but requires that these kinks be of
Ω∗ -measure of zero.
Importantly, condition (c) allows for situations in which there is a strictly positive mass
of agents at borrowing constraints in the invariant distribution Ω∗ . To see this, consider our
example of the Krusell and Smith economy. Saving policy functions e k (k, θ) in that economy
form kinks at the point where borrowing constraints start to bind. Thus, for each k ≥ 0 there
is at most one θk at which the borrowing constraint start bind and policy function e k kinks.
Since the distribution of θ is continuous, the set of (k, θk ) points is of Ω∗ -measure of zero. This
is true even if there is a mass point of agents at the borrowing constraint, i.e. if Ω∗ ⟨0, θ⟩ dθ
R
9
from the lower orders, with a (z, θ) denoting small-dimensional matrices that depend on (z, θ),
A denoting the small-dimensional matrices that do not depend on (z, θ), A denoting linear
high-dimensional linear operators. For the purposes of showing our main steps, we do not
need to know the specific analytical expressions for these matrices and operators and often
drop those explicit expressions from the main text. In the appendix, we provide their explicit
form that we also use to numerically solve the linear systems of equations that we derive.
Before we characterize the main result of this section, it is useful to consider the mathematical
structure of our problem using a simple example. Observe that our policy function X e depend
on endogenous state Z
e that itself is a policy function of the state in the previous period, Z.
To build the intuition for what this implies in terms of Taylor expansions, supposed for the
moment that both X e and Ze are uni-dimensional and consider perturbations of the stationary
state Z ∗ by Z ∗ + σEt in each t. Let us trace-out how this perturbation affects policy functions
to the first order, and what this implies about the approximation of stochastic sequence Xt E t
In the initial period, “terms proportional to X σ ” are merely X σ , but this terminology will be
useful for expansions in other periods. Moreover, one can show3 that X σ = 0 so that all terms
proportional to X σ can be ignored. Thus, in the initial period, X0 E 0 ≈ X ∗ + X Z E0 to the
∂ e e ∗
X Z (Z + σE0 ; σ) + σE1 ; σ = X Z Z Z E0 + X Z E1 + terms proportional to X σ . (12)
∂σ σ=0
Continuing in this fashion, we see that the first-order approximation of Xt E t for any history
of shocks E t = (E0 , ..., Et ) can be constructed recursively from policy functions as follows: the
response of Xt in period t to a period-s shock Es is equal to the value of that shock Es multiplied
by a constant given in the following table
3
It follows directly from well-known arguments in Schmitt-Grohé and Uribe (2004).
10
X0 E 0 X1 E 1 X2 E 2 X3 E 3
...
E0 X Z Ẑ0 X Z Ẑ1 X Z Ẑ2 X Z Ẑ3 ...
E1 0 X Z Ẑ0 X Z Ẑ1 X Z Ẑ2 ...
... ... ... ... ... ...
where
These objects have a natural interpretation. Ẑt is the first-order effect of the shock on the
endogenous state that occurred t periods ago, and X Z Ẑt is the first-order policy response to
this first-order change in the aggregate state. As this table makes it clear, one only needs to
n o
know the sequence X Z Ẑt to characterize the first-order response, without having to know
n o t
X Z and Ẑt separately. This insight will be very useful for heterogeneous agents economies,
t
where analogues of both X Z and Ẑt are large- (usually infinite-) dimensional objects that
are impractical to compute, while their product X Z Ẑt is a small-dimensional object, which
dimensionality is equal to that of Xt .4
We now show that exactly the same insights extends to our general economy, one merely
n o
needs to interpret X Z as a Frechet derivative and Ẑt as a sequence of directions in which
t
these derivatives are evaluated. Formally, we define this sequence recursively by Ẑ0 = [1, 0]T
and Ẑt := Z Z · Ẑt−1 . To get the intuition for this definition, recall that Z consists of exogenous
state Θ and endogenous distribution Ω. Thus, the initial direction Ẑ0 changes the exogenous
part of the state by one unit keeping the endogenous distribution at the point of approximation
n o
Ω∗ . Sequence Ẑt captures the evolution of the aggregate state following this initial shock.
t h iT
Since the first element of Z decays at constant rate ρΘ , direction Ẑt has structure ρtΘ , Ω̂t ,
where Ω̂t is the effect of the shock on the distribution in t periods. Let X Z,t := X Z · Ẑt be the
value of the Frechet derivative X Z evaluated in direction Ẑt . The next lemma shows that this
construction indeed characterizes the first-order responses to aggregate shocks.
11
explicitly is impractical.5 Economically, X Z,t captures the first-order response of aggregate
variables to shocks that occurred t periods ago; it includes the direct response to the shock,
which due to autocorrelation is given by ρtΘ , and the indirect response to the changes in the
aggregate distribution Ω̂t induced by that shock. Lemma 1 implies that to the first order the
impulse response is given by
This lemma shows that the first-order effect on individual policy function x̄Z,t (z, θ) is equal
to the sequences of first-order effects of aggregate policy functions X̄Z,t+s s weighted with
coefficients {xs (z, θ)}s that are known from the zeroth-order economy. In economic terms,
xs (z, θ) is a derivative of individual policy functions with respect to the fully anticipated,
s-period ahead change in aggregate variables in the deterministic economy. The economic
intuition for equation (14) is as follows. x̄Z,t (z, θ) captures the effect of perturbation of state
Z ∗ into direction Ẑt on individual policy functions x. Households do not care directly about
the aggregate state Z; rather they care about the effect of this perturbation on the aggregate
variables, both concurrently, X̄Z,t , and in the future, X̄Z,t+s s≥1 . Below we give a sketch of
the proof that also provides explicit expressions for {xs (z, θ)}s .
where Fx (z, θ), Fx′ (z, θ) and FX (z, θ) are the derivatives of (3) with respect to the first three
arguments in the zeroth order economy. Set Z e = Ẑt , observe that Ẑt+1 = Z̄Z · Ẑt , and use the
5
The brute force application of DYNARE attempts to compute all these objects and thus quickly fails in
economies with even modest amount of heterogeneity.
12
definitions of x̄Z,t (z, θ) and X̄Z,t to get
(Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ]) x̄Z,t (z, θ) + FX (z, θ)X̄Z,t + Fx′ (z, θ)E [x̄Z,t+1 (z, θ) |z, θ] = 0,
(15)
where p is the selection matrix implicitly defined by z = px . Define the coefficients xs
recursively via
x0 (z, θ) = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ])−1 FX (z, θ)
xs+1 (z, θ) = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ])−1 Fx′ (z, θ)E [xs (, )|z, θ] .
One can directly verify that x̄Z,t defined by (14) solves (15) since
Fx′ (z, θ)E xs (, )X̄Z,t+1+s |z, θ = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ])−1 xs+1 (z, θ)X̄Z,t+s+1 .
We are now ready to construct approximations X Z,t t . Differentiate equation (6) into
direction Ẑt to get Z
Rx xdΩ + RX X̄Z,t + RΘ ρt = 0, (16)
Z,t
R
where xdΩ is given by
Z,t
Z Z Z
∗
xdΩ := x̄Z,t dΩ + x̄dΩ̂t . (17)
Z,t
Equation (16) shows the relationship between the unknown variable X̄Z,t , known objects Rx ,
RX and RΘ (which are simply derivatives of R mapping given by (6) with respect to its
R
arguments at the point of approximation), as well as yet-unknown xdΩ . This latter term
Z,t
captures how aggregate shock affects aggregation of individual variables to the first order. As
can be seen from equation (17), this term is a sum of two components: the effect of perturbation
on individual variables weighted with zeroth-order invariant distribution, x̄Z,t dΩ∗ , and the
R
R
aggregation of individual policy functions weighted with the perturbed distribution, x̄dΩ̂t .
This perturbed distribution Ω̂t is not known. Moreover, it is a complex, infinite-dimensional
object which makes finding it explicitly impractical.
One of the main insights of this section is to show that Ω̂t satisfies a simple linear recursive
R
structure that can be utilized to obtain explicit expressions for xdΩ without having
Z,t
to solve for Ω̂t directly; moreover, this linear recursive structure survives into higher order of
approximations. Let Dθ be a linear operator that for any distribution Ω returns Dθ · Ω ⟨z, θ⟩ =
d
dθ Ω ⟨z, θ⟩. The next lemma shows the central result that Dθ · Ω̂t follows a simple recursive law
of motion that much simplifies first-order approximations.
13
Lemma 3. For any t, Dθ · Ω̂t satisfies recursion
∞
X
Dθ · Ω̂t = A · Dθ · Ω̂t−1 − as X̄Z,t−1+s . (18)
s=0
where we used differentiation by parts to obtain the last term. Substitute (14) into this
expression and differentiate both sides with respect to θ to get
∞ Z
X ZZ
′ ′ ′ ′ ∗
Dθ ·Ω̂t z , θ =− Λ̄(z , θ , z, θ)zs (z, θ) dΩ X Z ·Ẑt−1+s − Λ̄(z ′ , θ′ , z, θ)z̄z (z, θ)Dθ · Ω̂t−1 ⟨z, θ⟩ dzdθ .
s=0 | {z } | {z }
as (z ′ ,θ′ ) A·Dθ ·Ω̂t−1 (z ′ ,θ′ )
The recursion (18) is helpful because it shows that Dθ · Ω̂t is a simple linear recursive
equation that depends on Dθ · Ω̂t−1 and X̄Z,t−1+s s weighted with known linear operators.
Since Ω̂0 = 0, we can use (18) to solve for Dθ · Ω̂t purely in terms of X̄Z,s s :
∞
t−1 X
X
Dθ · Ω̂t = − At−1−k · as−k X̄Z,s . (19)
k=0 s=0
R
To see how equation (19) is useful, observe that we can write the term x̄dΩ̂t in the expression
(17) using integration by parts as follows
Z Z
x̄dΩ̂t = − x̄z (z, θ)Dθ · Ω̂t ⟨z, θ⟩dzdθ. (20)
R
Combine it with (19) to express x̄dΩ̂t purely as a linear function of X̄Z,s s
. This allows us
to prove the following result.
14
Proof. Due to Lemma 2 we have
Z ∞ Z
X ∞ Z
X
∗ ∗ ∗
x̄Z,t dΩ = xs dΩ X̄Z,t+s = xs−t dΩ X̄Z,s (22)
s=0 s=0
with the understanding that xl = 0 for all l < 0. Substitute (19) into (20), substitute resulting
expression and (22) into (17) and re-arrange to obtain (21), where
Z t−1 Z
X
∗
Gt,s = xs−t dΩ + x̄z (z, θ) At−1−k · as−k (z, θ)dzdθ .
k=0 | {z }
B·At−1−k ·as−k
One can directly verify that Gt,s satisfies the desired recursion.
R
This corollary shows that the effect of the perturbation Ẑt on x edΩ can be expressed
purely as a function of yet-unknown variables X̄Z,s s weighted with known coefficients Gt,s .
These coefficients themselves follow a linear recursive structure that makes them fast and easy
to construct numerically using matrix multiplications. Once we substitute equation (21) into
(16), we find the linear system of equations that characterizes the first-order approximation to
our problem.
Proposition 1. X̄Z,t t
is the solution to
∞
X
Rx Gt,s X̄Z,s + RX X̄Z,t + RΘ ρtΘ = 0. (23)
s=0
Equation (23) is a linear system of equations that determines X̄Z,t t . While written as in
infinite system, it can be simplified. Assumption 1 implies that X̄Z,t → 0 as t → ∞ so it can
be transformed into a finite linear system of T × dim X equations and unknowns, and solved
using simple matrix inversions. Lemma 1 then implies that equation (23) fully characterizes
the first-order approximation.
One of the key feature of our approach is that it extends to higher-order approximations while
preserving tractable linear recursive structure of the first-order approximation. In this section
we show how it can be used to obtain the second-order approximations.
To build the intuition, we start with the same uni-dimensional illustrative example that we
used to motivate first-order approximations. The second order analogue of equation (11) is
∂2 e ∗
X (Z + σE0 ; σ) = X ZZ E02 + X σσ + terms proportional to X Zσ . (24)
∂σ 2 σ=0
15
Terms proportional to X Zσ can be ignored since they are zero to the second order, but the
terms proportional to X σσ (to which we refer to as the risk adjustment terms) can not. Term
X ZZ E02 captures the second-order response of policy functions to the E0 shock. The second-
order analogue of equation (12), that characterizes second-order responses to shocks in period
1, is given by
∂2 e e ∗
2
X Z (Z + σE 0 ; σ) + σE 1 ; σ = X ZZ Z Z + X Z Z ZZ E02 +X ZZ E12 +X ZZ Z Z E0 E1 +X σσ + X Z Z σσ .
∂σ 2 σ=0 | {z }
risk adjustment
(25)
Thus, in order to find the second-order approximation to the first period response, we need
to know second-order responses to E0 and E1 shocks, their interaction effect E0 E1 as well as
the appropriate risk adjustment. A very useful observation is that, as we continue rolling
this approximation forward, the coefficients that multiply shocks Et Es in the second-order
approximation can be described as follows
X0 E 0 X1 E 1 X2 E 2 X3 E 3
...
E0 E0 X ZZ Ẑ0 Ẑ0 + X Z Ẑ0,0 X ZZ Ẑ1 Ẑ1 + X Z Ẑ1,1 X ZZ Ẑ2 Ẑ2 + X Z Ẑ2,2 X ZZ Ẑ3 Ẑ3 + X Z Ẑ3,3 ...
E0 E1 0 X ZZ Ẑ1 Ẑ0 + X Z Ẑ1,0 X ZZ Ẑ2 Ẑ1 + X Z Ẑ2,1 X ZZ Ẑ3 Ẑ2 + X Z Ẑ3,2 ...
E0 E2 0 0 X ZZ Ẑ2 Ẑ0 + X Z Ẑ2,0 X ZZ Ẑ3 Ẑ1 + X Z Ẑ3,1 ...
... ... ... ... ... ...
risk X σσ + X Z Ẑσσ,0 X σσ + X Z Ẑσσ,1 X σσ + X Z Ẑσσ,2 X σσ + X Z Ẑσσ,3 ...
where
Ẑ0,0 := 0 Ẑ1,1 := Z ZZ Ẑ0 Ẑ0 + Z Z Ẑ0,0 Ẑ2,2 := Z ZZ Ẑ1 Ẑ1 + Z Z Ẑ1,1 Ẑ3,3 := Z ZZ Ẑ2 Ẑ2 + Z Z Ẑ2,2 ...
0 Ẑ1,0 := 0 Ẑ2,1 := Z ZZ Ẑ1 Ẑ0 + Z Z Ẑ1,0 Ẑ3,2 := Z ZZ Ẑ2 Ẑ1 + Z Z Ẑ2,1 ...
0 0 Ẑ2,0 := 0 Ẑ3,1 := Z ZZ Ẑ2 Ẑ0 + Z Z Ẑ2,0 ...
.. .. .. ... ...
Ẑσσ,0 = 0 Ẑσσ,1 = Z̄σσ + Z Z Ẑσσ,0 Ẑσσ,2 = Z̄σσ + Z Z Ẑσσ,1 Ẑσσ,3 = Z̄σσ + Z Z Ẑσσ,2 ...
These tables show that just as in the first-order case, the terms in the second-order ap-
proximation have linear recursive structure. To understand the intuition for this structure, it
is useful to start with the following observation. The second-order approximation of policy
functions always consists of two terms: a term that captures the second-order response of
policy function to the first-order changes in the state, and a term that captures the first-order
response of policy functions to the second-order changes in the state. This observation helps to
understand the intuition for the form of expressions that we derived in the tables below. The
n o
first-order changes in states Ẑt were already described in Section 3.1. The second order
t
16
n o
changes in states are given by Ẑt,s , where Ẑt,s is the second-order effect on the current state
t,s
of two shocks, that occurred t and s periods ago. Since states themselves are endogenous and
characterized by policy functions, they have the two-term form Ẑt,s = Z ZZ Ẑt Ẑs + Z Z Ẑt−1,s−1 ,
which is a sum of the second-order responses of Ze to the first-order effects of the state and the
first-order response of Ze to the second-order changes in the state. Coefficients that characterize
second-order responses of Xt E t and the risk adjustment terms follow identical structure.
While the table describing second-order approximation coefficients has more elements than
the first-order table in Section 3.1, it shows that the second-order coefficients have the same
features that allowed us to tractably characterize their first-order analogues. In particular,
second-order responses are described by sequences X ZZ,t,s t,s and X σσ,t t , where X ZZ,t,s =
X ZZ Ẑt Ẑs + X Z Ẑt−1,s−1 and X σσ,t = X σσ + X Z Ẑσσ,t . If one knows those sequences, it is not
necessary to know separately each term from which they are constructed, such as X ZZ or
Ẑt−1,s−1 . Moreover, the sequences are described by a tractable recursive linear mathematical
structure.
We now return back to our general economy. As in Section 3.1, all the constructions used
in the uni-dimensional example remains unchanged in any-dimensional settings as long as we
understanding that “derivative” is meant to be “Frechet derivative”. In particular, in our
n o
general settings let Ẑ0,s = Ẑt,0 = Ẑσσ,0 = 0 and construct directions Ẑt,s recursively as
t,s
follows
Ẑt,s = Z̄ZZ · Ẑt−1 , Ẑs−1 + Z̄Z · Ẑt−1,s−1 for all t, s > 0,
T
Ẑσσ,t = 0, Ωσσ + Z Z · Ẑσσ,t−1 for all t > 0.
Similarly, terms characterizing the second-order responses X ZZ,t,s t,s and X σσ,t t
are de-
fined as
X ZZ,t,s := X̄ZZ · Ẑt , Ẑs + X̄Z · Ẑt,s ,
X σσ,t := X σσ + X Z · Ẑσσ,t .
Next lemma shows that these terms fully characterize second-order approximations.
17
X ZZ,t−s,t−m Es Em is the second-order response that captures the interaction between shocks
that occurred t − s and t − m periods before period t. X σσ,t is the risk-adjustment affect of
period t policy functions. The impulse response is given by
∞
X
x̄σσ,t (z, θ) = xs (z, θ)X̄σσ,t+s + z(z, θ), (28)
s=0
where z(z, θ) is known conditional on knowing X̄ZZ,t,k t,s
.
This lemma shows relationship between {x̄ZZ,t,k }t,k and X̄ZZ,t,k t,k , and between {x̄σσ,t }t
and X̄σσ,t t is characterized by the same loading terms {xs }s that were already computed for
the first order terms, plus additional adjustments {yt,k }t,k and z. Terms {yt,k }t,k are known
from the lower-order solution. Term z is know from the lower-order and X̄ZZ,t,k t,k . As we
shall see, one can find X̄ZZ,t,k t,k without using any information about X̄σσ,t t . Thus, the
general approach to the second-order approximation is to first solve for X̄ZZ,t,k t,k , then use
(28) and follow the same steps to solve for X̄σσ,t t . The proof of Lemma 5 follow very similar
steps to those of Lemma 2 except we take multiple derivatives and sum them. For example,
to obtain (5) we take a second-order derivative of (5) in directions Ẑt , Ẑs and a first-order
derivative in direction Ẑt,s , and then add the two expressions.
18
We follow the same strategy of combining expansions to several different directions for
other steps as well. Derivatives of (4) in directions Ẑt , Ẑs and Ẑt,s imply that
Z
Rx xdΩ + RX X̄ZZ,t,s + RΘ,t,s = 0
ZZ,t,s
where
Z Z Z Z Z
∗
xdΩ := x̄ZZ,t,k dΩ + x̄Z · Ẑt dΩ̂k + x̄Z · Ẑk dΩ̂t + x̄dΩ̂t,k .
ZZ,t,k
where Z Z Z
∗
xdΩ := x̄σσ,t dΩ + x̄dΩ̂σσ,t .
σσ,t
By comparing these expressions to their first-order analogue (16) and (17), one can see that
n o
the only new elements that appear in the second-order expansions are directions Ω̂t,k and
n o t,k
Ω̂σσ,t that capture second-order effects on the law of motion of the aggregate distribution.
t n o
Crucially, just like their first-order analogue Ω̂t , they have linear recursive representations.
t
These linear recursive representations are important as they allow us to sidestep the need
to explicitly calculate second-order effects of changes in the aggregate state on individual
policy functions, just like we avoided this problem in the first-order approximations. The
next two results establish analogues of Corollary 1 and Proposition 1 for the second-order
approximations.
Z ∞
X
xdΩ = Gt,s X̄σσ,s + Gσσ,t . (30)
σσ,t s=0
19
Proposition 2. X̄ZZ,t,k t,k
and X̄σσ,t t
are the solutions to
∞
X
Rx Gt,s X̄ZZ,t−k+s,s + Rx Ht,k + RX X̄ZZ,t,k + RΘ,t,k = 0 (31)
s=0
and
∞
X
Rx Gt,s X̄σσ,s + Rx Gσσ,t + RX X̄σσ,t = 0 (32)
s=0
respectively.
Proposition 2 fully characterizes second order approximation. Both (31) and (32) are linear
systems of equations. Since X̄ZZ,t,k → 0 as either t or k approach ∞, the system (31) can
be solve by appropriate truncation, just like we solved equation (23). The solution method
for equation (32) needs to be adjusted since the risk adjustment term does not disappear as
t → ∞ but instead converges to some time-invariant X̄σσ,∞ . Thus, to solve (32), we use a
terminal condition X̄σσ,t = X̄σσ,T for all t ≥ T use solve a truncated linear system of equations
T
X
Rx Ĝt,s X̄σσ,s + Rx Gσσ,t + RX X̄σσ,t = 0
s=0
T P∞
for X̄σσ,s s=0
, where Ĝt,s = Gt,s for s < T and Ĝt,T = s=T Gt,s .
Continuing in the same fashion as in Section 3.2, it is clear how the method can be extended
to third- and higher-order approximations. The key property that we exploited in recursive
construction of the second order approximation is that the second-order Frechet derivative is
a multi-linear map that preserves a lot of analytical tractability. The same property holds for
higher-order Frechet derivatives.
In our presentation, we assumed that both aggregate exogenous state Θt or individual state
zt are uni-dimensional. This was done for simplicity as nothing in the analysis relies on the
uni-dimensional natural of the shock. Suppose, for example that Θt is a vector of shocks that
follows stochastic process 1, where ρΘ is now a matrix, and Et is a vector of disturbances that
are i.i.d. over time but have arbitrary correlation across elements. Our perturbation (10)
would be unchanged, with scalar σ scaling the whole vector of shocks. Elements like X Z,t
that appear in Lemma 1 would now be vector rather than scalars, and X Z,t−s Es would be
interpreted as a dot product, and equation (23), except now it will pin down a sequence of
vectors X Z,t t . Analogous observations apply to higher orders. Similarly, when zt is a vector
the only adjustments in the analysis is needed to reinterpret the derivatives xz as gradients.
20
3.4 Computational Implementation
Sections 3.1-3.3 present our algorithm using the language of peicewise smooth functions and
infinite dimensional linear operators. We implement the algorithm on a computer by approx-
imating functions using a finite set of basis functions. Under this assumption, we show how
all of the linear operators can be represented by sparse matrices allowing for quick efficient
computation using only linear algebra once the steady state is solved.
We’ll proceed by adopting the following notation. We assume that all functions are approx-
imated using the set of N basis functions {ϕi (z, θ)}i . An approximated function is then defined
by a set of coefficients for these basis functions. In order to maintain parsimony in our nota-
tion we will use the superscript ·i to denote these coefficients. For example, the steady-state
decision rules x̄(z, θ) are approximated as follows
X
x̄(z, θ) ≈ ϕi (z, θ)x̄i .
i
We compute these coefficients using the collocation method and and a set of N grid points
{(ẑi , θ̂i )}i . Once the value of a function f (x) is known at these N grid points we determine
coefficients by solving the following linear system using the basis matrix, Φji = ϕi (ẑj , θ̂j ),
1
f f (ẑ1 , θ̂1 )
f 2 f (ẑ2 , θ̂2 )
Φ . = .
.. ..
.
fN f (ẑN , θ̂N )
Generally, we will use the same basis functions and gridpoints as were used to approximate
the steady state policy rules though this is not a requirement. Below we document how to
approximate the objects in Lemma 2-3 and corollary 1 using linear operations.
The objects to compute in Lemma 2 are the functions xs (z, θ). We begin with x0 (z, θ) which
is defined by
x0 (z, θ) = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)|z, θ] p)−1 FX (z, θ).
In order to evaluate x0 at an arbitrary gridpoint (ẑj , θ̂j ) we need to evaluate the derivatives
of F at that gridpoint. This can be done simply and efficiently via automatic differentiation
h i
once the steady state policy rules are known. Next, the term E x̄z (·, ·)|ẑj , θ̂j needs to be
computed. Written out fully this is
h i Z
E x̄z (·, ·)|ẑj , θ̂j = x̄z (z̄(ẑj , θ̂j ), ρθ θ̂j + ϵ)µ(ϵ)dϵ.
21
wkµ f (ϵµk ) is approximated using a quadrature method
R P
When the integral f (ϵ)µ(ϵ)dϵ ≈ k
with gridpoints ϵµk and weights wkµ , this expectation can be approximated the our basis functions
as
h i XX
E x̄z (·, ·)|ẑj , θ̂j ≈ wkµ ϕi,z z̄(ẑj , θ̂j ), ρθ θ̂j + ϵµk x̄i . (33)
i k
Once the basis functions and quadrature nodes are specified and steady state transition func-
tions are known equation (33) is simply an N × N matrix operator on the coefficients x̄i who’s
h i
elements can all be precomputed. We define this matrix as EΦz 6 and therefore E x̄z (·, ·)|ẑj , θ̂j
can be written as (EΦz x̄)j . We conclude that
−1
x0 (ẑj , θ̂j ) ≈ − Fx (ẑj , θ̂j ) + Fx′ (ẑj , θ̂j ) (EΦz x̄)j p FX (ẑj , θ̂j ).
Once x0 is known at all of the gridpoints, the coefficients xi0 can be computed via the collocation
method above.
With the coefficients, xi0 , known the remaining coefficients xis can be computed inductively
using the formula
xs+1 (z, θ) = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ])−1 Fx′ (z, θ)E [xs (, )|z, θ] .
h i
To evaluate this function at an arbitrary gridpoint (ẑj , θ̂j ) it is necessary to compute E xs (, )|ẑj , θ̂j .
h i
Following the same steps as for E x̄z (·, ·)|ẑj , θ̂j we observe that
h i XX
E xs (, )|ẑj , θ̂j ≈ wkµ ϕi z̄(ẑj , θ̂j ), ρθ θ̂j + ϵµk xis ≡ (EΦxs )j
i k
where all the terms in this expression are composed of simple linear operators.
The objects to compute from Lemma (3) are the functions as (z, θ) and the linear operator A,
which are defined by Z
′ ′
Λ̄(z ′ , θ′ , z, θ)zs (z, θ) dΩ∗
as z , θ =
and ZZ
′ ′
A · Dθ · Ω̂ z , θ = Λ̄(z ′ , θ′ , z, θ)z̄z (z, θ)Dθ · Ω̂ ⟨z, θ⟩ dzdθ.
6
Formally EΦ′ji = wkµ ϕ′i z̄(ẑj , θ̂j ), ρθ θ̂j + ϵµ
P
k k .
22
The building blocks for both of these objects are the steady state density dΩ∗ and the steady
state transition density. In solving for the Bewely-Aiyagari steady state both of these will have
been approximated globally using a fine grip of I points {(z̄i , θ̄i )}i and the histogram method
of Young (2010). We let ωi∗ represent the fraction of agents in the steady state who are in state
(z̄i , θ̄i ) and let Λ̄ji be the probability, given the steady state policy rules, of transitioning from
state (z̄i , θ̄i ) to state (z̄j , θ̄j ). With these objects in hand we can evaluate as at any gridpoint
(z̄i , θ̄i ) via
X
Λ̄ji Φ̄ω zs i
as z̄j , θ̄j ≈ (34)
i
where Φ̄ωik = ϕk (z̄i , θ̄i )ωi∗ is a matrix which takes a vector of coefficients and evaluates the
corresponding function on the gridpoints of the distribution weighted by the steady state
density. The operator A is approximated by a I × I matrix constructed from Λ̄ji by weighting
the rows via z̄z (z̄i , θ̄i ),
Aji = Λ̄ji z̄z (z̄i , θ̄i ).
Finally, for corollary 1 we wish to evaluate the operation B · At−1 · as , where B represents
integration weighted by x̄z (z, θ). Equation 34 allows us to represent as as a vector by evaluating
it at all of the gridpoints used to approximate the distribution. At−1 · as is then constructed by
repeatedly applying the matrix Aji to that vector. To construct B · At−1 · as , we then take that
resulting vector and take the dot product with the vector Φ̄z x̄ where Φ̄z,ji = ϕi,z z̄j , θ̄j is the
basis matrix that takes a vector of coefficients and evaluates the derivative of the corresponding
function at all the gridpoints of the distribution.
Our approach builds on the variational techniques such as Schmitt-Grohé and Uribe (2004)
and borrows directly many insights from them. Those techniques were originally developed
to study representative agent models. When applied to heterogeneous agent economies, such
an approach would seek to solve directly for derivatives X Z , X ZZ and various directions
n o
Ẑt ,which quickly becomes impractical as the dimensionality of state Z grows. Our ap-
t
proach, most importantly Corollaries 1 and 2, shows that this problem can be sidestepped in
a broad class of economies with arbitrary heterogeneity.
Our first-order approximation is closely related to the method developed in an important
paper by Auclert, Bardóczy, Rognlie, and Straub (2021), or ABRS for short. While our
formulation and theirs appear to be different, we now explain that they are, in fact, equivalent
to the first order, both mathematically and in terms of computational time needed to implement
23
them.The key difference between the two approaches is that ours easily scales to higher-order
approximations.
ABRS work with sequence-space representation (3) and (4). As we do, they also consider
approximation around point Z ∗ and exploit the insight of Boppart, Krusell, and Mitman (2018)
that the first-order approximation of a stochastic economy can be constructed from impulse
responses to MIT shocks in deterministic economy.7 Similarly to our Lemma 2, ABRS exploit
the fact that individual choices xt are only functions of aggregate time series X = {Xt }t and
so can be expressed as xt (X). Differentiating (4) for deterministic economy with respect to
the time 0 MIT shock gives
∞
X ∂Xs ∂Xt
Rx Jt,s + RX + RΘ ρtΘ = 0, (35)
∂Θ0 ∂Θ0
s=0
∂ ( xi,t di)
R
∂Xt
where Jt,s := ∂Xs and ∂Θ0 is the response of Xt to a period 0 MIT shock. The matrix J
with elements {Jt,s }t,s is the Jacobian of the sequence-space representation. It is a complicated
object to compute but in the central result of their paper ABRS show that Jt,s has a linear
n o
∂Xt
recursive structure and can be expressed as a function of Jt−1,s−1 and ∂Θ . This allows
n o0 t
∂Xt
them to re-write (35) as a linear system of equations that determines ∂Θ 0
.
t
Our approach both builds on their insights but also departs from them in two important
ways, that allows us to extend our method to higher-orders of approximation. The similarly
between the two approaches can be easily seen by comparing their equation (35) to our (23).
∂Xt
Clearly, their response ∂Θ0 corresponds to our X Z,t , and Jt,s to Gt,s , with linear recursive
structure is being exploited in both cases to construct these objects. However, steps leading
to these expressions are quite different for at least two reasons.
The first difference between our approach and that of ABRS is that we use state- rather
than sequence-space representation. It is difficult to see how sequence-space representation
can be extended to higher-orders of approximation. In higher orders, knowing responses to
MIT shocks is insufficient to recover approximations of policy functions as those, for example,
would not include the risk adjustment terms.
The second difference is that ABRS start with discretizing the invariant distribution Ω∗ and
its law of motion (using the histogram method, due to Young (2010)) and then use a mixture
of numerical and analytical techniques to construct approximations of terms that appear in
(35). In contrast, we first derive analytically the exact theoretical representation (23) before
implementing it numerically using discretization. The order at which discretization is used
7
Ultimately, our Lemma 1 is also restatement of the same insight.
24
is important. Since we approximate the exact analytical expressions, our procedure ensures
that as the grid size h of our discretization shrinks to zero the numerical solution converges to
the analytical one. This is not the case if the order is reversed. In the appendix we show if
one starts with the disctretized law of motion for the aggregate distribution, the second-order
approximations do not converge to the theoretically correct ones even as h → 0. While this is
not a problem for ABRS, who focus on the first-order approximations, it serves as a roadblock
of extending that approach to higher orders.
To derive analytically properties of the law of motions, we used linear operator techniques
that were first developed in Bhandari, Evans, Golosov, and Sargent (2021). The problem con-
sidered in that paper was much simpler, as it did not allow for any kinks in policy functions or
non-trivial transition dynamics at σ = 0. The nature of approximation was also quite different
in that paper since σ scaled both aggregate and idiosyncratic shocks. But Bhandari, Evans,
Golosov, and Sargent (2021) also derived simple versions of our Lemmas 2 and 5 and showed
that Frechet derivatives can be fruitfully used to simplify approximations of heterogeneous
agent economies.
4 Extensions
We now discuss how our approach described in Section 3 can be extended to two classes of
problems: models with stochastic volatility and portfolio problems.
For many applications that require realistic modeling of financial markets or uncertainty about
government policies it is important to allow for variation in volatility of aggregate shocks. The
conventional wisdom (see, e.g., discussion in Fernández-Villaverde, Guerrón-Quintana, Rubio-
Ramı́rez, and Uribe (2011)) is that analysis of stochastic volatility requires at least third-order
approximations, with second-orders terms capturing the mean level of volatility and third-order
terms capturing the effect of deviations from that mean. These third-order expansions can
significantly increase computational time required to approximate or estimate dynamic models
even in RA settings. While it is certainly possible to study effects of stochastic volatility in
HA models by extending our approach to the third order, it is not necessary. The second-order
approximations that we described in Section 3.2 can be modified to directly study problems
with stochastic volatility. These modifications have minimal impact on the computational
speed and allow one to analyze such problems quickly and easily.
To illustrate our main ideas, we start with describing our approximations for a particularly
25
convenient family of heteroskedastic shocks. Suppose that the aggregate shock Θt can be
described by the following stochastic process
p
Θt = ρΘ Θt−1 + Υt−1 Et , (36)
now of both shocks to Θt and volatility shocks. We assume that initial conditions are such
that Υ−1 = 0. The rest of the model is the same as in Section 2.
The state in the recursive representation now consists of a triplet (Υ, Θ, Ω). One way to
approximate this economy is to proceed as in Section 3.3, scale both shocks Et and EΥ,t with
σ, and approximate equilibrium around the deterministic point (1, 0, Ω∗ ). In order to capture
time-varying volatility, this approach would indeed require using third-order approximations.
Instead, we present a much faster and simpler method.
We scale only shock Et with σ so that the perturbed process for Θt takes the form
p
Θt = ρΘ Θt−1 + σ Υt−1 Et .
The stochastic process for Υt in the perturbed economy in unchanged and is given by (37).
Since shocks EΥ,t are not scaled with σ, volatility Υt follows a non-trivial stochastic process
even in the long run when σ = 0. Thus, our approximations are around (Υ, 0, Ω∗ ), where Υ is
a non-trivial random variable.
Observe that when σ = 0, volatility Υ has no effect on any endogenous variable. Therefore,
the invariant distribution Ω∗ is independent of Υ and coincides with the one we considered in
Section 3. Similarly, the derivatives of policy functions X
e (Υ, Θ, Ω; 0) and x
e (z, θ, Υ, Θ, Ω; 0)
with respect to Z = (Θ, Ω) are also independent of Υ and, in fact, coincide with their values
in the baseline economy in Section 3. Similarly, derivatives X σ , xσ is equal to zero for any
Υ, but the second-order derivative, X σσ (Υ) and xσσ (Υ) are non-trivial functions of Υ. This
dependence of these derivatives on Υ allows us to compute effects of stochastic volatility using
second-order approximations.
These observations imply that the only modification that our approach requires is in de-
vising a method that generalizes X σσ,t t and finds its values in stochastic volatility settings.
26
We proceed as follows. Take any history EΥ t , with its implied history of volatilities (Υ , ..., Υ ),
0 t
t
and construct sequence X σσ,t EΥ t as follows:
t t
X σσ,t EΥ := X σσ (Υt ) + X Z · Ẑσσ,t EΥ ,
t
T t−1
Ẑσσ,t EΥ = 0, Ωσσ (Υt−1 ) + Z Z · Ẑσσ,t−1 EΥ for all t > 0.
This definition is the same as that of X σσ,t t except that it explicitly recognizes the fact that
t
X σσ and Ωσσ depend on Υ. The definition of x̄σσ,t EΥ is modified analogously. The next
result extends Lemmas 1 and 4 to the settings with stochastic volatility.
Comparison of equations (39) and (28) reveals that stochastic volatility introduces two changes:
(i) the deterministic term X σσ,t+s in equation (28) is replaced with the conditional expectation
t in equation (39), and (ii) equation (39) has an additional term, Υ z (z, θ), where
E X̄σσ,t+s |EΥ t Υ
zΥ (z, θ) is an known. Modulo these two modifications, all other steps we described in Section
3.2 remain unchanged. Proceeding as in that section, we obtain
∞
X t
X
t t
Rx Gt,s E X̄σσ,s |EΥ + Rx Gσσ,t + Rx GΥ,s Υt−s + RX X̄σσ,t EΥ = 0. (40)
s=0 s=0
This equation is the analogue of equation (32). The key difference between the two equation
is that (40) has an addition term ts=0 Rx GΥ,s Υt−s , which captures the history of stochastic
P
t instead of X
volatility realizations in the past, and conditional expectations E X̄σσ,s |EΥ σσ,s .
t
In order to apply Lemma 7 we need to use (40) to recover X σσ,t EΥ t,E t
. To this end,
Υ
t . Therefore, taking equation (40) for
we exploit the fact that equation (40) holds for all t, EΥ
t−1 s−1
some history EΥ , EΥ,t and subtracting from it equation (40) for history EΥ , it is possible
t−1 t−1
to obtain express X σσ,t EΥ , EΥ,t as a function of X σσ,t−1 EΥ . Proceeding in this fashion
and simplifying, we obtain the following result.
27
t
Proposition 3. X̄σσ,t EΥ t
is given by
t
X
t
X σσ,t (EΥ ) = X̄σσ,t + X̄Υ,t−s EΥ,s ,
s=0
where X̄σσ,t t
is the solution to equation (32) and X̄Υ,t t
is the solution to
∞
X k
X
Rx Gk,s X̄Υ,s + Rx GΥ,k−s ρsΥ + RX X̄Υ,k = 0. (41)
s=0 s=0
The key insight of Proposition 3 is that the only additional step required to capture the
effect of stochastic volatility is to solve (41) for a deterministic sequence X̄Υ,t t . This step can
be done in parallel with solving (32) for X̄σσ,t t and thus it has trivial marginal effect on the
computational speed required to perform second-order approximations in stochastic volatility
settings.
We conclude this section with a remark about the stochastic process (36) and (37). We
chose this process because it automatically implies that the last term on the right hand size of
equation (39), Υt zΥ (z, θ), is linear in Υt . This linearity then carries over to equation (40) that
ultimately leads to (41). For more general specifications of stochastic volatility processes, this
last term in (39) would be of the form zΥ (z, θ, Υt ), i.e. it would be a known but potentially
a non-linear function of Υt . The easiest way to proceed in that case is to linearize zΥ with
respect to Υt thereby reducing this case to the form given in Proposition 3. This would only
affect the O term in equation (38) leaving all other equations unchanged.
5 Numerical Results
In this section we implement our algorithm using a calibration of Krusell and Smith (1998)
model. We first compare the relative accuracy and speed of the first and second order ap-
proximations using a one time shock with perfect foresight. The second order approximation
generally produces errors which are an order of magnitude smaller than the first order ap-
proximation. We then explore non-linearities of the model through the differences in impulse
responses induced by previous shocks and the welfare costs of business cycle policies.
Our baseline model extends the Krusell-Smith framework of section 2 to include capital ad-
justment costs and fiscal policy that varies over the business cycle. Investment in capital is
subject to convex adjustment, which are assumed to take the form:
2
ϕ It
ϕ(It , Kt−1 ) = − δ Kt−1 .
2 Kt−1
28
We assume that capital is held by a continuum of perfectly competitive mutual funds with a
price per unit capital given by Pt . This intermediary mutual funds holds the individual saving
ai,t as deposit, representing the market value. The aggregate saving is invested in capital by
the fund: Z
Pt Kt = ai,t di
The shares held in the mutual funds have a financial return, which is the sum of (i) capital
gains net of depreciation and investment and (i) dividends given by the marginal product of
capital rk = ∂Y /∂K, net of capital adjustment costs:
Pt (1 − δ + It ) + rtk − ϕ(It , Kt−1 )
Rt =
Pt−1
The optimal investment choice of the mutual funds implies that the price per unit capital is
exactly equal to the marginal cost of investment, namely
It
Pt = 1 + ϕ .
Kt−1
Note that Pt also represents the marginal gain of capital inside in the firm, i.e. the Q-ratio as
in the Tobin’s Q theory, c.f. Hayashi (1982).
Finally, we extend the model to include fiscal policies that vary over the business cycle in
form of a time varying labor-tax
τt = τΘ Θt ,
29
Table 1: Calibration of the Krusell-Smith economy
We begin by documenting the first order response of the Krusell and Smith style economy with
capital adjustment costs. For comparison purposes we compute the first order responses using
3 different methodologies: our benchmark approach described in section 3.1, a modification
of our approach which approximates the law of motion of the distribution with a histogram
as described in section 3.5, and the sequence space Jacobian approach of Auclert, Bardóczy,
Rognlie, and Straub. Figure 1 plots the derivative of capital, returns to capital wages and
output over time with respect to a one-time unexpected change to TFP at time 0. Under
our notation these are the variables in X̄Z,t . All three methodologies produce nearly identical
responses. An increase in at time 0 combined with capital adjustment costs instantaneously
cause the price of capital, and hence returns to capital, to increase on impact. Households
respond to this transitory higher than expected level of output by saving for the future causing
capital to rise on impact and generating the familiar hump shaped response as the TFP shock
dissipates. The presence of capital adjustment costs results in a slower decline of the capital
stock, which only returns to its steady state level after 400 quarters.
Next, we compare the accuracy of all three approaches. Measuring the numerical errors
made in the simulation of heterogeneous agents models with aggregate shocks is notoriously
difficult since there are no “reference point” to compare with. As such, we test the accuracy of
our method by studying the response to a one-time, one standard deviation positive shock to
TFP. All approaches give an approximation X̂t to the path of all aggregates in response to this
shock. Given these approximations, these sequences of aggregate variables are the only thing
30
Figure 1: Derivatives of capital, returns, wages and output with respect to the TFP shock for the 3 method-
ologies.
needed to compute the dynamics of the rest of the model in a fully non-linear way: first using
the sequence of prices, the optimal policies x̃t (z) of households are computed using standard
Dynamic programming methods. Second, given these policies, we compute the law of motion
of the distribution ω̃t (z), and, third, aggregating up, we compute the law of motion X̃t . We
can then measure the accuracy by comparing X̂t with the non-linear output X̃t . We do this in
K̂t −K̃t
figure 2 by plotting the % error in the capital stock K̃t
for the three approaches represented
by the blue, red and green lines. As would be anticipated by figure 1 all three approaches have
roughly the same error to first order, with the maximal error being on the order of 0.04% of
the capital stock.
Finally, we turn to the computational speed of our approach which we break down in table
2 by each stage of the algorithm. The timings for the first order approximation are reported
in the first two columns of the table.8 All told, once the steady state has been computed, our
algorithm takes 0.5 seconds to solve for the X Z,t terms with roughly equal time spent in all
4 of the main steps. As Lemma 1 highlights, X Z,t are all that is needed to simulate the path
of aggregates and to compute ergodic moments from the first order approximation. The other
first order terms, x̄Z,t and Ω̄Z,t , are required for the second order approximation and take an
additional 0.7 seconds to compute.
We compare this to our own implementation of the sequence space Jacobian approach of
8
All numbers are reported using a 20 core M1 ultra mac studio.
31
K̂t −K̃t
Figure 2: Nonlinear error K̃t
for the three first order approximations and the second order approxima-
tions.
ABRS 1.51s
32
Auclert, Bardóczy, Rognlie, and Straub which takes approximately 1.5 seconds to compute the
equivalent on the X Z,t . Of that time, approximately 1.35 seconds are spent on the backward
and forwards iteration steps which are the equivalent of the terms computed in Lemma 2 and
3. As we detailed in section 3.4, once the steady state is solved for our implementation of
the algorithm requires only sparse linear operations which can be done quickly and efficiently
independently of how the steady state is solved for. The methodology of ABRS generally
relies on numerical differentiation of global transition code, and is therefore limited by the
efficiency of that global code. Moreover, very careful attention has to be paid to those numeric
derivatives in order to ensure that they are accurate, see appendix C.1 of Auclert, Bardóczy,
Rognlie, and Straub for details. These numerical issues would be amplified with a second order
approximation as numerical second derivatives are more prone to numerical error. By giving
explicit expressions for these second derivatives in terms of derivatives of F and R we sidestep
these issues.
Next we turn to the second order approximation of the Krusell and Smith model with capital
adjustment costs. We begin with the derivatives capturing the non-linear responses in a perfect
foresight world, X̄ZZ,t,s . Again, we compare these terms computed using our baseline method-
ology and those computed when the law of motion of the distribution is approximated via a
histogram. Figure 3 plots these terms for the derivative of the response of capital and output.
Unlike the first order approximation, we see that these result in very different derivatives with
those of the histogram becoming very negative relative to those of baseline case. These differ-
ences in the derivatives are reflected in the errors observed in figure 2. Our baseline approach
has errors which remain very small over time, while the histogram approach generates errors
which become the same order of magnitude as those of the first order approximation.
The addition time to compute the second order approximation is broken out in the last 2
columns of table 2. As highlighted in section 3.2 there are two additional types of terms in the
second order approximation: the curvature terms, X̄ZZ,t,k , and risk correction terms X̄σσ,t . As
they follow the same mathematical structure, we break out the computational time separately
for both types. The curvature terms take 1.11 seconds to compute9 while the risk adjustment
terms take 0.83 seconds. The vast majority of the computational time for the curvature terms
9
Here we report only the time required to compute that X̄ZZ,t,t terms. We do this for two reasons. Firstly,
for most ergodic moments only the X̄ZZ,t,t are required. Secondly, computing the addition X̄ZZ,t,t+i terms are
trivially parallelizable for each i so, with enough processors, computing all the X̄ZZ,t,k terms would not require
any additional time.
33
Figure 3: Curvature terms X̄ZZ,t,t for capital (top) and output (bottom) using our baseline methodology and
when the law of motion of the distribution is approximated with a histogram.
34
Figure 4: Non-linearities in Impulse Responses. Solid line represents the HA economy while the dashed line
represents an RA model.
is spent on Lemma 5 and Proposition 2 which is a result of a large number of quadratic forms
required to compute the yt,k and RΘ,t,k terms. All combined, computing the second order
approximation requires an additional 2.63 seconds relative to the first order approximation.
We can use our second order approximation to study both non-linearities over the business
cycle. We begin with non-linearities other the business cycle by studying how impulse responses
change as a result of previous shocks. Specifically, we compute
which is the change in the impulse response resulting from a shock j periods in the past. Ap-
plying Lemma 4 it is straightforward to see that this term is exactly given by X̄ZZ,k,k+j Et Et−j.
In figure 4 we plot these changes in the impulse responses for various values of j for both
35
capital and output. In the figure we normalize by a 1 s.d. impulse response so these should
be interpreted as the percentage change in the impulse response induced by a shock in the
previous period. As a point of reference, we compare to the non-linear impulses generated by
a RA agent economy calibrated to match the same long run moments.10 Both the RA and the
HA versions of the economy are very close to linear so nearly all the terms are close to zero,
but we do observe some nonlinearities in the impulse responses. A sequence of recent positive
TFP shocks will result in amplified business cycles: the increase (fall) in output will be larger
for a positive (negative) TFP shock. For example, if the economy had received a positive TFP
shock in the previous period the response of capital could be up to 30% large in some period.
Relative to the RA economy, the HA version appears to feature greater persistence arising
from the endogenous movement of the wealth distribution.
In addition to studying non-linearities, second order approximations can be used to evaluate
the welfare effects of policies. To see this, return to Lemma 4 and take the expectation
conditional on time 0 information to find
t
!
1 X
X ZZ,t−s,t−s σE2 + X σσ,t + O E3 ,
E0 [Xt ] = X̄ +
2
s=0
where σE standard deviation of the exogenous shock Et . Taking the limit as t → ∞ gives the
long run ergodic means of the aggregate variables X as
∞
!
1 X
X ZZ,s,s σE2 + O E3
E [X] = X̄ + + X σσ,∞
2
s=0
whereX σσ,∞ = limt→∞ X σσ,t . When X contains a measure of welfare this gives a quick and
efficient algorithm for computing ergodic welfare as a function of a policy parameter since
for given value of the policy parameter it is straightforward to compute the second order
derivatives X̄ZZ,t,t and X̄σσ,t .11 We illustrate this approach by computing ergodic welfare as
a function of the tax parameter τΘ and plotting ergodic welfare (computed as consumption
equivalent relative to steady state) in figure 5. The blue line which plots ergodic welfare
using our methodology, while the green line plots the same line for a representative agent
calibration. Since the taxes are rebated lump sum, welfare in the RA calibration is invariant of
the policy parameter. That the blue line is below the green line indicates that business cycles
are more costly in the HA calibration due to market incompleteness. We see that relative
to a Laissez faire policy, τΘ = 0, making the tax policy more countercyclical initially raises
10
The second order approximation to the RA model was computed using our methodology as
11
In principle ergodic welfare will also depend on the change in the steady state X̄, in our example the steady
state will not depend on the policy parameter.
36
Figure 5: Welfare as a function of τΘ
We can use the results of section 4.1 to study the response of our benchmark model to tempo-
rary changes in volatility. In line with the literature, we’ll focus on the impulse response to a
transitory shock to the variance of the TFP process, EΥ,t . The impulse response to this shock
37
Figure 6: Impulse response to a shock to volatility.
is then defined by
38
Figure 7: Average portfolio holdings by wealth.
uncertainty does not change the level of output in the economy which means that this increased
precautionary savings motive results in lower consumption and increases savings and capital.
That increased capital eventually results in a paradoxical boom in output. As the Krusell and
Smith economy is very close to linear the magnitudes are quite modest but of a similar scale to
Fernández-Villaverde and Guerrón-Quintana (2020). Relative to the RA calibration it appears
as if the HA model slightly amplifies the effects of an increase in uncertainty. If one were to
have only computed the IRF using the methodology based on a histogram (red line) one would
incorrectly conclude that an increase in uncertainty would paradoxically result in a prolonged
decrease in capital and output. As can be readily observed by comparing figures 3 and 6 this
prolonged decrease in output is generated by the incorrect curvature terms X ZZ,t,t computed
using this methodology.
In this section we study endogenous portfolio choice in this Krusell and Smith style model.
We extend the model to allow agents to trade risk free debt, b, which is in zero net supply, in
addition to claims to capital. We impose that households cannot short-sell capital. We compute
the first order response of aggregates and the endogenous portfolio choice of individuals. We
plot the latter in figure 7. The right-hand side of the figure plots the portfolio holdings for
the entire wealth distribution. We observe that nearly all households have the same portfolio:
putting nearly all of the wealth in capital. The exception is those households nearest to the
borrowing constraint who borrow in the risk free asset in order to invest more in capital. This
surprising result comes from the fact that the poorest households are nearly entirely reliant on
39
Figure 8: Comparison of first order derivatives between endogenous and exogenous portfolio choice.
labor income which is safer than a claim to capital. They are therefore more willing to bare
additional risk than their richer counterparts. As the portfolio decisions of the agents align
very closely to those in the benchmark economy with an exogenous portfolio, the first order
responses of aggregates are nearly identical, which is observed in figure 8.
Finally we turn to how government policy can influence the portfolio choice of households.
Figure 9 plots the implied portfolio distribution for two different levels of the government policy
parameter τΘ . The top line of the figure is the portfolio distribution when τΘ = −2, the optimal
level found in the previous section, while the bottom captures the portfolio distribution when
τΘ = 2. We observe that for the richest agents their choices are nearly identical for the two
policy regimes. The poorest agents behave quite differently. When provided a lot of insurance,
τΘ = −2, the poorest agents increase their leverage and invest even more in risky capital. This
choice can be flipped if the government makes taxes pro-cyclical, τΘ = 2. The pro-cyclical tax
rate makes labor income more risky pushing households to the safer asset at which point the
no-borrowing constraint binds for some of those agents.
To give a perspective of how theses approximations perform in the presence of greater curvature
we study an additional calibration that features a higher level of risk aversion, σ = 5, and larger
40
Figure 9: Portfolio distribution for τΘ = −2 and τΘ = 2.
shocks σΘ = 0.1. We then recalibrate to model to match the same moments.12 We begin by
comparing the relative accuracy of the various approximation techniques. As can be observed
in figure 10, the three first order approaches have nearly identical errors, but relative to the
baseline case they are, unsurprisingly, much larger: the maximal error is 2% of the capital
stock compared to 0.04% in the benchmark case. The second order approximation improves
matters markedly with a maximal error of 0.5% of the capital stock.
We can see how the extreme calibration amplifies the non-linearities of the model by study-
ing the non-linear impulse responses conditional on past shocks. Figure 11 plots the percentage
change in the impulse response
E [Xt+k |Et , Et−j ] − E [Xt+k |Et , Et−j = 0]
E [Xt+k |Et , Et−j = 0]
conditional on a shock in the previous period. A 1 s.d. positive TFP shock in the previous
period can lead the impulse response of the capital stock to be up to 200% larger and the
impulse response of output being up to 50% larger.
The higher risk-aversion and larger shocks also significantly increase the welfare costs of
business cycles. Figure 12 (computed as consumption equivalent relative to steady state)
plots ergodic welfare as a function of the tax parameter τΘ The blue line which plots ergodic
welfare using our methodology, while the red line plots the same line for a representative agent
12
With the exception of the capital adjustment cost parameter which we target to match the same ratio of
the standard deviation of returns to output.
41
Figure 10: Nonlinear error K̂tK̃−K̃t for the three first order approximations and the second order approxima-
t
tions for the extreme calibration.
calibration. The extreme calibration features much large welfare costs of uncertainty with
the representative agent being willing to sacrifice nearly 3.5% of their consumption to remove
all uncertainty. The welfare costs for the heterogeneous agent economy is much larger, and,
comparing the blue lines in figures 5 and 12, we see that government policy is less effective at
mitigating the costs of uncertainty under the extreme calibration.
Finally we turn to stochastic volatility under the extreme calibration. Figure 13 plots this
impulse response using two methodologies. The blue line uses derivatives computed using our
baseline approach, the green line represents the RA calibration, and the red line is computed
assuming the LOM of the distribution is approximated using a histogram. Comparing the
green lines in figures 6 and 13 we see that, qualitatively, the RA economy responds similarly
to an temporary increase in uncertainty under the extreme calibration. The main difference
is that the responses are an order of magnitude larger. Interestingly, the qualitative response
in the heterogeneous agent economy is flipped as in the extreme calibration the increase in
uncertainty results in a decline in the capital stock.
42
Figure 11: Non-linearities in Impulse Responses. Solid line represents the HA economy while the dashed line
represents an RA model.
43
Figure 12: Welfare as a function of τΘ for extreme calibration.
44
Figure 13: Impulse response to a shock to volatility.
45
References
Ahn, S., G. Kaplan, B. Moll, T. Winberry, and C. Wolf (2018): “When Inequal-
ity Matters for Macro and Macro Matters for Inequality,” NBER Macroeconomics Annual,
32(1), 1–75.
Auclert, A., B. Bardóczy, M. Rognlie, and L. Straub (2021): “Using the Sequence-
Space Jacobian to Solve and Estimate Heterogeneous-Agent Models,” Econometrica, 89(5),
2375–2408.
Krusell, P., and A. A. Smith, Jr (1998): “Income and Wealth Heterogeneity in the
Macroeconomy,” Journal of Political Economy, 106(5), 867–896.
Luenberger, D. G. (1997): Optimization by Vector Space Methods. New York, NY, USA:
John Wiley & Sons.
Schmitt-Grohé, S., and M. Uribe (2004): “Solving Dynamic General Equilibrium Mod-
els Using a Second-Order Approximation to the Policy Function,” Journal of Economic
Dynamics and Control, 28(4), 755–775.
46
Winberry, T. (2018): “A Method for Solving and Estimating Heterogeneous Agent Macro
Models,” Quantitative Economics, 9(3), 1123–1151.
Young, E. R. (2010): “Solving the Incomplete Markets Model with Aggregate Uncertainty
using the Krusell–Smith Algorithm and Non-Stochastic Simulations,” Journal of Economic
Dynamics and Control, 34(1), 36–41.
A Proofs
A.1 Proof of Lemma 1
The path of aggregates, Xt (E t ; σ), as a history of aggregate shocks, E t , can be constructed from
T
the recursive representation X̃(Z; σ) and Ω̃(Z; σ). Begin by defining Zt (E t , σ) = Θt E t ; σ , Ωt E t ; σ
Xt (E t ; σ) = X̃ Zt E t ; σ ; σ .
(44)
A first order expansion of equations (43) and (44) around the σ = 0 steady state yields the
following recursive relationship
Xt E t = X + X Z Zt E t − Z ∗ + X σ + O E 2
(45)
Zt E t = Z ∗ + Z Z Zt−1 E t−1 − Z ∗ + Ẑ0 Et + Z σ + O E 2 .
(46)
with Ẑ0 being the direction defined in the main text and Z σ = 0, Ωσ . Our first step is to
show that X σ and Z σ are both 0 which we codify in the following lemma
Lemma 8. The first derivatives with respect to σ, X σ , Ωσ , xσ , are all 0.
Proof. Differentiating the F , R, and LOM of Ω mappings with respect to σ yields the following
system of equations13
0 = Fx (z, θ)xσ (z, θ) + FX (z, θ)X σ + Fx′ (z, θ)E xz ()z σ (z, θ) + xσ () + xZ · Z σ
Z
0 = Rx xσ dΩ∗ + RX X σ
ZZ
Ωσ ⟨z ′ , θ′ ⟩ = − δ z(z, θ) − z ′ ι(ρθ θ + ϵ ≤ θ′ )z σ (z, θ)µ(ϵ)dϵdΩ∗ .
This system of equations is homogeneous of degree 1 in X σ , Ωσ , xσ and,therefore is solved
by setting all terms to zero.
13
Here we have exploited the knowledge that E [xΘ (z ′ , θ′ )E ′ ] = 0
47
With the knowledge that X σ and Z σ are both zero and Z−1 − Z ∗ = 0 it’s possible to roll
forward equation (46) to find
t
t ∗
X t−s
Z Z · Ẑ0 Es + O E 2
Zt E −Z =
s=0
t
X
Ẑt−s Es + O E 2 ,
= (47)
s=0
where Ẑt is defined in the main text. Plugging into equation (45) yields
t
X t
X
Xt E t = X + X Z · Ẑt−s Es + O E 2 = X + X Z,t−s Es + O E 2
s=0 s=0
as desired.
Xt E t = X + X Z Zt E t − Z ∗
1
X ZZ · Zt−1 E t−1 − Z ∗ , Zt−1 E t−1 − Z ∗ + X σσ + O E 3
+ (48)
2
Zt E t = Z ∗ + Z Z Zt−1 E t−1 − Z ∗ + Ẑ0 Et
1
Z ZZ · Zt−1 E t−1 − Z ∗ , Zt−1 E t−1 − Z ∗ + Z σσ + O E 3 ,
+ (49)
2
T
where Z ZZ is defined in the main text and Z σσ = 0, Ωσσ . As Zt−1 E t−1 − Z ∗ satisfies (47)
(2) (2)
E t recursively via Ẑ0 E 0 = 0 and
Therefore if we define the direction Ẑt
t−1 X
t−1
(2) (2)
X
t t−1
Ẑt E = Z Z Ẑt−1 E + Z ZZ · Ẑt−1−s , Ẑt−1−m Es Em + Z σσ (50)
s=0 m=0
48
and
t t X
t
!
t
X 1 X
(2) t
+O E 2 .
Xt E −X = X Z,t−s Es + X ZZ · Ẑt−s , Ẑt−m Es Em + X Z · Ẑt E
2
s=0 s=0 m=0
(51)
By construction it is straightforward to use show that (50) implies that
t X
t
(2)
X
E t = Ẑσσ,t +
Ẑt Ẑt−s,t−m Es Em
s=0 m=0
(2)
E t in (51)
where both Ẑt,s and Ẑσσ,t are as defined in the main text. Substituting for Ẑt
and applying the definitions of X σσ,t and X ZZ,t,s immediately yields
t X t
!
1 X
Xt E t = ... + X ZZ,t−s,t−m Es Em + X σσ,t + O E 2
2
s=0 m=0
Assumption 1 states that the functions x̃(z, θ, Z; σ) are continuous and piece-wise smooth in
some neighborhood Z ∗ and σ = 0. We’ll start by assuming a single point where the functions
are joined defined by z ∗ (θ, Z; σ). We’ll let x̃p and x̃m denote the functions on either side of
that point thus (
x̃1 (z, θ, Z; σ) z ≥ z ∗ (θ, Z; σ)
x̃(z, θ, Z; σ) = ,
x̃0 (z, θ, Z; σ) z ≤ z ∗ (θ, Z; σ)
or
x̃(z, θ, Z; σ) = x̃0 (z, θ, Z; σ)ι (z ≤ z ∗ (θ, Z; σ)) + x̃1 (z, θ, Z; σ)ι (z ≥ z ∗ (θ, Z; σ)) ,
where z ∗ (θ, Z; σ) is determined by the agent being on the budget constraint while uncon-
strained, i.e.
z̃ 1 (z ∗ (θ, Z; σ), θ, Z; σ) = z
xZ (z, θ) · Ẑi = x0Z (z, θ) · Ẑi ι (z ≤ z ∗ (θ)) + x1Z (z, θ) · Ẑi ι (z ≥ z ∗ (θ)) + x1 (z, θ) − x0 (z, θ) z ∗Z (θ) · Ẑi δ(z − z ∗ (θ))
where x◦Z (z, θ)· Ẑi is the piece-wise smooth component of xZ (z, θ)· Ẑi . Continuity of x̃(z, θ, Z; σ)
implies that x1 (z ∗ (θ), θ) − x0 (z ∗ (θ), θ) = 0 which give the third equality. Therefore, all the
49
Differentiating twice with respect to Z implies that
xZZ (z, θ) · Ẑi , Ẑj =x◦ZZ (z, θ) · Ẑi , Ẑj + x1Z (z, θ) · Ẑj − x0Z (z, θ) · Ẑj z ∗Z (θ) · Ẑi δ(z − z ∗ (θ))
+ x1Z (z, θ) · Ẑj − x0Z (z, θ) · Ẑj z ∗Z (θ) · Ẑj δ(z − z ∗ (θ))
− x1 (z, θ) − x0 (z, θ) z ∗Z (θ) · Ẑi z ∗Z (θ) · Ẑj δ ′ (z − z ∗ (θ))
=x◦ZZ (z, θ) · Ẑi , Ẑj + xδZZ,i,j (θ)δ(z − z ∗ (θ))
where x◦ZZ (z, θ) · Ẑi , Ẑj is the piece-wise smooth component of xZZ (z, θ) · Ẑi , Ẑj and
xδZZ,i,j (θ) is given by
∗ ∗ ∗ ∗
xδZZ,i,j (θ) = x∆
Z (θ) · Ẑ j z Z (θ) · Ẑ i + x ∆
Z (θ) · Ẑ i z Z (θ) · Ẑ j + x ∆
z (θ) z Z (θ) · Ẑ i z Z (θ) · Ẑ j
with x∆ 1 ∗ 0 ∗ ∆ 1 ∗ 0 ∗
Z (θ) · Ẑi ≡ xZ (z (θ), θ) · Ẑj − xZ (z (θ), θ) · Ẑj and xz (θ) ≡ xz (z (θ), θ) − x (z (θ), θ) .
Finally, we note that z ∗Z (θ) · Ẑi is determined by
Note that all of the terms of xδZZ (θ) are known to first order. Combining all of these facts
together we have that
xZZ,i,j (z, θ) = xZ (z, θ) · Ẑi,j + xZZ (z, θ) · Ẑi , Ẑj = x◦ZZ,i,j (z, θ) + xδZZ,i,j (θ)δ(z − z ∗ (θ))
where x◦ZZ,i,j (z, θ) is the piece-wise smooth component of xZZ,i,j (z, θ).
To determine x◦ZZ,i,j (z, θ) we differentiate the F mapping twice in direction Ẑi and Ẑj at
any point (z, θ) ̸= (z ∗ (θ), θ) and add to it the derivative of F in direction Ẑi,j to get
0 = Fx (z, θ)x◦ZZ,i,j (z, θ)+FX (z, θ)X ZZ,ij +Fx′ (z, θ)E x◦ZZ,i+1,j+1 (, ) + xz (, )px◦ZZ,i,j (z, θ) +Fi,j (z, θ)
(52)
where Fi,j (z, θ) contains all the interaction terms known to first order
Fi,j (z, θ) =Fx′ (z, θ)E x◦zz ()z Z,i (z, θ)z Z,j (z, θ) + x◦zZ,j+1 ()z Z,i (z, θ) + x◦zZ,i+1 ()z Z,j (z, θ)
+ Fxx (z, θ) · (xZ,i (z, θ), xZ,j (z, θ)) + FxX (z, θ) · xZ,i (z, θ), X Z,j + Fxx′ (z, θ) · xZ,i (z, θ), x+
Z,j (z, θ)
+ FXx (z, θ) · X Z,i , xZ,j (z, θ) + FXX (z, θ) · X Z,i , X Z,j + FXx′ (z, θ) · X Z,i , x+
Z,j (z, θ)
+ Fx′ x (z, θ) · xZ,i (z, θ), xZ,j (z, θ) + Fx′ X (z, θ) · xZ,i (z, θ), X Z,j + Fx′ x′ (z, θ) · x+
+ + +
Z,i (z, θ), xZ,j (z, θ)
∗
µ(θ (z(z, θ)) − ρθ θ) δ ∗ ∗
+ Fx′ (z, θ) xzz (θ (z(z, θ)), z(z, θ))z Z,i (z, θ)z Z,j (z, θ) + xδzZ,j+1 (θ (z(z, θ)), z(z, θ))z Z,i (z, θ
z ∗θ (θ)
!
∗ ∗
+ xδzZ,i+1 (θ (z(z, θ)), z(z, θ))z Z,j (z, θ) + xδZZ,i+1,j+1 (θ (z(z, θ)), z(z, θ)) .
50
where x+
Z,i (z, θ) = E [xz (, )xZ,i (z, θ) + xZ,i+1 (, )|z, θ] . It is straightforward to verify that (52) is
solved by
∞
X
x◦ZZ,i,j (z, θ) = ◦
xs (z, θ)X ZZ,i+s,j+s + yi,j (z, θ)
s=0
◦ (z, θ) solves the recursive equation
where yi,j
◦
(z, θ) = (Fx (z, θ) + Fx′ (z, θ)E [xz (, )|z, θ] p)−1 Fi,j (z, θ) + Fx′ (z, θ)E yi+1,j+1
◦
yi,j (, )|z, θ .
◦
yt,k (z, θ) = yt,k (z, θ) + xδZZ,t,k (θ)δ(z − z ∗ (θ))
0 = Fx (z, θ)xσσ,t (z, θ) + FX (z, θ)X σσ,t + Fx′ (z, θ)E xΘΘ (, )σE2 + xσσ,t+1 (, ) + xz (, )pxσσ,t (z, θ) .
We begin by letting z(z, θ) be the function that solves the following linear functional equation
0 = Fx (z, θ)z(z, θ) + Fx′ (z, θ)E xΘΘ (, )σE2 + z(z, θ)(, ) + xz (, )pz(z, θ) .
Subtracting these two equations and defining x̂σσ,t (z, θ) = xσσ,t (z, θ) − z(z, θ) we see that
0 = Fx (z, θ)x̂σσ,t (z, θ) + FX (z, θ)X σσ,t + Fx′ (z, θ)E [x̂σσ,t+1 (, ) + xz (, )px̂σσ,t (z, θ)] .
This is identical to system of equations solved by xZ,t which allows us to conclude that
∞
X
x̂σσ,t (z, θ) = xs (z, θ)X σσ,t+s
s=0
and hence
∞
X
xσσ,t (z, θ) = xs (z, θ)X σσ,t+s + z(z, θ).
s=0
51
A.4 Proof of Lemma 3’
Differentiating the law of motion of Ω twice in direction Ẑt and Ẑk and adding to it the
derivative in direction Ẑt,k gives
Z Z
Dθ · Ω̂t+1,k+1 ⟨z ′ , θ′ ⟩ = − Λ(z ′ , θ′ , z, θ)z ZZ,t,k (z, θ)dΩ∗ + Λz ′ (z ′ , θ′ , z, θ)z Z,t (z, θ)z Z,k (z, θ)dΩ∗
Z Z
+ Λ(z ′ , θ′ , z, θ)z zZ,k (z, θ)Dθ · Ω̂t ⟨z, θ⟩dθdz + Λ(z ′ , θ′ , z, θ)z zZ,t (z, θ)Dθ · Ω̂k ⟨z, θ⟩dθdz
Z Z
− Λz ′ (z ′ , θ′ , z, θ)z Z,k (z, θ)z z (z, θ)Dθ · Ω̂t ⟨z, θ⟩dθdz − Λz ′ (z ′ , θ′ , z, θ)z Z,k (z, θ)z z (z, θ)Dθ
Z
+ Λ(z ′ , θ′ , z, θ)z̄z (z, θ)Dθ · Ω̂t,k ⟨z, θ⟩dθdz
and
Z Z
′ ′
ct+1,k+1 ⟨z , θ ⟩ = Λ(z , θ , z, θ)z Z,t (z, θ)z Z,k (z, θ)dΩ − Λ(z ′ , θ′ , z, θ)z Z,k (z, θ)z z (z, θ)Dθ · Ω̂t ⟨z, θ⟩dθdz
′ ′ ∗
Z
− Λ(z ′ , θ′ , z, θ)z Z,k (z, θ)z z (z, θ)Dθ · Ω̂t ⟨z, θ⟩dθdz.
Next, differentiating the LOM of Ω twice with respect to σ and adding to it the derivative
of the LOM in direction Ẑσσ,t yields
Z Z
′ ′
Dθ · Ω̂σσ,t+1 ⟨z , θ ⟩ = − Λ(z , θ , z, θ)z σσ,t (z, θ)dΩ + Λ(z ′ , θ′ , z, θ)z̄z (z, θ)Dθ · Ω̂σσ,t ⟨z, θ⟩dθdz.
′ ′ ∗
Substituting for z σσ,t using Lemma 2’ immediately obtains the second equation of the Lemma
with Z
′ ′
aσσ ⟨z , θ ⟩ = Λ(z ′ , θ′ , z, θ)z(z, θ)dΩ∗ .
Z Z
= − Λz (z , θ , z, θ)z z (z, θ)Ω̂⟨z, θ⟩dzdθ − Λ(z ′ , θ′ , z, θ)z zz (z, θ)Ω̂⟨z, θ⟩dzdθ
′ ′
Z Z
= Λz ′ (z ′ , θ′ , z, θ)z z (z, θ)z z (z, θ)Ω̂⟨z, θ⟩dzdθ − Λ(z ′ , θ′ , z, θ)z zz (z, θ)Ω̂⟨z, θ⟩dzdθ
= Dz · L · Ω̂⟨z ′ , θ′ ⟩ − M · Ω̂⟨z ′ , θ′ ⟩
52
where L·Ω̂⟨z ′ , θ′ ⟩ ≡Λ(z ′ , θ′ , z, θ)z z (z, θ)z z (z, θ)Ω̂⟨z, θ⟩dzdθ and M·Ω̂⟨z ′ , θ′ ⟩ ≡ Λ(z ′ , θ′ , z, θ)z zz (z, θ)Ω̂⟨z, θ⟩dzd
R R
R
From the definition of xdΩ we have
ZZ,t,k
Z Z Z Z Z
∗
xdΩ = xZZ,t,k dΩ + xZ,t dΩ̂k + xZ,k dΩ̂t + xdΩ̂t,k .
ZZ,t,k
Z Z Z Z
∗
= xZZ,t,k dΩ − xzZ,t Dθ · Ω̂k dzdθ − xzZ,k Dθ · Ω̂t dzdθ − xz Dθ · Ω̂t,k dzdθ
(53)
We’ll proceed under the assumption that k ≥ t, the case where t > k is symmetric. Our first
claim is that
∞
t−1 X
X
Dθ · Ω̂t,k = At−1−j · as−j X ZZ,s,k−t+s − b̂t,k + Dz · ĉt,k
j=0 s=0
Proof is via induction starting with our knowledge that Dθ · Ω̂0,k−t = 0. Applying the LOM
for Dθ · Ω̂t,k we have
∞
t−1 X
X ∞
X
Dθ · Ω̂t+1,k+1 =A · At−1−j · as−j X ZZ,s,k−t+s − b̂t,k + Dz · ĉt,k − as−t X ZZ,s,k−t+s
j=0 s=0 s=0
− bt+1,k+1 + Dz ct+1,k+1
∞
t X
X
= At−j · as−j X ZZ,s,k−t+s − A · b̂t,k − bt+1,k+1 + A · Dz · ĉt,k + Dz · ct+1,k+1
j=0 s=0
∞
t X
X
= At−j · as−j X ZZ,s,k−t+s − A · b̂t,k − bt+1,k+1 − M · ĉt,k + Dz · L · ĉt,k + Dz · ct+1,k+1
j=0 s=0
We conclude that
b̂t+1,k+1 = A · b̂t,k + bt+1,k+1 + M · ĉt,k
and
ĉt+1,k+1 = L · ĉt,k + ct+1,k+1 .
with the initial conditions b̂0,k−t = ĉ0,k−t = 0. Substituting for Dθ · Ω̂t,k and xZZ,t,k in (53)
yields (29) with
Z Z Z Z Z
∗
Ht,k = yt,k dΩ − xzZ,t Dθ · Ω̂k dzdθ − xzZ,k Dθ · Ω̂t dzdθ + xz b̂t,k dzdθ + xzz ĉt,k dzdθ.
For the second half of the corollary we note that we can iterate the LOM for Dθ · Ω̂σσ,t
forward starting from the initial condition Dθ · Ω̂σσ,0 = 0 to get
X ∞
t−1 X t−1
X
Dθ · Ω̂σσ,t = − At−1−k · as−k X σσ,s − At−1−k · aσσ .
k=0 s=0 k=0
53
By construction and applying integration by parts yields
Z Z Z
∗
xdΩ = xσσ dΩ − xz Dθ · Ω̂σσ,t dzdθ
σσ,t
substituting for Dθ · Ω̂σσ,t and xσσ using Lemma 2’ we have equation (30) with
Z t−1
X
Gσσ,t = zdΩ∗ + B · At−1−k · aσσ
k=0
To show equation (31) we differentiate the R mapping twice with respect to Ẑt and Ẑk and
add to it the derivative of the R mapping in direction Ẑt,k to get
Z
Rx xdΩ + RX X ZZ,t,k + RΘ,t,k = 0
ZZ,t,k
where
Z Z ! Z ! Z !
RΘ,t,k =Rxx · xdΩ , xdΩ + RxX · xdΩ , X Z,k + RxΘ · xdΩ , ρkΘ
Z,t Z,k Z,t Z,t
Z !
+ RxX · X Z,t , X Z,k + RxΘ · X Z,t , ρkΘ
+ RXx · X Z,t , xdΩ
Z,k
Z !
ρtΘ , + RΘX · ρtΘ , X Z,k + RΘΘ · ρtΘ , ρkΘ .
+ RΘx · xdΩ
Z,k
R
Substituting for xdΩ using (29) yields the desired expression.
ZZ,t,k
To show equation (31) we differentiate the R mapping twice with respect to σ and add to
it the derivative of the R mapping in direction Ẑσσ,t to find
Z
Rx xdΩ + RX X σσ,t = 0.
σσ,t
R
The desired expression is then obtained by substituting for xdΩ using (30).
σσ,t
54