A Perturbational Approach For Approximating HA Models

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

A Perturbational Approach for Approximating

Heterogeneous-Agent Models*
Anmol Bhandari Thomas Bourany David Evans
University of Minnesota University of Chicago University of Oregon
Mikhail Golosov
University of Chicago

February 2023

Abstract

We develop a novel perturbational technique to approximate a broad class of stochastic


heterogeneous-agent (HA) models that is scalable to the second-, and higher-orders of approx-
imation and that can be applied to economies that have recursive representations with very
complex state spaces such as multi-dimensional endogenous distributions. The central insight
of our approach is that it is possible to analytically characterize any order of approximation
for the stochastic process that governs this state. These characterizations have linear recursive
mathematical structure which allows us to derive exact analytical expressions for approximat-
ing coefficients as solutions to small-dimensional linear system of equations. These systems
of equations are then solved numerically. Computationally, to the first order of approxima-
tion, our method is as fast and precise as existing state-of-the-art techniques that linearize
HA models using so-called “MIT shocks” but our approach is easily scalable to higher or-
ders of approximation. We also show how our techniques can be used to obtain quickly and
efficiently approximations to models with stochastic volatility and portfolio problems in HA
environments.

*
We thank Adrien Auclert and participants of SITE conference for thoughtful comments. Bhandari, Evans,
and Golosov thank the NSF for support (grant #36354.00.00.00).

1
1 Introduction

Heterogeneous-agent (HA) models, with their ability to map naturally to rich comprehensive
micro-level datasets, are quickly becoming a staple of macroeconomic analysis. Despite the
popularity of these models, using them to study effects of aggregate shocks and macroeconomic
risk has proven difficult. State variables in the recursive representations of such models are
complex high- (often infinite-) dimensional objects, such as distributions of agents’ endogenous
choices and exogenous characteristics. These complex state variables are endogenous and
change stochastically over time. Standard computational techniques struggle with handling
such problems.
A “small-noise” perturbational approach is one of the most popular techniques to solve
representative-agent (RA) macroeconomic models;1 this approach lies in the heart of the widely
used DYNARE approximation package. This method approximates equilibrium policy func-
tions around a non-stochastic steady state using various order of Taylor expansions with respect
to a scalar that multiplies all aggregate shocks. This approach is popular because it applies to
a broad class of economies, scales to any order of approximation, and can be executed com-
putationally fast. There are two problems with extending this approach to HA models. First,
it quickly breaks down as the state in the recursive representation becomes large since com-
putational complexity of this approach grows exponentially with the dimension of the state.
Moreover, this approach struggles when there are kinks in policy functions, the feature that
emerges naturally in HA models with occasionally binding borrowing constraints.
The goal of this paper is to overcome this challenge and develop a numerical approach that
can be used to study equilibrium behavior in a broad class of macroeconomic models, that can
handle high-dimensional state variables, and that can be easily scale-able to second- and higher-
order approximations. This scalability requirement is one of the central focuses of our approach,
as higher-order approximations are necessary to study many central issues in macroeconomics:
the effects of risks and asset pricing, welfare analysis of macroeconomic stabilization policy,
understanding non-linearities and interaction effects of macroeconomic shocks. Our approach
is intentionally designed to preserve central advantages of the DYNARE-like techniques but to
be applicable to economies with arbitrarily complex state variables and occasionally binding
borrowing constraints.
In our approach, we follow Reiter (2009), Ahn, Kaplan, Moll, Winberry, and Wolf (2018),
1
This approach was developed to solve quantitative macroeconomic models, which originally were based
almost exclusively on the representative agent assumption. The same techniques work well if agents are non-
identical but heterogeneity is small. For simplicity, we refer to all such environments as RA models.

1
Auclert, Bardóczy, Rognlie, and Straub (2021) and use the invariant distribution in the econ-
omy without aggregate shocks as the “point-of-approximation”. This invariant distribution can
be computed efficiently using existing numerical techniques and we treat the solution to this
problem as known for the purposes of our approximation. We then use small-noise expansions
of policy functions around this point-of-approximation is it is typically done in the perturba-
tional approach, e.g., Schmitt-Grohé and Uribe (2004). The standard approach then requires
computing derivatives of policy function with respect to the state. In HA economies, this is an
infinitely-dimensional object and direct computation of it is not feasible. One of the central
result of our paper is that it is also not necessary. We show that in a broad class of HA models
it is possible to derive analytically expressions that characterize, for any given order of approx-
imation, how the law of motion for the distribution is affected by aggregate shocks, and how
aggregate variables respond to those changes in the distribution. Moreover, these expressions
have tractable linear recursive structure. This tractability allows us to collapse the problem
of finding any order of approximations to a problem solving small-dimensional linear systems
of equations. The unknown variables in these systems are the approximation coefficients that
describe responses to shocks. The known parameters are given by explicit formulas that use
the point-of-approximation variables. These analytical expressions are theoretically exact and
no numerical methods are used to derive them. We then use numerical approximations in
the final step, when we compute the point-of-approximation invariant distribution, construct
required matrices from those solutions, and invert them to find approximation coefficients.
We show that our technique applies to a broad class of economies, including models with
borrowing constraints. Our technique is easily parallelizable and can compute equilibria very
fast even in models with large heterogeneity. It takes, for example, about 0.1-0.2 seconds to fully
solve for the first-order approximation (and 1-2 seconds for the second-order approximation) a
high-dimensional version of the Krusell-Smith economy on a standard desktop computer. We
also show how our techniques can be used to quickly and easily solve portfolio problems and
problems with stochastic volatility in HA environments.
To derive our analytical results, we rely on linear operator techniques. Mathematically,
the dependence of any policy function on distributions is captured by Frechet derivatives.
Even though these Frechet derivatives are typically infinite-dimensional objects in standard
HA economies, they are linear (or, more precisely, multi-linear, when we consider arbitrary
order of the derivative) operators. Moreover, it is possible to derive analytically some of their
properties. These properties, together with linearity of the operator, allows us to show that
any order of approximation of the law of motion of the aggregate distribution is described

2
by a linear recursive mathematical structure and to characterize it explicitly. This ultimately
allows us to derive exact analytical characterization of various orders of approximation of HA
models.

1.1 Related literature

Several recent papers developed alternative methods to compute equilibria of HA economies.


When the underlying state in the HA can be summarized by sufficiently simple functions, the
original approach of Krusell and Smith (1998) or a more recent one by Winberry (2018) can
be used. Boppart, Krusell, and Mitman (2018) observed that first-order approximations of
stochastic economy can be fully constructed from first-order approximations of determinis-
tic response to one-time unexpected “MIT-style” shocks, and study impulse-response in HA
economies using sequences of such shocks.
In an important recent contribution, Auclert, Bardóczy, Rognlie, and Straub (2021), or
ABRS for short, developed a fast and efficient sequence-space method to solve a broad class
of HA economies to the first-order approximation. There is a close connection between our
approach and theirs and we explain that connection is great details in Section 3.5; here we
quickly summarize main similarities and differences.
ABRS build on the result of Boppart, Krusell, and Mitman (2018) that knowing responses
to MIT shocks allows one to construct first-order approximations to stochastic economies. The
key insight of ABRS is that the response of the distribution to an MIT shock can be described
as a recursive linear system, which allows one to perform computations very fast. ABRS
use this insight to build computationally the key object in their analysis (that they call “the
Jacobian”) and convert the approximation problem into a small-dimensional linear system of
equations.
Our method is related to ABRS, and in fact directly builds on their insights, as we also
exploit the recursively linear structure of linear operators to simplify the analysis and collapse
our approximation problem to small-dimensional linear systems of equations. When we restrict
attention only to the first-order approximations, there is certain equivalent between the two
approaches. While their linear recursive system is different from ours as we use two different
representations of the equilibrium conditions, it is possible to prove that as the grid size in
their approximation procedure converges to zero their first order approximating coefficients
would coincide with ours. Moreover, it takes approximately the same time to compute first-
order responses under their method and ours as both techniques ultimately use linear recursive
properties of key objects to construct closely related linear systems of equations.

3
There are two key differences between our approach and theirs that enable us to obtain
higher-order approximations. Firstly, we use recursive state space representation of equilibrium
conditions, while ABRS use sequence-space formulation. For orders of approximation that are
higher than the first, knowing responses to MIT shocks is insufficient to recover all approxi-
mation coefficients. Secondly, and perhaps even more importantly, ABRS follow most of the
literature and start with approximating the invariant distribution and its transition probability
(using the popular “histogram method”, see Young (2010)) before using numerical techniques
in constructing first-order approximations. We instead derive exact analytical expressions first
before using any numerical approximations. This distinction is important. In the paper we
show that even in the limit, as the grid size used in the histogram method becomes arbitrarily
small, constructed approximations do not converge to their exact theoretical counterparts, and
that some of the second-order terms get lost. We show that this can significantly affect the
conclusions one obtains.
Our paper is also related to the approximation method used by Bhandari, Evans, Golosov,
and Sargent (2021). Like us, they used a state-space variational approach and Frechet deriva-
tives to obtain various orders of approximation of HA economy. Their approach does not extend
to economies with occasionally binding borrowing constraints. It also changes the point-of-
approximation at each node of the aggregate history, which requires many re-computations of
the point-of-approximation economies. Our method instead computes the point-of-approximation
only once and can handle kinks in policy functions that emerge due to borrowing constraints.
The analytical characterization of the law of motion of the aggregate distribution, which is the
central theoretical result that simplifies our approach, is new to our paper.

2 Environment

We consider an infinite period economic environment with a continuum of agents. The economy
consists of a vector of aggregate variables Xt that economic agents take as given and a vector
of individual variables xi,t for each agent i, aggregate shocks Θt and idiosyncratic shocks θi,t .
The shocks are exogenous and satisfy autoregressive representation

Θt = ρΘ Θt−1 + Et , (1)

θi,t = ρθ θi,t−1 + εi,t , (2)

where Et and εi,t are mean zero stochastic processes, independent across time and agents and
|ρΘ |, |ρθ | < 1. For technical reasons, we assume that Et is such that |Et | are all bounded by some
E. For now, we assume that these processes are time-invariant, with probability distribution

4
of εt given by continuous µ, but drop this assumption later in the paper. To streamline the
exposition, we treat Θt and θi,t to be scalars but our approach extends straightforwardly to
the case when these are arbitrary vectors. The economic decisions of individual agents can be
characterized by mappings F of the form

F (xi,t , Ei,t [xi,t+1 ] , Xt , θi,t ) = 0 for all i, t, (3)

where initial θi,0 for each i and Θ0 as well as some subset zi,0 ∈ xi,0 and Z0 ∈ X0 are given. All
the variables must satisfy aggregate conditions that are described by mapping R of the form
Z 
R xi,t di, Xt , Θt = 0 for all t. (4)

Our goal is to characterize the solution to the economy described by equations (3) and (4).
We refer to this representation as the sequence-space representation.
We assume that the solution admits a recursive representation. In particular, we suppose
that there is some subset zt ∈ xt of individual state variables, with Ωt being a joint distribution
over (zt , θt ) so that solution to (3) and (4) can be represented by policy functions X
e (Θ, Ω) and
x
e (z, θ, Θ, Ω), together with the law of motion for distribution, Ω
e (Θ, Ω). Such policy functions
must satisfy state-space representation,
 
F x e+ , X,
e, x e θ = 0 for all (z, θ, Θ, Ω) , (5)
Z 
R x
edΩ, X, Θ = 0 for all (Θ, Ω) ,
e e (6)

and
Z Z
e (Θ, Ω) z ′ , θ′ = ι ze(z, θ, Θ, Ω) ≤ z ′ ι(ρθ θ + ϵ ≤ θ′ )µ (ϵ) dϵdΩ ⟨z, θ⟩ for all (Θ, Ω) .


(7)

e+ is the expectation of x
Here x e next period,
h  i
e+ (z, θ, Θ, Ω) := E x
x e (Θ, Ω) , ρθ θ + ε, ρΘ Θ + E ,
e ze (z, θ, Θ, Ω) , Ω

were the expectation is taken over (ε, E).


Representation (5) - (7) is very general and includes many standard infinite period macroe-
conomic. To make discussion concrete, we first show that Krusell and Smith (1998) economy
maps into this representation and then explain how a broad class of economic environments
map into this representation.

5
2.1 A Krusell-Smith Example

Consider a simple version of Krusell and Smith (1998) economy. All agents supply inelastically
one unit of labor that is subject to idiosyncratic efficiency shocks θi,t which process is given by
(2). Agents receive wage Wt and save in capital stock ki,t that earns interest rate Rt . Agent’s
decisions can be represented by a maximization problem

X
max E0 β t U (ci,t )
{ci,t ,ki,t }t≥0
t=0

subject to some initial capital stock ki,−1 , non-negativity constraint ki,t ≥ 0, as well as ni,t =
exp (θi,t ) and the budget constraint

ci,t + ki,t = (1 + Rt − δ) ki,t−1 + Wt ni,t . (8)

Here, β, δ ∈ (0, 1) are discount and depreciation rates. We assume that U is strictly concave
and smooth function with derivative Uc that satisfies the Inada conditions. Capital and labor
are hired by a representative firm endowed with a Cobb-Douglas production technology Yt =
exp (Θt ) Ktα Nt1−α where Θt follows stochastic process (1). The market clearing conditions are
Z Z
Kt = ki,t−1 di, Nt = ni,t di. (9)

We first derive its sequence-space representation. Let λi,t and ζi,t be multipliers on agent’s
budget and non-negativity constraints. Agents optimality conditions are characterized by
equations (8),
λi,t = Rt Uc (ci,t ), Uc (ci,t ) + ζi,t = βEt λi,t+1 ,

as well as ni,t = exp (θi,t ) and the complementary slackness condition ki,t ζi,t = 0. These four
equations define a mapping F in (3), with xi,t consisting of (ki,t , ci,t , λi,t , ζi,t ), and Xt consisting
of (Rt , Wt , Kt , Nt ). The mapping R in (4) is given by equation (9) and

Rt = α exp (Θt ) Ktα−1 Nt1−α , Wt = (1 − α) exp (Θt ) Ktα Nt−α .

This problem can be written recursively in state-space representation (5), (6) and (7) with
zi,t = ki,t being an individual state variable and Ωt being a joint distribution over (ki,t , θi,t ).

2.2 Mapping of general economic environments into our formulation

As the Krusell-Smith example makes clear, mapping F consists of all necessary and sufficient
conditions that characterize individual agent (household, firm, etc) optimization, while R in-
cludes all aggregation and market clearing conditions. Since F , R, X, x are all arbitrary

6
variables, this formulation encompasses a lot of cases that may not be immediately obvious
from the simple example we gave. For example, our formulation encompasses cases when ag-
gregate shocks affect individual agents directly (just have a variable Xt′ as a part of vector
Xt that satisfies Xt′ = Θt as a part of mapping Rt ) or where some endogenous aggregate
variable Πt , such as inflation in the previous period, is a part of the endogenous state space
(just augment definition of vector zi,t to include Πt and appropriately extend definition of Ωt
to this space). Similarly, our formulation allows any moment of distribution of xi,t to affect
equilibrium, not just the first moment. To see this, suppose one considers an economy in which
equilibrium depends on variance of some variable x′i,t . Let Xt′˙ := x′i,t di be the mean of x′i,t ,
R
 2 R  ′′ 2
x′′i,t := x′i,t − Xt′ be the square deviation from mean of x′i,t ,and Xt′′ := xi,t di be the
variance of x′i,t . It is easy to see that this fits into our specification (3) and (4), with the first
and third equations being parts of the R mapping, and the second equation being a part of
the F mapping. Any other moment of x′i,t can be dealt with analogously.

3 Approximation approach

The aggregate dynamics of economy characterized by equations (3) and (4). For given initial
conditions (Θ−1 , Ω−1 ), they describe the stochastic process for aggregate variables Xt E t as


a function of each history of aggregate shocks E t = (E0 , ..., Et ). Our goal is to develop nu-
merical techniques to obtain first-, second-, and, in principle, higher-order approximations of
this dynamics that can be done quickly and efficiently computationally in situations when the
dimensionality of the aggregate state Ω is large, e.g. infinite. Such situations arise frequently
in heterogeneous agents economies and cannot be handled with standard approximation tech-
niques.
We build on variational methods developed by Schmitt-Grohé and Uribe (2004) and that are
widely used in popular approximation packages such as DYNARE. In particular, we consider
a sequence of economies, parameterized by a scalar σ ≥ 0, in which Θt follows the stochastic
process
Θt = ρΘ Θt−1 + σEt . (10)

That is, relative to our original economy, all aggregate shocks are scaled by σ. The state space
representation holds for all σ; and we use notation X e (Θ, Ω; σ), x
e (z, θ, Θ, Ω; σ), Ω
e (Θ, Ω; σ) to
denote policy functions in the economy parameterized by σ. The case σ = 0 corresponds to a
deterministic economy, i.e. the economy without aggregate shocks. Depending on the initial
(Θ−1 , Ω−1 ), this economy may still have a non-trivial transition dynamics. We assume that

7
it converges to a unique (0, Ω∗ ), where Ω∗ the invariant distribution corresponding that that
deterministic economy. We use Z to refer to the aggregate state (Θ, Ω), with Z ∗ = (0, Ω∗ ).
Just like with the standard variational techniques, we use various orders of Taylor expan-
sions of policy functions with respect to σ around the point (Z, σ) = (Z ∗ , 0). The brute force
application of standard techniques is impossible when Ω is a large-dimensional object since it
would require separately computing dependence of policy functions on each of these dimen-
sions. Moreover, as we explain later in the paper, approximating Ω with a small-dimensional
vector would often result in incorrect expressions for higher-order (that is higher than first-
order) terms. Our key contribution is to show these difficulties can be sidestepped entirely. By
using linear operator techniques, it is possible to characterize analytically the systems of linear
equations that determine all coefficients in the first-, second-, and higher-order approximations
of Xt E t , without having to solve explicitly for the relationship between policy functions X

e
and various moments of Ω. This analytical system of linear equations can then be solved
quickly on the computer to obtain equilibrium approximations of various orders.
To avoid carrying (Z ∗ , 0) in all formulas, we use bars to denote policy functions that are
evaluated at the point (Z, σ) = (Z ∗ , 0). Thus, X, x (z, θ), Ω are used to refer to Xe (Z ∗ ; 0),
e (z, θ, Z ∗ ; 0), and Ω
x e (Z ∗ ; 0). We refer to them as zeroth-order approximations. We use notation
Λ(z ′ , θ′ , z, θ) to denote the transition probability matrix (or, rather, function, since our space
of (z, θ) is not countable) between states (z, θ) and (z ′ , θ′ ) in the invariant distribution Ω∗
Similarly, xz (z, θ) , xzz (z, θ), xσ (z, θ), etc refer to various derivatives of policy functions with
respect to z and σ, again evaluated at (Z ∗ , 0).
We are interested in understanding the response of equilibrium variables to aggregate
shocks. Aggregate shocks affect both the exogenous state Θ and the endogenous distribu-
tion Ω. As we shall see, all the local information about the effect of this distribution can be
summarized by the Frechet derivatives of policy functions. We use notation X Z to denote the
(first order) Frechet derivatives of X with respect to Z, and use notation X Z · Ẑ to denote the
value of that derivative when evaluated in direction Ẑ.2 X Z is a linear operator on the space
of distributions over (z, θ). Since we are working with economies in which such distributions
are infinitely-dimensional, computing X Z will usually be impractical. However, the linearity of
X Z allows us to manipulate these objects analytically without having to know their numerical
values. Importantly, the analytical tractability is maintained even for higher-order expansions.
 
For example, let X ZZ be the second-order Frechet derivative, with X ZZ · Ẑ ′ , Ẑ ′′ denoting
2
In particular, XhZ · Ẑ
 satisfies (since
 Frechet and i directional or Gateaux derivative coincide when both exist)
X Z · Ẑ = limα→0 1
Xe Z ∗ + αẐ; 0 − X e (Z ∗ ; 0) . See Luenberger (1997) for overview of Frechet derivatives of
α
various orders.

8
its value in directions Ẑ ′ and Ẑ ′′ . X ZZ is a bi-linear operator and this conditional linearity
will be the key to extending our approach to the second-order approximations.
In parallel with definitions of X Z and X ZZ , we use xZ (z, θ), xZZ (z, θ), ΩZ , etc to denote
Frechet derivatives of various orders of policy functions x
e (z, θ) and Ω.
e The derivatives for the
law of motion of the aggregate state (Θ, Ω) is denoted by Z Z . Since the first element of Z is
 T
ρΘ
exogenous and decays with fixed rate ρΘ , this derivative has a form Z Z = ΩZ .
0
Throughout our analysis we make the following assumptions.

Assumption 1. (a) For any Z0 = [Θ0 , Ω0 ], let Ω0 = Ω0 and Ωt (Z0 ) := Ω Ωt−1 (Z0 ) for all
t > 0. limt→∞ Ω̄t (Z0 ) = Ω∗ for all Z in a neighborhood of Z ∗ .
(b). X̃ and Ω̃ are smooth with respect to Z, σ in the neighborhood of (Z ∗ , 0);
(c). x̃ are a.e. smooth under Ω∗ with respect to z, θ, Z, σ in the neighborhood of (Z ∗ , 0);

Condition (a) is required to ensure that economy is stable and that impulse responses
converge back to stationary values, condition (b) is needed to ensure that Taylor expansions
are well-defined. Both conditions (a) and (b) have direct analogue in standard perturbational
approach to solving representative agent economies. Condition (c) is an extension of condition
(b) to individual policy functions in heterogeneous agent economies with occasionally binding
constraints. It allows for kinks in these policy functions but requires that these kinks be of
Ω∗ -measure of zero.
Importantly, condition (c) allows for situations in which there is a strictly positive mass
of agents at borrowing constraints in the invariant distribution Ω∗ . To see this, consider our
example of the Krusell and Smith economy. Saving policy functions e k (k, θ) in that economy
form kinks at the point where borrowing constraints start to bind. Thus, for each k ≥ 0 there
is at most one θk at which the borrowing constraint start bind and policy function e k kinks.
Since the distribution of θ is continuous, the set of (k, θk ) points is of Ω∗ -measure of zero. This
is true even if there is a mass point of agents at the borrowing constraint, i.e. if Ω∗ ⟨0, θ⟩ dθ
R

is of strictly positive Ω∗ -measure.


We now turn to describing our approximation approach. As we mentioned above, one of
the key steps is in deriving exact analytical representations of first-, second-, and, in principle,
higher-order approximations of policy functions. These analytical representations allow us to
reduce the problem of finding these approximations to solving linear system of equations. These
systems will have a recursive structure that expresses yet-unknown n-order approximation
terms as linear functions of terms known from previous n − 1 orders of approximations. To
make it easy to keep track, we use fonts a (z, θ), A and A to denote objects that we known

9
from the lower orders, with a (z, θ) denoting small-dimensional matrices that depend on (z, θ),
A denoting the small-dimensional matrices that do not depend on (z, θ), A denoting linear
high-dimensional linear operators. For the purposes of showing our main steps, we do not
need to know the specific analytical expressions for these matrices and operators and often
drop those explicit expressions from the main text. In the appendix, we provide their explicit
form that we also use to numerically solve the linear systems of equations that we derive.

3.1 First-order approximations

Before we characterize the main result of this section, it is useful to consider the mathematical
structure of our problem using a simple example. Observe that our policy function X e depend
on endogenous state Z
e that itself is a policy function of the state in the previous period, Z.
To build the intuition for what this implies in terms of Taylor expansions, supposed for the
moment that both X e and Ze are uni-dimensional and consider perturbations of the stationary
state Z ∗ by Z ∗ + σEt in each t. Let us trace-out how this perturbation affects policy functions
to the first order, and what this implies about the approximation of stochastic sequence Xt E t


for any history of shocks.


e (Z ∗ + σE0 ; σ) and its
The value of the policy function in the initial period is X0 E 0 = X


first order expansion is


∂ e ∗
X (Z + σE0 ; σ) = X Z E0 + terms proportional to X σ . (11)
∂σ σ=0

In the initial period, “terms proportional to X σ ” are merely X σ , but this terminology will be
useful for expansions in other periods. Moreover, one can show3 that X σ = 0 so that all terms
proportional to X σ can be ignored. Thus, in the initial period, X0 E 0 ≈ X ∗ + X Z E0 to the


first order of approximation.


The value of aggregate variable in period 1, X1 E 1 , can be described by policy functions

 
as X
e Z e (Z ∗ + σE; σ) ; σ . The first-order expansion of this expression is

∂ e e ∗ 
X Z (Z + σE0 ; σ) + σE1 ; σ = X Z Z Z E0 + X Z E1 + terms proportional to X σ . (12)
∂σ σ=0

Continuing in this fashion, we see that the first-order approximation of Xt E t for any history


of shocks E t = (E0 , ..., Et ) can be constructed recursively from policy functions as follows: the
response of Xt in period t to a period-s shock Es is equal to the value of that shock Es multiplied
by a constant given in the following table
3
It follows directly from well-known arguments in Schmitt-Grohé and Uribe (2004).

10
X0 E 0 X1 E 1 X2 E 2 X3 E 3
   
...
E0 X Z Ẑ0 X Z Ẑ1 X Z Ẑ2 X Z Ẑ3 ...
E1 0 X Z Ẑ0 X Z Ẑ1 X Z Ẑ2 ...
... ... ... ... ... ...

where

Ẑ0 := 1 Ẑ1 := Z Z Ẑ0 Ẑ2 := Z Z Ẑ1 Ẑ3 := Z Z Ẑ2 ...

These objects have a natural interpretation. Ẑt is the first-order effect of the shock on the
endogenous state that occurred t periods ago, and X Z Ẑt is the first-order policy response to
this first-order change in the aggregate state. As this table makes it clear, one only needs to
n o
know the sequence X Z Ẑt to characterize the first-order response, without having to know
n o t
X Z and Ẑt separately. This insight will be very useful for heterogeneous agents economies,
t
where analogues of both X Z and Ẑt are large- (usually infinite-) dimensional objects that
are impractical to compute, while their product X Z Ẑt is a small-dimensional object, which
dimensionality is equal to that of Xt .4
We now show that exactly the same insights extends to our general economy, one merely
n o
needs to interpret X Z as a Frechet derivative and Ẑt as a sequence of directions in which
t
these derivatives are evaluated. Formally, we define this sequence recursively by Ẑ0 = [1, 0]T
and Ẑt := Z Z · Ẑt−1 . To get the intuition for this definition, recall that Z consists of exogenous
state Θ and endogenous distribution Ω. Thus, the initial direction Ẑ0 changes the exogenous
part of the state by one unit keeping the endogenous distribution at the point of approximation
n o
Ω∗ . Sequence Ẑt captures the evolution of the aggregate state following this initial shock.
t h iT
Since the first element of Z decays at constant rate ρΘ , direction Ẑt has structure ρtΘ , Ω̂t ,
where Ω̂t is the effect of the shock on the distribution in t periods. Let X Z,t := X Z · Ẑt be the
value of the Frechet derivative X Z evaluated in direction Ẑt . The next lemma shows that this
construction indeed characterizes the first-order responses to aggregate shocks.

Lemma 1. To the first order approximation, X σ = 0 and Xt satisfies


t
X
Xt E t =X + X Z,t−s Es + O E 2 .
 
s=0

As we anticipated above, the problem of finding first-order approximation can be restated as


the problem finding values of the Frechet derivative X Z evaluated in the sequence of directions
n o
Ẑt . Both X Z and Ẑt are large- (typically infinite-) dimensional objects and computing them
t
4
More generally, when Et is a vector, its dimensionality is dim Xt × dim Et .

11
explicitly is impractical.5 Economically, X Z,t captures the first-order response of aggregate
variables to shocks that occurred t periods ago; it includes the direct response to the shock,
which due to autocorrelation is given by ρtΘ , and the indirect response to the changes in the
aggregate distribution Ω̂t induced by that shock. Lemma 1 implies that to the first order the
impulse response is given by

E [Xt |E0 ] − E [Xt |E0 = 0] = X Z,t E0 + O E 2 ,



(13)

and so X Z,t can also be interpreted as the impulse response function.
t
Aggregate policy functions X
e are closely related to the individual policy functions x
e (z, θ),
and therefore X Z,t should also depend on the responses of those individual policy functions to
the aggregate shock, {xZ,t (z, θ)}t,z,θ . It turns out that it is not necessary to separately solve

for {xZ,t (z, θ)}t,z,θ as a part of the first-order approximation since X Z,t t contains all the
information about these variables.

Lemma 2. For any t,



X
x̄Z,t (z, θ) = xs (z, θ) X̄Z,t+s . (14)
s=0

This lemma shows that the first-order effect on individual policy function x̄Z,t (z, θ) is equal

to the sequences of first-order effects of aggregate policy functions X̄Z,t+s s weighted with
coefficients {xs (z, θ)}s that are known from the zeroth-order economy. In economic terms,
xs (z, θ) is a derivative of individual policy functions with respect to the fully anticipated,
s-period ahead change in aggregate variables in the deterministic economy. The economic
intuition for equation (14) is as follows. x̄Z,t (z, θ) captures the effect of perturbation of state
Z ∗ into direction Ẑt on individual policy functions x. Households do not care directly about
the aggregate state Z; rather they care about the effect of this perturbation on the aggregate

variables, both concurrently, X̄Z,t , and in the future, X̄Z,t+s s≥1 . Below we give a sketch of
the proof that also provides explicit expressions for {xs (z, θ)}s .

Proof. (Sketch). Differentiate (5) into arbitrary direction Z:


e
h   i
Fx (z, θ)x̄Z (z, θ) · Z
e + FX (z, θ)X̄Z · Z
e + Fx′ (z, θ)E x̄z ()z̄Z (z, θ) · Z
e + x̄Z () · Z̄Z · Z
e |z, θ = 0,

where Fx (z, θ), Fx′ (z, θ) and FX (z, θ) are the derivatives of (3) with respect to the first three
arguments in the zeroth order economy. Set Z e = Ẑt , observe that Ẑt+1 = Z̄Z · Ẑt , and use the
5
The brute force application of DYNARE attempts to compute all these objects and thus quickly fails in
economies with even modest amount of heterogeneity.

12
definitions of x̄Z,t (z, θ) and X̄Z,t to get

(Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ]) x̄Z,t (z, θ) + FX (z, θ)X̄Z,t + Fx′ (z, θ)E [x̄Z,t+1 (z, θ) |z, θ] = 0,
(15)
where p is the selection matrix implicitly defined by z = px . Define the coefficients xs
recursively via

x0 (z, θ) = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ])−1 FX (z, θ)
xs+1 (z, θ) = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ])−1 Fx′ (z, θ)E [xs (, )|z, θ] .

One can directly verify that x̄Z,t defined by (14) solves (15) since

Fx′ (z, θ)E xs (, )X̄Z,t+1+s |z, θ = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ])−1 xs+1 (z, θ)X̄Z,t+s+1 .
 


We are now ready to construct approximations X Z,t t . Differentiate equation (6) into
direction Ẑt to get Z 
Rx xdΩ + RX X̄Z,t + RΘ ρt = 0, (16)
Z,t
R 
where xdΩ is given by
Z,t
Z  Z Z

xdΩ := x̄Z,t dΩ + x̄dΩ̂t . (17)
Z,t

Equation (16) shows the relationship between the unknown variable X̄Z,t , known objects Rx ,
RX and RΘ (which are simply derivatives of R mapping given by (6) with respect to its
R 
arguments at the point of approximation), as well as yet-unknown xdΩ . This latter term
Z,t
captures how aggregate shock affects aggregation of individual variables to the first order. As
can be seen from equation (17), this term is a sum of two components: the effect of perturbation
on individual variables weighted with zeroth-order invariant distribution, x̄Z,t dΩ∗ , and the
R
R
aggregation of individual policy functions weighted with the perturbed distribution, x̄dΩ̂t .
This perturbed distribution Ω̂t is not known. Moreover, it is a complex, infinite-dimensional
object which makes finding it explicitly impractical.
One of the main insights of this section is to show that Ω̂t satisfies a simple linear recursive
R 
structure that can be utilized to obtain explicit expressions for xdΩ without having
Z,t
to solve for Ω̂t directly; moreover, this linear recursive structure survives into higher order of
approximations. Let Dθ be a linear operator that for any distribution Ω returns Dθ · Ω ⟨z, θ⟩ =
d
dθ Ω ⟨z, θ⟩. The next lemma shows the central result that Dθ · Ω̂t follows a simple recursive law
of motion that much simplifies first-order approximations.

13
Lemma 3. For any t, Dθ · Ω̂t satisfies recursion

X
Dθ · Ω̂t = A · Dθ · Ω̂t−1 − as X̄Z,t−1+s . (18)
s=0

Proof. By definition of Ω̂t we have

Ω̂t z ′ , θ′ =ΩZ · Ẑt−1 z ′ , θ′


ZZ
=− δ(z̄(z, θ) − z ′ )ι(ρθ θ + ϵ ≤ θ′ )µ(ϵ)z Z (z, θ) · Ẑt−1 dϵdΩ∗
ZZ
δ z̄(z, θ) − z ′ z̄z (z, θ)ι(ρθ θ + ϵ ≤ θ′ )µ (ϵ) dϵDθ · Ω̂t−1 ⟨z, θ⟩ dθdz,

+

where we used differentiation by parts to obtain the last term. Substitute (14) into this
expression and differentiate both sides with respect to θ to get
∞ Z
X ZZ 
′ ′ ′ ′ ∗
Dθ ·Ω̂t z , θ =− Λ̄(z , θ , z, θ)zs (z, θ) dΩ X Z ·Ẑt−1+s − Λ̄(z ′ , θ′ , z, θ)z̄z (z, θ)Dθ · Ω̂t−1 ⟨z, θ⟩ dzdθ .
s=0 | {z } | {z }
as (z ′ ,θ′ ) A·Dθ ·Ω̂t−1 (z ′ ,θ′ )

The recursion (18) is helpful because it shows that Dθ · Ω̂t is a simple linear recursive

equation that depends on Dθ · Ω̂t−1 and X̄Z,t−1+s s weighted with known linear operators.

Since Ω̂0 = 0, we can use (18) to solve for Dθ · Ω̂t purely in terms of X̄Z,s s :


t−1 X
X
Dθ · Ω̂t = − At−1−k · as−k X̄Z,s . (19)
k=0 s=0
R
To see how equation (19) is useful, observe that we can write the term x̄dΩ̂t in the expression
(17) using integration by parts as follows
Z Z
x̄dΩ̂t = − x̄z (z, θ)Dθ · Ω̂t ⟨z, θ⟩dzdθ. (20)

R 
Combine it with (19) to express x̄dΩ̂t purely as a linear function of X̄Z,s s
. This allows us
to prove the following result.

Corollary 1. For all t,


Z  ∞
X
xdΩ = Gt,s X̄Z,s , (21)
Z,t s=0

where Gt,s follows a recursion Gt,s = Gt−1,s−1 + B · At−1 · as .

14
Proof. Due to Lemma 2 we have
Z ∞ Z
X  ∞ Z
X 
∗ ∗ ∗
x̄Z,t dΩ = xs dΩ X̄Z,t+s = xs−t dΩ X̄Z,s (22)
s=0 s=0

with the understanding that xl = 0 for all l < 0. Substitute (19) into (20), substitute resulting
expression and (22) into (17) and re-arrange to obtain (21), where
Z t−1 Z
X  

Gt,s = xs−t dΩ + x̄z (z, θ) At−1−k · as−k (z, θ)dzdθ .
k=0 | {z }
B·At−1−k ·as−k

One can directly verify that Gt,s satisfies the desired recursion.
R
This corollary shows that the effect of the perturbation Ẑt on x edΩ can be expressed

purely as a function of yet-unknown variables X̄Z,s s weighted with known coefficients Gt,s .
These coefficients themselves follow a linear recursive structure that makes them fast and easy
to construct numerically using matrix multiplications. Once we substitute equation (21) into
(16), we find the linear system of equations that characterizes the first-order approximation to
our problem.

Proposition 1. X̄Z,t t
is the solution to

X
Rx Gt,s X̄Z,s + RX X̄Z,t + RΘ ρtΘ = 0. (23)
s=0

Equation (23) is a linear system of equations that determines X̄Z,t t . While written as in
infinite system, it can be simplified. Assumption 1 implies that X̄Z,t → 0 as t → ∞ so it can
be transformed into a finite linear system of T × dim X equations and unknowns, and solved
using simple matrix inversions. Lemma 1 then implies that equation (23) fully characterizes
the first-order approximation.

3.2 Second-order approximations

One of the key feature of our approach is that it extends to higher-order approximations while
preserving tractable linear recursive structure of the first-order approximation. In this section
we show how it can be used to obtain the second-order approximations.
To build the intuition, we start with the same uni-dimensional illustrative example that we
used to motivate first-order approximations. The second order analogue of equation (11) is

∂2 e ∗
X (Z + σE0 ; σ) = X ZZ E02 + X σσ + terms proportional to X Zσ . (24)
∂σ 2 σ=0

15
Terms proportional to X Zσ can be ignored since they are zero to the second order, but the
terms proportional to X σσ (to which we refer to as the risk adjustment terms) can not. Term
X ZZ E02 captures the second-order response of policy functions to the E0 shock. The second-
order analogue of equation (12), that characterizes second-order responses to shocks in period
1, is given by

∂2 e  e ∗  
2

X Z (Z + σE 0 ; σ) + σE 1 ; σ = X ZZ Z Z + X Z Z ZZ E02 +X ZZ E12 +X ZZ Z Z E0 E1 +X σσ + X Z Z σσ .
∂σ 2 σ=0 | {z }
risk adjustment
(25)
Thus, in order to find the second-order approximation to the first period response, we need
to know second-order responses to E0 and E1 shocks, their interaction effect E0 E1 as well as
the appropriate risk adjustment. A very useful observation is that, as we continue rolling
this approximation forward, the coefficients that multiply shocks Et Es in the second-order
approximation can be described as follows

X0 E 0 X1 E 1 X2 E 2 X3 E 3
   
...
E0 E0 X ZZ Ẑ0 Ẑ0 + X Z Ẑ0,0 X ZZ Ẑ1 Ẑ1 + X Z Ẑ1,1 X ZZ Ẑ2 Ẑ2 + X Z Ẑ2,2 X ZZ Ẑ3 Ẑ3 + X Z Ẑ3,3 ...
E0 E1 0 X ZZ Ẑ1 Ẑ0 + X Z Ẑ1,0 X ZZ Ẑ2 Ẑ1 + X Z Ẑ2,1 X ZZ Ẑ3 Ẑ2 + X Z Ẑ3,2 ...
E0 E2 0 0 X ZZ Ẑ2 Ẑ0 + X Z Ẑ2,0 X ZZ Ẑ3 Ẑ1 + X Z Ẑ3,1 ...
... ... ... ... ... ...
risk X σσ + X Z Ẑσσ,0 X σσ + X Z Ẑσσ,1 X σσ + X Z Ẑσσ,2 X σσ + X Z Ẑσσ,3 ...
where

Ẑ0,0 := 0 Ẑ1,1 := Z ZZ Ẑ0 Ẑ0 + Z Z Ẑ0,0 Ẑ2,2 := Z ZZ Ẑ1 Ẑ1 + Z Z Ẑ1,1 Ẑ3,3 := Z ZZ Ẑ2 Ẑ2 + Z Z Ẑ2,2 ...
0 Ẑ1,0 := 0 Ẑ2,1 := Z ZZ Ẑ1 Ẑ0 + Z Z Ẑ1,0 Ẑ3,2 := Z ZZ Ẑ2 Ẑ1 + Z Z Ẑ2,1 ...
0 0 Ẑ2,0 := 0 Ẑ3,1 := Z ZZ Ẑ2 Ẑ0 + Z Z Ẑ2,0 ...
.. .. .. ... ...
Ẑσσ,0 = 0 Ẑσσ,1 = Z̄σσ + Z Z Ẑσσ,0 Ẑσσ,2 = Z̄σσ + Z Z Ẑσσ,1 Ẑσσ,3 = Z̄σσ + Z Z Ẑσσ,2 ...
These tables show that just as in the first-order case, the terms in the second-order ap-
proximation have linear recursive structure. To understand the intuition for this structure, it
is useful to start with the following observation. The second-order approximation of policy
functions always consists of two terms: a term that captures the second-order response of
policy function to the first-order changes in the state, and a term that captures the first-order
response of policy functions to the second-order changes in the state. This observation helps to
understand the intuition for the form of expressions that we derived in the tables below. The
n o
first-order changes in states Ẑt were already described in Section 3.1. The second order
t

16
n o
changes in states are given by Ẑt,s , where Ẑt,s is the second-order effect on the current state
t,s
of two shocks, that occurred t and s periods ago. Since states themselves are endogenous and
characterized by policy functions, they have the two-term form Ẑt,s = Z ZZ Ẑt Ẑs + Z Z Ẑt−1,s−1 ,
which is a sum of the second-order responses of Ze to the first-order effects of the state and the
first-order response of Ze to the second-order changes in the state. Coefficients that characterize
second-order responses of Xt E t and the risk adjustment terms follow identical structure.


While the table describing second-order approximation coefficients has more elements than
the first-order table in Section 3.1, it shows that the second-order coefficients have the same
features that allowed us to tractably characterize their first-order analogues. In particular,
 
second-order responses are described by sequences X ZZ,t,s t,s and X σσ,t t , where X ZZ,t,s =
X ZZ Ẑt Ẑs + X Z Ẑt−1,s−1 and X σσ,t = X σσ + X Z Ẑσσ,t . If one knows those sequences, it is not
necessary to know separately each term from which they are constructed, such as X ZZ or
Ẑt−1,s−1 . Moreover, the sequences are described by a tractable recursive linear mathematical
structure.
We now return back to our general economy. As in Section 3.1, all the constructions used
in the uni-dimensional example remains unchanged in any-dimensional settings as long as we
understanding that “derivative” is meant to be “Frechet derivative”. In particular, in our
n o
general settings let Ẑ0,s = Ẑt,0 = Ẑσσ,0 = 0 and construct directions Ẑt,s recursively as
t,s
follows
 
Ẑt,s = Z̄ZZ · Ẑt−1 , Ẑs−1 + Z̄Z · Ẑt−1,s−1 for all t, s > 0,
 T
Ẑσσ,t = 0, Ωσσ + Z Z · Ẑσσ,t−1 for all t > 0.
 
Similarly, terms characterizing the second-order responses X ZZ,t,s t,s and X σσ,t t
are de-
fined as
 
X ZZ,t,s := X̄ZZ · Ẑt , Ẑs + X̄Z · Ẑt,s ,

X σσ,t := X σσ + X Z · Ẑσσ,t .

Next lemma shows that these terms fully characterize second-order approximations.

Lemma 4. To the second order approximation, Xt satisfies


t X
t
!
t 1 X
+ O E3 ,
 
Xt E = ... + X ZZ,t−s,t−m Es Em + X σσ,t
2
s=0 m=0

where ... are the first-order terms.

17
X ZZ,t−s,t−m Es Em is the second-order response that captures the interaction between shocks
that occurred t − s and t − m periods before period t. X σσ,t is the risk-adjustment affect of
period t policy functions. The impulse response is given by

E [Xt |E0 ] − E [Xt |E0 = 0] = ... + X ZZ,t,t E02 + O E 3 ,



(26)

so sequence X ZZ,t,t t
captures the non-linear part of the impulse response. Comparison of
equations (13) and (26) highlight important differences between first- and second- (as well as

higher-) order approximations. To the first-order, if one knows impulse responses X Z,t t one
can construct first-order approximations for any history of stochastic shocks E t . This is no

longer the case in higher-orders. The second-order impluse response X ZZ,t,t t includes only
the non-linear responses to the period 0 shock. To know the full second-order approxima-
tion, one also needs to compute all the interaction terms between shocks in different periods,
 
X ZZ,t,s t,s as well as the risk adjustment terms X σσ,t t .
The first step in the first-order approximation was to drive the expression for effect of

perturbation on individual variables, {xZ,t }t in terms of the aggregates, X Z,t t , which was
done in Lemma 2. We can now state the second-order analogue of that lemma.

Lemma 5. For any t, k



X
x̄ZZ,t,k = xs (z, θ) X̄ZZ,t+s,k+s + yt,k (z, θ), (27)
s=0


X
x̄σσ,t (z, θ) = xs (z, θ)X̄σσ,t+s + z(z, θ), (28)
s=0

where z(z, θ) is known conditional on knowing X̄ZZ,t,k t,s
.

This lemma shows relationship between {x̄ZZ,t,k }t,k and X̄ZZ,t,k t,k , and between {x̄σσ,t }t

and X̄σσ,t t is characterized by the same loading terms {xs }s that were already computed for
the first order terms, plus additional adjustments {yt,k }t,k and z. Terms {yt,k }t,k are known

from the lower-order solution. Term z is know from the lower-order and X̄ZZ,t,k t,k . As we
 
shall see, one can find X̄ZZ,t,k t,k without using any information about X̄σσ,t t . Thus, the

general approach to the second-order approximation is to first solve for X̄ZZ,t,k t,k , then use

(28) and follow the same steps to solve for X̄σσ,t t . The proof of Lemma 5 follow very similar
steps to those of Lemma 2 except we take multiple derivatives and sum them. For example,
 
to obtain (5) we take a second-order derivative of (5) in directions Ẑt , Ẑs and a first-order
derivative in direction Ẑt,s , and then add the two expressions.

18
We follow the same strategy of combining expansions to several different directions for
 
other steps as well. Derivatives of (4) in directions Ẑt , Ẑs and Ẑt,s imply that
Z 
Rx xdΩ + RX X̄ZZ,t,s + RΘ,t,s = 0
ZZ,t,s

where
Z  Z Z Z Z

xdΩ := x̄ZZ,t,k dΩ + x̄Z · Ẑt dΩ̂k + x̄Z · Ẑk dΩ̂t + x̄dΩ̂t,k .
ZZ,t,k

Similarly, derivatives of (4) in direction Ẑσσ,t implies


Z 
Rx xdΩ + RX X̄σσ,t = 0,
σσ,t

where Z  Z Z

xdΩ := x̄σσ,t dΩ + x̄dΩ̂σσ,t .
σσ,t
By comparing these expressions to their first-order analogue (16) and (17), one can see that
n o
the only new elements that appear in the second-order expansions are directions Ω̂t,k and
n o t,k
Ω̂σσ,t that capture second-order effects on the law of motion of the aggregate distribution.
t n o
Crucially, just like their first-order analogue Ω̂t , they have linear recursive representations.
t

Lemma 6. For any t, k, Dθ · Ω̂t,k and Dθ · Ω̂σσ,t satisfy recursions



X
Dθ · Ω̂t,k = A · Dθ Ω̂t−1,k−1 − as X̄ZZ,t−1+s,k−1+s − bt,k + Dz · ct,k ,
s=0

X
Dθ · Ω̂σσ,t = A · Dθ · Ω̂σσ,t−1 − as X̄σσ,t−1+s − aσσ .
s=0

These linear recursive representations are important as they allow us to sidestep the need
to explicitly calculate second-order effects of changes in the aggregate state on individual
policy functions, just like we avoided this problem in the first-order approximations. The
next two results establish analogues of Corollary 1 and Proposition 1 for the second-order
approximations.

Corollary 2. For all t,


Z  ∞
X
xdΩ = Gt,s X̄ZZ,t−k+s,s + Ht,k , (29)
ZZ,t,k s=0

Z  ∞
X
xdΩ = Gt,s X̄σσ,s + Gσσ,t . (30)
σσ,t s=0

19
 
Proposition 2. X̄ZZ,t,k t,k
and X̄σσ,t t
are the solutions to

X
Rx Gt,s X̄ZZ,t−k+s,s + Rx Ht,k + RX X̄ZZ,t,k + RΘ,t,k = 0 (31)
s=0

and

X
Rx Gt,s X̄σσ,s + Rx Gσσ,t + RX X̄σσ,t = 0 (32)
s=0
respectively.

Proposition 2 fully characterizes second order approximation. Both (31) and (32) are linear
systems of equations. Since X̄ZZ,t,k → 0 as either t or k approach ∞, the system (31) can
be solve by appropriate truncation, just like we solved equation (23). The solution method
for equation (32) needs to be adjusted since the risk adjustment term does not disappear as
t → ∞ but instead converges to some time-invariant X̄σσ,∞ . Thus, to solve (32), we use a
terminal condition X̄σσ,t = X̄σσ,T for all t ≥ T use solve a truncated linear system of equations
T
X
Rx Ĝt,s X̄σσ,s + Rx Gσσ,t + RX X̄σσ,t = 0
s=0
 T P∞
for X̄σσ,s s=0
, where Ĝt,s = Gt,s for s < T and Ĝt,T = s=T Gt,s .

3.3 Higher order approximations, multi-dimensional shocks and individual


states

Continuing in the same fashion as in Section 3.2, it is clear how the method can be extended
to third- and higher-order approximations. The key property that we exploited in recursive
construction of the second order approximation is that the second-order Frechet derivative is
a multi-linear map that preserves a lot of analytical tractability. The same property holds for
higher-order Frechet derivatives.
In our presentation, we assumed that both aggregate exogenous state Θt or individual state
zt are uni-dimensional. This was done for simplicity as nothing in the analysis relies on the
uni-dimensional natural of the shock. Suppose, for example that Θt is a vector of shocks that
follows stochastic process 1, where ρΘ is now a matrix, and Et is a vector of disturbances that
are i.i.d. over time but have arbitrary correlation across elements. Our perturbation (10)
would be unchanged, with scalar σ scaling the whole vector of shocks. Elements like X Z,t
that appear in Lemma 1 would now be vector rather than scalars, and X Z,t−s Es would be
interpreted as a dot product, and equation (23), except now it will pin down a sequence of

vectors X Z,t t . Analogous observations apply to higher orders. Similarly, when zt is a vector
the only adjustments in the analysis is needed to reinterpret the derivatives xz as gradients.

20
3.4 Computational Implementation

Sections 3.1-3.3 present our algorithm using the language of peicewise smooth functions and
infinite dimensional linear operators. We implement the algorithm on a computer by approx-
imating functions using a finite set of basis functions. Under this assumption, we show how
all of the linear operators can be represented by sparse matrices allowing for quick efficient
computation using only linear algebra once the steady state is solved.
We’ll proceed by adopting the following notation. We assume that all functions are approx-
imated using the set of N basis functions {ϕi (z, θ)}i . An approximated function is then defined
by a set of coefficients for these basis functions. In order to maintain parsimony in our nota-
tion we will use the superscript ·i to denote these coefficients. For example, the steady-state
decision rules x̄(z, θ) are approximated as follows
X
x̄(z, θ) ≈ ϕi (z, θ)x̄i .
i
We compute these coefficients using the collocation method and and a set of N grid points
{(ẑi , θ̂i )}i . Once the value of a function f (x) is known at these N grid points we determine
coefficients by solving the following linear system using the basis matrix, Φji = ϕi (ẑj , θ̂j ),
 1   
f f (ẑ1 , θ̂1 )
 f 2   f (ẑ2 , θ̂2 ) 
Φ .  =  .
   
 ..   ..
. 
fN f (ẑN , θ̂N )
Generally, we will use the same basis functions and gridpoints as were used to approximate
the steady state policy rules though this is not a requirement. Below we document how to
approximate the objects in Lemma 2-3 and corollary 1 using linear operations.

3.4.1 Lemma 2 Terms

The objects to compute in Lemma 2 are the functions xs (z, θ). We begin with x0 (z, θ) which
is defined by

x0 (z, θ) = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)|z, θ] p)−1 FX (z, θ).

In order to evaluate x0 at an arbitrary gridpoint (ẑj , θ̂j ) we need to evaluate the derivatives
of F at that gridpoint. This can be done simply and efficiently via automatic differentiation
h i
once the steady state policy rules are known. Next, the term E x̄z (·, ·)|ẑj , θ̂j needs to be
computed. Written out fully this is
h i Z
E x̄z (·, ·)|ẑj , θ̂j = x̄z (z̄(ẑj , θ̂j ), ρθ θ̂j + ϵ)µ(ϵ)dϵ.

21
wkµ f (ϵµk ) is approximated using a quadrature method
R P
When the integral f (ϵ)µ(ϵ)dϵ ≈ k
with gridpoints ϵµk and weights wkµ , this expectation can be approximated the our basis functions
as
h i XX  
E x̄z (·, ·)|ẑj , θ̂j ≈ wkµ ϕi,z z̄(ẑj , θ̂j ), ρθ θ̂j + ϵµk x̄i . (33)
i k
Once the basis functions and quadrature nodes are specified and steady state transition func-
tions are known equation (33) is simply an N × N matrix operator on the coefficients x̄i who’s
h i
elements can all be precomputed. We define this matrix as EΦz 6 and therefore E x̄z (·, ·)|ẑj , θ̂j
can be written as (EΦz x̄)j . We conclude that
 −1
x0 (ẑj , θ̂j ) ≈ − Fx (ẑj , θ̂j ) + Fx′ (ẑj , θ̂j ) (EΦz x̄)j p FX (ẑj , θ̂j ).

Once x0 is known at all of the gridpoints, the coefficients xi0 can be computed via the collocation
method above.
With the coefficients, xi0 , known the remaining coefficients xis can be computed inductively
using the formula

xs+1 (z, θ) = − (Fx (z, θ) + Fx′ (z, θ)E [x̄z (·, ·)p|z, θ])−1 Fx′ (z, θ)E [xs (, )|z, θ] .
h i
To evaluate this function at an arbitrary gridpoint (ẑj , θ̂j ) it is necessary to compute E xs (, )|ẑj , θ̂j .
h i
Following the same steps as for E x̄z (·, ·)|ẑj , θ̂j we observe that
h i XX  
E xs (, )|ẑj , θ̂j ≈ wkµ ϕi z̄(ẑj , θ̂j ), ρθ θ̂j + ϵµk xis ≡ (EΦxs )j
i k

for the N × N sparse matrix, EΦ. We therefore have


  −1
xs+1 (ẑj , θ̂j ) ≈ − Fx (ẑj , θ̂j ) + Fx′ (ẑj , θ̂j ) EΦ′ x̄ j p Fx′ (ẑj , θ̂j ) (EΦxs )j

where all the terms in this expression are composed of simple linear operators.

3.4.2 Lemma (3) and corollary 1 Terms

The objects to compute from Lemma (3) are the functions as (z, θ) and the linear operator A,
which are defined by Z
′ ′
Λ̄(z ′ , θ′ , z, θ)zs (z, θ) dΩ∗

as z , θ =

and ZZ
′ ′
A · Dθ · Ω̂ z , θ = Λ̄(z ′ , θ′ , z, θ)z̄z (z, θ)Dθ · Ω̂ ⟨z, θ⟩ dzdθ.
 
6
Formally EΦ′ji = wkµ ϕ′i z̄(ẑj , θ̂j ), ρθ θ̂j + ϵµ
P
k k .

22
The building blocks for both of these objects are the steady state density dΩ∗ and the steady
state transition density. In solving for the Bewely-Aiyagari steady state both of these will have
been approximated globally using a fine grip of I points {(z̄i , θ̄i )}i and the histogram method
of Young (2010). We let ωi∗ represent the fraction of agents in the steady state who are in state
(z̄i , θ̄i ) and let Λ̄ji be the probability, given the steady state policy rules, of transitioning from
state (z̄i , θ̄i ) to state (z̄j , θ̄j ). With these objects in hand we can evaluate as at any gridpoint
(z̄i , θ̄i ) via
 X
Λ̄ji Φ̄ω zs i

as z̄j , θ̄j ≈ (34)
i

where Φ̄ωik = ϕk (z̄i , θ̄i )ωi∗ is a matrix which takes a vector of coefficients and evaluates the
corresponding function on the gridpoints of the distribution weighted by the steady state
density. The operator A is approximated by a I × I matrix constructed from Λ̄ji by weighting
the rows via z̄z (z̄i , θ̄i ),
Aji = Λ̄ji z̄z (z̄i , θ̄i ).

Finally, for corollary 1 we wish to evaluate the operation B · At−1 · as , where B represents
integration weighted by x̄z (z, θ). Equation 34 allows us to represent as as a vector by evaluating
it at all of the gridpoints used to approximate the distribution. At−1 · as is then constructed by
repeatedly applying the matrix Aji to that vector. To construct B · At−1 · as , we then take that

resulting vector and take the dot product with the vector Φ̄z x̄ where Φ̄z,ji = ϕi,z z̄j , θ̄j is the
basis matrix that takes a vector of coefficients and evaluates the derivative of the corresponding
function at all the gridpoints of the distribution.

3.5 Comparison to literature

Our approach builds on the variational techniques such as Schmitt-Grohé and Uribe (2004)
and borrows directly many insights from them. Those techniques were originally developed
to study representative agent models. When applied to heterogeneous agent economies, such
an approach would seek to solve directly for derivatives X Z , X ZZ and various directions
n o
Ẑt ,which quickly becomes impractical as the dimensionality of state Z grows. Our ap-
t
proach, most importantly Corollaries 1 and 2, shows that this problem can be sidestepped in
a broad class of economies with arbitrary heterogeneity.
Our first-order approximation is closely related to the method developed in an important
paper by Auclert, Bardóczy, Rognlie, and Straub (2021), or ABRS for short. While our
formulation and theirs appear to be different, we now explain that they are, in fact, equivalent
to the first order, both mathematically and in terms of computational time needed to implement

23
them.The key difference between the two approaches is that ours easily scales to higher-order
approximations.
ABRS work with sequence-space representation (3) and (4). As we do, they also consider
approximation around point Z ∗ and exploit the insight of Boppart, Krusell, and Mitman (2018)
that the first-order approximation of a stochastic economy can be constructed from impulse
responses to MIT shocks in deterministic economy.7 Similarly to our Lemma 2, ABRS exploit
the fact that individual choices xt are only functions of aggregate time series X = {Xt }t and
so can be expressed as xt (X). Differentiating (4) for deterministic economy with respect to
the time 0 MIT shock gives

X ∂Xs ∂Xt
Rx Jt,s + RX + RΘ ρtΘ = 0, (35)
∂Θ0 ∂Θ0
s=0

∂ ( xi,t di)
R
∂Xt
where Jt,s := ∂Xs and ∂Θ0 is the response of Xt to a period 0 MIT shock. The matrix J
with elements {Jt,s }t,s is the Jacobian of the sequence-space representation. It is a complicated
object to compute but in the central result of their paper ABRS show that Jt,s has a linear
n o
∂Xt
recursive structure and can be expressed as a function of Jt−1,s−1 and ∂Θ . This allows
n o0 t
∂Xt
them to re-write (35) as a linear system of equations that determines ∂Θ 0
.
t
Our approach both builds on their insights but also departs from them in two important
ways, that allows us to extend our method to higher-orders of approximation. The similarly
between the two approaches can be easily seen by comparing their equation (35) to our (23).
∂Xt
Clearly, their response ∂Θ0 corresponds to our X Z,t , and Jt,s to Gt,s , with linear recursive
structure is being exploited in both cases to construct these objects. However, steps leading
to these expressions are quite different for at least two reasons.
The first difference between our approach and that of ABRS is that we use state- rather
than sequence-space representation. It is difficult to see how sequence-space representation
can be extended to higher-orders of approximation. In higher orders, knowing responses to
MIT shocks is insufficient to recover approximations of policy functions as those, for example,
would not include the risk adjustment terms.
The second difference is that ABRS start with discretizing the invariant distribution Ω∗ and
its law of motion (using the histogram method, due to Young (2010)) and then use a mixture
of numerical and analytical techniques to construct approximations of terms that appear in
(35). In contrast, we first derive analytically the exact theoretical representation (23) before
implementing it numerically using discretization. The order at which discretization is used
7
Ultimately, our Lemma 1 is also restatement of the same insight.

24
is important. Since we approximate the exact analytical expressions, our procedure ensures
that as the grid size h of our discretization shrinks to zero the numerical solution converges to
the analytical one. This is not the case if the order is reversed. In the appendix we show if
one starts with the disctretized law of motion for the aggregate distribution, the second-order
approximations do not converge to the theoretically correct ones even as h → 0. While this is
not a problem for ABRS, who focus on the first-order approximations, it serves as a roadblock
of extending that approach to higher orders.
To derive analytically properties of the law of motions, we used linear operator techniques
that were first developed in Bhandari, Evans, Golosov, and Sargent (2021). The problem con-
sidered in that paper was much simpler, as it did not allow for any kinks in policy functions or
non-trivial transition dynamics at σ = 0. The nature of approximation was also quite different
in that paper since σ scaled both aggregate and idiosyncratic shocks. But Bhandari, Evans,
Golosov, and Sargent (2021) also derived simple versions of our Lemmas 2 and 5 and showed
that Frechet derivatives can be fruitfully used to simplify approximations of heterogeneous
agent economies.

4 Extensions

We now discuss how our approach described in Section 3 can be extended to two classes of
problems: models with stochastic volatility and portfolio problems.

4.1 Stochastic volatility

For many applications that require realistic modeling of financial markets or uncertainty about
government policies it is important to allow for variation in volatility of aggregate shocks. The
conventional wisdom (see, e.g., discussion in Fernández-Villaverde, Guerrón-Quintana, Rubio-
Ramı́rez, and Uribe (2011)) is that analysis of stochastic volatility requires at least third-order
approximations, with second-orders terms capturing the mean level of volatility and third-order
terms capturing the effect of deviations from that mean. These third-order expansions can
significantly increase computational time required to approximate or estimate dynamic models
even in RA settings. While it is certainly possible to study effects of stochastic volatility in
HA models by extending our approach to the third order, it is not necessary. The second-order
approximations that we described in Section 3.2 can be modified to directly study problems
with stochastic volatility. These modifications have minimal impact on the computational
speed and allow one to analyze such problems quickly and easily.
To illustrate our main ideas, we start with describing our approximations for a particularly

25
convenient family of heteroskedastic shocks. Suppose that the aggregate shock Θt can be
described by the following stochastic process
p
Θt = ρΘ Θt−1 + Υt−1 Et , (36)

Υt = (1 − ρΥ ) + ρΥ Υt−1 + EΥ,t , (37)

where ρΘ and Et are as in described in Section 2, ρΥ ∈ (0, 1) is the autocorrelation parameter


for volatility shocks, and EΥ,t is a mean zero stochastic processes that is independent across
time and that has support such that Υt is a non-negative random variable. In this specification,
period t − 1 conditional volatility of Et is proportional to Υt , and so we refer to Υt as stochastic
volatility and EΥ,t as volatility shocks. The history of aggregate shocks E t = E t , EΥ t consists


now of both shocks to Θt and volatility shocks. We assume that initial conditions are such
that Υ−1 = 0. The rest of the model is the same as in Section 2.
The state in the recursive representation now consists of a triplet (Υ, Θ, Ω). One way to
approximate this economy is to proceed as in Section 3.3, scale both shocks Et and EΥ,t with
σ, and approximate equilibrium around the deterministic point (1, 0, Ω∗ ). In order to capture
time-varying volatility, this approach would indeed require using third-order approximations.
Instead, we present a much faster and simpler method.
We scale only shock Et with σ so that the perturbed process for Θt takes the form
p
Θt = ρΘ Θt−1 + σ Υt−1 Et .

The stochastic process for Υt in the perturbed economy in unchanged and is given by (37).
Since shocks EΥ,t are not scaled with σ, volatility Υt follows a non-trivial stochastic process
even in the long run when σ = 0. Thus, our approximations are around (Υ, 0, Ω∗ ), where Υ is
a non-trivial random variable.
Observe that when σ = 0, volatility Υ has no effect on any endogenous variable. Therefore,
the invariant distribution Ω∗ is independent of Υ and coincides with the one we considered in
Section 3. Similarly, the derivatives of policy functions X
e (Υ, Θ, Ω; 0) and x
e (z, θ, Υ, Θ, Ω; 0)
with respect to Z = (Θ, Ω) are also independent of Υ and, in fact, coincide with their values
in the baseline economy in Section 3. Similarly, derivatives X σ , xσ is equal to zero for any
Υ, but the second-order derivative, X σσ (Υ) and xσσ (Υ) are non-trivial functions of Υ. This
dependence of these derivatives on Υ allows us to compute effects of stochastic volatility using
second-order approximations.
These observations imply that the only modification that our approach requires is in de-

vising a method that generalizes X σσ,t t and finds its values in stochastic volatility settings.

26
We proceed as follows. Take any history EΥ t , with its implied history of volatilities (Υ , ..., Υ ),
0 t
t
 
and construct sequence X σσ,t EΥ t as follows:
t t
 
X σσ,t EΥ := X σσ (Υt ) + X Z · Ẑσσ,t EΥ ,

where Ẑσσ,−1 = 0 and

t
  T t−1

Ẑσσ,t EΥ = 0, Ωσσ (Υt−1 ) + Z Z · Ẑσσ,t−1 EΥ for all t > 0.

This definition is the same as that of X σσ,t t except that it explicitly recognizes the fact that
t

X σσ and Ωσσ depend on Υ. The definition of x̄σσ,t EΥ is modified analogously. The next
result extends Lemmas 1 and 4 to the settings with stochastic volatility.

Lemma 7. To the second order approximation, Xt satisfies


t t
t X
!
t
X √ 1 X √ t
+O E 3 ,
  
Xt E = X+ X Z,t−s νs−1 Es + X ZZ,t−s,t−m νs−1 νm−1 Es Em + X σσ,t EΥ
2
s=0 s=0 m=0
(38)
 
where sequences X Z,t t , X ZZ,t,k are the same as in Lemmas 1 and 4.
t,k
t
 
Given this lemma, we need to find a method to compute X σσ,t EΥ t,E t
. We follow the
Υ
same steps as in Section 3.2. The analogue of equation (28) becomes

X
t t
  
x̄σσ,t EΥ (z, θ) = xs (z, θ)E X̄σσ,t+s |EΥ + z(z, θ) + Υt zΥ (z, θ). (39)
s=0

Comparison of equations (39) and (28) reveals that stochastic volatility introduces two changes:
(i) the deterministic term X σσ,t+s in equation (28) is replaced with the conditional expectation
t in equation (39), and (ii) equation (39) has an additional term, Υ z (z, θ), where
 
E X̄σσ,t+s |EΥ t Υ

zΥ (z, θ) is an known. Modulo these two modifications, all other steps we described in Section
3.2 remain unchanged. Proceeding as in that section, we obtain

X t
X
t t
  
Rx Gt,s E X̄σσ,s |EΥ + Rx Gσσ,t + Rx GΥ,s Υt−s + RX X̄σσ,t EΥ = 0. (40)
s=0 s=0

This equation is the analogue of equation (32). The key difference between the two equation
is that (40) has an addition term ts=0 Rx GΥ,s Υt−s , which captures the history of stochastic
P

t instead of X
 
volatility realizations in the past, and conditional expectations E X̄σσ,s |EΥ σσ,s .
t
 
In order to apply Lemma 7 we need to use (40) to recover X σσ,t EΥ t,E t
. To this end,
Υ
t . Therefore, taking equation (40) for
we exploit the fact that equation (40) holds for all t, EΥ
t−1 s−1

some history EΥ , EΥ,t and subtracting from it equation (40) for history EΥ , it is possible
t−1 t−1
 
to obtain express X σσ,t EΥ , EΥ,t as a function of X σσ,t−1 EΥ . Proceeding in this fashion
and simplifying, we obtain the following result.

27
t
 
Proposition 3. X̄σσ,t EΥ t
is given by
t
X
t
X σσ,t (EΥ ) = X̄σσ,t + X̄Υ,t−s EΥ,s ,
s=0
 
where X̄σσ,t t
is the solution to equation (32) and X̄Υ,t t
is the solution to

X k
X
Rx Gk,s X̄Υ,s + Rx GΥ,k−s ρsΥ + RX X̄Υ,k = 0. (41)
s=0 s=0
The key insight of Proposition 3 is that the only additional step required to capture the

effect of stochastic volatility is to solve (41) for a deterministic sequence X̄Υ,t t . This step can

be done in parallel with solving (32) for X̄σσ,t t and thus it has trivial marginal effect on the
computational speed required to perform second-order approximations in stochastic volatility
settings.
We conclude this section with a remark about the stochastic process (36) and (37). We
chose this process because it automatically implies that the last term on the right hand size of
equation (39), Υt zΥ (z, θ), is linear in Υt . This linearity then carries over to equation (40) that
ultimately leads to (41). For more general specifications of stochastic volatility processes, this
last term in (39) would be of the form zΥ (z, θ, Υt ), i.e. it would be a known but potentially
a non-linear function of Υt . The easiest way to proceed in that case is to linearize zΥ with
respect to Υt thereby reducing this case to the form given in Proposition 3. This would only
affect the O term in equation (38) leaving all other equations unchanged.

5 Numerical Results

In this section we implement our algorithm using a calibration of Krusell and Smith (1998)
model. We first compare the relative accuracy and speed of the first and second order ap-
proximations using a one time shock with perfect foresight. The second order approximation
generally produces errors which are an order of magnitude smaller than the first order ap-
proximation. We then explore non-linearities of the model through the differences in impulse
responses induced by previous shocks and the welfare costs of business cycle policies.

5.1 Baseline Model and Calibration

Our baseline model extends the Krusell-Smith framework of section 2 to include capital ad-
justment costs and fiscal policy that varies over the business cycle. Investment in capital is
subject to convex adjustment, which are assumed to take the form:
 2
ϕ It
ϕ(It , Kt−1 ) = − δ Kt−1 .
2 Kt−1

28
We assume that capital is held by a continuum of perfectly competitive mutual funds with a
price per unit capital given by Pt . This intermediary mutual funds holds the individual saving
ai,t as deposit, representing the market value. The aggregate saving is invested in capital by
the fund: Z
Pt Kt = ai,t di

The shares held in the mutual funds have a financial return, which is the sum of (i) capital
gains net of depreciation and investment and (i) dividends given by the marginal product of
capital rk = ∂Y /∂K, net of capital adjustment costs:
Pt (1 − δ + It ) + rtk − ϕ(It , Kt−1 )
Rt =
Pt−1
The optimal investment choice of the mutual funds implies that the price per unit capital is
exactly equal to the marginal cost of investment, namely
It
Pt = 1 + ϕ .
Kt−1
Note that Pt also represents the marginal gain of capital inside in the firm, i.e. the Q-ratio as
in the Tobin’s Q theory, c.f. Hayashi (1982).
Finally, we extend the model to include fiscal policies that vary over the business cycle in
form of a time varying labor-tax
τt = τΘ Θt ,

which is returned lump-sum to the households in the form of transfers Tt = τt Wt . Households


with labor productivity θi,t will receive (1 − τt )Wt exp(θi,t ) in after-tax labor income in the
current period.
To calibrate our model, we set the period length to one quarter. The parameter α is set
to 0.3 to target the capital share of income. We set the risk aversion parameter σ = 2 to
be a value commonly used in the literature. For the parameters governing the aggregate and
idiosyncratic labor productivity in (1) and (2), we choose values used by Auclert, Bardóczy,
Rognlie, and Straub (2021). We discretize the idiosyncratic labor productivity process using
the Rouwenhorst method with a grid of Nϵ = 7 grid points. Regarding the discretization of the
policy functions, we use a set of Nz = 60 quadratic spline basis function to approximate the
policy x̄(z) with unequally spaced knot points. The distribution over states ω̄ is approximated
with unequally spaced grid with Iz = 1000 points along the asset dimension. The performance
of our algorithm is not sensitive to variations in these two values Nz and Iz . As a baseline we
set τΘ = 0 but explore variations in business cycle policy in section 5.3.1. The final parameter
ϕ is calibrated to match the 3% standard deviation of un-leveraged quarterly returns to equity.
The calibration parameters are summarized in table 1

29
Table 1: Calibration of the Krusell-Smith economy

Parameter Description Value


α Capital share 0.3
β Discount factor 0.983
σ Risk aversion 2
δ Depreciation rate of capital 1.77%
ϕ Adjustment cost of capital 125.0
ρϵ √ Idiosyncratic mean reversion 0.966
σϵ / 1 − ρ2ϵ Cross-sectional std of log earnings 0.503
ρΘ Aggregate mean reversion 0.80
σΘ Std of Aggregate TFP shocks 0.014
Nϵ Points in Markov chain for ϵ 7
Nz Grid points for the policy rule x̄i (z) 60
Iz Grid points for the distribution ω̄i 1000
T Time horizon (in quarters) for IRF 400

5.2 First Order Approximation

We begin by documenting the first order response of the Krusell and Smith style economy with
capital adjustment costs. For comparison purposes we compute the first order responses using
3 different methodologies: our benchmark approach described in section 3.1, a modification
of our approach which approximates the law of motion of the distribution with a histogram
as described in section 3.5, and the sequence space Jacobian approach of Auclert, Bardóczy,
Rognlie, and Straub. Figure 1 plots the derivative of capital, returns to capital wages and
output over time with respect to a one-time unexpected change to TFP at time 0. Under
our notation these are the variables in X̄Z,t . All three methodologies produce nearly identical
responses. An increase in at time 0 combined with capital adjustment costs instantaneously
cause the price of capital, and hence returns to capital, to increase on impact. Households
respond to this transitory higher than expected level of output by saving for the future causing
capital to rise on impact and generating the familiar hump shaped response as the TFP shock
dissipates. The presence of capital adjustment costs results in a slower decline of the capital
stock, which only returns to its steady state level after 400 quarters.
Next, we compare the accuracy of all three approaches. Measuring the numerical errors
made in the simulation of heterogeneous agents models with aggregate shocks is notoriously
difficult since there are no “reference point” to compare with. As such, we test the accuracy of
our method by studying the response to a one-time, one standard deviation positive shock to
TFP. All approaches give an approximation X̂t to the path of all aggregates in response to this
shock. Given these approximations, these sequences of aggregate variables are the only thing

30
Figure 1: Derivatives of capital, returns, wages and output with respect to the TFP shock for the 3 method-
ologies.

needed to compute the dynamics of the rest of the model in a fully non-linear way: first using
the sequence of prices, the optimal policies x̃t (z) of households are computed using standard
Dynamic programming methods. Second, given these policies, we compute the law of motion
of the distribution ω̃t (z), and, third, aggregating up, we compute the law of motion X̃t . We
can then measure the accuracy by comparing X̂t with the non-linear output X̃t . We do this in
K̂t −K̃t
figure 2 by plotting the % error in the capital stock K̃t
for the three approaches represented
by the blue, red and green lines. As would be anticipated by figure 1 all three approaches have
roughly the same error to first order, with the maximal error being on the order of 0.04% of
the capital stock.
Finally, we turn to the computational speed of our approach which we break down in table
2 by each stage of the algorithm. The timings for the first order approximation are reported
in the first two columns of the table.8 All told, once the steady state has been computed, our
algorithm takes 0.5 seconds to solve for the X Z,t terms with roughly equal time spent in all
4 of the main steps. As Lemma 1 highlights, X Z,t are all that is needed to simulate the path
of aggregates and to compute ergodic moments from the first order approximation. The other
first order terms, x̄Z,t and Ω̄Z,t , are required for the second order approximation and take an
additional 0.7 seconds to compute.
We compare this to our own implementation of the sequence space Jacobian approach of
8
All numbers are reported using a 20 core M1 ultra mac studio.

31
K̂t −K̃t
Figure 2: Nonlinear error K̃t
for the three first order approximations and the second order approxima-
tions.

Table 2: Computational Speed: First and Second Order

First Order Second Order


Step Time Step Time (ZZ) Time(σσ)
Additional First Order Terms 0.70s
Lemma 2 Terms 0.07s Lemma 5 Terms 0.64s 0.05s
Lemma 3 Terms 0.13s Lemma 6 Terms 0.21s 0.45s
Corollary 1 Terms 0.17s Corollary 2 Terms 0.07s 0.05s
Proposition 1 Terms 0.13s Proposition 2 Terms 0.19s 0.28s
Total 0.5s 1.81s 0.83s

ABRS 1.51s

32
Auclert, Bardóczy, Rognlie, and Straub which takes approximately 1.5 seconds to compute the
equivalent on the X Z,t . Of that time, approximately 1.35 seconds are spent on the backward
and forwards iteration steps which are the equivalent of the terms computed in Lemma 2 and
3. As we detailed in section 3.4, once the steady state is solved for our implementation of
the algorithm requires only sparse linear operations which can be done quickly and efficiently
independently of how the steady state is solved for. The methodology of ABRS generally
relies on numerical differentiation of global transition code, and is therefore limited by the
efficiency of that global code. Moreover, very careful attention has to be paid to those numeric
derivatives in order to ensure that they are accurate, see appendix C.1 of Auclert, Bardóczy,
Rognlie, and Straub for details. These numerical issues would be amplified with a second order
approximation as numerical second derivatives are more prone to numerical error. By giving
explicit expressions for these second derivatives in terms of derivatives of F and R we sidestep
these issues.

5.3 Second Order Approximation

Next we turn to the second order approximation of the Krusell and Smith model with capital
adjustment costs. We begin with the derivatives capturing the non-linear responses in a perfect
foresight world, X̄ZZ,t,s . Again, we compare these terms computed using our baseline method-
ology and those computed when the law of motion of the distribution is approximated via a
histogram. Figure 3 plots these terms for the derivative of the response of capital and output.
Unlike the first order approximation, we see that these result in very different derivatives with
those of the histogram becoming very negative relative to those of baseline case. These differ-
ences in the derivatives are reflected in the errors observed in figure 2. Our baseline approach
has errors which remain very small over time, while the histogram approach generates errors
which become the same order of magnitude as those of the first order approximation.
The addition time to compute the second order approximation is broken out in the last 2
columns of table 2. As highlighted in section 3.2 there are two additional types of terms in the
second order approximation: the curvature terms, X̄ZZ,t,k , and risk correction terms X̄σσ,t . As
they follow the same mathematical structure, we break out the computational time separately
for both types. The curvature terms take 1.11 seconds to compute9 while the risk adjustment
terms take 0.83 seconds. The vast majority of the computational time for the curvature terms
9
Here we report only the time required to compute that X̄ZZ,t,t terms. We do this for two reasons. Firstly,
for most ergodic moments only the X̄ZZ,t,t are required. Secondly, computing the addition X̄ZZ,t,t+i terms are
trivially parallelizable for each i so, with enough processors, computing all the X̄ZZ,t,k terms would not require
any additional time.

33
Figure 3: Curvature terms X̄ZZ,t,t for capital (top) and output (bottom) using our baseline methodology and
when the law of motion of the distribution is approximated with a histogram.

34
Figure 4: Non-linearities in Impulse Responses. Solid line represents the HA economy while the dashed line
represents an RA model.

is spent on Lemma 5 and Proposition 2 which is a result of a large number of quadratic forms
required to compute the yt,k and RΘ,t,k terms. All combined, computing the second order
approximation requires an additional 2.63 seconds relative to the first order approximation.

5.3.1 Nonlinearities and Welfare Analysis

We can use our second order approximation to study both non-linearities over the business
cycle. We begin with non-linearities other the business cycle by studying how impulse responses
change as a result of previous shocks. Specifically, we compute

E [Xt+k |Et , Et−j ] − E [Xt+k |Et , Et−j = 0]

which is the change in the impulse response resulting from a shock j periods in the past. Ap-
plying Lemma 4 it is straightforward to see that this term is exactly given by X̄ZZ,k,k+j Et Et−j.
In figure 4 we plot these changes in the impulse responses for various values of j for both

35
capital and output. In the figure we normalize by a 1 s.d. impulse response so these should
be interpreted as the percentage change in the impulse response induced by a shock in the
previous period. As a point of reference, we compare to the non-linear impulses generated by
a RA agent economy calibrated to match the same long run moments.10 Both the RA and the
HA versions of the economy are very close to linear so nearly all the terms are close to zero,
but we do observe some nonlinearities in the impulse responses. A sequence of recent positive
TFP shocks will result in amplified business cycles: the increase (fall) in output will be larger
for a positive (negative) TFP shock. For example, if the economy had received a positive TFP
shock in the previous period the response of capital could be up to 30% large in some period.
Relative to the RA economy, the HA version appears to feature greater persistence arising
from the endogenous movement of the wealth distribution.
In addition to studying non-linearities, second order approximations can be used to evaluate
the welfare effects of policies. To see this, return to Lemma 4 and take the expectation
conditional on time 0 information to find
t
!
1 X
X ZZ,t−s,t−s σE2 + X σσ,t + O E3 ,

E0 [Xt ] = X̄ +
2
s=0

where σE standard deviation of the exogenous shock Et . Taking the limit as t → ∞ gives the
long run ergodic means of the aggregate variables X as

!
1 X
X ZZ,s,s σE2 + O E3

E [X] = X̄ + + X σσ,∞
2
s=0

whereX σσ,∞ = limt→∞ X σσ,t . When X contains a measure of welfare this gives a quick and
efficient algorithm for computing ergodic welfare as a function of a policy parameter since
for given value of the policy parameter it is straightforward to compute the second order
derivatives X̄ZZ,t,t and X̄σσ,t .11 We illustrate this approach by computing ergodic welfare as
a function of the tax parameter τΘ and plotting ergodic welfare (computed as consumption
equivalent relative to steady state) in figure 5. The blue line which plots ergodic welfare
using our methodology, while the green line plots the same line for a representative agent
calibration. Since the taxes are rebated lump sum, welfare in the RA calibration is invariant of
the policy parameter. That the blue line is below the green line indicates that business cycles
are more costly in the HA calibration due to market incompleteness. We see that relative
to a Laissez faire policy, τΘ = 0, making the tax policy more countercyclical initially raises
10
The second order approximation to the RA model was computed using our methodology as
11
In principle ergodic welfare will also depend on the change in the steady state X̄, in our example the steady
state will not depend on the policy parameter.

36
Figure 5: Welfare as a function of τΘ

welfare with a distinct maximum achieved at τΘ = −2 which amounts to raising taxes by


2% for every percentage point decrease in TFP. In fact, roughly 20% of the welfare losses
from business cycles in this model can be ameliorated by this tax policy. Comparing to
the RA line one can conclude that roughly 60% of the welfare losses arising from market
incompleteness can be ameliorated with the countercyclical tax. This analysis also highlights
the importance of our methodology for tracking the distribution. The red line in figure 5 plots
ergodic welfare computed using a second order expansion of an economy where the law of
motion of the distribution is approximated using a histogram. As we noted above, this results
in significantly different second derivatives which inform the welfare criterion. Not only is there
no clear maximum the histogram method finds that business cycles actually increase ergodic
welfare relative to the steady state.

5.4 Stochastic Volatility

We can use the results of section 4.1 to study the response of our benchmark model to tempo-
rary changes in volatility. In line with the literature, we’ll focus on the impulse response to a
transitory shock to the variance of the TFP process, EΥ,t . The impulse response to this shock

37
Figure 6: Impulse response to a shock to volatility.

is then defined by

IRFkΥ (EΥ,t ) = Et Xt+k EΥ,t − E [Xt+k |EΥ,t = 0] .


 

Applying lemma 7 and proposition 3we find that


 
k−1 k
X ZZ,j,j ρk−1−j X̄Υ,j ρk−j
X X
IRFkΥ (EΥ,t ) =  Υ
2
σΘ + Υ
 EΥ,t . (42)
j=0 j=0

Following Fernández-Villaverde and Guerrón-Quintana (2020) we study the impulse response


to a shock that temporarily doubles the amount of risk in the economy and decays at a
rate of ρΥ = 0.75. Figure 6 plots this impulse response using two methodologies. The blue
line uses derivatives computed using our baseline approach, the green line represents the RA
calibration, and the red line is computed assuming the LOM of the distribution is approximated
using a histogram. Starting with the blue line we see that an increase in the risk in the
economy increases the precautionary savings motive. On impact, the increase in expected

38
Figure 7: Average portfolio holdings by wealth.

uncertainty does not change the level of output in the economy which means that this increased
precautionary savings motive results in lower consumption and increases savings and capital.
That increased capital eventually results in a paradoxical boom in output. As the Krusell and
Smith economy is very close to linear the magnitudes are quite modest but of a similar scale to
Fernández-Villaverde and Guerrón-Quintana (2020). Relative to the RA calibration it appears
as if the HA model slightly amplifies the effects of an increase in uncertainty. If one were to
have only computed the IRF using the methodology based on a histogram (red line) one would
incorrectly conclude that an increase in uncertainty would paradoxically result in a prolonged
decrease in capital and output. As can be readily observed by comparing figures 3 and 6 this
prolonged decrease in output is generated by the incorrect curvature terms X ZZ,t,t computed
using this methodology.

5.5 Portfolio Choice

In this section we study endogenous portfolio choice in this Krusell and Smith style model.
We extend the model to allow agents to trade risk free debt, b, which is in zero net supply, in
addition to claims to capital. We impose that households cannot short-sell capital. We compute
the first order response of aggregates and the endogenous portfolio choice of individuals. We
plot the latter in figure 7. The right-hand side of the figure plots the portfolio holdings for
the entire wealth distribution. We observe that nearly all households have the same portfolio:
putting nearly all of the wealth in capital. The exception is those households nearest to the
borrowing constraint who borrow in the risk free asset in order to invest more in capital. This
surprising result comes from the fact that the poorest households are nearly entirely reliant on

39
Figure 8: Comparison of first order derivatives between endogenous and exogenous portfolio choice.

labor income which is safer than a claim to capital. They are therefore more willing to bare
additional risk than their richer counterparts. As the portfolio decisions of the agents align
very closely to those in the benchmark economy with an exogenous portfolio, the first order
responses of aggregates are nearly identical, which is observed in figure 8.
Finally we turn to how government policy can influence the portfolio choice of households.
Figure 9 plots the implied portfolio distribution for two different levels of the government policy
parameter τΘ . The top line of the figure is the portfolio distribution when τΘ = −2, the optimal
level found in the previous section, while the bottom captures the portfolio distribution when
τΘ = 2. We observe that for the richest agents their choices are nearly identical for the two
policy regimes. The poorest agents behave quite differently. When provided a lot of insurance,
τΘ = −2, the poorest agents increase their leverage and invest even more in risky capital. This
choice can be flipped if the government makes taxes pro-cyclical, τΘ = 2. The pro-cyclical tax
rate makes labor income more risky pushing households to the safer asset at which point the
no-borrowing constraint binds for some of those agents.

5.6 Extreme Calibration

To give a perspective of how theses approximations perform in the presence of greater curvature
we study an additional calibration that features a higher level of risk aversion, σ = 5, and larger

40
Figure 9: Portfolio distribution for τΘ = −2 and τΘ = 2.

shocks σΘ = 0.1. We then recalibrate to model to match the same moments.12 We begin by
comparing the relative accuracy of the various approximation techniques. As can be observed
in figure 10, the three first order approaches have nearly identical errors, but relative to the
baseline case they are, unsurprisingly, much larger: the maximal error is 2% of the capital
stock compared to 0.04% in the benchmark case. The second order approximation improves
matters markedly with a maximal error of 0.5% of the capital stock.
We can see how the extreme calibration amplifies the non-linearities of the model by study-
ing the non-linear impulse responses conditional on past shocks. Figure 11 plots the percentage
change in the impulse response
E [Xt+k |Et , Et−j ] − E [Xt+k |Et , Et−j = 0]
E [Xt+k |Et , Et−j = 0]

conditional on a shock in the previous period. A 1 s.d. positive TFP shock in the previous
period can lead the impulse response of the capital stock to be up to 200% larger and the
impulse response of output being up to 50% larger.
The higher risk-aversion and larger shocks also significantly increase the welfare costs of
business cycles. Figure 12 (computed as consumption equivalent relative to steady state)
plots ergodic welfare as a function of the tax parameter τΘ The blue line which plots ergodic
welfare using our methodology, while the red line plots the same line for a representative agent
12
With the exception of the capital adjustment cost parameter which we target to match the same ratio of
the standard deviation of returns to output.

41
Figure 10: Nonlinear error K̂tK̃−K̃t for the three first order approximations and the second order approxima-
t
tions for the extreme calibration.

calibration. The extreme calibration features much large welfare costs of uncertainty with
the representative agent being willing to sacrifice nearly 3.5% of their consumption to remove
all uncertainty. The welfare costs for the heterogeneous agent economy is much larger, and,
comparing the blue lines in figures 5 and 12, we see that government policy is less effective at
mitigating the costs of uncertainty under the extreme calibration.
Finally we turn to stochastic volatility under the extreme calibration. Figure 13 plots this
impulse response using two methodologies. The blue line uses derivatives computed using our
baseline approach, the green line represents the RA calibration, and the red line is computed
assuming the LOM of the distribution is approximated using a histogram. Comparing the
green lines in figures 6 and 13 we see that, qualitatively, the RA economy responds similarly
to an temporary increase in uncertainty under the extreme calibration. The main difference
is that the responses are an order of magnitude larger. Interestingly, the qualitative response
in the heterogeneous agent economy is flipped as in the extreme calibration the increase in
uncertainty results in a decline in the capital stock.

42
Figure 11: Non-linearities in Impulse Responses. Solid line represents the HA economy while the dashed line
represents an RA model.

43
Figure 12: Welfare as a function of τΘ for extreme calibration.

44
Figure 13: Impulse response to a shock to volatility.

45
References

Ahn, S., G. Kaplan, B. Moll, T. Winberry, and C. Wolf (2018): “When Inequal-
ity Matters for Macro and Macro Matters for Inequality,” NBER Macroeconomics Annual,
32(1), 1–75.

Auclert, A., B. Bardóczy, M. Rognlie, and L. Straub (2021): “Using the Sequence-
Space Jacobian to Solve and Estimate Heterogeneous-Agent Models,” Econometrica, 89(5),
2375–2408.

Bhandari, A., D. Evans, M. Golosov, and T. J. Sargent (2021): “Inequality, Business


Cycles, and Monetary-Fiscal Policy,” Econometrica, 89(6), 2559–2599.

Boppart, T., P. Krusell, and K. Mitman (2018): “Exploiting MIT Shocks in


Heterogeneous-Agent Economies: The Impulse Response as a Numerical Derivative,” Jour-
nal of Economic Dynamics and Control, 89, 68–92.

Fernández-Villaverde, J., P. Guerrón-Quintana, J. F. Rubio-Ramı́rez, and


M. Uribe (2011): “Risk Matters: The Real Effects of Volatility Shocks,” American Eco-
nomic Review, 101(6), 2530–61.

Fernández-Villaverde, J., and P. A. Guerrón-Quintana (2020): “Uncertainty Shocks


and Business Cycle Research,” Review of Economic Dynamics, 37, S118–S146.

Hayashi, F. (1982): “Tobin’s marginal Q and average Q: A Neoclassical Interpretation,”


Econometrica: Journal of the Econometric Society, pp. 213–224.

Krusell, P., and A. A. Smith, Jr (1998): “Income and Wealth Heterogeneity in the
Macroeconomy,” Journal of Political Economy, 106(5), 867–896.

Luenberger, D. G. (1997): Optimization by Vector Space Methods. New York, NY, USA:
John Wiley & Sons.

Reiter, M. (2009): “Solving Heterogeneous-Agent Models by Projection and Perturbation,”


Journal of Economic Dynamics and Control, 33(3), 649–665.

Schmitt-Grohé, S., and M. Uribe (2004): “Solving Dynamic General Equilibrium Mod-
els Using a Second-Order Approximation to the Policy Function,” Journal of Economic
Dynamics and Control, 28(4), 755–775.

46
Winberry, T. (2018): “A Method for Solving and Estimating Heterogeneous Agent Macro
Models,” Quantitative Economics, 9(3), 1123–1151.

Young, E. R. (2010): “Solving the Incomplete Markets Model with Aggregate Uncertainty
using the Krusell–Smith Algorithm and Non-Stochastic Simulations,” Journal of Economic
Dynamics and Control, 34(1), 36–41.

A Proofs
A.1 Proof of Lemma 1

The path of aggregates, Xt (E t ; σ), as a history of aggregate shocks, E t , can be constructed from
T
the recursive representation X̃(Z; σ) and Ω̃(Z; σ). Begin by defining Zt (E t , σ) = Θt E t ; σ , Ωt E t ; σ
 

recursively as follows: let


 h  i
Zt E t ; σ = ρΘ Θt−1 E t−1 ; σ + σEt , Ω̃ Zt−1 E t−1 ; σ ; σ ,

(43)

where Z−1 = Z ∗ . The path of aggregates can then be constructed as

Xt (E t ; σ) = X̃ Zt E t ; σ ; σ .
 
(44)

A first order expansion of equations (43) and (44) around the σ = 0 steady state yields the
following recursive relationship

Xt E t = X + X Z Zt E t − Z ∗ + X σ + O E 2
   
(45)
Zt E t = Z ∗ + Z Z Zt−1 E t−1 − Z ∗ + Ẑ0 Et + Z σ + O E 2 .
   
(46)
 
with Ẑ0 being the direction defined in the main text and Z σ = 0, Ωσ . Our first step is to
show that X σ and Z σ are both 0 which we codify in the following lemma

Lemma 8. The first derivatives with respect to σ, X σ , Ωσ , xσ , are all 0.

Proof. Differentiating the F , R, and LOM of Ω mappings with respect to σ yields the following
system of equations13
 
0 = Fx (z, θ)xσ (z, θ) + FX (z, θ)X σ + Fx′ (z, θ)E xz ()z σ (z, θ) + xσ () + xZ · Z σ
Z
0 = Rx xσ dΩ∗ + RX X σ
ZZ
Ωσ ⟨z ′ , θ′ ⟩ = − δ z(z, θ) − z ′ ι(ρθ θ + ϵ ≤ θ′ )z σ (z, θ)µ(ϵ)dϵdΩ∗ .



This system of equations is homogeneous of degree 1 in X σ , Ωσ , xσ and,therefore is solved
by setting all terms to zero.
13
Here we have exploited the knowledge that E [xΘ (z ′ , θ′ )E ′ ] = 0

47
With the knowledge that X σ and Z σ are both zero and Z−1 − Z ∗ = 0 it’s possible to roll
forward equation (46) to find
t
t ∗
X t−s
Z Z · Ẑ0 Es + O E 2
 
Zt E −Z =
s=0
t
X
Ẑt−s Es + O E 2 ,

= (47)
s=0

where Ẑt is defined in the main text. Plugging into equation (45) yields
t
X t
X
Xt E t = X + X Z · Ẑt−s Es + O E 2 = X + X Z,t−s Es + O E 2
  
s=0 s=0

as desired.

A.2 Proof of Lemma 1’

We proceed by taking a second order expansion of (43) and (44) to find14

Xt E t = X + X Z Zt E t − Z ∗
  

1
X ZZ · Zt−1 E t−1 − Z ∗ , Zt−1 E t−1 − Z ∗ + X σσ + O E 3
    
+ (48)
2
Zt E t = Z ∗ + Z Z Zt−1 E t−1 − Z ∗ + Ẑ0 Et
  

1
Z ZZ · Zt−1 E t−1 − Z ∗ , Zt−1 E t−1 − Z ∗ + Z σσ + O E 3 ,
    
+ (49)
2
T
where Z ZZ is defined in the main text and Z σσ = 0, Ωσσ . As Zt−1 E t−1 − Z ∗ satisfies (47)
 

and both X ZZ and Z ZZ are quadratic forms we can conclude that


t X
X t  
t ∗ t ∗
X ZZ · Ẑt−s , Ẑt−m Es Em + O E 3
   
X ZZ · Zt E − Z , Zt E −Z =
s=0 m=0
t X
X t  
Z ZZ · Zt E t − Z ∗ , Zt E t − Z ∗ = Z ZZ · Ẑt−s , Ẑt−m Es Em + O E 3 .
   
s=0 m=0

(2) (2)
E t recursively via Ẑ0 E 0 = 0 and
 
Therefore if we define the direction Ẑt
t−1 X
t−1  
(2) (2)
X
t t−1
 
Ẑt E = Z Z Ẑt−1 E + Z ZZ · Ẑt−1−s , Ẑt−1−m Es Em + Z σσ (50)
s=0 m=0

then we can conclude that


t
t ∗
X 1 (2)
Ẑt−s Es + Ẑt E t + O E 3
  
Zt E −Z =
2
s=0
14
There are also X σZ and Z σZ terms but they are 0 following the same logic as X σ and Z σ being 0 in the
proof of Lemma 1

48
and
t t X
t
!
t
X 1 X  
(2) t
+O E 2 .
  
Xt E −X = X Z,t−s Es + X ZZ · Ẑt−s , Ẑt−m Es Em + X Z · Ẑt E
2
s=0 s=0 m=0
(51)
By construction it is straightforward to use show that (50) implies that
t X
t
(2)
X
E t = Ẑσσ,t +

Ẑt Ẑt−s,t−m Es Em
s=0 m=0

(2)
E t in (51)

where both Ẑt,s and Ẑσσ,t are as defined in the main text. Substituting for Ẑt
and applying the definitions of X σσ,t and X ZZ,t,s immediately yields
t X t
!
1 X
Xt E t = ... + X ZZ,t−s,t−m Es Em + X σσ,t + O E 2
 
2
s=0 m=0

where . . . are the first-order terms, as desired.

A.3 Proof of Lemma 2’ (scratch)

Assumption 1 states that the functions x̃(z, θ, Z; σ) are continuous and piece-wise smooth in
some neighborhood Z ∗ and σ = 0. We’ll start by assuming a single point where the functions
are joined defined by z ∗ (θ, Z; σ). We’ll let x̃p and x̃m denote the functions on either side of
that point thus (
x̃1 (z, θ, Z; σ) z ≥ z ∗ (θ, Z; σ)
x̃(z, θ, Z; σ) = ,
x̃0 (z, θ, Z; σ) z ≤ z ∗ (θ, Z; σ)
or
x̃(z, θ, Z; σ) = x̃0 (z, θ, Z; σ)ι (z ≤ z ∗ (θ, Z; σ)) + x̃1 (z, θ, Z; σ)ι (z ≥ z ∗ (θ, Z; σ)) ,

where z ∗ (θ, Z; σ) is determined by the agent being on the budget constraint while uncon-
strained, i.e.
z̃ 1 (z ∗ (θ, Z; σ), θ, Z; σ) = z

If we differentiate x̃(z, θ, Z; σ) with respect to Z in direction Ẑi we get

xZ (z, θ) · Ẑi = x0Z (z, θ) · Ẑi ι (z ≤ z ∗ (θ)) + x1Z (z, θ) · Ẑi ι (z ≥ z ∗ (θ)) + x1 (z, θ) − x0 (z, θ) z ∗Z (θ) · Ẑi δ(z − z ∗ (θ))


= x◦Z (z, θ) · Ẑi + x1 (z, θ) − x0 (z, θ) z ∗Z (θ) · Ẑi δ(z − z ∗ (θ))




= x◦Z (z, θ) · Ẑi

where x◦Z (z, θ)· Ẑi is the piece-wise smooth component of xZ (z, θ)· Ẑi . Continuity of x̃(z, θ, Z; σ)
implies that x1 (z ∗ (θ), θ) − x0 (z ∗ (θ), θ) = 0 which give the third equality. Therefore, all the


proofs in the main text can proceed as they do.

49
Differentiating twice with respect to Z implies that
     
xZZ (z, θ) · Ẑi , Ẑj =x◦ZZ (z, θ) · Ẑi , Ẑj + x1Z (z, θ) · Ẑj − x0Z (z, θ) · Ẑj z ∗Z (θ) · Ẑi δ(z − z ∗ (θ))
 
+ x1Z (z, θ) · Ẑj − x0Z (z, θ) · Ẑj z ∗Z (θ) · Ẑj δ(z − z ∗ (θ))
  
− x1 (z, θ) − x0 (z, θ) z ∗Z (θ) · Ẑi z ∗Z (θ) · Ẑj δ ′ (z − z ∗ (θ))
 
=x◦ZZ (z, θ) · Ẑi , Ẑj + xδZZ,i,j (θ)δ(z − z ∗ (θ))
   
where x◦ZZ (z, θ) · Ẑi , Ẑj is the piece-wise smooth component of xZZ (z, θ) · Ẑi , Ẑj and
xδZZ,i,j (θ) is given by
  
∗ ∗ ∗ ∗
xδZZ,i,j (θ) = x∆
Z (θ) · Ẑ j z Z (θ) · Ẑ i + x ∆
Z (θ) · Ẑ i z Z (θ) · Ẑ j + x ∆
z (θ) z Z (θ) · Ẑ i z Z (θ) · Ẑ j

with x∆ 1 ∗ 0 ∗ ∆ 1 ∗ 0 ∗

Z (θ) · Ẑi ≡ xZ (z (θ), θ) · Ẑj − xZ (z (θ), θ) · Ẑj and xz (θ) ≡ xz (z (θ), θ) − x (z (θ), θ) .
Finally, we note that z ∗Z (θ) · Ẑi is determined by

z ∗Z (θ) · Ẑi = −z 1z (z ∗ (θ), θ)−1 z 1Z (z ∗ (θ), θ) · Ẑi .

Note that all of the terms of xδZZ (θ) are known to first order. Combining all of these facts
together we have that
 
xZZ,i,j (z, θ) = xZ (z, θ) · Ẑi,j + xZZ (z, θ) · Ẑi , Ẑj = x◦ZZ,i,j (z, θ) + xδZZ,i,j (θ)δ(z − z ∗ (θ))

where x◦ZZ,i,j (z, θ) is the piece-wise smooth component of xZZ,i,j (z, θ).
To determine x◦ZZ,i,j (z, θ) we differentiate the F mapping twice in direction Ẑi and Ẑj at
any point (z, θ) ̸= (z ∗ (θ), θ) and add to it the derivative of F in direction Ẑi,j to get

0 = Fx (z, θ)x◦ZZ,i,j (z, θ)+FX (z, θ)X ZZ,ij +Fx′ (z, θ)E x◦ZZ,i+1,j+1 (, ) + xz (, )px◦ZZ,i,j (z, θ) +Fi,j (z, θ)
 

(52)
where Fi,j (z, θ) contains all the interaction terms known to first order

Fi,j (z, θ) =Fx′ (z, θ)E x◦zz ()z Z,i (z, θ)z Z,j (z, θ) + x◦zZ,j+1 ()z Z,i (z, θ) + x◦zZ,i+1 ()z Z,j (z, θ)
 
 
+ Fxx (z, θ) · (xZ,i (z, θ), xZ,j (z, θ)) + FxX (z, θ) · xZ,i (z, θ), X Z,j + Fxx′ (z, θ) · xZ,i (z, θ), x+

Z,j (z, θ)
 
+ FXx (z, θ) · X Z,i , xZ,j (z, θ) + FXX (z, θ) · X Z,i , X Z,j + FXx′ (z, θ) · X Z,i , x+
 
Z,j (z, θ)
     
+ Fx′ x (z, θ) · xZ,i (z, θ), xZ,j (z, θ) + Fx′ X (z, θ) · xZ,i (z, θ), X Z,j + Fx′ x′ (z, θ) · x+
+ + +
Z,i (z, θ), xZ,j (z, θ)

µ(θ (z(z, θ)) − ρθ θ) δ ∗ ∗
+ Fx′ (z, θ) xzz (θ (z(z, θ)), z(z, θ))z Z,i (z, θ)z Z,j (z, θ) + xδzZ,j+1 (θ (z(z, θ)), z(z, θ))z Z,i (z, θ
z ∗θ (θ)
!
∗ ∗
+ xδzZ,i+1 (θ (z(z, θ)), z(z, θ))z Z,j (z, θ) + xδZZ,i+1,j+1 (θ (z(z, θ)), z(z, θ)) .

50
where x+
Z,i (z, θ) = E [xz (, )xZ,i (z, θ) + xZ,i+1 (, )|z, θ] . It is straightforward to verify that (52) is

solved by

X
x◦ZZ,i,j (z, θ) = ◦
xs (z, θ)X ZZ,i+s,j+s + yi,j (z, θ)
s=0
◦ (z, θ) solves the recursive equation
where yi,j


(z, θ) = (Fx (z, θ) + Fx′ (z, θ)E [xz (, )|z, θ] p)−1 Fi,j (z, θ) + Fx′ (z, θ)E yi+1,j+1
 ◦ 
yi,j (, )|z, θ .

The first equation of Lemma 2’ is therefore satisfied by


yt,k (z, θ) = yt,k (z, θ) + xδZZ,t,k (θ)δ(z − z ∗ (θ))

with all terms known to first order or lower.


Finally, we move on to the second half of Lemma 2’. Assuming knowledge of X ZZ,t,t we
can construct xΘΘ (z, θ) = xZZ,0,0 (z, θ). To find xσσ (z, θ) differentiate the F mapping twice
with respect to σ and add to it the derivative of F in direction Ẑσσ,t

0 = Fx (z, θ)xσσ,t (z, θ) + FX (z, θ)X σσ,t + Fx′ (z, θ)E xΘΘ (, )σE2 + xσσ,t+1 (, ) + xz (, )pxσσ,t (z, θ) .
 

We begin by letting z(z, θ) be the function that solves the following linear functional equation

0 = Fx (z, θ)z(z, θ) + Fx′ (z, θ)E xΘΘ (, )σE2 + z(z, θ)(, ) + xz (, )pz(z, θ) .
 

Subtracting these two equations and defining x̂σσ,t (z, θ) = xσσ,t (z, θ) − z(z, θ) we see that

0 = Fx (z, θ)x̂σσ,t (z, θ) + FX (z, θ)X σσ,t + Fx′ (z, θ)E [x̂σσ,t+1 (, ) + xz (, )px̂σσ,t (z, θ)] .

This is identical to system of equations solved by xZ,t which allows us to conclude that

X
x̂σσ,t (z, θ) = xs (z, θ)X σσ,t+s
s=0

and hence

X
xσσ,t (z, θ) = xs (z, θ)X σσ,t+s + z(z, θ).
s=0

51
A.4 Proof of Lemma 3’

Differentiating the law of motion of Ω twice in direction Ẑt and Ẑk and adding to it the
derivative in direction Ẑt,k gives
Z Z
Dθ · Ω̂t+1,k+1 ⟨z ′ , θ′ ⟩ = − Λ(z ′ , θ′ , z, θ)z ZZ,t,k (z, θ)dΩ∗ + Λz ′ (z ′ , θ′ , z, θ)z Z,t (z, θ)z Z,k (z, θ)dΩ∗
Z Z
+ Λ(z ′ , θ′ , z, θ)z zZ,k (z, θ)Dθ · Ω̂t ⟨z, θ⟩dθdz + Λ(z ′ , θ′ , z, θ)z zZ,t (z, θ)Dθ · Ω̂k ⟨z, θ⟩dθdz
Z Z
− Λz ′ (z ′ , θ′ , z, θ)z Z,k (z, θ)z z (z, θ)Dθ · Ω̂t ⟨z, θ⟩dθdz − Λz ′ (z ′ , θ′ , z, θ)z Z,k (z, θ)z z (z, θ)Dθ
Z
+ Λ(z ′ , θ′ , z, θ)z̄z (z, θ)Dθ · Ω̂t,k ⟨z, θ⟩dθdz

where Λz ′ (z ′ , θ′ , z, θ) = − δ ′ (z(z, θ) − z ′ )δ(ρθ θ + ϵ − θ′ )µ(ϵ)dϵ. Substituting for z ZZ,t,k (z, θ)


R

using Lemma 2’ we have the first equation in Lemma 3’ with


Z Z
bt+1,k+1 ⟨z ′ , θ′ ⟩ = Λ(z ′ , θ′ , z, θ)yt,k (z, θ)dΩ∗ − Λ(z ′ , θ′ , z, θ)z zZ,k (z, θ)Dθ · Ω̂t ⟨z, θ⟩dθdz
Z
− Λ(z ′ , θ′ , z, θ)z zZ,t (z, θ)Dθ · Ω̂k ⟨z, θ⟩dθdz

and
Z Z
′ ′
ct+1,k+1 ⟨z , θ ⟩ = Λ(z , θ , z, θ)z Z,t (z, θ)z Z,k (z, θ)dΩ − Λ(z ′ , θ′ , z, θ)z Z,k (z, θ)z z (z, θ)Dθ · Ω̂t ⟨z, θ⟩dθdz
′ ′ ∗

Z
− Λ(z ′ , θ′ , z, θ)z Z,k (z, θ)z z (z, θ)Dθ · Ω̂t ⟨z, θ⟩dθdz.

Next, differentiating the LOM of Ω twice with respect to σ and adding to it the derivative
of the LOM in direction Ẑσσ,t yields
Z Z
′ ′
Dθ · Ω̂σσ,t+1 ⟨z , θ ⟩ = − Λ(z , θ , z, θ)z σσ,t (z, θ)dΩ + Λ(z ′ , θ′ , z, θ)z̄z (z, θ)Dθ · Ω̂σσ,t ⟨z, θ⟩dθdz.
′ ′ ∗

Substituting for z σσ,t using Lemma 2’ immediately obtains the second equation of the Lemma
with Z
′ ′
aσσ ⟨z , θ ⟩ = Λ(z ′ , θ′ , z, θ)z(z, θ)dΩ∗ .

A.5 Proof of Corollary 1’

We begin with a expression for the operator A · Dz . We note that


Z
A · Dz · Ω̂⟨z , θ ⟩ = Λ(z ′ , θ′ , z, θ)z z (z, θ)Dz · Ω̂⟨z, θ⟩dzdθ
′ ′

Z Z
= − Λz (z , θ , z, θ)z z (z, θ)Ω̂⟨z, θ⟩dzdθ − Λ(z ′ , θ′ , z, θ)z zz (z, θ)Ω̂⟨z, θ⟩dzdθ
′ ′

Z Z
= Λz ′ (z ′ , θ′ , z, θ)z z (z, θ)z z (z, θ)Ω̂⟨z, θ⟩dzdθ − Λ(z ′ , θ′ , z, θ)z zz (z, θ)Ω̂⟨z, θ⟩dzdθ

= Dz · L · Ω̂⟨z ′ , θ′ ⟩ − M · Ω̂⟨z ′ , θ′ ⟩

52
where L·Ω̂⟨z ′ , θ′ ⟩ ≡Λ(z ′ , θ′ , z, θ)z z (z, θ)z z (z, θ)Ω̂⟨z, θ⟩dzdθ and M·Ω̂⟨z ′ , θ′ ⟩ ≡ Λ(z ′ , θ′ , z, θ)z zz (z, θ)Ω̂⟨z, θ⟩dzd
R R
R 
From the definition of xdΩ we have
ZZ,t,k
Z  Z Z Z Z

xdΩ = xZZ,t,k dΩ + xZ,t dΩ̂k + xZ,k dΩ̂t + xdΩ̂t,k .
ZZ,t,k
Z Z Z Z

= xZZ,t,k dΩ − xzZ,t Dθ · Ω̂k dzdθ − xzZ,k Dθ · Ω̂t dzdθ − xz Dθ · Ω̂t,k dzdθ
(53)
We’ll proceed under the assumption that k ≥ t, the case where t > k is symmetric. Our first
claim is that

t−1 X
X
Dθ · Ω̂t,k = At−1−j · as−j X ZZ,s,k−t+s − b̂t,k + Dz · ĉt,k
j=0 s=0

Proof is via induction starting with our knowledge that Dθ · Ω̂0,k−t = 0. Applying the LOM
for Dθ · Ω̂t,k we have
 

t−1 X
X ∞
X
Dθ · Ω̂t+1,k+1 =A ·  At−1−j · as−j X ZZ,s,k−t+s − b̂t,k + Dz · ĉt,k  − as−t X ZZ,s,k−t+s
j=0 s=0 s=0

− bt+1,k+1 + Dz ct+1,k+1

t X
X
= At−j · as−j X ZZ,s,k−t+s − A · b̂t,k − bt+1,k+1 + A · Dz · ĉt,k + Dz · ct+1,k+1
j=0 s=0

t X
X
= At−j · as−j X ZZ,s,k−t+s − A · b̂t,k − bt+1,k+1 − M · ĉt,k + Dz · L · ĉt,k + Dz · ct+1,k+1
j=0 s=0

We conclude that
b̂t+1,k+1 = A · b̂t,k + bt+1,k+1 + M · ĉt,k

and
ĉt+1,k+1 = L · ĉt,k + ct+1,k+1 .

with the initial conditions b̂0,k−t = ĉ0,k−t = 0. Substituting for Dθ · Ω̂t,k and xZZ,t,k in (53)
yields (29) with
Z Z Z Z Z

Ht,k = yt,k dΩ − xzZ,t Dθ · Ω̂k dzdθ − xzZ,k Dθ · Ω̂t dzdθ + xz b̂t,k dzdθ + xzz ĉt,k dzdθ.

For the second half of the corollary we note that we can iterate the LOM for Dθ · Ω̂σσ,t
forward starting from the initial condition Dθ · Ω̂σσ,0 = 0 to get

X ∞
t−1 X t−1
X
Dθ · Ω̂σσ,t = − At−1−k · as−k X σσ,s − At−1−k · aσσ .
k=0 s=0 k=0

53
By construction and applying integration by parts yields
Z  Z Z

xdΩ = xσσ dΩ − xz Dθ · Ω̂σσ,t dzdθ
σσ,t

substituting for Dθ · Ω̂σσ,t and xσσ using Lemma 2’ we have equation (30) with
Z t−1
X
Gσσ,t = zdΩ∗ + B · At−1−k · aσσ
k=0

which conveniently satisfies the recursion Gσσ,t = Gσσ,t−1 + B · At−1 · aσσ .

A.6 Proof of Proposition 2’

To show equation (31) we differentiate the R mapping twice with respect to Ẑt and Ẑk and
add to it the derivative of the R mapping in direction Ẑt,k to get
Z 
Rx xdΩ + RX X ZZ,t,k + RΘ,t,k = 0
ZZ,t,k

where
Z  Z  ! Z  ! Z  !
RΘ,t,k =Rxx · xdΩ , xdΩ + RxX · xdΩ , X Z,k + RxΘ · xdΩ , ρkΘ
Z,t Z,k Z,t Z,t
Z  !
 
+ RxX · X Z,t , X Z,k + RxΘ · X Z,t , ρkΘ

+ RXx · X Z,t , xdΩ
Z,k
Z  !
 
ρtΘ , + RΘX · ρtΘ , X Z,k + RΘΘ · ρtΘ , ρkΘ .

+ RΘx · xdΩ
Z,k
R 
Substituting for xdΩ using (29) yields the desired expression.
ZZ,t,k
To show equation (31) we differentiate the R mapping twice with respect to σ and add to
it the derivative of the R mapping in direction Ẑσσ,t to find
Z 
Rx xdΩ + RX X σσ,t = 0.
σσ,t
R 
The desired expression is then obtained by substituting for xdΩ using (30).
σσ,t

54

You might also like