Outsourcing Memory Through Niche Construction
Edward D. Leea , Jessica C. Flackb , and David C. Krakauerb
a
Complexity Science Hub Vienna, Josefstædter Strasse 39, Vienna, Austria; b Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501
arXiv:2209.00476v2 [q-bio.PE] 8 Jan 2023
This manuscript was compiled on January 10, 2023
Adaptation to changing environments is a universal feature of life
and can involve the organism modifying itself in response to the environment as well as actively modifying the environment to control
selection pressures. The latter case couples the organism to environment. Then, how quickly should the organism change in response to
the environment? We formulate this question in terms of how memory duration scales with environmental rate of change when there
are trade-offs in remembering vs. forgetting. We derive a universal
scaling law for optimal memory duration, taking into account memory precision as well as two components of environmental volatility,
bias and stability. We find sublinear scaling with any amount of environmental volatility. We use a memory complexity measure to explore the strategic conditions (game dynamics) favoring actively reducing environmental volatility—outsourcing memory through niche
construction—over investing in neural tissue. We predict stabilizing
niche construction will evolve when neural tissue is costly, the environment is variable, and it is beneficial to be able to encode a rich
repertoire of environmental states.
adaptation | learning | stigmergy | niche construction | scaling
W
hat is the optimal timescale of adaptation—how long
should memory of the environment persist when the
environment is changing? And when should the organism
invest in changing the rate of environmental change? Research in a wide range of fields suggests that bidirectional
organism-environment feedback through niche construction
and symbiosis is common and plays a significant role in shaping evolutionary dynamics. Slowly evolving genes co-evolve
with quickly evolving culture (1), as illustrated by the evolution of dairy-farming facilitating selection of alleles for adult
lactase persistence (2). Quickly evolving organisms modify
their otherwise slowly changing niches and alter selection pressures (3–5), illustrated by yeast modifying fruit environments
to attract Drosophilid flies that enhance yeast propagation
(6). Institutions feed back to influence individual decisions by
changing cost landscapes and enhancing cultural transmission
(7, 8) (e.g. legislation in support of same-sex marriage that increases the willingness to voice support in the face of risk (9)).
To gain information about noisy, hidden variables and reduce
social uncertainty, error-prone individual pigtailed macaques
collectively compute a social power structure that reduces
uncertainty about the cost of social interaction, making accessible new forms of conflict management (reviewed in references
10, 11). Bacteria quorum sense, controlling group behavior
in dynamically complex, changing environments (reviewed in
reference 12). Individuals, institutions, and firms all adapt
to audit targets (Goodhart’s Law), creating new feedbacks as
they attempt to game the system (13–16). In order to undermine competitors, agents can destabilize a system like in the
recent Reddit-Gamestop event in which powerful hedge funds
are thought to have introduced volatility to markets by manipulating Reddit users to short squeeze yet other hedge funds
(17). Motivated by these examples, we develop a synthetic
framework that combines information theory, game dynamics, and scaling theory, in order to determine how adaptation
scales in a range of plausible strategic settings including niche
construction.
We start by reformulating adaptation as rate of discounting
of the past, building a conceptual and mathematical bridge to
work on memory (18–21). We take into account four factors:
bias as preference in the environment for a particular state,
stability as the rate at which environment fluctuates (22, 23),
precision as the capacity agents have to resolve environmental
signal, and feedback as the rate of agent modification of the
environment. In Table 1, we provide examples of studies addressing the interaction of bias, stability, and precision. We
also drop the separation of timescales assumption commonly
made in modeling papers and explicitly consider feedback. We
allow modification of the environment to be either passive
or active, such that active modification can be destabilizing
(increasing entropy) as well as stabilizing (reducing entropy).
The Reddit-Gamestop event is one example of this “destabilizing” niche construction. Another is guerrilla warfare in which
a weaker party randomly determines which battles to neglect
by allocating zero resources (24). In contrast, active agents
can stabilize the environment by buffering against variation
(5) or slowing its rate of change to reduce uncertainty about
the future (10, 25). A relatively simple example is stigmergy
in which trails or routes are consolidated through repeated
use (26). More complicated examples include the collective
computation of slowly changing power structures in macaque
groups (27) and foraging subgroup size distributions by spider
monkeys (28) in which social structures are computed through
communication and decision networks. Finally, we take into
account how the precision (29) of an agent’s or organism’s
Significance Statement
All organisms must adapt to changing environments, but adaptation can modify the environment itself. We solve a version of
this problem in terms of how long organisms remember. Shorter
memory should be better for variable environments and longer
for slow changing ones, but environmental variability depends
on feedback. Surprisingly, we find the same mathematical law
in both cases, revealing how much shorter memory should
be relative to the environmental timescale. We consider how
this depends on memory complexity and metabolic costs in
populations, allowing us to predict a general set of conditions
for when organism will outsource memory to the environment:
when maintaining a brain is costly, the environment fluctuates
quickly, and organisms inhabit a complex environment.
All authors helped develop the initial idea. E.D.L. did the analysis, modeling, and wrote the code.
The authors drafted the manuscript jointly.
The authors declare no competing interests.
2
To whom correspondence should be addressed. E-mail:
[email protected]
January 10, 2023
|
vol. XXX
|
no. XX
|
1–16
I Bias-Stability
Taxis of larval invertebrates (30)
Stochastic voting models (34)
Learning changing distributions (38)
Loss/Change aversion (44)
II Stability-Precision
Seed dormancy/germ banking (31)
Particle swarms (35)
Cognitive aging (39)
III Bias-Precision
Bandit problems (32)
Microbial chemotaxis (36)
Speed-accuracy trade-offs (40–42)
Optimal foraging (45)
Page Rank consensus (46)
IV Integrated
Volatile bandits (33)
Learning changing data sources (37)
Consensus with link failure (43)
Retinal sensitivity rescaling (19)
Table 1. Classification of studies in terms of bias, stability, and precision. We group studies according to the pairs of factors they combine (I)
bias-stability, (II) bias-precision, and (III) stability-precision. Studies that implicitly combine all of these factors are noted under “integrated”
(IV). Studies in category I focus on rules that apply in variable environments (bias) where environmental distributions are prone to rapid
change (stability). Studies in category II focus on rules that apply when environments are likely to change (stability) and where making the
correct decision depends on sensitivity to signals (precision or also “accuracy” in some literature). Studies in category III focus on rules
that apply in variable environments (bias) and where making the correct decision depends on power of sensors to detect signal (precision).
Studies in category IV include elements of I–III and apply to variable environments prone to rapid change, where sensory precision varies.
Feedback with the environment, where agent inference modifies the input statistics and timescales are formally coupled, remains little
considered despite being a central premise of research on niche construction and stigmergy.
estimates of environmental state influences its ability to fit
the environment at a given degree of volatility.
In “Result 1,” we explore the conditions under which long
memory is beneficial. In “Result 2,” we derive the scaling
relationship for optimal memory duration and environmental
change. In “Result 3,” we derive by way of a back-of-theenvelope calculation the costs of memory using the literature
on metabolic scaling. In “Result 4,” we introduce game dynamics introducing a complexity cost of memory to explore the
evolution of active modification and outsourcing of memory
to the environment.
Model structure & assumptions
We summarize the structure of our model in Figure 1, which
combines the essential components of adaptive agents. As a
result, it connects passive agents that learn the statistics of a
fluctuating environment with those that modify the environment itself. We summarize notation in Appendix Table S2.
The environment E at time t is described by a probability
distribution pE (s, t) over configurations s, a vector of descriptive properties. The environment has a bias for preferred states
that changes on a relatively slow timescale. Here, we represent
the state of the environment with a single bit s ∈ {−1, 1},
analogous to the location of a resource as a choice between
left and right (41, 47–49). In one configuration, the distribution of resources pE is biased to the left at a given time
t, or pE (s = −1, t) > pE (s = 1, t), such that an agent with
matching preference would do better on average than an agent
with misaligned preference. In the mirrored configuration, the
environment shows a bias of equal magnitude to the right
pE (s = −1, t) < pE (s = 1, t). Such probabilistic bias can be
represented as an evolving “field” hE (t),
pE (s, t) =
1
s
+ tanh hE (t),
2
2
[1]
such that reversal in bias corresponds to flip of sign hE (t) →
−hE (t) that naturally embodies a symmetry between left and
right. At every time point, the environment has clearly defined
bias in one direction or another, determined by setting the
external field to either hE (t) = −h0 or hE (t) = h0 . With probability 1/τE per unit time, the bias in the environment reverses
such that over time τE the environment remains correlated
with its past. When τE is large, we have long correlation times
and a slow environment or a “slow variable.” This formulation
yields a stochastic environment whose uncertainty depends on
2
|
both fluctuation rate, such that low rate implies high stability,
and the strength of bias for a particular state, such that a
strong bias yields a clear environmental signal.
Passive agents sample from the environment and choose
a binary action. In principle, the precision of the choice is
dependent on the number of sensory cells contributing to
the estimate of environmental state, the sensitivity of those
cells, and the number of samples each cell collects while the
contribution of each factor to the estimate can differ. In our
model, all the alternatives are captured by τc . When τc is high
(either because the sensory cells sampled from the environment
for a long time, many sensory cells contributed estimates,
or each sensory cell is very sensitive) agents obtain exact
measurements of the environment. A small τc corresponds to
noisy estimates. The resulting estimate of environmental state
p̂ thus incurs an error ǫτc ,
p̂(s, t) = pE (s, t) + ǫτc (t).
[2]
From this noisy signal, sensory cells obtain an estimate of
bias ĥ(t), which is related to environmental bias hE (t) plus
measurement noise ητc (t),
ĥ(t) = hE (t) + ητc (t).
[3]
In the limit of large precision τc and given that the noise in the
estimated probabilities ǫτc (t) from Eq 2 is binomial distributed,
the corresponding error in field ητc (t) converges to a Gaussian
distribution (see Materials and Methods). Then, at each time
step the agent’s measurement of the environment includes
finite-sample noise which is inversely related to precision.
An aggregation algorithm determines how much to prioritize the current measurement over historical ones. This gives
the duration of memory by recording the agent’s estimate of
the state of the environment at the current moment in time
h(t) and feeding it to sensory cells at time t + 1 with some
linear weighting 0 ≤ β ≤ 1 (50),
h(t + 1) = (1 − β)ĥ(t + 1) + βh(t).
[4]
This estimate is stored in an “aggregator” At , and we define
h(0) = 0. The weight β determines how quickly the previous
state of the system is forgotten such that when β = 0 the
agent is constantly learning the new input and has no memory
and when β = 1 the agent ceases to learn preserving its initial
Lee et al.
EE1
τc
τe
τf
as the probability of switching. We add to the switching rate
1/τE , the active construction rate,
agent
environment
sensory cells
A
E′
E2
v 2 /τE
1
≡
,
τf (t)
[h(t) − hE (t)]2 + v 2
At
τm
[6]
such that the probability q that the environment changes at
the next point in time is
At−1
q[hE (t + 1) 6= hE (t)] = 1/τE + α/τf (t).
[7]
memory
Eq 6 is written so that it remains normalized for arbitrary v
and that the rate gets smaller as the squared distance between
agent bias and environmental bias [h(t) − hE (t)]2 goes to zero.
The probability q of the environment switching to the opposite
configuration includes weight α ∈ (0, 1] to tune the strength of
destabilizers, or α ∈ [−1, 0) for stabilizers. This means that for
positive α, the rate of switching increases as the agent matches
the environment more closely and the opposite for negative
α, whereas the parameter v controls how closely the agent
must match the environment to have an effect (i.e. the width
of the peak as plotted in Figure 1C). The two types of active
agents capture two ways adaptive behavior can feedforward to
influence the timescale environmental change.∗ We note that
when 1/τf = 0, we obtain passive agents that do not modify
their environment, thus connecting passive and active agents
to one another along a continuum scale.
Putting these elements of adaptation together, as shown
in Figure 1A, we obtain a toy learning agent that infers the
statistics of a time-varying and stochastic environment.
B
C
Result 1: Long memory and adaptation favored when
sensory cells are imprecise & environments are slow
Fig. 1. (A) Overview of framework. Environment E switches configuration on
timescale τE . The agent measures the current environment through sensory cells
with precision τc , here worth 4 bits. To obtain an estimate of environment statistics at
time t, the agent At combines present sensory estimates with memory of previous
estimates recorded in an aggregator At−1 (Eq 4) such that memory decays over time
τm (Eq 5). Coupling with the environment speeds up or slows down environmental
change on timescale τf (Eq 6). (B) Example trajectories of agents adapting to environmental state hE (t) with short, medium, and long memory. (C) Rate of environment
switching per time step as a function of agent bias h relative to environmental bias
hE = 0.2. For passive agents, switching rate does not depend on agent bias. For
destabilizers α = 0.95, for stabilizers α = −0.95. For both, v = 0.1 from Eq 6
and environmental timescale τE = 5.
The timescale of adaptation represents a balance between the
trade-offs of preserving an internal state for too long or losing
it too fast. We explore this trade-off by calculating an agent’s
fit to a changing environment. The fit can be quantified with
the KL divergence between environment pE (s, t) with bias
hE (t) and agent p(s, t),
X
DKL [pE ||p](t) =
pE (s, t) log2
s∈{−1,1}
[5]
We think of the weight β that the aggregation algorithm
places on the current estimate relative to the stored value as
the timescale of adaptation τm , or agent memory duration.
The output of this computation is the agent’s behavior,
p(s, t). We measure the effectiveness of adaptation, or fit to the
environment, with the divergence between a probability vector
describing an agent and that of the environment. Measures
of divergence, like Kullback-Leibler (KL) divergence, and,
more generally, mutual information, have been shown to be
natural measures of goodness of fit in evolutionary and learning
dynamics from reinforcement learning through to Bayesian
inference (51, 52).
Here we extend the model to include feedback by allowing
agents to alter environmental stability, which is operationalized
Lee et al.
pE (s, t)
p(s, t)
.
[8]
When the KL divergence is DKL = 0, the agents use optimal
bet-hedging, known as “proportional betting,” which is important for population growth dynamics (53, 54). Eq 8 is also
minimized for Bayesian learners under optimal encoding (55).
Assuming agents are playing a set of games in which they
must guess the state of the environment at each time step,
Eq 8 is the information penalty paid by imperfect compared
to perfect agents.
After averaging over many environmental bias switches, we
obtain the agent’s typical divergence,
state. In between, agent memory decays exponentially with
lifetime
τm ≡ −1/ log β.
D̄ ≡ lim
T →∞
T −1
1 X
DKL [pE ||p](t),
T
[9]
t=0
The bar notation signals an average over time. Thus, fit
improves as D̄ decreases.
∗
Note that in this binary example, the new environmental configuration when switching is unique,
enforcing a deterministic switch, but in general there may be a large number of K ≫ 1 options
such that the agent cannot easily guess at the results of environmental fluctuations.
January 10, 2023
|
vol. XXX
|
no. XX
|
3
10
4
P
6
10 10
1
101
A
memory
*
m
agent memory
103
m
101
100
10 1 100
102
C
104
env. timescale
divergence D
2
10
2
10
4
S'
S
B
6
10 10
1
101
agent memory
divergence D *
divergence D
10
10
10
103
m
passive (P)
destabilizer (S')
stabilizer (S)
4
5
E
env. timescale
E=1
E=7
E = 46
E = 316
E = 2154
E = 14678
E = 100000
D
100
*
m
D*
102
104
env. timescale
1/2
E
E
1/2
E
Fig. 2. Divergence D̄ as a function of agent memory τm and environmental timescale τE for (A) passive and (B) active agents including destabilizers S ′ and stabilizers S . For
∗
longer τE , agents with longer memory always do better, a pattern emphasized for stabilizers and diminished for destabilizers. (C) Scaling of optimal memory duration τm
with
∗
environmental timescale τE , corresponding to minima from panels A and B. (D) Divergence at optimal memory duration D̄ ∗ ≡ D̄(τm
). Environmental bias h0 = 0.2.
In Figures 2A and B, we show divergence D̄(τm , τE ) as
a function of the agent’s memory τm given environmental
timescale τE . In the limiting cases in which an agent has
either no memory and is constantly adapting or has infinite
memory and adaptation is absent, the timescale on which environmental bias changes ultimately has no effect—we observe
convergence across all degrees of bias and stability. When an
agent has no memory, or τm = 0, an agent’s ability to match
the environment is solely determined by its sensory cells. Low
precision τc leads to large errors on measured environmental
bias hE (t) and large divergence D̄(τm = 0). On the other
hand, high precision τc increases performance and depresses
the intercept (Eq 23). At the right hand side of Figure 2A, for
large τm ≫ 1, behavior does not budge from its initial state.
Assuming that we start with an unbiased agent such that the
transition probability is centered as q(h) = δ(h), the Dirac
delta function, the agent’s field is forever fixed at h = 0. Then,
divergence D̄(τm = ∞) reduces to a fixed value that only
depends on environmental bias (Eq 24). In between the two
limits of zero and infinite agent memory, the model produces a
∗
minimum divergence D̄(τm = τm
). This indicates the optimal
∗
duration of memory τm for a given degree of environmental
bias and stability.
The benefits of memory are more substantial for agents
with imprecise sensory cells. This benefit is the difference
∗
) as shown in Figure 3A. As one
D̄(τm = 0) − D̄(τm = τm
might expect, integrating over longer periods of time provides
more of a benefit when the present estimate p̂ is noisy, τc −1 is
large, and sensory cells are not particularly precise, a deficiency
in precision that memory counters by allowing organisms to
accumulate information over time. This intuition, however,
only applies in the limit of large environmental bias h0 where
the contours of optimal memory flatten and become orthogonal
to precision τc −1 . When the bias in the environment is weak,
the curved contours show that the benefits of memory come
to depend strongly on nontrivial interaction of precision and
environmental bias. The complementary plot is the benefit
4
|
∗
from forgetting, D̄(τm = ∞) − D̄(τm = τm
) in Figure 3 B,
which is largely determined by bias h0 . When bias is strong,
the costs of estimating the environment inaccurately are large,
and it becomes important to forget if sensory cells are imprecise. Thus, our model encapsulates the trade-off between
remembering and forgetting both in terms of their absolute
benefits as well as the emergence of simple dependence of the
respective benefits in the limits of high environmental bias
and high sensory precision. An agent has optimally tuned its
∗
timescale of adaptation τm = τm
when it has balanced the
implicit costs of tuning to fluctuations against the benefits of
fitting bias correctly.
Result 2: Adaptation and environmental change scale
sublinearly
For sufficiently slow environments, or sufficiently large τE , we
∗
find that optimal memory duration τm
scales with the environmental timescale τE sublinearly as in Figure 2C. To derive the
scaling between optimal memory and environmental timescale,
we consider the limit when agent memory persistence is small
relative to the environmental persistence τm ≪ τE . Under this
condition, optimal memory represents a trade-off between a
poor fit lasting time τm and a good fit for time τE − τm . During the poor fit, the agent pays a typical cost at every single
time step such that the cost grows linearly with its duration,
Cτm , for constant C. When the environment is stable, agent
precision is enhanced by a factor of τm because it effectively
averages over many random samples, or a gain of G log τm for
constant G. When we weight each term by the fraction of
time spent in either transient or stable phases, τm /τE and
(τE − τm )/τE respectively, we obtain the trade-off
C
2
τE − τm
τm
−G
log τm .
τE
τE
[10]
∗
At optimal memory τm
, Eq 10 will have zero derivative. Keeping only the dominant terms and balancing the resulting equa-
Lee et al.
10
4
10
5
10
2
10
3
10
4
10
5
sensory cell
precision 1/ c
0.5
env. bias h0
A1.0
forgetting
0.5
env. bias h0
B1.0
met. cost adaptive cost
3
3.0
log10 benefit
10
remembering
6.0
9.0
0.0
1.0
2.0
3.0
4.0
log10 benefit
2
sensory cell
precision 1/ c
10
environmental timescale
Result 3: Metabolic cost of memory can become prohibitive in slow environments
Here we ask how memory might become limited by the
metabolic costs of neural tissue.
We start with the well-documented observation that physical constraints on circulatory networks responsible for energy
distribution influence organismal traits including lifespan and
size across the animal kingdom from microorganisms to blue
whales (56, 57). Metabolic costs matter for brain mass Mbr ,
a
which scales with body mass Mbo sublinearly, Mbr = AMbo
,
where a = 3/4 across taxa (within individual taxa it spans
the range 0.24 to 0.81 (58)). To account for memory cost,
we make the simple assumption that the quantity of brain
mass required for memory is proportional to the number and
duration of environmental states (the “environmental burden”)
the organism encounters,
tion, we find
1/2
[11]
This scaling argument aligns with numerical calculation as
shown in Figure 2C.
Similarly, we calculate how optimal divergence D̄∗ scales
with environmental timescale. Assuming that the agent has
a good estimate of the environment such that the error in
average configuration ǫτc (t) is small, agent behavior is pE (s, t)+
ǫτc (t) and ǫτc (t) is normally distributed. Then, we expand
the divergence about pE (s, t) in Taylor series of error ǫτc (t)
∗
(Materials & Methods). Over a timescale of τm
, the precision
∗
of this estimate is further narrowed by a factor of τm
such that
−1/2
∗
D̄∗ ∼ 1/τm
∼ τE
.
Mbr ∝ N τE .
[13]
After all, we say, “an elephant never forgets” and not the same
of a mouse.
Now, we use predictions of allometric scaling theory to
relate metabolic rate B to mass, B ∝ M 1/4 (59), and lifespan
b
to body mass, T ∝ Mbo
for metabolic exponent b = 1/3 (60).
From Eq 13, we obtain a relationship between metabolic rate
and memory burden, B ∝ N φ τEφ , where φ ≡ a/4b.† Note
that this scaling is sublinear for biological organisms, φ < 1.
Although the adaptive cost decays with τE in Figure 4A,
metabolism grows as τEφ as shown in Figure 4B. The competing
scalings suggest that for small organisms the cost of adaptation
will make a disproportionate contribution to the lifetime energy
budget of an organism. This is consistent with observations
on developmental neural growth in butterflies (61).‡
[12]
Although we do not account for the transient phase, we expect
the relation in Eq 12 to dominate in the limit of large τE ,
and our numerical calculations indeed approach the predicted
scaling in Figure 2C. In contrast, when environment does not
fluctuate, or bias h0 = 0, agents pay no cost for failing to adapt
to new environments and infinite memory is optimal. Overall,
the sublinear scaling between memory duration and rate of
environmental change indicates an economy of scale. Agents
require proportionally less expenditure on adaptation in slow
environments than would be true under a linear relationship.
Hence a slow environment is in this sense highly favorable to an
adaptive agent when considering the costs of poor adaptation.
Lee et al.
E
Fig. 4. Scaling of adaptive and metabolic costs with environmental timescale τE .
(A) Adaptive cost D̄ is largest at small τE , but (B) metabolic costs are largest for
longer-lived organisms with scaling dependent on exponents y and φ such as for
“elephants” that experience slower environments (Eq 14).
Fig. 3. Benefit from (A) remembering and from (B) forgetting defined as the reduction
in divergence at optimal memory duration relative to no memory, D̄(τm = 0) −
∗
∗
), and optimal memory duration to infinite memory, D̄(τm = ∞) − D̄(τm
),
D̄(τm
respectively. We show passive agents given environmental timescale τE = 10. All
contours must converge (A) when h0 = 0 and (B) when τc = 0. Agent-based
simulation parameters specified in accompanying code.
∗
τm
∼ τE .
y>
y=
y<
†
When we use a = 3/4, we obtain the range φ = [5/8, 15/16], the endpoints depending
on whether b = 0.3 or b = 0.2, respectively, while accounting for taxa-specific variation in a
leads to much wider range of φ ∈ [0.2, 1.01]. Thus, we hypothesize that longer environmental
timescales lead to increased brain mass and metabolic expenditure with sublinear scaling.
‡
As noted in the cited study and its citations, experience leads to larger brain size, indicating that
learning from such experience is sufficiently valuable to warrant concomitant constitutive and induced costs.
January 10, 2023
|
vol. XXX
|
no. XX
|
5
10
1
10
2
3
10 10
0.02
0.00
3
10 1
stab. weight
stab. advantage
complexity weight
100
0.02
Fig. 5. Comparison of total divergence for stabilizers DS and destabilizers DS ′ , or
DS ′ − DS , in a fixed environment and common sensory precision and costs. The
difference is between agents poised at optimal memory duration given µ, χ, and β .
Small stabilization weight χ favors stabilizers, whereas high monopolization cost µ
favors destabilizers. Simulation parameters are specified in accompanying code.
To generalize the previous argument, we assume larger
organisms experience longer environmental timescales. Then,
τE ∝ T y , where y ∈ [0, 1] to ensure that τE and N increase
1/y−1
∝ N . We now find the relationship
together since τE
between metabolic rate and environmental timescale
φ/y
B ∝ τE
∝ N φ/(1−y) ,
[14]
which reduces to the previous case when y = 1 (and N is
a constant). Such dependence implies that the metabolic
cost of memory will explode with environmental timescale
(and organism lifetime) as y approaches zero and grow slowly
and sublinearly when y = 1. Both possibilities are shown in
Figure 4B. More generally, lifespan is expected to influence
the relative contributions of adaptive versus metabolic costs
(62, 63).
Result 4: Niche construction, memory complexity, &
the outsourcing principle
In Result 3, we explored the metabolic cost of memory versus
adaptation, emphasizing the metabolic constraints on long
memories. In this section we focus on the information costs
of adaptation when allowing for active modification of the
environment. We explore how outsourcing memory to the
environment by slowing it down is beneficial when costs of
poor adaptation are dominant (10).
A slow environmental timescale increases the advantages of
persistent memory, but it also reduces the amount of new information an organism requires by reducing uncertainty about
the state of the environment. In this sense, slow environmental variables reflect a form of niche construction. Whether
ant pheromone trails, food caching, collectively computing
power structures, writing, or map-making, niche construction that promotes the stability or predictability of the local
environment (5, 64) reduces the number of environmental
configurations that an organism needs to encode. Stabilizing
niche construction, however, also creates a public good that
by reducing environmental uncertainty, provides a benefit to
all agents, and can be exploited by free riders. This can lead
to a tragedy of the commons (65).
We explore the conditions under which active modification
of the environment can evolve given the free-rider problem,
6
|
and how this overcomes the costs of adaptation. We introduce stabilizing mutants into a population of passive agents.
Assuming other organisms are poorly adapted to regularities
in the environment, we expect stabilizing mutants to gain a
competitive advantage but only over the short term. In the
longer term, an established stabilizer population is susceptible
to invasion by free-riders exploiting outsourced memory; said
another way, stabilizers slow environmental timescales and
reduce divergence for all individuals sharing the environment,
but they uniquely pay for stabilization. Thus, as in the classical example of niche construction, the usual “tragedy of the
commons” argument makes it an evolutionary dead end (65).
It follows that stabilization is only a competitive strategy
if individuals can monopolize extraction of resources from the
stabilized environment. In the natural world, this could occur
through physical encryption (e.g. undetectable pheromones
(66)), the erasure of signal (e.g. food caching (67)), or the
restriction of social information (e.g. concealment (68)). To
model competition between monopolistic stabilizers and other
strategies, we account for the costs of memory, stabilization,
and precision. We introduce a new memory cost of encoding
complex environments as
H(τm ) = log2 (1 + 1/τm ).
[15]
Eq 15 can be thought of as a cost of exploring more configurations over a short period time versus agents that are
temporally confined. This is different from costs associated
with the environmental burden in Result 3, which emphasizes
the costs of persistence, not variability.
We define the cost stabilizers pay for niche construction
as the extent of change to the environmental switching rate,
or the KL divergence between the natural environmental rate
1/τE and the time-averaged, modified rate h1/τ̃E i,
1
G(1/τE , h1/τ̃E i) =
log2
τE
1
1−
τE
1/τE
h1/τ̃E i
+
1 − 1/τE
1 − h1/τ̃E i
log2
[16]
.
The quantity G depends implicitly on stabilization strength α
because smaller α slows the environment further. For passive
agents and destabilizers, G = 0 by definition because nonstabilizers fit to τE and only stabilizers benefit from the slower
timescale with monopolization.
We finally consider the cost of precision, which we assume
to be given by the information obtained by the agent from
sampling the environment,
C(τc ) = log2 τc .
[17]
Sensory complexity means that higher precision implies higher
expenditure to obtain such precision, given by the KL divergence between environment configuration and agent behavior, C ∼ − log2 (σ 2 ) leaving out constants. This depends on the variance of agent measurement noise σ 2 =
pE (s, t)[1 − pE (s, t)]/τc . Infinitely precise sensory cells lead to
diverging cost, whereas imprecise cells are cheap.
Putting these costs together with divergence D̄, we obtain
the total divergence
D = D̄ + µH + χG + βC.
[18]
Lee et al.
Weights µ ≥ 0, χ ≥ 0, β ≥ 0 represent the relative contribution
of these costs. As a result, we can distinguish dominant
strategies by comparing total divergence such as between the
pair of destabilizer and stabilizer strategies shown in Figure 5.
Large µ, or high complexity cost, means that a pure population
of stabilizers would be stable to invasion from destabilizers.
Whereas for large χ, or heavy stabilization cost, the opposite
is true. The generalized measure of adaptive cost in Eq 18,
given the weights, carves out regions of agent morphospace
along axes of computational cost. This is a morphospace that
captures the relative advantage of internal versus external
memory that can be thought of as a space of evolutionary
outsourcing.
As has often been remarked in relation to evolution, survival is not the same as arrival. We now determine when
stabilizer strategies can emerge in this landscape. We start
with a pure population of passive agents with stabilization
strength α = 0 and poised about optimal memory duration
∗
τm = τm
determined by minimizing both divergence D̄ and
complexity µH. Whether or not stabilizers emerge under
mutation and selection can be determined through adaptive
dynamics (69–71), that is by inspecting the gradient of the
total divergence along the parameters (∂τm D, ∂α D, ∂τc D), or
memory complexity, stabilizer strength, and precision. As
we show in SI Appendix C and Eq S16, the gradient terms
can be calculated under a set of perturbative approximations.
∗
Using local convexity about optimal memory τm
, we show that
the term ∂α D drives passive agents to smaller α and slower
timescales; it originates from combining the scaling law from
Eq 12 and complexity of memory. The term ∂τc D shows that
precision tends to decrease when the cost gradient ∂τc (βC)
dominates over ∂τc D̄. In this case, the general conditions
∂α D < 0 and ∂τc D < 0 funnel a passive population towards
stabilization and reduced precision.
Discussion
Life is adaptive, but optimal adaptation would seem to depend
on a multitude of properties of both organism and environment,
which have been studied in a wide literature (Table 1). To
the contrary, we predict that it does not. This becomes clear
once we organize crucial aspects of adaptation into a unified
framework in terms of timescales including niche construction
that speeds up or slows down the environment (Figure 1). We
find that memory duration, under a wide range of assumptions and conditions, scales sublinearly with environmental
rates of change (Figure 2). This essentially derives from the
competition between using current but noisy information and
the reliance on outdated but precise information, leading to
a universal, optimal timescale for adaptation. Importantly,
sublinear scaling implies that persistent features of the environment can be more efficiently encoded the longer-lasting
they become; there is an economy of scale.
Yet, memory remains costly as it requires investing in neural
tissue. To estimate this cost and how it might affect adaptation,
we use metabolic scaling theory to estimate how much neural
tissue an organism must allocate to memory for a given rate
of environmental change. We find that the metabolic costs
of memory can increase super-linearly with the persistence
time of environmental statistics. Thus, while memory need
not grow in proportion to environmental stability, costs of
memory could increase disproportionately (Figure 4). Because
Lee et al.
adaptive costs peak at short timescales, this suggests that
adaptive costs are most important for organisms with short
lifespans such as insects.
When the costs of adaptation are greater than the metabolic
costs of memory, active modification of the environment such as
stabilizing niche construction can be favored. In this case the
organism intervenes on the environmental timescale to decrease
volatility. Although outsourcing of memory to the environment
reduces the organism’s need to adapt, it introduces two new
problems. First, active modification is itself not free. Second,
slow environmental variables created by active modification
are public goods that can be exploited by free riders.
To address the costs of active modification and free riding,
we introduce game dynamics considering the information costs
of adaptation including the complexity of memory. Unlike
memory duration, memory complexity quantifies the effective
number of states that agents occupy. Starting with passive
agents, we find that the spontaneous emergence of adaptive
dynamics stabilizes the environment, lengthening the optimal
∗
memory duration τm
and thereby making weak stabilizers less
competitive. This moves a population as a whole towards
slower timescales. In other words, stabilizing niche construction, because of the economy of scale with respect to memory,
requires proportionally less neural tissue for memory relative
to the size of the whole brain as given by metabolic scaling
theory. This is effectively outsourcing memory from neural tissue to the environment. As a possible consequence, organisms
could reduce absolute brain size or invest in a larger behavioral
repertoire, increasing competitiveness by monopolizing a larger
number of environmental states. Do learning agents in volatile
environments “given a choice” to invest in additional memory
or to directly change the environment favor the latter?
This hypothesis is consistent with related work on institutions and social structure as a form of collectively encoded
memory (72–75) or as devised constraints (e.g. reference 76)
that slow down the need to acquire functional information. In
pigtailed macaque society (reviewed in reference 10), individuals collectively compute a social-power distribution from status
signaling interactions. The distribution of power as a coarsegrained representation of underlying fight dynamics changes
relatively slowly and consequently provides a predictable social
background against which individuals can adapt. By reducing
uncertainty and costs, the power distribution facilitates the
emergence of novel forms of impartial conflict management.
Conflict management, in turn, further reduces volatility, allowing individuals to build more diverse and cohesive local
social niches and engage in a greater variety of socially positive
interactions (77). In other words, outsourcing memory, in this
case, of fight outcomes, to a stable social structure in the
power distribution allows for a significant increase in social
complexity. More generally, we anticipate that one of the
features of slowing environmental timescales, including social
environments fostered by institutions, might the emergence of
new functions (78).
Without feedforward and feedback loops between environment and agent such as in the case of the passive agent, our
framework is akin to the classical problem of learning. This
has been a major problem of interest in foraging (79), neural
circuits that adapt to changing input distributions (19, 80, 81)
and modes of prediction in order to best adapt to multiple
clustered sets of statistics (20, 80). We introduce here a minJanuary 10, 2023
|
vol. XXX
|
no. XX
|
7
imal modeling framework for connecting learners to active
agents that modify the environment through the act of adaptation. Our framework provides a first-order approximation
to this extended space, which could itself be extended in several directions to include how agents physically modify their
environments, connecting to the physics of behavior with the
physics of information (82).
Materials and Methods
Numerical solution to model. Given Eqs 1–4 defining the binary
agent, we calculate agent behavior in two ways. The first method
is with agent-based simulation (ABS). We generate a long time
series either letting the environment fluctuate independently and
training the agent at each moment in time or coupling environmental
fluctuations at each time step with the state of the agent. By
sampling over many such iterations, we compute the distribution
over agent bias given environmental bias, q(h|hE ), which converges
to a stationary form.
This principle of stationarity motivates our second solution of
the model using an eigenfunction method. If the distribution is
stationary, then we expect that under time evolution that the
conditional agent distribution map onto itself
q(h|hE ) = T [q(h|hE )].
[19]
If the time-evolution operator T evolves the distribution over a
single time step, the external field can either stay the same with
probability 1 − 1/τE or reverse with probability 1/τE .
For either for these two possible alternatives over a single time
step, we must convolve the distribution with the distribution of
noise for the field ητc . The distribution of noise derives from agent
perceptual errors ǫτc on the estimated probabilistic bias of the environment (Eq 2). Hence, the corresponding error distribution for the
bias ητc originates from the binomial distribution through a transformation of variables. We can simplify this because in the limit of
large sensory cell sample size τc the binomial distribution converges
to a Gaussian and a concise representation of the distribution of
ητc becomes accurate. Using Eq 1, we find that the distribution of
perceptual errors in the bias yields
2
− [tanh hE (t)−
tanh(hE (t) + ητc )] /8σ 2 sech2 (hE (t) + ητc ) .
[20]
Here, the agent’s perceptual estimate of the environment includes
finite-sample noise determined by the sensory cell precision 1/τc .
At finite τc , there is the possibility that the agent measure a sample
from the environment of all identical states. In our formulation, the
fields then diverge as do the fields averaged over many separate measurements. We do not permit such a “zero-temperature” agent that
freezes in a single configuration in our simulation just as thermodynamic noise imposes a fundamental limit on invariability in nature.
Our agents inhabit an in silico world, where the corresponding limit
is fixed by the numerical precision of the computer substrate, so we
limit the average of the bits sampled from the environment to be
within the interval [−1 + 10−15 , 1 − 10−15 ]. This is one amongst
variations of this idea that inference is constrained by regularization,
Bayesian priors, Laplace counting (in the frequentist setting), etc.
Regardless of the particular approach with which finite bounds
might be established, they are only important in the small τc limit.
See SI Appendix A.
Given the Gaussian approximation to precision error, we propagate the conditional distribution over a single time step, defining a
self-consistent equation that can be solved by iterated application.
To make this calculation more efficient, we only solve for abscissa
of the Chebyshev basis in the domain β ∈ [0, 1], fixing both the
endpoints of the interval including the exact value for β = 1 from
Eq 24 (83) (more details in SI Appendices A and B). In Figure
S7, we show that our two methods align for a wide range of agent
memory τm . Importantly, the eigenfunction approach is much faster
than ABS for large τc because the latter can require a large number
8
|
Divergence curves. To measure how well agent behavior is aligned
with the environment, we compare environment pE (s, t) and agent
p(s, t) with the KL divergence at each time step to obtain the
agent’s typical loss in Eq 9. Equivalently, we can average over the
stationary distribution of fields conditional on environment
D̄ =
The code used to generate these results will be made available on
GitHub at https://github.com/eltrompetero/adaptation.
ρ(ητc , t) = (8πσ 2 )−1/2 exp
of time steps to converge. On the other hand, ABS is relatively fast
for small τc . Thus, these two approaches present complementary
methods for checking our calculation of agent adaptation.
1
NE
XZ
∞
dh q(h|hE )DKL [pE (hE )||p(h)],
[21]
−∞
E
where we sum over all possible environments E and weight them
inversely with the number of total environments NE . For the binary
case, NE = 2. We furthermore simplify this for the binary case as
D̄ =
Z
∞
dh q(h|hE = h0 )DKL [pE (hE = h0 )||p(h)].
[22]
−∞
In Eq 22, we have combined the two equal terms that arise from both
positive hE = h0 and negative hE = −h0 biases of the environment.
In Figure 2A and B, we show divergence as a function of agent
memory over a variety of environments of varying correlation time
D̄(τm , τE ). When the agent has no memory, its behavior is given
solely by the properties of the sensory cells as is determined by the
integration time τc . Then, we only need account for the probability
that the environment is in either of the two symmetric configurations
and how well the memoryless agent does in both situations. Since
the configurations are symmetric, the divergence at zero memory is
D̄(τm = 0) =
Z
∞
dητc ρ(ητc |hE = h0 )×
−∞
X
pE (s|hE = h0 ) log2
s∈{−1,1}
pE (s|hE = h0 )
p(s)
[23]
,
where the biased distribution of environmental state pE and the
error distribution ρ from Eq 20 are calculated with environmental
bias set to hE = h0 . Note that this is simply Eq 22 explicitly
written out for this case.
At the limit of infinite agent memory, as in the right hand side
of Figure 2A, passive agents have perfect memory and behavior
does not budge from its initial state. Assuming that we start with
an unbiased agent such that q(h) = δ(h), the Dirac delta function,
the agent’s field is forever fixed at h = 0. Then, divergence reduces
to
[24]
D̄(τm = ∞) = 1 − S[pE ],
where the conditional entropy S[pE ]
=
−pE (s|h
=
h0 ) log2 pE (s|h = h0 ) − [1 − pE (s|h = h0 )] log2 [1 − pE (s|h = h0 )].
Scaling argument for optimal memory. As is summarized by Eq 10,
the value of optimal memory can be thought of as a trade-off
between the costs of mismatch with environment during the transient
adaptation phase and gain from remembering the past during stable
episodes. In order to apply this argument to the scaling of divergence,
we consider the limit where the environment decay time τE is very
long and agent memory τm is long though not as long as the
environment’s. In other words, we are interested in the double
limit τm → ∞ and τm /τE → 0. Then, it is appropriate to expand
divergence in terms of the error in estimating the bias
D̄ =
X
pE (s, t) log pE (s, t)−
s∈{−1,1}
[25]
pE (s, t) log[pE (s, t) + ǫτc (t)] ,
where the average is taken over time. Considering only the second
term and simplifying notation by replacing ǫτc (t) with ǫ,
hpE (s, t) log pE (s, t) + log[1 + ǫ/pE (s, t)]i
≈
ǫ
ǫ2
1
pE (s, t) log pE (s, t) +
−
pE (s, t)
2 pE (s, t)2
,
[26]
Lee et al.
where the average error hǫi = 0 and assuming
that the next non
trivial correlation of fourth order O ǫ4
is negligible. Plugging
this back into Eq 25,
D̄ ≈
X
s∈{−1,1}
2
τm
τE − τm ǫ
+
τE
pE (s)2
τE
ǫ2
pE (s, t)2
.
[27]
The first term in Eq 27 relies on the fact that when environmental
timescales are much longer than agent memory, the errors become
independent of the state of the environment. Thus, we can average
over the errors separately, and the environment configuration average can be treated independently of time pE (s, t) → pE (s). The
second term, however, encases the transient dynamics that follow
immediately after a switch in environmental bias while the agent
remembers the previous bias. It is in the limit τm /τE → 0 that we
can completely ignore this term and the scaling for optimal memory
∗ ∼ τ 1/2 from Eq 11 is the relevant limit that we consider here.
τm
E
Since the errors with which the agent’s matching of environmental bias is given by a Gaussian distribution of errors, the precision
increases with the number of samples taken of the environment:
it should increase with both sensory cell measurement time τc as
well as the typical number of time steps in the past considered,
τm = −1/ log β. Thus, we expect the scaling of divergence at
optimal memory to be
D̄∗ ∼
1
,
∗τ
τm
c
[28]
which with Eq 11 leads to the scaling of optimal memory with
environment decay time Eq 12. Though the scaling with precision
∗ , it is clear that a similar scaling
timescale τc in Eq 28 is at τm = τm
with τc holds at τm = 0, where only precision determines divergence.
However, such a scaling does not generally hold for any fixed τm ,
the trivial case being at τm = ∞, where divergence must go to a
constant determined by environmental bias.
ACKNOWLEDGMENTS.
E.D.L. was supported by the Omega
Miller Program at the Santa Fe Institute. D.C.K. and J.F. are
grateful for support from the James S. McDonnell Foundation 21st
Century Science Initiative-Understanding Dynamic and Multi-scale
Systems.
Lee et al.
January 10, 2023
|
vol. XXX
|
no. XX
|
9
B. Eigenfunction solution
We present more details on top of those in Materials and Methods
on the iterative, eigenfunction solution to the divergence of an agent
relying on the fact that the distribution of agent bias q(h) becomes
stationary at long times.
Let us first consider the case of the passive agent. After sufficiently long time, the distribution of agent behavior q(h) and
the distribution conditioned on the two states of the environment
q(h|hE = h0 ) and q(h|hE = −h0 ) converge to stationary forms.
Assuming that the distributions have converged, we evolve the distribution a single time step. If the external field hE (t) = h0 , then
it either stays fixed with probability 1 − 1/τE or it switches to the
mirrored configuration with probability 1/τE .
Considering now the evolution of the conditional probability
q(h|hE = h0 ), we note that the state of the agent will be either be
convolved by the distribution of sampling error at the next time
step or lose probability density from a switching field. Since we
are considering a symmetric configuration, however, the mirrored
conditional density will reflect the same probability density back
such as in Eq S1. Thus, Eq S1 is satisfied by the conditional density
of agent bias that is solved by the eigenfunction for q(h|hE ) with
eigenvalue 1. By the Perron-Frobenius theorem when considering
normalized eigenvectors, this is the unique and largest eigenvalue
that returns the stationary solution.
To extend this formulation to active agents, we must also account
for the dependence of the rate of switching on the distance between
agent and environmental bias. This additional complication only
requires a modification of Eq S1 to include such dependence in the
rate coefficients. Thus, all types of agents can be captured by this
eigenfunction solution and solved by iteration til convergence.
Eq S1 is only independent of time when agent memory τm = 0.
When there is finite memory, or β > 0, the distribution q(h, t)
“remembers” the previous state of the environment such that we
must iterate Eq S1 again. Over many iterations, we will converge to
the solution, but the convergence slows with agent memory which
introduces ever slower decaying eigenfunctions. An additional difficult arises because the narrowing in the peak of the agent’s estimate
of the environment, like the peaks shown in Figure S7, require
increased numerical precision. As a result, increasing memory and
computational costs make it infeasible to calculate the eigenfunction
with high precision for β close to 1.
Instead of calculating the full functional form directly below
but not at the limit β → 1, we use the output of the iterative
eigenfunction procedure as input for an interpolation procedure
using Chebyshev polynomials. We iterate Eq S1 for β equal to
10
|
10
3
10
4
10
1
10
2
10
3
10
4
10
9.6
8.4
7.2
6.0
4.8
3.6
2.4
1.2
0.0
0.6
0.9
1.2
1.5
1.8
2.1
2.4
3
10 2.7
A
1
101
agent memory
B
complexity cost
H+C
10
2
stabilization
cost G
1
sensory cell
precision 1/ c
To complement the eigenfunction solution described in Appendix B,
we present the agent-based simulation.
After having specified the environmental bias hE (t), we generate
a sample of τc binary digits from the distribution pE (s, t). From
this sample, we calculate the mean of the environment hsi which
is bounded in the interval [−1 + 10−15 , 1 − 10−15 ]. These bounds
are necessary to prevent the measured field ĥ(t) from diverging and
reflects the fact that in silico agents have a finite bound in the values
they can represent, mirroring finite cognitive resources for biological
or social agents as discussed in Materials and Methods. We combine
this estimated field ĥ(t) with the one from the aggregator having set
the initial value condition H(0) = 0. Given the estimate of the field
h(t), we compute the Kullback-Leibler (KL) divergence between
the agent distribution p(s) and the environment pE (s).
When we calculate the divergence landscape across a range of
different agent memories, we randomly generate the environment
using the same seed for the random number generator. Though
this introduces bias in the pseudorandom variation between divergence for agents of different types, it makes clearer the form of the
divergence landscape by eliminating different offsets between the
points. Our comparison of this approach with the eigenfunction
solution in Appendix B provides evidence that such bias is small
with sufficiently long simulations. For the examples shown in the
main text, we find that total time T = 107 or T = 108 are sufficient
for convergence to the stationary distribution after ignoring the first
t = 104 time steps.
10
sensory cell
precision 1/ c
A. Agent-based simulation
m
Fig. S6. Landscape of the costs we consider as (A) a combined agent complexity
and (B) a stabilization cost. (A) Isocontours defined as sum of memory complexity
and sensory precision costs. The values have been offset to ensure that the costs
are positive over the shown landscape calculated from memory (Eq 15) and sensory
cost (Eq 17). (B) As memory τm → ∞, stabilization cost converges to a finite value
that can be calculated exactly from noting that agent behavior has probability density
fixed at its starting point, q(h) = δ(h). A kink in the contours at 1/τc = 10−3
arises from numerical precision errors where we matched up ABS and eigenfunction
methods.
the Gauss-Lobatto abscissa of the Chebyshev polynomial of degree
d, mapping the interval β ∈ [0, 1] to the domain x ∈ [−1, 1] for
the set of Chebyshev polynomials (83). The Gauss-Lobatto points
include the endpoints β = 0 and β = 1, the first of which is trivial
numerically and the latter for which we have an exact solution
given in Eq 24. Then, we exclude calculated values for large β that
show large iteration error ǫ > 10−4 . This threshold, however, leaves
the coefficients of the Chebyshev polynomial undetermined. We
instead interpolate these remaining N − k points by by fitting a
Chebyshev polynomial of degree N − k − 1 with least-squares on
the logarithm of the divergence. A similar procedure can be run
for the stabilization cost from Eq 16 to obtain Figure S6B. We find
that typically N = 30 or N = 40 starting abscissa with a maximum
of 103 iterations are sufficient to obtain close agreement with the
agent-based simulation (ABS) from Appendix A (Figure S8). This
interpolation procedure does not work well with ABS because small
stochastic errors can lead to high-frequency modes in interpolation
(and thus large oscillations), errors that can be essentially driven to
zero exponentially fast for the eigenfunction method.
C. Evolution of reduced complexity
We consider a population of passive agents, or an agent with stabilization parameter α = 0, precision timescale τc , and optimal
∗ , the variables that determine agent fitness. Assuming
memory τm
that the canonical equation for evolution applies (i.e. mutations
only change phenotype and fitness slightly, the population dynamics
move much faster than the evolutionary landscape such that we
can assume a single phenotype dominates), the rate at which the
population evolves across the phenotypic landscape is proportional
to the fitness gradient. In addition to this assumption, we will
assume that the population is always poised at optimal memory, an
assumption that will be made clear below.
We recall that the total divergence consists of the time-averaged
divergence D̄, statistical complexity cost H, stabilization cost G,
and precision cost C
D = D̄ + µH(τm ) + χG(τE , τ̃E ) + βC(τc )
[S2]
Lee et al.
1
q(h, t|hE = h0 ) = 1 −
τE
1
τE
(h|hE = 0.2)
0
25
0
25
0
25
0
0.0
∞
−∞
Z
∞
−∞
Z
∞
ρ(ητc |hE = h0 )q(h, t − 1|hE = h0 )δ(h − h0 − ητc ) dητc dh+
−∞
[S1]
∞
ρ(ητc |hE = −h0 )q(h, t − 1|hE = −h0 )δ(h + h0 − ητc ) dητc dh.
−∞
50
ABS
eigenfunction
= 0/9
0
50
(h|hE = 0.2)
25
Z
Z
= 1/9
= 2/9
= 3/9
0.1
0.2
agent bias h
0
50
0
50
0.3
0
0.0
ABS
eigenfunction
= 0/9
= 1/9
= 2/9
= 3/9
0.1
0.2
agent bias h
0.3
Fig. S7. Comparison of agent-based simulation (ABS) and eigenfunction solution for the conditional probability distribution of agent bias q(h|h0 ) for (left) a passive agent
and (right) stabilizer. Agent-based simulation returns a normalized histogram that aligns closely with the eigenfunction solution. Environment timescale τE = 20 and bias
h0 = 0.2. Spacing of discrete domain in eigenfunction solution determined in proportion with typical width of the peak around h = h0 , which scales as in Eq 20 and inversely
with the square root of agent memory τm .
with semi-positive weights µ, χ, and β. In Figure S9, we show each
the divergence of a stabilizer without such costs in blue, each of
these costs separately in black, and their sum in orange to generate
the total divergence in Eq 18. For the evolutionary dynamics, we
must calculate the gradient (∂τm D, ∂α D, ∂τc D) determining the
evolution in the properties of the agent. We calculate these term
by term and then put them together at the end.
We assume that agent memory τm is at the minimum of the combination of time-averaged divergence D̄ and statistical complexity
cost µH (stabilization is zero for passive agents). Since divergence
has a unique minimum and complexity monotonically approaches
H(τm = ∞) = 0, the addition of complexity only shifts optimal
memory to a larger value. Without the complexity cost, we have
that small deviations about optimal memory can be represented by
a quadratic function for some positive constant a,
∗ 2
D̄ = D∗ + a(τm − τm
) ,
[S3]
where we write
∗ 1/2
D∗ = D0 (τm
)
[S4]
for some positive constant D0 . Once we have accounted for a
perturbative addition from memory complexity, however, we have a
shifted optimal memory
∗∗
τm
=
∗
τm
µ
+
+ O(µ2 )
∗ (τ ∗ + 1)
2(log 2)aτm
m
[S5]
obtained from ∂τm [D̄ + µH] = 0 and using the approximation that
µ is small. Eq S18 shows us that memory complexity, the term
∗∗ up.
proportional to µ, drives optimal memory τm
Lee et al.
Taking the approximation in Eq S18 the shifted optimal divergence, denoted by an apostrophe, becomes
∗∗
) = D∗ + a
D̄′ (τm
µ2
+ O(µ3 ).
∗ )2 (τ ∗ + 1)2
4(log 2)2 (τm
m
[S6]
Again, perturbations about the local optimum lead to
D̄′ (τm ) ≈ D∗ + a
µ2
+
∗ )2 (τ ∗ + 1)2
4(log 2)2 (τm
m
[S7]
∗∗ 2
b(τm − τm
)
for some positive constant b, which implicitly depends on the complexity cost. Eq S20 expresses local convexity about shifted optimal
∗∗ according to the corresponding shifted divergence D̄ ′ .
memory τm
This indicates how the population is poised along the ridge of
optimal memory given a perturbative cost of memory complexity.
Then, time-averaged divergence will grow because optimal memory changes. Assuming that the population is at optimal memory,
we obtain for the partial derivative with respect to α
∂α D̄′ =
∗ + 1)
µ2 (2τm
D0 ∗ − 3
(τm ) 2 + a
∗
∗ + 1)3
2
2(log 2)(τm )3 (τm
∗
∂τm
,
∂α
[S8]
where we have used the fact that optimal memory must increase with
∗ < 0, to explicitly pull out a negative
stronger stabilizer, or that ∂α τm
sign. Given that we are in the scalin regime, this confirms that in
Eq S8 divergence at optimal memory decreases as α approaches −1
from above as expected.
Niche-constructing stabilization changes the environmental
timescale through feedback. We start by considering over a long
January 10, 2023
|
vol. XXX
|
no. XX
|
11
stabilization cost G
A
B
0.2
0.0
10
1
101
agent memory
103
m
period of time the average over many environmental switches
h1/τ̃E i = 1/τE + α
v2
v 2 + (h − hE )2
,
[S9]
= 1/τE + αf (τm ).
Since we do not know the exact form of the second term on the
right hand side, we represent it as some function f that represents
an average over time. For notational simplicity, we only make explicit f ’s dependence on τm , but it depends on agent properties and
environmental timescale. Now, a change in α also indirectly affects
∗ because the environmental timescale will change, reducing or
τm
increasing the agents ability to track the new environment. For
example, with the passive agent, an increase in α introduces environmental stabilization, driving the effective environmental timescale
slower and moving the optimal memory timescale up. Accounting
for these derivatives means that
dα h1/τ̃E i = f (τm ) + α∂τm f (τm )∂α τm .
[S10]
∗
Now, we will again make use of the assumption that τm is close τm
∗)+
such that we can make the linear approximation f (τm ) ≈ f (τm
∗ )f ′ (τ ∗ ). Putting this in, we find
(τm − τm
m
∗
∗
∗
)+
)f ′ (τm
) + (τm − τm
dα h1/τ̃E i (τm ) = f (τm
∗
∗
∗
∗
α∂τm [f (τm
) + (τm − τm
)f ′ (τm
)]∂α τm
[S11]
For a passive agent, this simplifies because α = 0. Furthermore, we
∗ ) = 0 because we have assumed that the agent is
know that f ′ (τm
at optimal memory so any deviation from optimal memory must
generally increase the typical distance between environmental and
agent bias (h − hE )2 . Then,
dα h1/τ̃E i (τm ) =
∗
).
f (τm
[S12]
Eq S12 is already clear from Eq S9 given the assumptions we have
made, but these steps take us through the general problem (when
|
10
2
10
3
10
4
100
cost
H
G
C
10
10
divergence D
total divergence
0
104
10
102
agent memory m
2
4
Fig. S9. Example of cost functions for stabilizers with varying memory but fixed
sensory precision. (blue) Without costs, divergence profile shows only a single global
minimum. (orange) With costs, we obtain degenerate minima at memory values
around τm = 0 and τm = 20. Eigenfunction solution parameters specified in
Materials and Methods code.
Fig. S8. Example of convergence of least-squares fit of Chebyshev polynomial
with increasing number of abscissa N with the eigenfunction solution. (top) For
comparison, divergence D as calculated from the agent-based simulation (ABS). The
eigenfunction solution is close even with a relatively small number of points fit to a
9th-degree Chebyshev polynomial. Both methods are especially effective when the
environmental timescale is small as is here, where τE = 10. The bias h0 = 0.2.
(bottom) Stabilization cost is similarly interpolated, but it is slower to converge with
visible oscillations disappearing by N = 30. For N = 20 and N = 30, not all
the points fell within the convergence criterion and only 19 and 28 points were fit,
respectively. For both plots, the Chebyshev polynomial approximation is slowest to
converge near the sharp bends at large τm . ABS is run for 107 time steps.
12
1
cost (bits)
10
3
divergence (bits)
2
divergence D
10
10
ABS
Eigen. N = 10
Eigen. N = 20
Eigen. N = 30
not situated exactly at optimal memory and when α =
6 0 are more
complicated). In other words, decreasing α for the weak stabilizer
will reduce the probability that the environment switches by the
term in Eq S12 because f > 0 and f ′ — the change in probability
is not just dependent on the rate effect f but also its derivative.
Under such a change, the new environmental timescale will
deviate from τE and so the stabilization cost can be expanded as
1
1
1 − 1/τE
1/τE
+
log
log
τE
h1/τ̃E i
τE
1 − h1/τ̃E i
1
≈
[h1/τ̃E i − 1/τE ]2 +
2τE
[S13]
1
1
1−
[h1/τ̃E i − 1/τE ]2
2
τE
1
= [h1/τ̃E i − 1/τE ]2 ,
2
a cost that increases quadratically with the change in the averaged
switch probability h1/τ̃E i away from 1/τE . For a passive agent,
this direction is 0 unless we allow for α to vary, which leads to the
relation
G(τE , τ̃E ) =
α2
∗ 2
) .
[S14]
f (τm
2
Eq S14 tells us that if we vary α, we must pay a stabilization
cost that, at least locally, grows quadratically with the strength of
stabilization with zero gradient.
The simplest contribution is with respect to the change in the
precision timescale τc . Divergence, as derived in Materials & Methods, is proportional to 1/τc . On the other hand, precision cost is
C = log τc . Since optimal memory timescale does not depend on
τc , the change of the total divergence is
G(τE , τ̃E ) =
∂τc [D1 /τc + β log2 τc ] = −D1 /τc 2 + β/τc ,
[S15]
where we take D∗ = D1 /τc to encapsulate the terms in the divergence apart from the scaling with precision timescale. If this has a
minimum at positive τc , the value of τc at which the minimum is
reached is τc ∗ = D1 /β.
Putting all of these together, we have the terms in the gradient
∗
)
∂τm D = 2a(τm − τm
∂α D =
∗ + 1)
D0 ∗ − 3
µ2 (2τm
(τm ) 2 + a
∗
∗ + 1)3
2
2 log(2)(τm )3 (τm
∗
∂τm
∂α
[S16]
∂τc D = β/τc − D1 /τc 2
When the cost gradient ∂α D < 0, a population of passive agents
is driven towards niche construction and when ∂τc D < 0 towards
precision reduction. Thus, the conditions that lead to reduction in
agent complexity by increasing memory, enhancing stabilization,
and lowering precision are captured by these gradients.
Lee et al.
A similar derivation can be made for the evolution of a starting
population of destabilizers, or agents with α > 0, instead a pure
population of passive agents. However, this requires us to deal
with all the terms in Eq S11 and to account for a term from the
gradient of stabilization cost in Eq S16 instead of assuming α = 0.
The change in the environmental timescale is more complicated to
calculate because we must then consider the way that destabilization
determines the modified environmental timescale in Eq S9, but it
is clear that the qualitative results will be the same because of the
adaptive gain from slower environmental timescales, i.e. decreasing
α, but the exact rate at which α changes will depend on the curvature
of the stabilization cost.
D. Metabolic costs of neural tissue for memory
we obtain for the partial derivative with respect to α
∂α D̄′ = −
∗ + 1)
µ2 (2τm
D0 ∗ −3/2
+a
(τm )
+
∗
∗ + 1)3
2
2(log 2)(τm )3 (τm
∗ )4φ + µφ(τ ∗ )2φ−1
2(log 2)φ2 a−2 (τm
m
−
∗ + 1)3
(log 2)(τm
∗ )4φ−1 + 2−1 µφ(2φ − 1)(τ ∗ )2φ−2
4(log 2)φ3 (τm
m
∗ + 1)2
(log 2)(τm
∗
∂τm
.
∂α
[S21]
Unlike the previous outcome in Eq S8, it is not necessarily the case
that a stronger stabilizer will decrease divergence because sufficiently
large metabolic costs will counteract the adaptive benefits of a slower
environment.
In the total divergence in Eq 18 and as discussed in Appendix C,
we consider information costs separately from energetic, metabolic
costs of neural tissue. An important consideration for comparing
the costs directly with one another is that that the right units for
comparison are not clear, an issue that we avoid by only considering
the scaling exponents presented in Result 3. Furthermore, while the
scaling argument makes clear that the metabolic costs will dominate
at sufficiently long lifetimes, the differences in how information and
energetic costs affect reproductive fitness make a direct comparison
in a combined “total divergence” equation problematic.
Nonetheless, if we do entertain the inclusion of metabolic costs
into the total divergence, we will find that rising metabolic costs with
environmental timescale will lead to a upper cutoff, i.e. truncating
memory at some point beyond which the benefits of increasing
stabilization are counteracted by the monotonically increasing costs
of supporting neural tissue for memory.
To show this more formally, we redo the calculations in the
previous section with an additional metabolic cost of memory,
D = D̄ + µH(τm ) + χG(τE , τ̃E ) + βC(τc ) + γF (τm ),
[S17]
2φ
with the new term F (τm ) = τm
defining the metabolic cost from
Result 3. Then, we have for the shifted optimal memory
∗∗
∗
τm
= τm
+
∗ )2φ
µ
φ(τm
−
+
∗
∗
∗
2(log 2)aτm (τm + 1)
a(τm + 1)
[S18]
O(µ2 ) + O(γ 2 ) + O(µγ),
obtained from ∂τm [D̄ + µH + γF ] = 0 and using the approximation
that both µ and γ are small. The perturbative assumption is not
necessary to take, but then there is no closed analytical solution for
∗∗ that we can write down. Eq S18
the shifted optimal memory τm
shows us that memory complexity, the term proportional to µ, tends
∗∗ up but the metabolic cost, the term
to drive optimal memory τm
proportional to γ, tends to drive it down, the balance of which
determine the exact change in optimal memory.
Taking the approximation in Eq S18 the shifted optimal divergence, denoted by an apostrophe, becomes
∗∗
D̄′ (τm
) = D∗ + a
µ2
+
∗ )2 (τ ∗ + 1)2
4(log 2)2 (τm
m
∗ )4φ
∗ )2φ
γ 2 φ2 (τm
aφµγ(τm
−
+
2
∗
2
∗
∗ + 1)2
a (τm + 1)
2(log 2)aτm (τm
[S19]
O(µ3 ) + O(γ 3 ) + O(µγ 2 ) + O(µ2 γ).
Again, perturbations about the local optimum lead to
D̄′ (τm ) ≈ D∗ + a
µ2
+
∗ )2 (τ ∗ + 1)2
4(log 2)2 (τm
m
∗ )4φ
∗ )2φ
φ2 (τm
aφµ(τm
−
+
∗ + 1)2
∗ (τ ∗ + 1)2
a2 (τm
2(log 2)aτm
m
[S20]
∗∗ 2
)
b(τm − τm
for some positive constant b, which implicitly depends on the complexity cost. Assuming that the population is at optimal memory,
Lee et al.
January 10, 2023
|
vol. XXX
|
no. XX
|
13
Table S2. Variables used in main text organized by section in which they are first introduced or used.
Model structure & assumptions
Parameter
At
Et
h0
h
ĥ
hE
p
p̂
pE
q
s
t
v
α
β
ǫτc
ητc
τc
τE
τf
τm
Description
discrete agent state at time t, e.g. {−1, 1}
discrete environmental state at time t, e.g. {−1, 1}
parameter for strength of environmental bias
agent bias
agent’s estimate of environmental bias
environmental bias
agent’s probability distribution over possible states of At after time integration
agent’s estimate of environment probability distribution at time t based on present samples
environmental probability distribution over possible states of Et
probability of change in environmental bias at a single time step
state of environment taking values of −1 or 1
time
construction rate curvature
construction rate weight, α < 0 for stabilizers and α > 0 for destabilizers
learning weight in Eq 4; coefficient of precision cost in Eq 18
perceptual error
estimated bias error
sampling duration, inverse precision
environment duration
niche construction duration
agent memory duration
Result 1
Parameter
D̄
D̄∗
DKL
∗
τm
Description
time-averaged Kullback-Leibler (KL) divergence
time-averaged KL divergence at optimal memory duration
KL divergence
optimal memory duration
Result 3
Parameter
B
Mbr
N
T
y
φ
Description
metabolic rate
brain mass
number of episodes of environmental change
lifespan of organism
exponent relating environment duration and organism lifetime
exponent relating metabolic rate and memory duration, φ = a/4b for energetic exponents a and b
Result 4
Parameter
C
D
G
H
β
µ
τ̃E
χ
14
|
Description
sensory precision cost
total divergence
stabilization cost
complexity of memory cost
coefficient for sensory cost
coefficient for memory complexity cost
modified environment duration
coefficient for stabilization cost
Lee et al.
1. Feldman MW, Laland KN (1996) Gene-culture coevolutionary theory. Trends in Ecology &
Evolution 11(11):453–457.
2. Gerbault P, et al. (2011) Evolution of lactase persistence: an example of human niche
construction. Philosophical Transactions of the Royal Society B: Biological Sciences
366(1566):863–877.
3. Odling-Smee FJ, Laland KN, Feldman MW (1996) Niche Construction. The American Naturalist 147(4):641–648. Publisher: The University of Chicago Press.
4. Laland KN, O’Brien MJ (2011) Cultural Niche Construction: An Introduction. Biological Theory 6(3):191–202.
5. Clark AD, Deffner D, Laland K, Odling-Smee J, Endler J (2020) Niche Construction Affects
the Variability and Strength of Natural Selection. The American Naturalist 195(1):16–30.
6. Buser CC, Newcomb RD, Gaskett AC, Goddard MR (2014) Niche construction initiates
the evolution of mutualistic interactions. Ecology Letters 17(10):1257–1264. _eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1111/ele.12331.
7. Bowles S (2006) Microeconomics.
8. Poon P, Flack JC, Krakauer DC (2022) Institutional dynamics and learning networks. PLOS
ONE 17(5):e0267688. Publisher: Public Library of Science.
9. Ofosu EK, Chambers MK, Chen JM, Hehman E (2019) Same-sex marriage legalization associated with reduced implicit and explicit antigay bias. Proceedings of the National Academy of
Sciences 116(18):8846–8851. Publisher: Proceedings of the National Academy of Sciences.
10. Flack JC (2017) Coarse-graining as a downward causation mechanism. Phil. Trans. R. Soc.
A 375(2109):20160338.
11. Flack J (2017) Life’s information hierarchy in From Matter to Life: Information and Causality,
eds. Ellis GFR, Davies PCW, Walker SI. (Cambridge University Press, Cambridge), pp. 283–
302.
12. Mukherjee S, Bassler BL (2019) Bacterial quorum sensing in complex and dynamically changing environments. Nature Reviews Microbiology 17(6):371–382. Number: 6 Publisher: Nature Publishing Group.
13. Merton RK (1948) The Self-Fulfilling Prophecy. The Antioch Review 8(2):193–210. Publisher:
Antioch Review, Inc.
14. Strathern M (1997) Improving ratings: audit in the British University system. European Review
5(3). Publisher: Cambridge University Press.
15. Soros G (2013) Fallibility, reflexivity, and the human uncertainty principle.
Journal of Economic Methodology 20(4):309–329.
Publisher:
Routledge _eprint:
https://doi.org/10.1080/1350178X.2013.859415.
16. Manheim D, Garrabrant S (2019) Categorizing Variants of Goodhart’s Law. arXiv:1803.04585
[cs, q-fin, stat]. arXiv: 1803.04585.
17. Jakab S (2022) The Revolution That Wasn’t: GameStop, Reddit, and the Fleecing of Small
Investors. (Portfolio, New York, NY).
18. Kalman RE (1960) A New Approach to Linear Filtering and Prediction Problems. Journal of
Basic Engineering pp. 35–45.
19. Brenner N, Bialek W, de Ruyter van Steveninck R (2000) Adaptive Rescaling Maximizes
Information Transmission. Neuron 26(3):695–702.
20. Gershman SJ, Radulescu A, Norman KA, Niv Y (2014) Statistical Computations Underlying
the Dynamics of Memory Updating. PLoS Comput Biol 10(11):e1003939.
21. Davis RL, Zhong Y (2017) The Biology of Forgetting—A Perspective. Neuron 95(3):490–503.
22. Kussell E, Leibler S (2005) Phenotypic Diversity, Population Growth, and Information in Fluctuating Environments. Science 309(5743):2075–2078.
23. Rivoire O, Leibler S (2011) The Value of Information for Populations in Varying Environments.
J Stat Phys 142(6):1124–1166.
24. Chowdhury SM, Kovenock D, Sheremeta RM (2013) An experimental investigation of Colonel
Blotto games. Economic Theory 52(3):833–861.
25. Krakauer D, Bertschinger N, Olbrich E, Flack JC, Ay N (2020) The information theory of
individuality. Theory in Biosciences 139(2):209–223.
26. Theraulaz G, Bonabeau E (1999) A Brief History of Stigmergy. Artificial Life 5(2):97–116.
27. Brush ER, Krakauer DC, Flack JC (2018) Conflicts of interest improve collective computation of adaptive social structures. Science Advances 4(1):e1603311. Publisher: American
Association for the Advancement of Science.
28. Ramos-Fernandez G, Smith Aguilar SE, Krakauer DC, Flack JC (2020) Collective Computation in Animal Fission-Fusion Dynamics. Frontiers in Robotics and AI 7.
29. Ratcliff R (1978) A Theory of Memory Retrieval. Psychological Review 85(2):59–108.
30. Koehl MAR, Cooper T (2015) Swimming in an Unsteady World. Integr. Comp. Biol. 55(4):683–
697.
31. Evans MEK, Dennehy JJ (2005) Germ Banking: Bet-Hedging and Variable Release from Egg
and Seed Dormancy. The Quarterly Review of Biology 80(4):431–451.
32. Slivkins A (2019) Introduction to Multi-Armed Bandits. (Now Publishers).
33. Kaspi H, Mandelbaum A (1995) Levy Bandits: Multi-Armed Bandits Driven by Levy Processes.
Ann. Appl. Probab. 5(2):541–565.
34. Schofield N (2008) Divergence in the spatial stochastic model of voting in Power, Freedom,
and Voting, eds. Braham M, Steffen F. (Springer Berlin Heidelberg, Berlin, Heidelberg), pp.
259–287.
35. Yuan DL, Chen Q (2010) Particle swarm optimisation algorithm with forgetting character. International Journal of Bio-Inspired Computation 2(1):59.
36. Tindall MJ, Gaffney EA, Maini PK, Armitage JP (2012) Theoretical insights into bacterial
chemotaxis: Theoretical bacterial chemotaxis. WIREs Syst Biol Med 4(3):247–259.
37. Bifet A, Gavaldà R (2007) Learning from Time-Changing Data with Adaptive Windowing in
Proceedings of the 2007 SIAM International Conference on Data Mining. (Society for Industrial and Applied Mathematics), pp. 443–448.
38. Kosina P, Gama J (2012) Handling time changing data with adaptive very fast decision rules
in Machine Learning and Knowledge Discovery in Databases, eds. Flach PA, De Bie T, Cristianini N. (Springer Berlin Heidelberg, Berlin, Heidelberg), pp. 827–842.
39. Rolls ET, Deco G (2015) Stochastic cortical neurodynamics underlying the memory and cognitive changes in aging. Neurobiology of Learning and Memory 118:150–161.
40. Ratcliff R, Rouder JN (1998) Modeling Response Times for Two-Choice Decisions. Psychol
Lee et al.
Sci 9(5):347–356.
41. Brunton BW, Botvinick MM, Brody CD (2013) Rats and Humans Can Optimally Accumulate
Evidence for Decision-Making. Science 340(6128):95–98.
42. Miletic S, Boag R, Mathiopoulou V, Forstmann B (2019) Speed and accuracy in learning:
A combined Q-learning diffusion decision model analysis in 2019 Conference on Cognitive
Computational Neuroscience. (Cognitive Computational Neuroscience, Berlin, Germany).
43. Kar S, Moura J (2009) Distributed Consensus Algorithms in Sensor Networks With Imperfect
Communication: Link Failures and Channel Noise. IEEE Trans. Signal Process. 57(1):355–
369.
44. Schweitzer ME, Cachon GP (2000) Decision Bias in the Newsvendor Problem with a Known
Demand Distribution: Experimental Evidence. Management Science 46(3):404–420.
45. Tregenza T (1995) Building on the Ideal Free Distribution in Advances in Ecological Research.
(Elsevier) Vol. 26, pp. 253–307.
46. Musa HH, Noureldien A (2018) Comparing the ranking performance of page rank algorithm
and weighted page rank algorithm. Advanced Science Letters 24(1):750–753.
47. Couzin ID, Krause J, Franks NR, Levin SA (2005) Effective leadership and decision-making
in animal groups on the move. Nature 433(7025):513–516.
48. Franks NR, et al. (2007) Reconnaissance and latent learning in ants. Proc. R. Soc. B.
274(1617):1505–1509.
49. Bogacz R, Brown E, Moehlis J, Holmes P, Cohen JD (2006) The physics of optimal decision
making: A formal analysis of models of performance in two-alternative forced-choice tasks.
Psychol Rev 113(4):700–765.
50. McNamara JM, Houston AI (1987) Memory and the efficient use of information. Journal of
Theoretical Biology 125(4):385–395.
51. Donaldson-Matasci MC, Bergstrom CT, Lachmann M (2010) The fitness value of information.
Oikos 119(2):219–230.
52. Krakauer DC (2011) Darwinian demons, evolutionary complexity, and information maximization. Chaos: An Interdisciplinary Journal of Nonlinear Science 21(3):037110.
53. Kelly JL (1956) A New Interpretation of Information Rate. the bell system technical journal
p. 10.
54. Cover TM, Thomas JA (2006) Elements of Information Theory. (John Wiley & Sons, Hoboken), Second edition.
55. Conti D, Mora T (2020) Non-equilibrium dynamics of adaptation in sensory systems.
arXiv:2011.09958 [nlin, q-bio].
56. West GB (1999) The Fourth Dimension of Life: Fractal Geometry and Allometric Scaling of
Organisms. Science 284(5420):1677–1679.
57. White CR, Seymour RS (2003) Mammalian basal metabolic rate is proportional to body
mass2/3. Proceedings of the National Academy of Sciences 100(7):4046–4049.
58. Burger JR, George MA, Leadbetter C, Shaikh F (2019) The allometry of brain size in mammals. Journal of Mammalogy 100(2):276–283.
59. West GB, Woodruff WH, Brown JH (2002) Allometric scaling of metabolic rate from molecules
and mitochondria to cells and mammals. Proceedings of the National Academy of Sciences
99(Supplement 1):2473–2478.
60. Savage VM, Deeds EJ, Fontana W (2008) Sizing Up Allometric Scaling Theory. PLoS Comput. Biol. 4(9):e1000171.
61. Snell-Rood EC, Papaj DR, Gronenberg W (2009) Brain Size: A Global or Induced Cost of
Learning? Brain Behav Evol 73(2):111–128.
62. Liefting M, Rohmann JL, Le Lann C, Ellers J (2019) What are the costs of learning? Modest
trade-offs and constitutive costs do not set the price of fast associative learning ability in a
parasitoid wasp. Anim Cogn 22(5):851–861.
63. Woude E, Groothuis J, Smid HM (2019) No gains for bigger brains: Functional and neuroanatomical consequences of relative brain size in a parasitic wasp. J Evol Biol p. jeb.13450.
64. Klyubin AS, Polani D, Nehaniv CL (2004) Tracking Information Flow through the Environment:
Simple Cases of Stigmergy in Artificial Life IX: Proceedings of the Ninth International Conference on the Simulation and Synthesis of Living Systems, ed. Pollack J. (MIT Press).
65. Krakauer DC, Page KM, Erwin DH (2009) Diversity, Dilemmas, and Monopolies of Niche
Construction. The American Naturalist 173(1):26–40.
66. Wen XL, Wen P, Dahlsjö CAL, Sillam-Dussès D, Šobotník J (2017) Breaking the cipher:
Ant eavesdropping on the variational trail pheromone of its termite prey. Proc. R. Soc. B.
284(1853):20170121.
67. Smith CC, Reichman OJ (1984) The Evolution of Food Caching by Birds and Mammals. Ann.
Rev. Ecol. Syst. 15:329–351.
68. Hall K, et al. (2017) Chimpanzee uses manipulative gaze cues to conceal and reveal information to foraging competitor. Am J Primatol 79(3):e22622.
69. Brännström Å, Johansson J, von Festenberg N (2013) The Hitchhiker’s Guide to Adaptive
Dynamics. Games 4(3):304–328.
70. Dieckmann U (year?) Ph.D. thesis.
71. Dieckmann U, Law R (1996) The dynamical theory of coevolution: A derivation from stochastic ecological processes. J. Math. Biology 34(5-6):579–612.
72. Poon P, Flack JC, Krakauer DC (2022) Institutional dynamics and learning networks. PLoS
ONE 17(5):e0267688.
73. Brush ER, Krakauer DC, Flack JC (2018) Conflicts of interest improve collective computation
of adaptive social structures. Sci. Adv. 4(1):e1603311.
74. Ramos-Fernandez G, Smith Aguilar SE, Krakauer DC, Flack JC (2020) Collective Computation in Animal Fission-Fusion Dynamics. Front. Robot. AI 7:90.
75. Lee ED, Daniels BC, Krakauer DC, Flack JC (2017) Collective memory in primate conflict
implied by temporal scaling collapse. J. R. Soc. Interface 14(134):20170223.
76. North DC (2005) Understanding the Process of Economic Change. (Princeton University
Press).
77. Flack JC, Girvan M, de Waal FBM, Krakauer DC (2006) Policing stabilizes construction of
social niches in primates. Nature 439(7075):426–429.
78. Flack J (2017) Life’s Information Hierarchy in From Matter to Life: Information and Causality,
eds. Walker S, Davies PCW, Ellis GFR. (Cambridge University Press), pp. 283–302.
79. McNamara JM, Houston AI (1985) Optimal foraging and learning. Journal of Theoretical
January 10, 2023
|
vol. XXX
|
no. XX
|
15
Biology 117(2):231–249.
80. Gershman SJ, Wilson RC (2010) The Neural Costs of Optimal Control in Advances in Neural
Information Processing Systems. (Curran Associates, Inc.), Vol. 23, pp. 712–720.
81. Fox E, Sudderth EB, Jordan MI, Willsky AS (2011) Bayesian Nonparametric Inference of
Switching Dynamic Linear Models. IEEE Trans. Signal Process. 59(4):1569–1585.
82. Lee ED, Chen X, Daniels BC (2022) Discovering sparse control strategies in neural activity.
PLoS Comput Biol 18(5):e1010072.
83. Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2007) Numerical Recipes: The Art of
Scientific Computing. (Cambridge University Press, New York), 3rd edition.
16
|
Lee et al.