RESEARCH ARTICLE
10.1029/2019MS002034
Special Section:
The NASA GISS contribution
to CMIP6
Key Points:
• Tropospheric warming and
ocean heat uptake by 2014 are
smaller in GISS-E2.1 and closer to
observed trends than in its CMIP5
predecessor
• GISS-E2.1 climate sensitivity is
higher than in CMIP5 GISS-E2,
but forcing by greenhouse gases is
smaller
• Atmospheric trends vary among
model configurations with the
storage of heat beneath the
thermocline
Supporting Information:
• Supporting Information S1
Correspondence to:
R. L. Miller,
[email protected]
Citation:
Miller, R. L., Schmidt, G. A.,
Nazarenko, L. S., Bauer, S. E.,
Kelley, M., Ruedy, R., et al. (2021).
CMIP6 historical simulations
(1850–2014) with GISS-E2.1. Journal
of Advances in Modeling Earth
Systems, 13, e2019MS002034. https://
doi.org/10.1029/2019MS002034
Received 30 DEC 2019
Accepted 13 OCT 2020
Accepted article online 21 NOV 2020
CMIP6 Historical Simulations (1850–2014) With GISS-E2.1
Ron L. Miller1,2 , Gavin A. Schmidt1 , Larissa S. Nazarenko1,3 , Susanne E. Bauer1 ,
Maxwell Kelley1,4 , Reto Ruedy1,4 , Gary L. Russell1 , Andrew S. Ackerman1 ,
Igor Aleinov1,3 , Michael Bauer1,3 , Rainer Bleck5,6 , Vittorio Canuto1 , Grégory Cesana1,3 ,
Ye Cheng1,3 , Thomas L. Clune7 , Ben I. Cook1 , Carlos A. Cruz7,8 , Anthony D. Del Genio1 ,
Gregory S. Elsaesser1,2 , Greg Faluvegi1,3 , Nancy Y. Kiang1 , Daehyun Kim9 ,
Andrew A. Lacis1 , Anthony Leboissetier1,4 , Allegra N. LeGrande1 , Ken K. Lo1,4 ,
John Marshall10 , Elaine E. Matthews1 , Sonali McDermid11 , Keren Mezuman1,3 ,
Lee T. Murray12 , Valdar Oinas1,4 , Clara Orbe1 , Carlos Pérez García-Pando13,14 ,
Jan P. Perlwitz1,15 , Michael J. Puma1,3 , David Rind1 , Anastasia Romanou1 ,
Drew T. Shindell16 , Shan Sun6 , Nick Tausnev1,4 , Kostas Tsigaridis1,3 , George Tselioudis1 ,
Ensheng Weng1,3 , Jingbo Wu1,2 , and Mao-Sung Yao1,4
1 NASA
Goddard Institute for Space Studies, New York, NY, USA, 2 Department of Applied Physics and Applied
Mathematics, Columbia University, New York, NY, USA, 3 Center for Climate Systems Research, Earth Institute,
Columbia University, New York, NY, USA, 4 SciSpace LLC, New York, NY, USA, 5 CIRES, University of Colorado
Boulder, Boulder, CO, USA, 6 NOAA/ESRL/Global Systems Laboratory, Boulder, CO, USA, 7 Goddard Space Flight
Center, Greenbelt, MD, USA, 8 SSAI, Greenbelt, MD, USA, 9 Department of Atmospheric Sciences, University of
Washington, Seattle, WA, USA, 10 Department of Earth, Atmospheric and Planetary Sciences, Massachusetts Institute of
Technology, Cambridge MA, USA, 11 Department of Environmental Studies, New York University, New York, NY, USA,
12 Department of Earth and Environmental Sciences, University of Rochester, Rochester, NY, USA, 13 Barcelona
Supercomputing Center, Barcelona, Spain, 14 ICREA, Catalan Institution for Research and Advanced Studies,
Barcelona, Spain, 15 Climate, Aerosol, and Pollution Research, LLC, Bronx, NY, USA, 16 Nicholas School of the
Environment, Duke University, Durham, NC, USA
Abstract
Simulations of the CMIP6 historical period 1850–2014, characterized by the emergence of
anthropogenic climate drivers like greenhouse gases, are presented for different configurations of the
NASA Goddard Institute for Space Studies (GISS) Earth System ModelE2.1. The GISS-E2.1 ensembles are
more sensitive to greenhouse gas forcing than their CMIP5 predecessors (GISS-E2) but warm less during
recent decades due to a forcing reduction that is attributed to greater longwave opacity in the GISS-E2.1
pre-industrial simulations. This results in an atmosphere less sensitive to increases in opacity from rising
greenhouse gas concentrations, demonstrating the importance of the base climatology to forcing and
forced climate trends. Most model versions match observed temperature trends since 1979 from the ocean
to the stratosphere. The choice of ocean model is important to the transient climate response, as found
previously in CMIP5 GISS-E2: the model that more efficiently exports heat to the deep ocean shows a
smaller rise in tropospheric temperature. Model sea level rise over the historical period is traced to
excessive drawdown of aquifers to meet irrigation demand with a smaller contribution from thermal
expansion. This shows how fully coupled models can provide indirect observational constraints upon
forcing, in this case, constraining irrigation rates with observed sea level changes. The overall agreement of
GISS-E2.1 with observed trends is familiar from evaluation of its predecessors, as is the conclusion that
these trends are almost entirely anthropogenic in origin.
Plain Language Summary
©2020. The Authors.
This is an open access article under
the terms of the Creative Commons
Attribution License, which permits
use, distribution and reproduction in
any medium, provided the original
work is properly cited.
MILLER ET AL.
Measurements show clear evidence of warming over the twentieth
century and up to the present day. Our anticipation of future change comes from computer models of
climate. These are based upon well-established physical principles like Newton's laws of motion and
radiative transfer theory; the models are closely related to those used for weather forecasting. We can never
predict the weather on a particular day, 50 years in the future, but we can calculate whether that future
decade will be warmer than our present climate. Part of our confidence in such a forecast comes from
testing a climate model's ability to reproduce warming and other changes measured over the past century.
We use observations of atmospheric composition and the sunlight received by our planet to calculate how
the model responds to their changes. The climate model of the NASA Goddard Institute for Space Studies,
1 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
GISS-E2.1, closely follows changes measured in the ocean and atmosphere as the concentrations of
greenhouse gases and other pollutants rise. This agreement suggests that future warming by greenhouse
gases will be reliably predicted by GISS-E2.1. This suggests that the warming we already experience is due
to our consumption of fossil fuels that has led to the increase of carbon dioxide and other greenhouse gases
in the atmosphere over the past two centuries.
1. Introduction
In an era of climate change and the frequent occurrence of record-setting measurements, there is intense
interest in anticipating future changes. The Coupled Model Intercomparison Project (CMIP) was conceived
to facilitate coordinated experiments of past and future climate among modeling groups (Meehl et al., 1997).
Contrasting behavior between climate simulations would be attributed to model differences rather than
differences in experimental design. The coordinated experiments allow models from different research centers to be meaningfully compared to each other and observations, while providing benchmarks for the
comparison of successive generations of models.
A key element of CMIP Phase 6 (CMIP6 Eyring et al., 2016) is the simulation of climate change observed
during the so-called historical period from 1850 to 2014. The comparison of observed changes to simulated trends driven by changes in atmospheric composition and other climate drivers is an incomplete but
nonetheless useful test of model climate sensitivity that is important for assessing the likelihood of future
changes projected by the models. Historical simulations have been performed with multiple generations
of the NASA Goddard Institute for Space Studies (GISS) Earth system model (Hansen, Sato, Ruedy, Lacis,
et al., 1997; Hansen et al., 2002, 2007, 1988; Miller et al., 2014). Over previous CMIP generations, GISS has
assessed robustness of the modeled response by simulating the historical period with different combinations
of model components. Since CMIP3, we have carried out experiments with two ocean versions coupled to
the same atmospheric model, and for CMIP5, GISS analyzed the impact of different treatments of atmospheric composition and the aerosol indirect effect. For CMIP6, we continue to use multiple versions of the
coupled model. The use of a common atmosphere allows the effect of ocean treatments to be highlighted,
while different treatments of atmospheric composition test robustness of certain interactions and forcings.
Characterization of forcing is the basis for understanding the contribution of individual drivers to future
changes as well as attributing past and currently observed changes to the forcings (e.g., Hansen et al., 2005;
Marvel et al., 2015). For each CMIP generation, the forcings are better characterized, the number of forcings recognized as important to observed trends is increased, and the representation of physical processes
impacted by the forcings is expanded. Interactions between expanded physical representations within the
model can reveal limitations and the need for refinement that were obscure prior to their coupling.
The GISS contribution to CMIP6 includes varying ocean components, composition interactivity, vertical
and horizontal resolution, carbon cycle, and forcings. We report here on a subset of this contribution
based upon GISS-E2.1, an updated and more skillful version of the GISS-E2 model used in CMIP5 (Miller
et al., 2014; Nazarenko et al., 2015; Schmidt et al., 2014). The seasonal climatology of GISS-E2.1 during the
satellite period (1979–2014) is evaluated elsewhere by Kelley et al. (2020). The present article is a complementary analysis of climate trends during the CMIP6 historical period (1850–2014) simulated by GISS-E2.1
for comparison to observed changes.
In section 2, we summarize the model versions and describe the format of the historical experiments. In
section 3, we document forcings that perturb the model's pre-industrial climate, noting differences with
respect to the GISS CMIP5 simulations that impact the climate response. In section 4, we evaluate model
climate trends as coherently expressed across disparate climate variables and the trend dependence upon
the forcing along with the atmosphere and ocean components. In section 5, we present a summary of results
and implications.
2. Models and Experiments
A full description of GISS-E2.1, including development since CMIP5, is given by Kelley et al. (2020). Here we
summarize the main features of the atmosphere and ocean components, while describing how the historical
ensembles arise from the pre-industrial experiments.
MILLER ET AL.
2 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 1. Development of ModelE coupled to the GISS Ocean Model v.1 (née the Russell Ocean Model) from CMIP3 to
the present. There is parallel development of ModelE coupled to HYCOM (Sun & Bleck, 2006), where “R” and “G” are
replaced by “H.” The suffix “CC” refers to an interactive carbon cycle (Ito et al., 2020; Latto & Romanou, 2018; Lerner
et al., 2020; Romanou et al., 2013, 2014). The domain of GISS-E2.2 includes the mesosphere and has a better
stratospheric circulation (Orbe et al., 2020; Rind et al., 2020), while GISS-E3 incorporates a new cubed-sphere
grid and improved model physics.
2.1. Atmospheric Model
The horizontal and vertical resolution of the atmospheric component of GISS-E2.1 is unchanged since
CMIP5: 2◦ latitude by 2.5◦ longitude with 40 vertical layers extending from the surface to 0.1 hPa in the lower
mesosphere. Nonetheless, GISS-E2.1 shows substantial improvement in its seasonal climatology, along with
variability like the Madden-Julian Oscillation (Del Genio et al., 2015; Kim et al., 2012) and teleconnection
patterns associated with El Niño and the Southern Oscillation (ENSO) and the Pacific Decadal Oscillation
(Kelley et al., 2020).
Three versions of the GISS-E2.1 atmospheric component have been submitted to CMIP6. Atmospheric
composition is prescribed in the first version, denoted here as NINT (for “non-interactive”) and in the
CMIP6 archive as physics-version=1 (“p1”). In two other versions, aerosols and ozone are calculated prognostically using either the One-Moment Aerosol (OMA) model or else the MATRIX model
(“Multiconfiguration Aerosol TRacker of mIXing state” Bauer et al., 2008, 2020). The OMA version is designated as physics-version=3 (“p3”) in the CMIP6 archive to signify continuity with its CMIP5 predecessor, the TCADI model from which OMA was derived. (The relation of different CMIP generations of ModelE
is summarized in Figure 1) The MATRIX version of GISS-E2.1 is designated by physics-version=5
(“p5”) in the CMIP6 archive.
Aerosols and ozone in the NINT model are prescribed from “AMIP-style” OMA simulations of the historical period with forcings described in section 3 (Bauer et al., 2020). Sea surface temperature (SST) and sea
ice fraction are available from Rayner (2003) after 1870. Prior to that, these variables are specified using an
average from the decade 1876–1885, a period chosen for its spatial coverage of measurements prior to significant warming. The use of the OMA model to prescribe NINT composition gives greater consistency between
the two model versions. For example, ozone variations are in phase with the solar forcing in all the GISS
CMIP6 ensembles. This consistency allows differences in behavior between NINT and OMA to be more
directly attributed to feedbacks between composition and climate variations. In addition, the OMA version
of GISS-E2.1 provides atmospheric composition for NINT based upon a state-of-the-art model, in contrast to
the CMIP5 NINT version, where some constituents like ozone were calculated with the previous-generation
(CMIP3) version of ModelE. OMA aerosol species provided to NINT include sulfate, nitrate, ammonium,
organic carbon and three types of black carbon (BC). These species are primarily anthropogenic in origin,
MILLER ET AL.
3 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
although there are non-anthropogenic sources from biomass burning and lightning. OMA also provides
NINT with sea salt and dust aerosols that vary temporally through climate variables like wind speed. Anthropogenic dust sources are not represented in OMA, even though high dust concentration is observed in
regions of cultivation (e.g., Ginoux et al., 2012; Mahowald et al., 2004; Tegen et al., 1996). Nonetheless, some
anthropogenic trends in dust concentration can be driven indirectly by irrigation that increases soil moisture
(Bauer et al., 2020), as well as by the formation of sulfate and nitrate coatings on the dust particle surface
that increases the rate of wet removal (Bauer & Koch, 2005; Bauer et al., 2004).
Atmospheric composition from AMIP-style OMA and MATRIX experiments is described by Bauer et al.
(2020). Aerosol concentrations are determined by simulating an extensive series of chemical reactions in
the atmosphere, which includes prognostic budgets for other radiatively active constituents like methane,
nitrous oxide and chlorofluorocarbons (CFCs). The surface concentration of each of these gases is prescribed during the historical period using measurements (Meinshausen et al., 2017), with their vertical and
spatial distributions determined by a suite of physical and chemical processes. Photochemical oxidation of
methane is a source of stratospheric water vapor that alters the column longwave opacity. Ozone in the
troposphere and stratosphere is calculated prognostically based upon the prescribed concentration of constituents like halogens or emission of tropospheric pollutants (Hoesly et al., 2018; Meinshausen et al., 2017),
photochemistry and heterogeneous reactions on aerosols and polar stratospheric cloud droplets. Photolysis reactions associated with ozone are modulated by variations in the solar photon flux as described in
section 3.4 (Matthes et al., 2017), although at present this flux does not vary with the changing concentration
of radiatively active constituents like volcanic aerosols or ozone itself.
Aerosol extinction is linearly related to aerosol mass. Solar extinction by BC is increased by a factor of 1.5
relative to its mass to represent enhanced absorption by sulfate and nitrate coatings that increase internal
reflection (Bauer et al., 2007), while dust LW extinction is increased by a factor of 1.3 to represent scattering
(Dufresne et al., 2002; Schmidt et al., 2006).
Along with their direct radiative effect, aerosols perturb climate through microphysical adjustments of cloud
properties and the associated cloud radiative fluxes: the aerosol indirect effect (AIE; e.g., Boucher et al.,
2013). In the NINT model, cloud fraction is increased according to the number concentration of sulfate,
nitrate and carbonaceous aerosols above their pre-industrial levels: a heuristic representation of changes to
cloud radiative forcing by aerosol nucleation of cloud droplets (Hansen et al., 2005). The relation between
cloud fraction and aerosol number is logarithmic to represent the declining effect of adding anthropogenic
aerosols to highly polluted air where nucleation sites are already abundant. This heuristic representation
of the AIE within the NINT model is tuned to give roughly −1 W m−2 of forcing over the historical period,
consistent with more physically sophisticated models constrained by observations (Boucher et al., 2013).
The NINT AIE directly relates anthropogenic aerosol mass to a change in cloud cover. Radiative properties of
observed clouds depend upon cloud droplet number (CDN), which depends upon the concentration of both
anthropogenic and natural aerosols, but additionally upon droplet size and their distribution with height
within the cloud. The OMA and MATRIX models incorporate this more comprehensive view by relating
aerosol mass only to the CDN at cloud base, and simulating cloud microphysical processes that convert
vapor to liquid water to determine droplet size and the cloud optical depth. CDN is derived from aerosol
number and updraft speed using empirical relations for convective clouds (Menon & Rotstayn, 2006, their
Figure 1) and for stratiform clouds (Lohmann et al., 2007, their Equation 1). These relations are derived
from case studies of observed cloud properties simulated by more physically complex microphysical models
(e.g., Segal et al., 2004). In MATRIX, aerosols serve as droplet nucleation sites depending upon their mass
and number along with the cloud updraft velocity following Abdul-Razzak et al. (1998) and Abdul-Razzak
and Ghan (2000).
The OMA and MATRIX versions of GISS-E2.1 represent only the first indirect effect relating aerosol number
to cloud optical thickness (Twomey, 1977). Development of the second indirect effect relating the influence
of aerosols upon the growth of cloud droplets to precipitable size (Albrecht, 1989) is underway in GISS-E3.
In contrast to the NINT model, the OMA and MATRIX calculations of cloud radiative effect are not adjusted
to give a specific indirect forcing.
For the NINT pre-industrial control simulations (denoted as “piControl” in the CMIP6 archive), aerosol
and ozone concentrations vary with calendar month based upon a decadal average of the OMA AMIP
MILLER ET AL.
4 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Table 1
Model Configurations Used for piControl Experiments With Their CMIP6 Identifier and Digital Object Identifier (doi)
Along With the First and Last Years and Duration of the piControl Segment Used to Define Anomalies During the
Historical Period
Model
CMIP archive tag
doi
First year
Last year
Duration
GISS-E2.1-G NINT f1
piControl_r1i1p1f1
10.22033/ESGF/CMIP6.7380
4150
5000
851
GISS-E2.1-G NINT f2
piControl_r1i1p1f2
10.22033/ESGF/CMIP6.7380
7550
8000
451
GISS-E2.1-G OMA
piControl_r1i1p3f1
10.22033/ESGF/CMIP6.7380
2000
2450
451
GISS-E2.1-H NINT f1
piControl_r1i1p1f1
10.22033/ESGF/CMIP6.7381
3180
3980
801
GISS-E2.1-H NINT f2
piControl_r1i1p1f2
10.22033/ESGF/CMIP6.7381
4000
4450
451
GISS-E2.1-H OMA
piControl_r1i1p3f1
10.22033/ESGF/CMIP6.7381
2000
2450
451
Note. Aerosols and ozone in the p1 (NINT) “f1” and “f2” simulations are taken from a four-member ensemble average
of a prior OMA AMIP model version and a five-member ensemble average of the corrected OMA AMIP model, respectively. The doi of the latter ensemble is 10.22033/ESGF/CMIP6.6984, and its output in the CMIP6 archive is identified
by amip_r[1-5]i1p3f1. The OMA experiments identified by “f1” in the CMIP6 archive use the corrected model.
The first year of each piControl experiment is unrelated to the length of its coupled spin-up.
simulation between 1850 and 1859. For NINT experiments during the CMIP6 historical period, the prescribed concentration is updated using OMA AMIP values that vary by month and year.
2.2. Ocean Models
Each of the three physics versions of the GISS-E2.1 atmospheric component is coupled to one of two ocean
general circulation models (OGCMs).
The coupled model denoted “GISS-E2-1-G” in the CMIP6 archive (and “E2.1-G” here) uses the GISS
Ocean version 1 (GO1), a descendent of the Russell OGCM used in CMIP5 (whose coupled version was
denoted “GISS-E2-R” or else “E2-R”). Both GO1 and the Russell OGCM have horizontal resolution of
1◦ latitude by 1.25◦ longitude, so that four ocean grid boxes nest within a single AGCM grid box. GO1 has
40 vertical layers, 8 more than its CMIP5 predecessor. The additional layers improve vertical resolution in
the upper ocean, including the thermocline. GO1 has 24 layers above 1,400 m depth compared to 15 in the
CMIP5 version. Above 200 m depth, GO1 has 11 layers compared to just 6 in its predecessor. This approximately doubles vertical resolution in shallow, marginal seas where GO1 shows reduced salinity biases.
The strongest mixing is now directed along isopycnals (Redi, 1982), eliminating unintended cross-isopycnal
diffusion present in the GISS CMIP5 simulations.
The HYCOM model calculates the ocean circulation on 32 isopycnal layers, an increase of 6 over its CMIP5
precursor. These are replaced by height coordinates in regions of weak stratification. Meridional grid spacing is 1◦ at the equator and decreases poleward with the cosine of latitude. Zonal resolution is a uniform 1◦
outside of the Northern Hemisphere (NH) high latitudes, where a separate bi-polar grid eliminates the singularity of spherical coordinates. Because of the varying horizontal resolution associated with the NH polar
grid and varying meridional spacing elsewhere, output from the coupled HYCOM model (“GISS-E2-1-H”
or “E2.1-H”) is archived on a uniform 1◦ by 1◦ grid at 32 depths extending from the surface to 5500 m.
The combination of the three physics versions of the GISS-E2.1 AGCM and the two ocean models results in
six distinct coupled models that have been submitted to CMIP6. At the time of writing, less model output
was available for the MATRIX coupled models, so we present results only for the NINT and OMA ensembles.
2.3. Spin-Up of Pre-Industrial Control Runs and Branching
Each historical simulation departs from a control simulation that is near equilibrium with respect to
pre-industrial atmospheric composition.
The pre-industrial control runs resulting from the different combinations of atmospheric model physics
versions and ocean models are listed in Table 1.
During integration of the NINT historical experiments, we identified a coding error in the OMA model that
caused an underestimate of stratospheric ozone destruction by heterogeneous reactions on the surface of
volcanic aerosols. (The surface area of volcanic aerosols that host these reactions was too small.) This error
had a small effect upon the pre-industrial climate, but restricted ozone destruction by halogens following
MILLER ET AL.
5 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 2. Global and annual surface air temperature (black) during the piControl simulation. In each panel, the red
line is calculated using loess smoothing over a 200-year window. The standard deviation with respect to the loess fit is
shown in the upper right corner of each panel; the value in parentheses is the standard deviation of the corresponding
GISS CMIP5 model. The yellow bar marks two standard deviations above and below the smoothed temperature. The
blue triangles show the year of branching for each historical ensemble member.
a volcanic eruption (Muthers et al., 2015; Solomon, 1999), leading to an ozone increase following Pinatubo
in contradiction to observations (Hofmann & Oltmans, 1993). The AMIP OMA simulations were rerun to
generate new ozone and aerosol inputs for a second set of NINT pre-industrial control and historical runs.
The NINT model output with the original and corrected composition is distinguished by “f1” and “f2,”
respectively, in the CMIP6 ripfarchive code. The atmospheric component of GISS-E2.1 used in these two
NINT simulations is identical; all that varies is the prescribed ozone and aerosol composition. In this article,
we emphasize the results of the NINT f2 and OMA f1 historical ensembles, both of which are based upon
the corrected model. In the troposphere, NINT f1 and f2 variables that we have examined are statistically
indistinguishable, so we show the NINT f1 results for stratospheric variables, mostly deferring tropospheric
diagnostics of this configuration to the Supporting Information.
The E2.1-G NINT f1 pre-industrial ocean was started from rest using a present-day climatology of temperature and salinity (Levitus & Boyer, 1994; Levitus, Burgett, et al., 1994), and run for two millennia with
pre-industrial atmospheric composition before a small correction was made to the sea ice calculation. The
model was integrated for another 200 years before a small fix to lake ice was applied. After an additional
50 years, net radiation at top of atmosphere (TOA) was roughly 0.1 W m−2 (with an interannual standard
deviation of 0.35 W m−2 ). This was considered close enough to equilibrium to start the piControl simulation
and branching of the first historical ensemble member. After nearly three millennia of additional integration, the NINT f1 piControl was used to initialize the hundred-year spin-up of the corresponding (E2.1-G)
MILLER ET AL.
6 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 3. Same as Figure 2 but for ocean heat content (ZJ; 1 ZJ equals 1021 J). Heat content is calculated relative to a
reference temperature of 273.15 K. The span of both axes is the same in each panel (850 years and 800 ZJ).
OMA model, after which the OMA piControl and historical ensembles were started. The E2.1-G NINT f2
model branched from the f1 piControl 500 years after the OMA model; after 50 years of spin-up, the f2
piControl and historical ensemble were initialized. The E2.1-H NINT and OMA coupled configurations
followed a similar path toward equilibrium and initialization of the piControl and historical experiments.
The global and annual-mean surface air temperature for the NINT and OMA piControl experiments is
shown in black in Figure 2, along with low-frequency variations in red that are identified using a loess fit
(described below). For the NINT experiments, temperature variations are on the order of a tenth of a degree
K, with higher variability in the E2.1-G coupled model than in E2.1-H. Unforced variability of surface air
temperature in each piControl experiment is larger than in its CMIP5 predecessor. (The standard deviation
about the loess fit is indicated in the upper right corner of each panel of Figure 2, with the CMIP5 GISS-E2
value in parentheses.) This contrast is associated with stronger ENSO variations in E2.1-G (Kelley et al.,
2020). In general, surface trends are small for each model in Figure 2. Sub-surface trends are more apparent. Figure 3 shows drift in ocean heat content, consistent with the small radiation imbalance at TOA. For
the NINT f2 model, the ocean heat content decreases by 550 ZJ over 450 years. This change is comparable in
magnitude to an estimate of uptake since the pre-industrial (Zanna et al., 2019), suggesting that the model
deep ocean is still adjusting toward equilibrium (Figure 3b; 1 ZJ equals 1021 J).
The first member of each historical ensemble branches at the start of its pre-industrial control run (Table 1).
The remaining members are initialized every 20 years, a duration long enough so that the mid-latitude gyre
circulations and tropical ENSO are decorrelated and independent between ensemble members (Blumenthal,
1991). (The dates of branching are shown by the blue triangles in Figure 2) To offset the larger unforced
variability in E2.1-G compared to its CMIP5 E2-R predecessor, which can obscure forced variations, we chose
MILLER ET AL.
7 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Table 2
Historical Ensembles
Model
CMIP archive tag
doi
Branch year
GISS-E2.1-G NINT f1
historical_r1i1p1f1
10.22033/ESGF/CMIP6.7127
4150
Ensemble size
10
GISS-E2.1-G NINT f2
historical_r1i1p1f2
10.22033/ESGF/CMIP6.7127
7550
10
GISS-E2.1-G OMA
historical_r1i1p3f1
10.22033/ESGF/CMIP6.7127
2000
10
GISS-E2.1-H NINT f1
historical_r1i1p1f1
10.22033/ESGF/CMIP6.7128
3180
10
GISS-E2.1-H NINT f2
historical_r1i1p1f2
10.22033/ESGF/CMIP6.7128
4000
5
GISS-E2.1-H OMA
historical_r1i1p3f1
10.22033/ESGF/CMIP6.7128
2000
5
Note. Each ensemble member begins in 1850 and runs through 2014. The branch year is the piControl date corresponding
to 1850 of the first member of the historical ensemble. Subsequent ensemble members branch every 20 years.
a ten-member ensemble size, compared to five for E2-R and (CMIP6) E2.1-H, whose piControl variability
is smaller.
Table 2 lists the historical ensembles including the branching date in the piControl simulation that
corresponds to the start of the first member of each historical ensemble.
2.4. Definition of Anomalies During the Historical Period
The historical ensemble members are perturbed from the pre-industrial control experiment by the forcings described in section 3. However, each member initially retains some drift and low-frequency variability
from the control run, in part because the latter may not have reached equilibrium. To isolate changes
due to the forcing, we construct anomalies by subtracting from each historical member the corresponding
low-frequency drift of the PI model. This drift is identified using a loess fit to the PI control experiment (Cleveland & Devlin, 1988), based upon the loess function in the R statistical (or stats) library
(e.g., Teetor, 2011).
The loess fit derives from a local linear regression that adapts to a larger range of time scales in the piControl
simulations than a simple linear or quadratic fit, similar to a low-pass filter. This is useful if the remaining
approach to equilibrium is largest in the early part of the control run, so that minimal detrending is necessary thereafter. This is the case for the E2.1-H NINT model whose trend is largest in the first hundred years
(Figure 2d). A disadvantage of loess or low-pass fits is that they also capture low-frequency unforced oscillations in a control experiment. Where forcing in the historical experiment drives oscillations out of phase
with the control variability, there will be spurious variations in the anomaly that are due to this temporal
mismatch and not the forcing.
We choose our smoothing duration near 200 years to capture drift at longer time scales while excluding
higher frequency oscillations. Despite this, Figure 2a shows a temporary warming around Year 4900 whose
origin between residual disequilibrium and internal variability cannot be distinguished. Pre-industrial control variations at the surface are on the order of a tenth of a degree K or less, small compared to the magnitude
of the forced anomalies observed by the end of the historical period, so this distinction is not of practical
importance. In general, we find that drift correction is important for ocean heat content and sea level, but
does not affect atmospheric trends noticeably.
3. Forcing
We characterize perturbations to the pre-industrial climate using the effective radiative forcing (ERF),
defined as the difference in TOA net radiation between two AMIP-style simulations with pre-industrial SST
and sea ice, where one simulation contains the forcing agent: for example, an increase in greenhouse gas
(GHG) concentration (Hansen et al., 2005). The difference is constructed from 30-year averages after a 1-year
adjustment period, following the RFMIP-ERF protocol (Forster et al., 2016; Pincus et al., 2016; Smith et al.,
2020). The ERF gives the radiative imbalance prior to the slow return by the ocean to equilibrium but after
rapid adjustments in the atmosphere to the original forcing: for example, through stratospheric temperature
or clouds, including their modification by aerosols.
Unforced variability in both simulations introduces uncertainty in the ERF. Within the 30-year averaging
period, interannual variations of TOA net radiation are similar in magnitude for all forcings; we combine
MILLER ET AL.
8 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 4. Effective radiative forcing for the E2.1-G NINT f2 model for the years 2000 (dark green) and 2014 (brown).
For comparison, the year 2000 ERF of the CMIP5 predecessor model (E2-R NINT) is shown in light green. For 2xCO2
and 4xCO2 , the E2-R and E2.1-G ERF values are in light and dark purple, respectively. SST and sea ice are prescribed
for each calendar month by averaging the first 30 years of the pre-industrial simulation. For 2xCO2 and 4xCO2 , “E2.1-G
HadISST” indicates that pre-industrial SST and sea ice are prescribed in E2.1-G using the Hadley observational
analyses of Rayner (2003). The central 95% of the distribution of uncertainty for each individual ERF spans
±0.08 W m−2 , marked by the error bar on the upper left of each panel.
variance across the individual forcings to compute a common standard deviation of 0.15 W m−2 , so that the
standard error of the ERF, constructed from the difference of two 30-year average fluxes, is 0.04 W m−2 .
Thus, drivers whose global ERFs have magnitudes less than 0.08 W m−2 cannot be distinguished statistically
from 0 with more than 95% confidence without longer runs. (This same criterion applies when comparing
ERF values computed from two forcing agents introduced in the √
same model. For an identical forcing agent
applied to NINT and OMA, this range should be multiplied by 2, so that statistically distinct values are
separated by at least 0.11 W m−2 .) This uncertainty is problematic for highly localized forcing agents like
land use and irrigation, whose global ERFs are small. Regional ERF values are also obscured by unforced
variations of TOA net radiation.
Below, we describe the forcings that perturb GISS-E2.1 during the historical period. Global ERF for the
E2.1-G NINT f2 model version is shown for the years 2000 and 2014 in Figure 4. The year 2014 is chosen
to characterize the present-day forcing; the year 2000 allows comparison to forcing computed with previous model versions. Comparison of forcing between the E2.1-G NINT f2 and OMA ensembles is given
by Figure 5.
Forcing in the E2.1-G and E2.1-H configurations is expected to be similar, due to their common AGCM and
pre-industrial climates that are only slightly different.
3.1. Long-Lived Greenhouse Gases
Long-lived greenhouse gases (GHGs) have nearly uniform concentrations as a result of atmospheric mixing. In the CMIP6 historical ensembles, concentrations of GHGs are annual averages of values taken from
Meinshausen et al. (2017). We use “Option 2” from that study and calculate radiative forcing from prescribed
concentrations of CO2 , CH4 , N2 O, CFC-11 (eq,) and CFC-12, where CFC-11 (eq.) includes CFC-11 plus a
radiative equivalent of 38 other long-lived GHGs. Concentrations are uniform with height throughout the
troposphere.
In the NINT ensembles, the NH concentrations of CH4 , N2 O and CFCs are increased compared to their
Southern Hemisphere (SH) values to account for hemispheric contrasts of anthropogenic sources. In the
OMA historical experiments, the vertical dependence of CH4 is calculated prognostically, but its surface
MILLER ET AL.
9 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 5. Effective radiative forcing in 2014 for the E2.1-G NINT f2 (brown) and OMA (blue) models. Only the
combined aerosol direct and indirect forcing can be diagnosed for the OMA version; OMA forcing by BC on
snow is diagnosed as the difference between two aerosol experiments. The central 95% of the uncertainty
for each individual ERF spans ±0.08 W m−2 , marked by the error bar on the upper right. ERF
values corresponding to the same forcing agent applied to the OMA and NINT models are statistically distinct if they
are separated by 0.11 W m−2 .
value is tethered to the prescribed NINT concentration and meridional contrast. The prognostic concentrations of N2 O and CFC calculated by the OMA model are replaced in the radiative calculation with the
prescribed CMIP6 values used by the NINT model.
In the stratosphere, oxidation of methane is a source of water vapor (LeTexier et al., 1988) that is calculated
prognostically in the OMA and MATRIX ensembles. For the NINT model, the source is implemented empirically by increasing stratospheric water according to the methane concentration at the surface, but lagged by
2 years to account for vertical transport (Miller et al., 2014). This corresponds to a forcing around 0.1 W m−2
(Hansen et al., 2005).
Radiative forcing by GHGs is markedly smaller in E2.1-G than in the CMIP5 E2-R. Figure 4 shows that in
Year 2000, ERF for the E2.1-G NINT configuration is 2.38 W m−2 in contrast to the E2-R value of 3.25 W m−2 .
This reduction by one-quarter is greater than the fractional contrast in forcing associated with the doubling
or quadrupling of CO2 (Figure 4), suggesting that other GHGs also contribute to the reduction.
The spatial distribution of the GHG ERF is shown in Figure 6 for both the CMIP6 and CMIP5 versions of the
GISS NINT configuration. The forcing is largest mainly at low latitudes where outgoing longwave radiation
is highest. The contrast is fairly uniform with latitude, although slightly higher in the SH mid-latitudes
(Figure 6c).
Column water vapor (or “precipitable water”) and longwave (LW) cloud radiative forcing are larger in the
E2.1-G NINT pre-industrial climate, compared to their CMIP5 E2-R predecessors (Figure 7). The radiative
effect of the precipitable water contrast is related to the percentage difference (Goody & Yung, 1989; Huang
& Shahabadi, 2014), which is largest in the SH (Figure 7c). Contrasts in LW cloud radiative forcing vary
more with latitude but show a smaller hemispheric contrast (Figure 7f). We interpret the smaller GHG
forcing in the GISS CMIP6 ensembles as the result of a column that is more opaque to LW radiation due to
higher concentrations of water vapor and increased LW cloud forcing, and therefore less sensitive to further
increases in LW opacity by GHGs. The reduction of the E2.1-G ERF compared to the CMIP5 E2-R value
brings the GHG forcing closer to that of the CMIP3 ModelE (Hansen et al., 2007).
MILLER ET AL.
10 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 6. GHG ERF (W m−2 ) in the (a) E2.1-G NINT f2 and (b) CMIP5 predecessor (E2-R NINT) models in the year
2000 relative to 1850. (c) difference.
3.2. Ozone
Skeie et al. (2020) show that GISS-E2.1 simulates the observed change in column ozone trend at springtime high latitudes between the late 20th century and first decade of the 21st century. For Year 2014,
the NINT ozone ERF is 0.27 W m−2 and statistically indistinguishable from the OMA value (Figures 4
and 5). During this year, the concentration of volcanic aerosols is small as is the forcing difference with
respect to the uncorrected NINT f1 configuration that underestimates ozone destruction by heterogeneous
chemistry (Figure S3).
This correction is nonetheless important to regional ozone concentration and forcing. The NINT f2 ERF for
2014 is generally positive at all latitudes (Figure 8e) except over Antarctica, where the stratospheric ozone
hole caused by heterogeneous reactions with ozone depleting substances is apparent. In contrast, negative
forcing over Antarctica is nearly absent in the f1 ensemble (Figure S7e).
MILLER ET AL.
11 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 7. (left, a–c) Precipitable water vapor (PW) in pre-industrial AMIP-style simulations with the E2.1-G NINT f2 and CMIP5 E2-R NINT models. SST and
sea ice are monthly varying 30-year averages from the respective coupled piControl simulations. The PW difference in the lowest panel is plotted as a
percentage change, relative to the CMIP5 value. (right, d–f) Same but for LW cloud radiative forcing. The RF difference in (f) is in Wm−2 .
3.3. Aerosols
For Year 2000, the NINT f2 ERF corresponding to aerosol direct radiative forcing is −0.35 W m−2 , more negative than the CMIP5 NINT ERF of −0.22 W m−2 (Figure 4). The rapid adjustment includes the semi-direct
effect of aerosols (Hansen, Sato, & Ruedy, 1997). The ERF is largest over land and coastal regions, especially
in polluted regions like East Asia and the Indo-Gangetic Plain along with regions of cultivation and fertilizer use like the North American Great Plains (Figure 9a). For OMA, emission rather than concentration
is prescribed so that the latter is intertwined with cloud processes. Thus, the total aerosol forcing cannot
be decomposed into separate direct and indirect contributions. However, the NINT aerosol concentration is
derived from a prescribed-SST version of OMA, so the direct effect of the two models should be similar.
The ERF associated with the aerosol indirect effect is larger in magnitude than the direct forcing and dominates the total aerosol ERF, especially where aerosol emissions coincide with clouds, such as the Guinea
coast of NH Africa and the Amazon basin (Figures 9b and 9c). For the NINT configuration, the magnitude of
the AIE is imposed, as noted in section 2.1. However, for the OMA version, where the first indirect effect is
calculated using a physically based parameterization without tuning, the magnitude of total aerosol forcing
is only slightly reduced (Figure 9e).
MILLER ET AL.
12 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 8. Effective radiative forcing in 2014 for the E2.1-G NINT f2 model (W m−2 ) by (a) all forcing agents present simultaneously, (b) reduction of snow
albedo through BC deposition, (c) land use, (d) irrigation, (e) ozone, and (f) total solar irradiance. Gray dots indicate where the forcing is distinct from
0 based upon a two-sided Student t test with a 95% central interval.
Aerosols perturb climate where their deposition upon snow and ice increases shortwave absorption
(Flanner et al., 2007; Skiles et al., 2018; Warren & Wiscombe, 1985). In GISS-E2.1, surface albedo is reduced
by BC deposition (Bauer et al., 2013; Hansen & Nazarenko, 2003). Forcing is locally positive, as seen in the
Himalayas (Figure 8b) where prolific sources of BC are near the bright snowy surface. In the OMA configuration, the global ERF is slightly positive: 0.10 W m−2 (Figure 5). The NINT ERF is globally negative but not
statistically distinct from 0 (Figure 4).
3.4. Solar
Incoming solar radiation at TOA is derived from the annual average of the spectrally resolved flux at 1 AU
compiled by Matthes et al. (2017). The annual flux is derived from two empirical models that relate measurements of solar irradiance by satellite (Rottman, 2005) to proxies like sunspot number and facular darkening
that are available farther back in time. The satellite measurements are themselves a source of uncertainty
due to gaps in temporal coverage and unknown offsets between non-overlapping instruments (e.g., Fröhlich,
2006). The same CMIP6 data set provides the photon flux that is used to calculate photolysis rates.
MILLER ET AL.
13 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 9. Effective radiative forcing in 2014. (a) the direct, (b) indirect, and (c) total radiative forcing of NINT f2 aerosols. (d) Total OMA aerosol forcing, and
(e) difference of total aerosol forcing between OMA and NINT. Gray dots indicate where the forcing (or forcing difference) is distinct from 0 based upon a
two-sided Student t test with a 95% central interval.
The NINT f2 global ERF for solar in 2000 is 0.06 W m−2 , down from the CMIP5 NINT ERF of 0.28 W m−2
(Figure 4). For the year 2014, solar ERF for both CMIP6 NINT and OMA models is small and indistinguishable from 0 (Figure 5). Both of these years are near the maximum irradiance measured during
their respectively solar cycles. The latest cycle (24) is unusually weak compared to recent predecessors
(Jiang et al., 2015).
3.5. Volcanic Aerosols
We use the recommended CMIP6 data set for stratospheric aerosols created by volcanic eruptions
(ftp://iacftp.ethz.ch/pub_read/luo/CMIP6/Readme_Data_Description.pdf), a change from prior GISS
CMIP experiments, where volcanic aerosols were derived from an update of Sato et al. (1993).
The CMIP6 volcanic prescription is based upon a combination of direct measurements and model reconstructions. During the satellite era (after 1979), radiance measurements are used to retrieve the aerosol
extinction and particle size as an evolving function of height and latitude (Thomason et al., 2018). This
period includes major eruptions by El Chichon and Pinatubo.
MILLER ET AL.
14 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Table 3
Major Volcanic Eruptions During the Historical Period (1850–2014)
Name
Location
Date
Makian
Indonesia 0.3◦ N
May 1861
Krakatoa
Indonesia 6◦ S
September 1883
Santa Maria
Guatamala 15◦ N
October 1902
Agung
Bali 8◦ S
June 1963
El Chichon
Mexico 17◦ N
April 1982
Pinatubo
Philippines 15◦ N
July 1991
Prior to the satellite era, stratospheric aerosol properties are derived using a chemical transport model
(Arfeuille et al., 2014). The injection of sulfur dioxide following an eruption is estimated using measurements from multiple ice cores. Oxidation and conversion to sulfate aerosol is calculated by a model that
simulates nucleation, condensation, coagulation, and deposition with transport prescribed from climatology
as a function of latitude and height. Photometric extinction is used as a constraint when available. The model
simulates extinction by sulfate aerosols for specific historical eruptions identified by the ice cores. During
quiescent periods, aerosol properties are taken from satellite retrievals averaged between 1996 and 2005.
Aerosol properties following eruptions are different after 1979 when retrievals replace model estimates.
The CMIP6 prescription moves one eruption identified by Sato et al. (1993) from the late 1850s to 1861
(Table 3). GISS prescribes total extinction at 0.55 μm using the CMIP6 prescription, but specifies its own
version of the remaining Mie parameters as described by Hansen et al. (2005). During the historical period,
total extinction at 0.55 μm is generally reduced compared to Sato et al. (1993). Extinction is now largest
following the eruption of Pinatubo rather than Krakatau. Globally, the effective radius of volcanic aerosols
is reduced.
During the piControl simulations, volcanic aerosol properties are held constant at a value equal to their
average during the historical period. This is intended to minimize the climate trend resulting from successive
eruptions during the historical period by requiring that the time-average volcanic forcing be 0. Negative
forcing following eruptions after 1850 is punctuated by positive forcing during the intervening quiescent
periods (Figure 10). For example, in 2000, during the quiescent period following Pinatubo, CMIP6 volcanic
forcing is 0.09 W m−2 (Figure 4). The assumption of zero forcing during the historical period results in a
forcing discontinuity at 1850, when the constant pre-industrial concentration of volcanic aerosols is replaced
by a smaller value until the Makian eruption in 1861 (Table 3).
3.6. Orbital
Orbital forcing results from variations in the eccentricity of Earth's orbit, the obliquity of its rotation axis
compared to the orbital plane and the date of perihelion. As for CMIP5, these parameters are prescribed
following Berger (1978). During the historical period, orbital variations in TOA insolation are small and are
calculated only to allow continuity with Last Millennium simulations where changes are larger.
3.7. Land Use
The expansion of cultivation and grazing with human population over the historical period reduces the
fraction of natural vegetation types within each land grid box. We inadvertently implemented land use in
GISS-E2.1 using the same areal fractions of cropland and pasture as prescribed for the GISS CMIP5 model
(Miller et al., 2014). Departures from the CMIP6 protocol are described by Ito et al. (2020). After 2005, the
end of the CMIP5 historical period, land use is held constant.
Land use affects surface albedo. Forcing is negative where crops replace darker forests; forcing is small where
crops expand into grassland and positive where bright bare soils are planted. Land use alters transpiration,
which increases where the leaf area index is increased. Reforestation is not included in our prescription. The
“pasture” category includes both managed pastures and unmanaged rangeland. The latter is found within
natural vegetation categories and its surface properties are generally unmodified by grazing, so its inclusion
within the pasture category leads to an excessive prescription of cultivated area. This overestimate is largest
in grid boxes with abundant natural grasslands and arid shrublands where rangelands are identified. The
inclusion of unmanaged rangeland in the cultivated category nearly doubles the area of simulated land use.
MILLER ET AL.
15 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 10. Effective radiative forcing for the E2.1-G NINT f2 model during the historical period extended to 2100 using
Shared Socioeconomic Pathway (SSP) 2-4.5 (Riahi et al., 2017). The standard deviation of interannual variations of the
ERF is 0.22 W m−2 . The error bar near the right edge marks plus and minus two standard deviations. For GHGs and
aerosols, whose concentrations rise gradually compared to volcanic forcing, we apply a 15-term low-pass filter that
reduces variations shorter than 30 years. The vertical gray line marks 2014, the end of the historical period.
Despite the overestimation of land use area, neither NINT or OMA global forcing in 2014 are distinct from
0 (Figure 5). Nonetheless, negative values of ERF are significant over the Great Plains of North America
(Figure 8c), consistent with the GISS CMIP5 ERF, due to replacement of forest by lighter crops. Figure 8c
shows large positive forcing over the Arabian Peninsula, where cultivation expands to cover bright bare soil.
3.8. Irrigation
Irrigation was implemented within ModelE after CMIP5 (Cook et al., 2011, 2014; Puma & Cook, 2010;
Shukla et al., 2014). Water demand for irrigation in GISS-E2.1 was calculated between 1900 and 2005 as
described by Wada et al. (2014) using irrigation areal extent from Siebert et al. (2015). Nineteenth century
irrigation is estimated by extrapolating backwards from the trend between 1900 and 1929. Irrigation is held
constant after 2005 using the flux from 2004.
The irrigated flux is taken from lakes and runoff within the same grid box and deposited into the soil column
(Puma & Cook, 2010). When lakes and runoff are insufficient to meet demand, water is assumed to be
supplied by a aquifer of unlimited volume. The aquifer draw represents an addition of water to ModelE.
Irrigated water may run off or evaporate, but some is diverted into deep soil layers that have a long return
time to the ocean and atmosphere. As a consequence, irrigation initially has the potential to sequester water
in the soil column and reduce sea level by transferring moisture from the ocean to the land. Eventually
enough water accumulates in the deep layers that a large fraction of the irrigated flux remains near the
surface where it is subject to evapotranspiration and runoff. Crops and vegetation share the same soil column
in GISS-E2.1. Thus, saturation of the deep soil layers through the irrigation flux takes longer than if water
were applied solely to the crop fraction of the soil column.
Global radiative forcing by irrigation is statistically indistinguishable from 0 in the NINT model but slightly
positive in the OMA model (Figure 5). Forcing is positive over the Indo-Gangetic Plain (Figure 8d), a site of
extensive irrigation during the twentieth century (Puma & Cook, 2010). Moistening of the atmosphere by
irrigation increases LW opacity, which contributes to positive forcing. The surface can nonetheless cool, as
moisture supplied by irrigation shifts the compensation of solar heating away from the turbulent sensible
heat flux toward efficient cooling by evaporation.
3.9. All Forcing
For Year 2014, the NINT ERF for the addition of all forcing agents simultaneously is 2.08 W m−2 and is
dominated by the contribution from rising greenhouse gas concentrations (Figures 4 and 10). For the OMA
MILLER ET AL.
16 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 11. Ensemble, global, and annual-average surface air temperature (K) during the historical simulations
(1850–2014). (a) NINT f2 and (c) OMA models; (b and d) same but emphasizing the period after 1970. Intra-ensemble
variability at plus and minus two standard deviations is marked by pink and light blue shading for the E2.1-G and
E2.1-H models, respectively, with light purple marking their overlap. This variability is smoothed using an 11-year
window that damps periods less than a decade. Temperatures from the corresponding CMIP5 models (E2-R and E2-H)
are plotted as dotted lines. The observed GISTEMP record is in black. Temperatures are anomalies, defined relative to
their late-nineteenth century average (1880–1900). Vertical gray lines show major volcanic eruptions listed in Table 3.
configuration, the all-forcings ERF is 2.19 W m−2 (Figure 5), marginally larger than the NINT value, with
aerosols making the largest contribution to the contrast.
For Year 2000, the CMIP6 NINT ERF is 1.79 W m−2 , nearly 1 W m−2 smaller than the CMIP5 NINT value of
2.72 W m−2 . Most of this change is due to a reduction of GHG forcing in the CMIP6 model. The total aerosol
forcing is nearly unchanged between the two generations of NINT experiments, but solar and volcanic
forcing combine for an additional reduction of 0.46 W m−2 .
The spatial distribution of the ERF is generally positive as a result of increasing LW opacity due to rising GHG
concentrations (Figure 8a). However, the imprint of more regionalized forcings is apparent: for example,
land use over the North American Great Plains and western Arabian Peninsula, aerosols over southeast
Asia, the Guinea coast and Amazon basin, BC deposited upon snow in the Himalayas, irrigation over the
Indo-Gangetic Plain, and the near offset of GHG forcing over SH high latitudes by stratospheric ozone loss.
4. Historical Response
4.1. Temperature
Surface air temperature for the historical ensembles is plotted in Figure 11 along with the GISTEMP
Land-Ocean Temperature Index (LOTI; Lenssen et al., 2019). (Table 4 summarizes the observations used in
this study.) Anomalies are defined relative to the start of the GISTEMP record in the late nineteenth century
(1880–1900), when the forcing is still small.
MILLER ET AL.
17 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Table 4
Observations Used in Model Evaluation
Variable
tas
Observation
Source
GISS surface temperature analysis
data.giss.nasa.gov/gistemp
(GISTEMP)
v.4, accessed 2020-07-29
Citation
Lenssen et al. (2019)
msu tls, tmt MSU/AMSU brightness temperature Remote Sensing Systems v.4.0
siconca
Mears and Wentz (2016)
Channel TLS, TMT
www.remss.com/measurements/upper-air-temperature
Sea ice area
National Snow and Ice Data Center
Fetterer et al. (2017)
Hemispheric Sea Ice Area Index, v.3 nsidc.org/data/g02135
ohc
msftmz
Sea ice extent
doi. org/10.5281/zenodo.3717240
Brennan et al. (2020)
Ocean heat content: column
laurezanna.github.io/post/ohc_updated_data
Zanna et al. (2019)
Ocean heat content: upper 700 m
www.cmar.csiro.au/sealevel/thermal_expansion_ocean_heat_timeseries.html Church et al. (2011)
Atlantic Meridional Overturning
RAPID-MOCHA array (26◦ N)
Smeed et al. (2019)
doi.org/10/c72s
zos
Sea level
research.csiro.au/slrwavescoast/sea-level
zostoga
Thermosteric sea level
research.csiro.au/slrwavescoast/sea-level
Church et al. (2011)
Church et al. (2011)
pr
Precipitation
Global Precipitation Climatology Project (GPCP) v.2.3
Adler et al. (2018)
www.esrl.noaa.gov/psd/data/gridded/data.gpcp.html
The GISS ensembles closely track the GISTEMP LOTI throughout the historical period, warming by 1 K
and remaining within two standard deviations of intra-ensemble variability. For the NINT ensembles, two
standard deviations is roughly 0.3 K (For GISTEMP, the 95% central uncertainty at 1880 is 0.15 K, decreasing
to 0.05 K in the present day Lenssen et al., 2019).
Over the historical period, the GISS-E2.1 models warm by 0.1–0.2 K less than their GISS CMIP5 counterparts. The reduced warming is in spite of a higher transient climate response (TCR) for the GISS-E2.1 models
compared to GISS-E2 (Table 5, excerpted from Table 6 of Kelley et al., 2020), suggesting that the CMIP6
reduction is a consequence of smaller total forcing (Figure 4). The E2.1-H version has a higher TCR than
E2.1-G, consistent with its higher warming in both the NINT and OMA versions. (Table 5).
Because E2.1-G and E2.1-H share a common atmospheric model, the warming and TCR contrasts suggest
that HYCOM exports heat from the surface to the deep ocean more slowly (as shown in section 4.3).
Regional warming is shown in Figure 12 as the temperature difference between the start of the GISTEMP
LOTI (1880–1899) and the end of the historical period (2005–2014). Both the models and observations show
their largest warming at high latitudes, especially in the NH, a robust feature of coupled models (Dai et al.,
2019; Holland & Bitz, 2003). In E2.1-G, there is cooling in the region of North Atlantic deepwater formation near the outlet of the Labrador Sea that matches the location of the observed warming minimum
(Figures 12a and 12b). This minimum is not reproduced by the E2.1-H ensembles (Figures 12d and 12h).
Despite the broad hemispheric agreement of model temperature with the observations, regional disagreements are apparent. The extratropical continents in both hemispheres warm less than observed in both
the NINT and OMA simulations. The contrast is especially large over the Indo-Gangetic Plain and East
Asia, regions cooled by irrigation and strong negative forcing by anthropogenic aerosols (Figures 8 and 9).
Table 5
Transient Climate Response From Kelley et al. (2020)
Model
Physics version
CMIP6 (K)
E2.1-G
NINT
1.8
CMIP5 (K)
1.4
E2.1-G
OMA
1.6
1.6
E2.1-H
NINT
1.9
1.7
E2.1-H
OMA
2.0
1.8
Note. The CMIP5 antecedent to the GISS-E2.1-G model is GISS-E2-R.
MILLER ET AL.
18 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 12. Change in ensemble and annual-average surface air temperature (K) between 1880–1899 and 2005–2014. (a) E2.1-G NINT f2, (b) GISTEMP LOTI,
(c) E2.1-G NINT f2 minus GISTEMP, (d) E2.1-H NINT f2 minus GISTEMP, (e) E2.1-G OMA minus NINT f2, (f) E2.1-H OMA minus NINT f2, (g) E2.1-G OMA
minus GISTEMP, (h) E2.1-H OMA minus GISTEMP. Observed temperature is plotted only at locations where measurements are available during 90% of each
averaging period. Gray dots indicate where the (a) ensemble-mean or (b) observed change or (e, f) difference between the ensemble-mean changes is
distinct from 0 based upon a two-sided Student t test with a 95% central interval. (c, d, g, h) Gray dots indicate where the observed value is outside
the central 95% of a normal distribution constructed from the individual ensemble members.
MILLER ET AL.
19 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 13. Global and annual temperature retrieved from the Microwave Sounding Unit (MSU) and Advanced
Microwave Sounding Unit (AMSU) compared to model values. (top) Temperature of the lower stratosphere (TLS),
(bottom) temperature of the middle troposphere (TMT); (left) E2.1-G, (right) E2.1-H. Intra-ensemble variability
(corresponding to plus and minus two standard deviations) for the NINT f2 model is shaded in purple. The
retrievals (black) are from RSS v.4.0 and are offset to have the same time average as the NINT f2 ensemble
during the period 1979–2014. Vertical gray lines show major volcanic eruptions.
In contrast, there is excessive warming by all model versions in the Greenland and Norwegian Seas. In the
Tropics, model warming is “ENSO-like” with greater magnitude in the equatorial Eastern Pacific compared
to observations (Cane et al., 1997).
Differences between the OMA and NINT ensembles are largest at NH high latitudes, where meridional
amplification of warming by OMA is smaller. This moderation is most prominent in the vicinity of the
Norwegian Sea in both OMA models (Figures 12e and 12f).
Upper air temperatures retrieved from the Microwave Sounding Unit (MSU) and Advanced Microwave
Sounding Unit (AMSU) are plotted in Figure 13. Retrievals of lower stratosphere (TLS) and middle troposphere (TMT) temperature, representing weighted averages with respect to height, are provided by Remote
Sensing Systems (Mears & Wentz, 2016, RSS Version 4.0). TLS is weighted most heavily near 20 km; TMT
has its largest contribution from radiation originating near 10 km, but has significant weight extending into
the lower stratosphere. The weighting with respect to height varies with location and season. For comparison to the retrievals, we average model temperature over a range of altitude that is globally and temporarily
invariant (Hansen et al., 1998), an approximation that has been shown to have a small effect upon annual
trends compared to other uncertainties, including differences among retrieval algorithms (Santer et al., 1999;
Thorne et al., 2011).
The TLS retrievals show abrupt warming of the lower stratosphere following a volcanic eruption. Recovery
is initially rapid followed by a slower decadal cooling and return to pre-eruption values (Figure 13a and 13b).
In the last half-century, this cycle is superimposed upon a multi-decadal cooling trend that has been
MILLER ET AL.
20 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 14. Global column ozone (DU) prescribed in the NINT f1 and f2 ensembles. Vertical gray lines show major
volcanic eruptions.
attributed to increasing concentrations of ozone depleting substances and greenhouse gases (Gillett et al.,
2011; Manabe & Weatherald, 1967). Each of the GISS-E2.1 ensembles reproduces these transient warmings and recent cooling. Throughout the historical period, TLS from the NINT f2 ensemble closely tracks
the OMA value, reflecting the prescription of atmospheric composition in the former model from an AMIP
version of the latter.
At the start of the historical period, the lower stratosphere in the OMA and NINT f2 models cools abruptly,
prior to any volcanic eruption (Figures 13a and 13b). The anomaly is outside two standard deviations of
intra-ensemble variability, and is significantly different from the initial pre-industrial temperature. We interpret this cooling as the result of a discontinuous removal of volcanic aerosols. As noted in section 3.5, aerosol
concentration during the pre-industrial is constant and equal to its average during the historical period. At
the start of the historical period, concentration drops abruptly and remains low until the Makian eruption
in 1861. The removal of volcanic aerosols causes cooling in the OMA and NINT f2 ensembles by the same
mechanism that the injection of aerosols into the stratosphere following an eruption causes rapid warming
through longwave absorption (McCormick et al., 1995).
In contrast, this cooling is absent in the NINT f1 ensemble that also generally lacks the slow decadal recovery following each eruption that is present in the other model versions. This absence is associated with a
consistent warm offset of the NINT f1 TLS during quiescent periods. Until the introduction of CFCs into
the stratosphere in the mid-twentieth century, TLS is tightly coupled to column-integrated ozone in both
NINT models (Figure 14). At the start of the historical period, column ozone drops in the NINT f2 ensemble,
associated with a reduced concentration of volcanic aerosols that host heterogeneous chemical reactions
on the aerosol surface. These reactions compete for nitrogen and reduce the concentration of NOx and the
associated catalytic destruction of ozone (Muthers et al., 2015; Tie & Brasseur, 1995). The abrupt reduction
of volcanic aerosols at the start of the NINT f2 and OMA ensembles causes catalytic ozone destruction to
increase, associated with cooling of the lower stratosphere. In contrast, heterogeneous reactions are muted
in the NINT f1 model by the misspecification of surface area for volcanic aerosols. Ozone destruction in
NINT f1 is nearly unchanged by the removal of volcanic aerosols in 1850, resulting in a steady concentration
and temperature of the lower stratosphere.
By the late twentieth century, CFC concentrations have increased and the effect of heterogeneous reactions
upon ozone is now dominated by destruction through the presence of chlorine. Figure 14shows that column ozone decreases following the Pinatubo eruption in the NINT f2 and OMA ensembles, as observed
(Hofmann & Oltmans, 1993), and in contrast to the increase in the NINT f1 model. The contrasting relation
of TLS and ozone in the NINT f2 model between nineteenth and late-twentieth century eruptions shows that
MILLER ET AL.
21 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 15. Trends between 1979 and 2014 for GISTEMP LOTI and RSS v.4.0 MSU TMT and TLS. Ensemble means for
NINT f1, NINT f2, and OMA are marked by triangles, diamonds, and circles, respectively. E2.1-G is marked in red;
E2.1-H is in blue. Trends for individual ensemble members are marked by small triangles. The red and blue horizontal
lines indicate plus and minus two standard deviations of the trend distribution within each ensemble. Observational
uncertainty reflects interannual variations but not instrument or retrieval uncertainty.
the representation of heterogeneous chemistry in OMA is flexible and able to reproduce the distinct limits dominated by chlorine and nitrogen associated with high and low background concentrations of CFCs
(Tie & Brasseur, 1995).
Retrievals of TMT show a warming trend starting in the late twentieth century punctuated by abrupt warming associated with El Niño events and cooling following major volcanic eruptions (Figure 13c and 13d).
(TMT trends also include an offsetting contribution from cooling in the lower stratosphere.) The warming
trend and post-volcanic cooling are reproduced by each ensemble.
Temperature trends between 1979 and 2014 at the surface and MSU retrieval layers are shown in Figure 15.
This period was chosen for overlap between the MSU retrievals and historical ensembles. Observational
uncertainty reflects interannual variations, but not instrument or retrieval uncertainty. For surface temperature, this effect is on the order of 0.01 K per decade and relatively unimportant. Retrieval uncertainty for
MSU is higher: 0.05 K per decade (Thorne et al., 2011).
The figure shows surface and tropospheric warming but stratospheric cooling, a contrast that is consistent with anthropogenic forcing, especially rising greenhouse gas concentrations and ozone depletion
(e.g., Gillett et al., 2011; Manabe & Weatherald, 1967). The NINT f1 and f2 ensembles have tropospheric
trends that are indistinguishable, showing the limited effect of the erroneous post-eruption ozone chemistry
in the original ensemble (Table 6). The difference is larger and significant in the stratosphere.
MILLER ET AL.
22 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Table 6
Modeled and Observed Trends in Surface, Middle Troposphere, and Lower Stratospheric
Global Temperature Between 1979 and 2014 (in K per Decade)
Surface
Middle troposphere
Lower stratosphere
E2.1-G NINT f1
0.21 ± 0.05
0.19 ± 0.03
−0.21 ± 0.02
E2.1-G NINT f2
0.21 ± 0.05
0.17 ± 0.05
−0.31 ± 0.02
−0.35 ± 0.03
E2.1-G OMA
0.17 ± 0.03
0.13 ± 0.03
E2.1-H NINT f1
0.23 ± 0.06
0.20 ± 0.05
−0.21 ± 0.02
E2.1-H NINT f2
0.24 ± 0.06
0.19 ± 0.03
−0.29 ± 0.02
−0.26 ± 0.02
E2.1-H OMA
0.26 ± 0.11
0.22 ± 0.08
Obs
0.15 ± 0.03
0.11 ± 0.05
−0.25 ± 0.09
Obs (1979–2018)
0.17 ± 0.03
0.13 ± 0.04
−0.22 ± 0.10
Note. Model trends are calculated from the ensemble mean, while their uncertainty
range bounds the central 95th percentile of the distribution of ensemble member trends.
Observed trends are based upon GISTEMP (surface) and RSS v.4.0 MSU channels TMT
(middle troposphere) and TLS (lower stratosphere). Uncertainty of observed trends is
related to interannual variability but does not include measurement or retrieval error.
Different methods of MSU retrieval contribute additional TMT trend uncertainty near
0.05 K per decade (cf. Thorne et al., 2011).
For the lower stratosphere, the observed cooling is consistent with the distribution of model trends for each
ensemble. In the mid-troposphere and surface, most ensembles have mean warming trends that are excessive compared to observations, but each E2.1-G ensemble has members whose trend is close to the observed
value.
We recomputed trends with the observational record extended to 2018, which introduces an unusually large
El Niño event. The observed surface and tropospheric trends increase by 0.02 K per decade, which reduces
the model discrepancy (Table 6). This extension has little effect upon agreement in the lower stratosphere.
At the surface and mid-troposphere, E2.1-H ensembles with the HYCOM ocean model show larger warming than the E2.1-G ensembles. This contrast is largest in the OMA ensembles, consistent with the greater
contrast in TCR (Table 5.)
4.2. Sea Ice
Sea ice area is shown for each ensemble as an annual hemispheric percentage in Figure 16, along with the
percentage retrieved by the National Snow and Ice Data Center (NSIDC Fetterer et al., 2017). Sea ice area is
greater in both hemispheres in the GISS-E2.1 ensembles compared to their GISS CMIP5 predecessors. This
represents an improvement in the SH, where the E2.1-G ensemble means are in good agreement with the
NSIDC retrievals at the start of the satellite period. Observed SH sea ice trends are obscured by interannual
variability (Figure 16). E2.1-G trends are weak, but SH ice loss in the E2.1-H ensembles is comparable to NH
trends and unrealistically large. Freshwater from melting ice sheets is one source of sea ice variability not
represented in the models (Golledge et al., 2019; Rye et al., 2014). Anomalous meltwater increases surface
stratification, reducing vertical mixing and the upward transport of ocean heat that leads to sea ice loss.
In the NH, ensemble-mean area is up to 30% higher than the NSIDC retrieval, with a larger bias than their
GISS CMIP5 counterparts. Kelley et al. (2020) attribute this increase to excessive radiative loss at high latitudes related to brighter clouds. Figure 16 includes an estimate of sea ice extent that extends prior to the
satellite period (Brennan et al., 2020). The estimate addresses the sparsity of sea ice observations prior to the
mid-twentieth century by assimilating comparatively ubiquitous temperature measurements into an ESM
to infer sea ice extent, defined as the total area of grid cells whose areal fraction of ice exceeds 15%. By excluding open water within a grid cell, ice area is generally smaller. For the NSIDC retrieval grid, ice extent is
larger than area by about 20%. To facilitate comparison with model area, we give ice extent the same average
between 1979 and 2013 as the NSIDC area.
NH ice extent shows a decline in the early twentieth century interrupted by a brief recovery following the
eruption of Mount Agung before a steeper decline during the satellite period. Throughout the historical
MILLER ET AL.
23 of 35
Journal of Advances in Modeling Earth Systems
10.1029/2019MS002034
Figure 16. Ensemble and annual sea ice area as a hemispheric percentage during the historical simulations
(1850–2014). For comparison to observations, total ice area is shown instead of a detrended anomaly. Unforced
piControl trends are small compared to recent historical trends (Figures S1 and S2).