AI For Science Report
AI For Science Report
AI For Science Report
Department of
Energy Contact
Barbara Helland Program Manager, Department of Energy
Special Assistance
Chapter Leads: Argonne National Laboratory
Valerie Taylor, Director, Mathematics and Computer Science Division
General Atomics
David Humphreys
i
Publication: Argonne National Laboratory: Linda Conlin, Kristen Dean,
Lorenza Salinas, John W. Schneider, Sonya Soroko
ii
Contents
Executive Summary ...................................................................................................... 1
iii
Contents (Cont.)
06. Fusion .................................................................................................................... 65
1. State of the Art ...............................................................................................................65
2. Major (Grand) Challenges ..............................................................................................66
3. Advances in the Next Decade ........................................................................................69
4. Accelerating Development .............................................................................................70
5. Expected Outcomes.......................................................................................................70
6. References ....................................................................................................................71
iv
Contents (Cont.)
12. Data Life Cycle and Infrastructure ..................................................................... 117
1. State of the Art .............................................................................................................118
2. Major (Grand) Challenges ............................................................................................119
3. Advances in Next Decade ............................................................................................122
4. Accelerating Development ...........................................................................................123
5. Expected Outcomes.....................................................................................................123
6. References ..................................................................................................................123
v
Contents (Cont.)
AC. Combined Town Hall Registrants ..................................................................... 171
vi
Executive Summary
From July to October 2019, the Argonne, Oak Ridge, and Berkeley National Laboratories hosted a
series of four town hall meetings attended by more than 1,000 U.S. scientists and engineers. The
goal of the town hall series was to examine scientific opportunities in the areas of artificial
intelligence (AI), Big Data, and high-performance computing (HPC) in the next decade, and to
capture the big ideas, grand challenges, and next steps to realizing these opportunities.
In this report and in the Department of Energy (DOE) laboratory community, we use the term “AI
for Science” to broadly represent the next generation of methods and scientific opportunities in
computing, including the development and application of AI methods (e.g., machine learning, deep
learning, statistical methods, data analytics, automated control, and related areas) to build models
from data and to use these models alone or in conjunction with simulation and scalable computing
to advance scientific research.
The AI for Science town hall discussions focused on capturing the transformational uses of AI that
employ HPC and/or data analysis, leveraging data sets from HPC simulations or instruments and
user facilities, and addressing scientific challenges unique to DOE user facilities and the agency’s
wide-ranging fundamental and applied science enterprise.
The town halls engaged diverse science and user facility communities, with both discipline- and
infrastructure-specific representation. The discussions, captured in the 16 chapters of this report,
contain common arcs revealing classes of opportunities to develop and exploit AI techniques and
methods to improve not only the efficacy and efficiency of science but also the operation and
optimization of scientific infrastructure.
The community’s experience with machine learning (ML), HPC simulation, data analysis methods,
and the consideration of long-term science objectives revealed a growing collection of unique and
novel opportunities for breakthrough science, unforeseeable discoveries, and more powerful
methods that will accelerate science and its application to benefit the nation and, ultimately,
the world.
New AI techniques will be indispensable to supporting the continued growth and expansion of
DOE science infrastructure from ESnet to new light sources to exascale systems, where system
scale and complexity demand AI-assisted design, operation, and optimization. Toward this end,
novel AI approaches to experiment design, in-situ analysis of intermediate results, experiment
steering, and instrument control systems will be required.
DOE’s co-design culture involving teams of scientific users, instrument providers, mathematicians
and computer scientists can be leveraged to develop new capabilities and tools such that they can
be readily applied across the agency’s (and indeed the nation’s) diversity of instruments, facilities,
and infrastructure. This report captures some early opportunities in this direction, but much more
needs to be explored.
From chemistry to materials sciences to biology, the use of ML and deep learning (DL) techniques
opens the potential to move beyond today’s heuristics-based experimental design and discovery to
AI-enhanced strategies of the future.
EXECUTIVE SUMMARY 1
Early use of generative models in materials exploration suggests that millions of possible materials
could be identified with desired properties and functions and evaluated with respect to
synthesizability. The synthesis and testing stages necessary for such scales will in turn rely on ML
and adaptive, autonomous robotic control of high-throughput synthesis and testing lines, creating
“self-driving” laboratories.
The same complexity challenge and concomitant need to move from human-in-the-loop to AI-
driven design, discovery, and evaluation also manifests across the design of scientific workflows,
optimization of large-scale simulation codes, and operation of next generation instruments.
Exascale systems and new scientific instruments, such as upgraded light sources and
accelerators, are increasing the velocity of data beyond the capabilities of existing instrument data
transmission and storage technologies. Consequently, real-time hardware is needed to detect
events and anomalies in order to reduce the raw instrument data rates to manageable levels. New
ML, including DL, capabilities will be critically important in order to fully exploit these instruments,
replacing pre-programmed hardware event triggers with algorithms that can learn and adapt, as
well as discover unforeseen or rare phenomena that would otherwise be lost in compression.
In recent years, the success of DL models has resulted in enormous computational workloads for
training AI models, representing a new genre of HPC resource demand. Here, the use of AI
techniques to optimize learning algorithms and implementation will be necessary with respect to
both the energy cost of large-scale computation and to the exploitation of new computing
hardware architectures. AI in HPC has already taken the form of neural networks trained as
surrogates to computational functions (or even entire simulations), demonstrating the potential for
AI to provide non-linear improvements of multiple orders of magnitude in time-to-solution for HPC
applications (and, coincidentally, reductions in their cost).
The DOE computing facilities such as Summit, Perlmutter, Aurora and Frontier will simultaneously
support the development of existing large-scale simulations, new hybrid HPC models with AI
surrogates, and the exploration of new types of generative models emerging from multimodel data
streams and sources. Future systems envisioned over the next decade may need to support even
richer workloads of traditional HPC and next-generation AI-driven scientific models.
AI will not magically address these and the other opportunities and challenges discussed in this
report. Much work will be required within all science disciplines, across science infrastructure, and
in the theory, methods, software, and hardware that underpin AI methods. The use of AI to design
and tune hardware systems—whether exascale workflows, national networks, or smart energy
infrastructure—will require the development and evaluation of a new generation of AI frameworks
and tools that can serve as building blocks that can be adapted and reused across disciplines and
EXECUTIVE SUMMARY 2
across heterogeneous infrastructure. Bringing AI to any specific domain—whether it is nuclear
physics or biology and life sciences—will demand significant effort to incorporate domain
knowledge into AI systems, quantify uncertainty, error, and precision, and appropriately integrate
these new mechanisms into state-of-the-art computational and laboratory systems.
The overflowing attendance at the AI for Science town halls, the level of enthusiasm and the
engagement of attendees, the number of spontaneous AI projects throughout every scientific
discipline, and the commitment to growth in this area at the nation’s premiere laboratories all
combine to indicate that the DOE scientific community is ready to explore and further the
transformational potential of AI through 2030 and beyond.
EXECUTIVE SUMMARY 3
This page intentionally blank.
4
Introduction: AI for Science
The AI for Science town halls brought together Below, we briefly outline the principle findings
more than a thousand researchers from of the main sections of the report.
DOE National Laboratories, industry, and
academia to identify opportunities for AI to
impact the national science enterprise Materials, Environmental, and
supported by DOE. The teams also outlined Life Sciences
the research and infrastructure needed to Chapters 1–3
advance AI methods and techniques for
science applications. Finding new materials, chemical compounds,
and biological agents able to address
Sixteen topical expert teams summarized the contemporary challenges—for example,
state of the art, outlined challenges, developed batteries with 10 times more storage capacity,
an AI roadmap for the coming decade, and materials that capture more solar energy at
explored opportunities for accelerating greater efficiency, and new drugs targeting
progress on that roadmap. emerging pathogens—is a grand challenge due
to the nearly infinite chemical, biological, and
Important themes emerged for AI applications atomic design spaces to which scientists have
in science. For example, participants anticipate access. Such discovery requires pervasive AI-
the use of AI methods to accelerate the design, enabled automation, from experiment design to
discovery, and evaluation of new materials, execution and analysis.
and to advance the development of new
hardware and software systems; to identify Projecting environmental risk and developing
new science and theories within increasingly resiliency in a changing environment are
high-bandwidth instrument data streams; central challenges to earth and environmental
to improve experiments by inserting inference sciences, encompassing atmosphere, land,
capabilities in control and analysis and subsurface systems along with their
loops; and to enable the design, evaluation, interdependencies. From large-scale observa-
autonomous operation, and optimization of tories such as the Atmospheric Radiation
complex systems from light sources to HPC Measurement (ARM) facility, AI methods will be
data centers; and to advance the essential to obtaining the data needed to refine
development of self-driving laboratories and complex earth and environmental systems
scientific workflows. models, and to developing new models with
unprecedented fidelity and resolution. AI “at the
Important themes also emerged with respect to edge”—where people and things meet—will
outlining the research needed to advance AI. enable autonomous observatories to detect
For example, participants highlighted the need anomalies and outliers, adapting instrument
to incorporate domain knowledge into AI settings and algorithms to provide detailed
methods to improve the quality and inter- measurement of events and conditions that
pretability of the models; the need to develop would otherwise go unnoticed.
software environments to enable AI capabilities
to seamlessly integrate with large-scale HPC Biology and life sciences are at the vanguard of
models; and the need to automate the large- AI applications, for instance using population
scale creation of “FAIR” (findable, accessible, genomics data to learn the bases of complex
interoperable, and reusable) data, given the traits and discovering or building workflows that
central role of data in an AI-centric future automate the inverse design of microbial and
science landscape. plant cells. “Self-driving” laboratories will
INTRODUCTION 5
leverage new generative models and Learning and Integrating Domain
reinforcement learning to explore potential Knowledge
compounds for cancer drugs, evaluate their
synthesizability, or model their response in Today’s computational learning frameworks are
target tumors. not yet able to realize the full potential of
AI-enabled materials, chemical, environmental,
Discovery and Data and biological sciences. We need new AI
methods that can both predict complex
Scientists have used computational
phenomena and provide insights into
approaches to explore virtually materials and
underlying processes. Such methods will be
chemical compounds, leveraging new data
foundational to our capacity to design custom
sources containing the simulated properties of
biological systems capable of addressing major
millions of simple materials and chemical
global health and environmental challenges—
compounds. Deep learning approaches are
that is, ultimately to “build life to spec.” Here, as
being developed to explore more deeply inside
with materials design, AI-enabled, self-driving
vast molecular and biological design spaces.
laboratories (through new automation and
Molecular scientists are using AI to learn force
decision support services) can fuel game-
fields to enable near-exact molecular dynamic changing advances in the understanding and
(MD) simulations with fully quantized electrons deployment of biological, chemical, and
and nuclei. Such analyses, intractable only a environmental systems.
few years ago, must now be captured and
advanced in the form of AI software toolkits Self-Driving and Steering Laboratories
and services.
The most exciting discovery possibilities for
Across the sciences, rapidly growing data emerging instruments such as for bio- or
sources can, in principle, be used to train ML materials imaging lie in going beyond today’s
models provided that the data can be “found, human-in-the-loop experimentation, and
accessed, and are interoperable and reusable,” allowing embedded AI to evaluate results and
or “FAIR.” The use of DL and unsupervised steer experiments.
learning for automatic labeling and reduction of
data also needs to be captured as adaptable AI-assisted management and control of
software services that can be applied to data research labs, instruments, facilities, experi-
sources ranging from environmental datasets ments, and workflows can help achieve a
at broad spatial and time scales, to instrument variety of goals, for instance by adapting
data from materials testing, to genomics data. workflows in response to new hypotheses
generated during workflow execution,
For life sciences, energy infrastructure scheduling resources for more efficient use of
sciences, and even national security, access is facility hardware, and dramatically reducing the
needed to protected sensitive data. We must total cost of operating facilities.
establish new infrastructure to enable shared
use of data that cannot be moved or revealed Experimental science is moving rapidly toward
due to privacy concerns. Similar challenges more frequent online analysis and adaptation.
arise with respect to proprietary manufacturing, In “self-driving” laboratories, AI can be used not
mobility, and private energy data. only for analysis and hypothesis generation,
but also to act on intermediate results, adapting
INTRODUCTION 6
to new data by adjusting experimental integration of uncertainty quantification into AI
parameters or laboratory processes toward workflows.
specific goals, such as protecting resources,
maximizing the data gathered related to a A secure environment for objective bench-
specific phenomenon, or following up on marking of AI algorithms against community
surprising or anomalous results. consensus metrics is needed to detect,
monitor, and possibly correct dataset biases or
AI-guided self-driving laboratories are inconsistent AI performance. Foundational
envisioned that can automate the design, technologies are needed to promote a rigorous
synthesis, and evaluation of material and statistical framework to monitor for potential
increase the pace of discovery by orders biases or inaccuracies in collected data, and to
of magnitude. monitor AI performance to confirm robust
performance or identify performance gaps.
AI in HPC These topics are detailed in Foundations,
Software, Data Infrastructure, and Hardware
Multi-scale models are needed to understand (page 11).
the underlying systems affecting phenomena
associated with the growing global demand for
fuel, food, water, and predictable weather. High-Energy, Nuclear, and
AI technologies can reveal the emergent Plasma Physics
controls of these enormously complex environ-
Chapters 4–6
mental, plant, and microbial biosystems,
enabling us to engineer our environment, for In cosmology, high-energy physics, fusion, and
instance to expand the range of arable lands nuclear physics, the next decade will bring
while improving water availability and quality. In new, enormous, and rich data sets from new
order to enable such discovery capabilities, we light sources, accelerators, tokamak facilities,
must not only improve the performance and and advanced survey telescopes, unparalleled
quality of HPC models (e.g., using ML in depth and resolution at the observed scales.
surrogates) but we must make it possible to These observations will be combined with
build generative models from diverse exascale-enabled simulations modeling
observations (e.g., time series measurements) structure formation in unprecedented detail to
and computational simulations. This will need enable major scientific advances. ML, including
to be aligned with AI-based inverse problem DL, techniques will be crucial in the analysis of
solvers, such as for image-to-phase or multi-spectral observational data sets. “AI-in-
waveform-to-source problems to explore novel HPC” approaches to simulation that use fast
geoengineered solutions. AI-based surrogates will allow the reconstruc-
tion of the history of the universe from the Big
Such simulation models represent another
Bang until today at unprecedented fidelity, from
domain where AI is already showing trans-
the largest scales down to our own galaxy.
formative results. Time-to-solution of modeling
systems and associated reduction in
The multiscale, highly correlated, and high-
computational needs (and associated energy
dimensionality nature of the physics of the
use) can be improved by combining data-
nuclear force also leads to a rich set of
informed AI approximations with physical
phenomena in nuclear physics. AI techniques
principles for earth systems, ecosystems, soil
offer the possibility of increased understanding
microbiology, watershed, and other models.
and new discoveries via DL analyses of light
The use of such AI “surrogate” functions will
source experimental data, especially given
require robust, explainable AI methods for
recent and planned upgrades and resulting
training and validating hybrid models, and the
INTRODUCTION 7
increased data volumes and rates. Fusion and experimental design is the availability (to
scientists look to AI/ML techniques for the general community) and the lack of
breakthroughs ranging from maximizing uniformity of data. A significant need in the
predictive understanding of fusion plasmas and coming decade will be to develop ML methods
the burning plasma state to enabling real-time to automatically annotate and structure data
control in long-pulse tokamak experiments, and from computational models and experimental
ultimately AI-in-the-loop plasma prediction and facilities such as the international ITER
control solutions necessary for sustained, safe, Tokamak, upgraded light sources such as the
and efficient fusion power plant operation. Advanced Photon Source (APS), and
Advanced Light Source (ALS).
Discovery and Data
Designing and Steering Experiments
In coming years, the global high-energy
physics community will deploy AI-controlled, The introduction of ML and AI into the scientific
city-size scientific instruments (particle process for hypothesis generation and the
accelerators and particle detectors) that design of experiments promise to significantly
produce zettabytes of data. Similarly, high- accelerate the scientific process by automating
bandwidth streams will come from new survey and accelerating the development of models
telescopes, upgraded light sources, and and the testing of hypotheses. For this to
tokamak experiments. AI-powered hardware become reality, domain knowledge must be
will be required to filter detector data in integrated into ML models, moving beyond
microseconds. AI inference systems trained by current models that are either purely data-
data and simulations of detector response will driven or that incorporate only simple
be needed to enable high-precision studies, algorithms, laws, and constraints. ML
while unsupervised AI-based searches for techniques that combine theoretical and data-
anomalies and rare events, indeed even for driven models in hybrid systems that better
“New Physics,” will open new windows for represent the underlying dynamics specific to
discovery. phenomena will be especially key.
INTRODUCTION 8
intelligent instrument operation and experi- Designing and Steering Infrastructure
ment-steering. ML inference with microsecond
latency will be required to support particle Just as AI will enable breakthroughs in
physics trigger applications in large detectors automation (such as designing experiments,
and associated event processing operations. self-driving laboratories, or steering
instruments), it will make it possible for the
The use of AI for real-time experiment-steering same techniques to be applied to designing
will increasingly become indispensable, and operating complex infrastructure. From
whether for light source instruments or electrical generation to transmission to
tokamak experiments, and will become equally distribution systems, increasingly powerful
critical for orchestrating the coupling of sensors—with edge computation enabling AI
cosmological models with the steering in-situ for anomaly detection, predictive
mechanisms of a new generation of multi- analytics, and controls/optimization—will
spectral telescopes. improve resilience as well as restoration by
enabling predictive capabilities of after-event
states and sharper awareness during the
Engineering, Instruments, and restoration process. AI-driven, real-time
Infrastructure intelligence in this context can perform
Chapters 7–9 and 14–16 information fusion from disparate sources,
coupling real-time infrastructure data with
Terms such as “smart manufacturing” and infrastructure models (e.g., a “digital twin”).
“digital twins” reference transformational Similarly, AI/ML-enabled predictive models
approaches for expanding optimization to trained by infrastructure data will be
include an entire manufacturing lifespan, from indispensable for exploring the design spaces
raw materials to shape/topology to manufac- for smart energy—as well as transportation—
turing process to end use. Concurrently, AI has infrastructure, HPC computing systems and
been used in generative design, a two-step data centers, and communications networks.
iterative process based on design goals that
first generates possible outputs that meet In similar fashion, particle accelerators, light
specified constraints and then allows a sources, and complex instruments such as
designer to tune variables to meet constraints. ITER comprise many interconnected
Generative adversarial networks are often used subsystems of magnets; mechanical, vacuum,
to drive the underlying optimal design. and cooling equipment; power supplies; and
other components. These instruments have
The nation’s energy infrastructure is moving thousands of control points and require high
increasingly from traditional loads (non-digital, levels of stability, making their operation a
invisible) to many more and smaller loads that complex optimization problem. The operation
expose data (are visible) and have of these instruments has benefited from AI/ML-
communication and intelligence features based solutions but remains extremely difficult
amenable to a cooperative load-management due to the lack of a priori models for reliable
approach. Combined with increasingly and safe control. In the absence of such
intelligent energy distribution and generation models, learning models based on raw data
infrastructure, the complexity, nonlinearity, and and other AI/ML-based solutions have been
emergent behaviors of these systems will explored, with promising results.
require AI-enabled, distributed and cooperative
configuration, optimization, threat detection and Even smaller scales, such as manufacturers of
avoidance, and control. limited volume batches of materials and those
INTRODUCTION 9
that produce many variants of similar designs electron microscopes and light sources can
for customized products, are limited by mere require responses in the 100 nanosecond
automation with heuristics-based operational range—over 100,000 times faster.
rules on robotic assembly lines. As with self-
driving laboratories, this widespread class of All of DOE’s current scientific facilities—ESnet,
manufacturing genre must move to robotics exascale machines, the continentally
with AI-at-the-edge to perform tasks distributed ARM, individual light sources, data
autonomously (in similar fashion, as noted sources from field-deployed sensors, and
earlier, with respect to self-driving laboratories instrument and HPC data repositories—have
or remote observatories). been designed for traditional scientific
workflows. Every link in this chain, from data
These data-driven methods for control-level portals and networks to edge systems, HPC
modeling, management, and interpretation of resources, and input/output (I/O) systems must
real-time data for control, optimal trajectory evolve to support the new demands of AI
determination, and real-time prediction to applications and workflows.
support continuous and asynchronous actions
and prevent faults will also accelerate the Infrastructure Security
development of approaches to the operation of
new types of infrastructure such as fusion As critical infrastructures increasingly rely on
power plants. information systems, AI applications will offer
the best approach to detecting and diagnosing
DOE also operates instruments with cyber and physical attacks and threats in real-
components distributed over distances from time. Removing the human-in-the-loop is
hundreds of kilometers (e.g., ESnet or the ARM increasingly necessary for defensive responses
facility). Moving to autonomy and adaptive on the same millisecond timescales as digital
measurement makes the current practice of attacks. Here AI can offer novel techniques,
centralized control intractable. Whether in including surrogate models, closure models,
laboratory experiment lines, on city-sized and learning-driven compute acceleration of
accelerator facilities, or for continental-scale high-fidelity models and solvers.
infrastructure, AI will be needed to support
AI in HPC
infrastructure as autonomous, self-tuning, and
self-healing complex systems with emergent As noted above, use of AI surrogates within
properties and non-linear behavior, relying on HPC models has the potential to improve time-
AI-at-the-edge due to complexity as well as to-solution by orders of magnitude, albeit
latency and data communications bandwidth. replacing first-principles functions with
approximations. AI-based surrogate models
Commercial AI hardware and system-on-chip can play at least three roles in manufacturing
(SoC) systems also have a key role to play, systems, including a priori optimization,
given DOE’s billions of dollars of investment in in situ real-time process control, and
experimental facilities. Ultra-low latency and heterogeneous manufacturing through the
low power inference for scientific experimental transfer of AI models between different devices
control in these facilities can enable more and/or feedstocks.
complex, intelligent experiments, and more
efficient operation. Again co-design and overall With infrastructure and manufacturing,
system architecture are critical as even the surrogate models could form the basis for
most time-sensitive commercial applications digital twins that guide design and operation.
fueling the AI hardware industry, such as Determining the best AI techniques to generate
autonomous driving, require millisecond and validate surrogates that are robust and
response, while DOE instruments such as
INTRODUCTION 10
with minimal bias will be important, along with optimization, and control services while also
research to explore, for at least several providing data that can use new AI-based
exemplar manufacturing and infrastructure services for creating and refining generative
processes, the optimal type and quantity of models that can guide the optimization and
data to improve design optimization. safe operation of the instruments themselves.
INTRODUCTION 11
New AI Hardware and Systems To fully realize capabilities ranging from self-
Components driving laboratories to AI-designed, imple-
mented, and operated scientific workflows, new
There is an explosion of new AI hardware in programming and run-time models must also
industry, however the target applications be developed. For example, scientists might
driving these devices largely comprise ideally describe workflows as high-level goals
consumer or enterprise areas such as and itemize building-block tasks (i.e.,
autonomous driving, social networks, e- experiments, simulations) and rough models of
commerce, and gaming. As evidenced in the costs of those tasks. An AI system could
DOE’s Exascale Computing Project, there are instead generate a specific workflow,
significant opportunities to co-design incorporating expert knowledge, to accomplish
heterogeneous compute nodes that leverage those tasks, adapting as results are uncovered
these new architectures and commodity or new data become available and refining the
SoC ecosystems. models of costs (e.g., in energy use or time).
Such workflows will need to operate across
A set of integrated new AI workflow orders of magnitude variations in
frameworks and exemplar applications will be communications latency and bandwidth and in
needed to evaluate emerging AI architectures computational power and storage, especially in
from edge SoCs to HPC data centers. This cases of specialized edge devices designed for
would effectively create both an evaluation tool low-power deployments in the field. These
set and a simultaneous series of specific programming frameworks will need to provide
science-based challenges to drive and shape resource discovery, matching, negotiation, and
new AI technologies, including those that fuse complex optimizations of these new forms of
explicit knowledge and learned function. heterogeneous distributed computing infra-
structure, including the integration of inference
Programming Models and Workflows on low-power edge systems with iterative
learning systems within a few milliseconds of
The design of next-generation hardware and the edge (e.g., in 5G telecommunications
software systems—from new chips to entire stations) and deep learning in data centers.
HPC systems—and the mapping of application
codes to target systems is currently a static Current HPC memory and storage systems are
process that involves human-in-the-loop design architected for traditional HPC simulation-only
with repeated experiments, modeling, and workloads with relatively small inputs and large
design space exploration. As these systems outputs, where the access patterns are
increase in complexity and heterogeneity, predictable, contiguous, block-based opera-
current strategies will be impractical. tions. Current AI training workloads, in contrast,
must read large datasets (i.e., petabytes)
Early work demonstrating systems and repeatedly and perhaps non-contiguously for
workflows that integrate AI capabilities with training. AI models will need to be stored and
traditional HPC simulation has largely involved dispatched to inference engines, which may
bespoke capabilities for each experiment. The appear as small, frequent, random operations.
frameworks, software, and data structures are Indeed, the model for computing within DOE
distinct, and APIs do not exist that would will need to evolve to where specialized AI
enable even simple coupling of simulation and hardware cooperates with traditional HPC
modeling codes with AI libraries and systems to train models that are dispatched to
frameworks. In situ data analysis requiring ML low-power devices at the edge.
capabilities suffers from the same limitations.
INTRODUCTION 12
AI Foundations characterized tool for science, the research
community will need to address these
AI presents a unique opportunity for creating questions and develop advanced capabilities to
data-driven surrogate models that are explain the behavior of the AI model.
potentially orders of magnitude faster to run
than first-principles simulation codes and that Especially for systems operating experiments,
can be particularly effective in the ability to instruments, or critical infrastructure, validation
simulate physical processes that span many is vital regardless of whether the AI model is
spatial and temporal scales. Rigorously making the right decision for the right reason.
understanding tradeoffs such as generalization Has the AI model learned spurious
limits, proofs of interpolation/extrapolation, correlations, or can the model determine the
robustness, assessment of confidence control variables? Can AI be used to identify
associated with predictions, and effects of the causal variables or distinguish between cause
input data will impact not only model selection and effect? Typically this cannot be done with a
in AI systems, but also the creation and single training dataset. Instead, the AI model
investigation of new classes and types needs to be trained to construct a hypothesis,
of models. typically a counterfactual one, and to design an
experiment—including the collection of data
At the most basic level, frameworks and tools (and the suitability of that data)—to test that
are needed to establish that a given problem is hypothesis.
effectively solvable by AI/ML methods and is
not subject to limits such as extreme Opportunities exist for fundamental advances
complexity, unbounded problems, or explain- in optimization algorithms, differentiation
ability. Principles of theoretical computer techniques, and models—foundational to
science provide a rigorous framework to training in AI. Additionally, an important aspect
establish critical properties of AI/ML codes, in the development and application of AI is the
namely computability, learnability, quantification of uncertainties. Where AI and
explainability, and provability. ML are used in physics-based applications,
established approaches to UQ are applicable.
To become an accepted part of the toolboxes In other cases, particularly in classification
used by scientists and engineers, the validity problems, ML models tend to be highly
and robustness of AI techniques need to be nonlinear systems that are extremely sensitive
trusted. What are the limits of AI techniques, to input data, and small (e.g., undetectable to
and what assumptions and circumstances can the human eye) changes can lead to
lead to establishing assurance of AI predictions misclassification.
and decisions? Which AI techniques can best
address different sampling scenarios and Addressing the computer science challenges
enable efficient AI on various computing and will require a comprehensive AI/ML science
sensing environments? Resulting AI systems program to develop and refine foundational
must similarly address assurance: whether and limits and solvable problems and to sharpen
when an AI model can be trusted. Why does the solutions for solvable classes to ensure
the AI model work for a problem? What are the effective computation, performance guarantees
internal representations of data that the AI and explanations. This is an urgent issue, as
model has learned during training? How can work on the foundations of AI and ML has been
the behavior of the AI model be explained? far outpaced by the empirical exploration and
How confident are the AI models on their use of such techniques—often in the form of
predictions given the different sources of bespoke systems with disparate architectures.
uncertainties and inductive biases involved? Consequently, the principles underlying the use
For such an AI model to be accepted as a well- and understanding of these and other
INTRODUCTION 13
techniques tend to be scattered across techniques are vital to designing and operating
disciplines, from theoretical computer science increasingly large scale, complex infrastruc-
to signal processing to statistics. ture. These services will, in turn, require AI-
based functions that can integrate and
Discovery and Data augment multimodal data sources including
metadata, such as scientific instrument
Accelerating science, engineering, and responses (e.g., flux and focus) in combination
manufacturing through AI methods requires with a record of instrument configurations (e.g.,
large and diverse sources of data. At the same motor positions, neutron chopper phases,
time, AI may hold the key to the limitations monochromator bending parameters), and
associated with that data. That is, applying measurable instrument and environmental
data sources—from instruments, simulations, parameters (e.g., ring current, cooling water
sensor networks, satellites, the scientific flow, and temperature). The integrated data will
literature, and research results—is inherently underpin AI services for developing generative
challenging with respect to data being “FAIR” models and decision-making functions that will
(findable, accessible, interoperable, reusable). be required to build advanced predictive
AI systems can be employed to automate the models of accelerators, end stations, and
creation of FAIR data and integrate it into sample delivery systems. Such services and
knowledge repositories, in turn providing the models will also aid in automated alignment
architectural basis for new data infrastructure and calibration of instruments, stabilizing user
necessary to accelerate AI training and operations, predicting and preventing
model development. catastrophic failures, and/or reducing the total
downtime of the instrument.
This high-volume data acquisition not only
extends the end-to-end experimentation time While the infrastructure and methods needed
but also limits experiments with time-sensitive to enable AI methods to access, learn from,
phenomena. Smart data reduction techniques and add to the broad body of knowledge are
(e.g., filtering relevant data or point-of-interest nascent, there are promising examples, such
data acquisition) will be necessities rather than as the use of reinforcement learning,
features with the upcoming instruments such unsupervised learning, and classification
as those mentioned earlier. techniques to automate labeling and creation of
metadata.
Data produced by instruments, manufacturing
systems, or engineered products (e.g.,
vehicles), often cannot be shared due to Conclusions
regulations (e.g., medical records or energy
usage data) or the competitive nature of the Realizing the scientific capabilities discussed
data (e.g., factory or mobility data). AI-based throughout this report will require extensive co-
federated learning techniques can accelerate design work for domain scientists, facility
model development, for instance, by designers, AI experts, mathematicians,
harnessing proprietary manufacturing data computer scientists, and software research
from multiple sources. These techniques teams. Across the 16 chapters are scientific
enable the development and training of models requirements that suggest a suite of new AI-
with data from many sources without requiring capability building blocks and services, from
data sharing among them. design to control, augmented simulations to
generative models, decision making to inverse
AI-based data services that leverage success problems, and the ability to learn not only from
to date of new DL and unsupervised learning multi-model data (e.g., text, graphics, images,
INTRODUCTION 14
waveforms, structured, time series) but from analysis infrastructure that fully exploits and
the domain knowledge embodied in the optimally supports new, AI-enabled software,
scientific literature. data lifecycle, workflow, and modeling services
and toolkits.
To achieve the grand challenge of developing
self-improving and self-adaptive hardware- DOE’s programmatic approaches, such as co-
software systems and applications, the design or SciDAC programs, are ideal for
services, applications, and software developing the new AI services—packaged
infrastructure must be both grounded by and supported as reusable toolkits and building
mathematical and AI foundations research and blocks—that are required for self-driving
also implemented, evaluated, and adjusted laboratories and for steering scientific
over the coming decade. While this report is a instruments. AI-based components including
not a detailed implementation plan, we can see design, decision-making and evaluation,
possibilities for how to accelerate the control and optimization, or the creation of
opportunities identified by the community. One generative models from instrument data and
potential path is to partner with industry along simulations are necessary to move from
at least two roadmaps. “AI has potential for…” to “AI is enabling…”.
The first is an “Instrument-to-Edge” activity that International leadership in AI over the coming
charts the course toward common tools and decade will hinge on an integrated set of
services for instrument, experiment, and programs across four interdependent areas—
infrastructure design, evaluation, optimization new applications, software infrastructure,
and steering, and safer operation across the foundations, and hardware tools and
DOE enterprise. technologies, feeding into and informed
concurrently by DOE’s scientific instrument
The second entails continuing efforts to facilities and by DOE’s leadership class
advance a leadership computing, data, and computing infrastructure.
INTRODUCTION 15
This page intentionally blank.
16
01. Chemistry, Materials, and Nanoscience
The ability to design and refine materials and 1. State of the Art
chemical compounds has always been key to
the rapid advancement of society’s technology Our ability to discover new materials and
and infrastructure. Today’s complex chemical reactions is driven by intuition, design
technologies require a broad spectrum of rules, models, and theories derived from
needs when developing and optimizing scientific data generated by experiments and
materials and chemicals with desired simulation. The number of materials and
performance [1–3], such as mechanical, chemical compounds that can be derived is
electronic, optical, and magnetic properties astronomical, so finding the desired ones can
(e.g., smartphones use up to 75 different be like looking for a needle in a haystack.
elements compared to the twentieth-century Currently, various machine learning (ML)
version that had only ~30). This new level of approaches are used to help scientists explore
technological complexity, combined with the complex information and data sets with the
need to search undiscovered areas of goal of gaining new insights that lead to
the chemical and materials landscape without scientific discoveries. Future discoveries of
clear theories or synthesis directions, advanced materials could be greatly
[4] requires new paradigms that utilize artificial accelerated through ML. Note, for example, the
intelligence (AI). timeline from discovery of LiMn2O4 to nickel-
manganese-cobalt (NMC) materials for
AI will become an integral part of a scientist’s batteries. Using known data, we could use ML
arsenal, alongside pen and paper, and to accelerate discovery of new material classes
experimental and computational tools. It will for batteries from 14 years to less than 5 years
accelerate the next scientific discoveries and (Figure 1.1).
the design and development of revolutionary
technologies benefiting society. AI will identify Nowadays, experimental characterization tools
both promising materials and chemicals, and routinely provide picometer/picosecond
the reaction pathways to make them [5]. resolved images at an ever-increasing rate,
Scientists will use AI to generate scientific data and, when coupled with a modern camera, are
in a rational way, formulating new physical capable of providing several hundreds of
models and theoretical insights that drive new frames per second. This pushes the data size
paths for rational design of materials and into the several hundreds of terabytes (TB) per
chemicals, and exploring atomic design spaces experiment for a single microscope [6]. Real-
currently unimaginable. time analysis of this data, aided by AI, is
Figure 1.1 Timeline from discovery of LiMn2O4 to NMC materials for batteries.
we search the chosen space in the most Specifically, gaps/challenges that need to be
efficient way or decide to move on to other addressed by AI/ML are listed below.
areas? Can we develop new design rules?
Aiding this would be the ability to understand Design metastable phases and materials
the length- and time-scale evolution of that persist out of equilibrium. These
functional chemical and materials systems. materials enable access to a diversity of
properties beyond the limits drawn by
The primary challenges are concisely equilibrium thermodynamics. For example,
described by BESAC’s 2015 report, Challenges optically driven processes of materials could
at the Frontiers of Matter and Energy: provide more control over the chemical
Transformative Opportunities for Discovery processes and lead to new materials, such as
Science. metastable phases or new low-dimensional
materials with dynamics controlled by in-plane
• Mastering Hierarchical Architectures and heterogeneity rather than layer stacking order.
Beyond-Equilibrium Matter Another example is self-assembly, where
transient (non-equilibrium) intermediate states
• Beyond Ideal Materials and Systems:
frequently appear, and control of assembly
Understanding the Critical Roles of
pathways can enable improved structural
Heterogeneity, Interfaces, and Disorder
control. Modern characterization systems such
• Revolutionary Advances in Models, Mathe- as electron and scanning probe microscopies
matics, Algorithms, Data, and Computing may allow “bottom-up” fabrication of new
• Harnessing Coherence in Light and Matter structures that are metastable, which allows
arrays, for example, of topological defects to be
• Exploiting Transformative Advances in created with nanometer precision for desired
Imaging Capabilities across Multiple Scales properties. The challenge is to do this in an
efficient and reproducible fashion; this requires
in-line analytics and feedback of very high
velocity and volume data streams.
Efficient materials, chemical, and device In the next decade, all the upgrades to DOE’s
characterization are critical elements in the light sources will be completed alongside the
scientific discovery workflow. As such, the proton power upgrade at the neutron source.
characterization capabilities are constantly Thus, there will be significant advances and
used for the determination of chemical new information in the following areas.
composition, structure, physical properties, and
overall functionality. In general, this involves New data sets/instruments online. There will
(1) an analytical step to confirm that the target be a continued increase in the capabilities in
chemicals and/or materials are produced; detectors/cameras alongside accelerators that
(2) characterization of the physical properties, will lead to a tremendous increase in potentially
morphologies, defects, and interfaces of the high-quality information from microscopes and
functional materials and chemicals by multiple light sources. Those instrument advances will
probes/techniques; (3) characterization of the provide extreme volumes and velocities of data
functional properties, in situ/operando, in that contain deep information regarding
devices. This means it will require new analysis materials/chemistry processes alongside a
across all of these platforms, including modality that enables manipulation and control
registration of data from different instruments of the materials.
(e.g., pan sharpening) and scaling for
Figure 1.4 Schematic illustration of the elements of experiments and computations that are
required to enable autonomous-smart experiments for materials/chemical design/synthesis.
26
02. Earth and Environmental Sciences
Earth and Environmental Sciences addresses Earth systems, facilitated by HPC capabilities.
some of the most pressing challenges in the Together, these vast observation and
nation, from natural resource utilization to simulation data offer unique opportunities
maintaining our infrastructure and environment. to apply AI approaches for improved
In particular, recent events have highlighted the understanding and scientific discovery in Earth
fact that our society is vulnerable to and environmental sciences. AI methods offer
increasingly frequent natural hazards, including the promise to accelerate development of
wildfire, drought and extreme precipitation advanced tools and the next generation of
events (Figure 2.1). An urgent need exists for technology for assimilating observations and
improving our predictive capabilities of earth data-driven forecasting.
and environmental systems, including physical,
chemical, and biological processes that govern 1. State of the Art
the complex interactions among the land,
atmosphere, subsurface, and ocean Applications of AI methods for Earth,
components from molecular to global scales, environment, and climate research are in their
and from daily to decadal time scales. infancy, but interest is growing rapidly as
our ability to collect and create data outpaces
our ability to assimilate, interpret, and
understand it [24,3]. Primary applications
include (1) knowledge discovery and
estimation; (2) data assimilation and data-
driven models; (3) model emulators, and
(4) hybrid process-/ML-based models that
integrate process scale data. Artificial neural
networks (ANNs) and deep neural networks
(DNNs) have been widely used for producing
weather forecasts (e.g., [8]), spatiotemporal
gap filling (e.g., [13]), and various remote
Figure 2.1 Billion-dollar weather and climate disasters for the sensing and geophysical image processing and
year 2018 [32].
analysis [3,21]. Random forest (RF) methods
are widely used to understand and interpret
In recent decades, Earth observation complex environmental data [1], as well as to
capabilities have been revolutionized, based on estimate environmental parameters such as
a suite of novel sensor, analytics and soil properties at the global scale [12]. In
telecommunication technologies. In particular, addition, unsupervised learning and clustering
DOE has pioneered integrated observational methods have been used to discover key
capabilities at the laboratory scale (e.g., EMSL, spatiotemporal patterns in large remote
SNS, ALS) and at field scales (e.g., NGEEs, sensing and simulation datasets (e.g., [14]).
ARM), as well as developed systems biology
databases (e.g., KBase) and data archives More recently, increasing interest in ML
(e.g., ESS-DIVE and ESGF). We now have applications have fueled development of
access to several hundred petabytes of emulators for environmental process
observational data of the Earth system in the models, particularly in the subsurface and
U.S. alone; most of them in real-time. In atmospheric sciences (e.g., [25,17,29]). New
parallel, predictive modeling capabilities have parameterizations based on ANNs have been
advanced significantly to simulate complex developed for representing stochastic
Figure 2.2 There are many ways environmental conditions and changes in the environment affect energy
systems [33].
Figure 2.6 Hybrid approach that combines AI with physical understanding to address some of the black box issues and
make the models physically consistent: (a) shows a multilayer neural network, with n the number of neural layers and m
the number of physical layers; b and c are concrete examples of hybrid modelling; (b) prediction of sea-surface
temperatures from past temperature fields; (c) a biological regulation process (opening of the stomatal ‘valves’
controlling water vapor flux from the leaves) is modelled with a recurrent neural network [24]. Hybrid models are useful
for replacing poorly understood or unresolved (sub-grid scale) phenomena. Challenges include (a) obey physical
constraints; (b) quantify uncertainties in the parameters in the network models; and (c) develop methods for adding
explanation to the network models and parameters. Training hybrid models using offline or online methods need
exploration.
potential problems (see Chapter 10, AI Hadron Collider (HL-LHC), and the Deep
Foundations and Open Problems). These Underground Neutrino Experiment (DUNE)—
techniques will allow for optimization of the will transform high energy physics. These
operations of these complex systems to facilities will be precise and powerful tools that
prevent or mitigate the impact of certain faults will enable both the discovery of new particles
and to accelerate the return to normal and in-depth studies of known particles and
operations if a fault occurs, increasing the fundamental interactions. They will produce
instrument science output. hundreds of petabytes of raw data every year,
and exabytes of simulated and secondary data
ii) ML inference with microsecond-latency streams. These data volumes will preclude
in particle physics trigger applications. At straightforward extensions of current
HL-LHC, each detector will produce petabytes approaches for detector data analysis. Collider
of detector data per second. The experiments physics can be described as a massive inverse
will rely on a “trigger” system built from custom problem, requiring techniques from data
hardware, plus FPGAs, CPUs, and GPUs merging, data visualization, and large-scale
processors to reduce these data rates to a inference, first to “deconvolute” the detector
more manageable 10GB/s. The first level of signals from thousands of particles traversing
this trigger system will reduce the detector data it, and then to reconstruct the primary collision
rate by three orders of magnitude in event from the particle measurements.
10 microseconds or less. The challenge is to
do that without throwing away any collision Key to the success of detector deconvolution
event resulting from rare or new physics and, in general, to the analysis of any particle
processes. AI advances will allow us to detect detector dataset is the availability of accurate,
and preserve these precious events that would high-statistics simulations of the detector
otherwise be lost forever, while still meeting the response to particle traversal. Currently, high-
stringent data rejection and latency accuracy detector simulation is performed
requirements. Advances in AI model using the Geant4 toolkit. As an example, the
architectures and in the use of inference availability of datasets with trillions of simulated
hardware (e.g., FPGAs) will be needed (see collision events could significantly increase the
Chapter 13, Hardware Architectures). sensitivity of precision measurements in the
Higgs and W boson sectors at the HL-LHC and
iii) AI-enabled, ultra-fast event processing help provide the first evidence for physics
chain. Over the next decade, accelerator beyond the Standard Model. Simulating a
facilities—such as the High-Luminosity Large collision event at HL-LHC can take up to
54
05. Nuclear Physics
The nature of matter is the fundamental The Nuclear Physics Long Range Plan
question in nuclear physics: what are the basic identifies the priorities for the field. These are:
components of matter and how do they interact
to form the elements that make up our • Utilize investments in accelerators,
universe? This question is not limited to familiar detectors, and computational infrastructure.
forms of matter, but also includes exotic forms, • Develop a U.S.-led, ton-scale neutrinoless
such as those that existed in the first moments double beta decay experiment.
after the Big Bang, and those that exist today
inside neutron stars. In addition to the • An electron-ion collider is the highest priority
fundamental questions of how and why matter for new facility construction.
takes on specific forms, it is also important to • Invest in small and mid-scale projects and
understand how that knowledge can benefit initiatives enabling forefront research,
society in the areas of medicine, nuclear including theory.
energy, and national security. Nuclear
experiments include a range of devices, from Applications of nuclear physics for societal
small- and intermediate-scale devices to very benefit are also important.
large detector programs at accelerator
laboratories like the Relativistic Heavy Ion The multiscale, highly correlated, and high-
Collider (RHIC) at Brookhaven National dimensionality nature of the physics of
Laboratory (BNL), the Continuous Electron the nuclear force leads to a rich set of
Beam Accelerator Facility at Thomas Jefferson phenomena in nuclear physics. AI techniques
National Accelerator Facility (Jefferson Lab), offer the possibility of increasing our
and the Argonne Tandem Linac Accelerator understanding of this physics and making new
System at Argonne National Laboratory discoveries, through a number of applications,
(Argonne). Nuclear physicists also lead detailed here.
experiments at other user facilities such as the
Large Hadron Collider (LHC) at the European
Organization for Nuclear Research (CERN)
(Figure 5.1), the Japan Proton Accelerator
Research Complex (J-PARC), and the
Spallation Neutron Source (SNS) at Oak Ridge
National Laboratory (ORNL).
Nuclear binding energy, for example, is an Analysis of the very complex data sets from
essential property for understanding the heavy ion collisions at RHIC and the LHC
production of nuclear species in astrophysical already benefits from AI. Deep neural networks
events such as supernovae and neutron star can connect specific moments of the complex
mergers. Some relevant binding energies particle correlations inside jets of hadrons with
cannot be measured directly and rely on properties of the quark gluon plasma produced
nuclear models. Supercomputer calculations in the collision—in ways not previously
based on fundamental theory provide our best predictable [8].
predictions for these binding energies and
other important nuclear properties, but to reach The GlueX experiment at Jefferson Lab utilizes
the needed precision, these calculations a high-intensity photon beam and a large-
become very computationally expensive. A acceptance particle detector to search for
team led by researchers from Iowa State exotic hadrons. Individual collisions are
University and Lawrence Berkeley National reconstructed from fine-grained detector
Laboratory (LBNL) developed a DL approach systems. A key use case of ML at GlueX thus
using a neural network trained with state- far is in filtering those events containing rare
of-the-art supercomputer calculations [5]. reactions. GlueX demonstrated that Boosted
The trained network estimates binding energies Decision Trees achieved the required
and other properties with precision performance [7]. Another recent development
beyond expectations from the available in GlueX is a system of data quality
calculations. The researchers validated their monitoring using ML to evaluate images of data
approach by demonstrating consistency with quality histograms in real time to identify
available analytic and phenomenological problematic regions of the detector during the
extrapolation tools. experiment’s operation.
Experimental groups in all areas of nuclear It is now known that neutrinos have mass.
physics are using AI techniques to characterize However, it is not known whether the neutrino
features in their data more quickly, efficiently, is a Dirac or Majorana particle (i.e., the
and with increasing sensitivity. neutrino and the antineutrino are the same
particle). To answer this question, nuclear
Experimental nuclear astrophysicists use the physicists search for the lepton-number
MUlti-Sampling Ionization Chamber (MUSIC) violating process of neutrinoless double beta
detector at Argonne to study the fusion of decay, wherein two neutrons in an atomic
nuclei in stars and to understand explosive nucleus are transformed into two protons
stellar phenomena such as Type I X-ray bursts without the usual emission of two antineutrinos.
and superbursts. Standard data analysis In such searches, it is paramount to
techniques require months to select relevant differentiate a very small signal from
events. Data that were previously analyzed background events that occur at rates orders of
(a) (b)
Figure 5.2 (left) 3D rendering of double beta decay-like data in high pressure TPC
type detector. (right a,b) Simulated neutrinoless double beta decay interactions
(right a) have two Bragg peaks, while energetically indistinguishable background
events have just one (right b). Right a,b images are 2D projections [9].
Nuclear astrophysics simulations—including Figure 5.4 The Facility for Rare Isotope Beams (FRIB) will
core-collapse supernovae, X-ray bursts, and provide unparalleled beam intensities of the most exotic nuclei.
neutron star mergers—continue an inexorable
march towards higher computational intensity, Transform the operation of accelerators and
as increased physical fidelity is realized using detector systems. In data analysis,
higher spatial resolutions, longer physical experimental design and optimization, and
times, and more complete microphysical even facility operation, AI/ML may provide
descriptions. Anomaly detection for these very approaches that are complementary to and
expensive (i.e., of order tens of millions of LCF offer improvement over traditional techniques.
node-hours) calculations becomes essential to AI/ML studies can offer transformative progress
ensure that scarce computational resources in optimal operations of accelerators. In
are not consumed in error. In addition, many of addition to the ongoing work at BNL and
the requisite microphysics in these simulations Jefferson Lab, FRIB operations will surely
(e.g., neutrino-matter interaction rates, benefit. Production of high-purity, high-intensity
thermonuclear reaction rates, and high-density beams of unstable nuclei and delivery with high
equations of state) are recovered via the use of efficiency to the FRIB experimental end
high-dimensional interpolation tables. ML stations present a daunting challenge. As data-
techniques such as Gaussian process models taking runs for each measurement can be
and deep neural networks can replace short, tuning time is important.
traditional interpolation techniques while
providing superior robustness. Time-consuming, multi-step beam generation
efforts potentially limit the overall scientific
When completed in 2022, the Facility for Rare productivity of the facility, as will the need to
Isotope Beams (FRIB) will be the world’s most (on occasion) use sub-optimal beams with
powerful rare isotope research laboratory. By lower intensity. By utilizing supervised ML
producing intense beams of nearly 80 percent methods or reinforcement learning, it is
of the predicted isotopes for elements up to anticipated that beam generation times can be
uranium, FRIB will enable researchers to make significantly reduced compared to manual
major advances in the structure, stability, and efforts, while simultaneously improving the
limits of nuclear matter, as well as in their quality of beams delivered to the end stations.
interactions and decays (Figure 5.4). We
06. FUSION 65
Figure 6.2 The left two plots compare the performance of machine-specific disruption predictors on three different tokamaks
(EAST, DIII-D, C-Mod). The rightmost plot shows the output of a real-time predictor installed in the DIII-D plasma control
system, demonstrating an effective warning time of several hundred milliseconds before disruption [15].
in fusion energy science applications, very little were used to formulate the four Grand
attention has been given to uncertainty Challenges in this area.
quantification. Due to the inherent statistical
nature of ML algorithms, comparing model Maximize predictive understanding of
predictions to data is nontrivial since fusion plasmas and the burning plasma
uncertainty must be considered [19]. The state. A central challenge for the advancement
predictive capabilities of a ML model are of fusion science toward the realization of
assessed using the model response as well as fusion energy is the achievement of sufficiently
the uncertainty, and each aspect is critical to predictive understanding of confined plasmas
the combined effectiveness of real-time and and, in particular, the burning plasma state.
offline applications. While both computational theoretical and
experimental studies have produced
In addition to the rapid growth in tokamak substantial understanding of fundamental
disruption predictors, in recent years fusion plasma phenomena, significant progress
applications of ML and statistical inference to is needed to enable high confidence design of
fusion research have expanded to include operational power plants. For example, further
model reduction for code acceleration understanding of energetic particle behavior in
[14], plasma control [6], and physics tokamak burning plasmas is needed to enable
discovery [3,10]. calculation of power plant performance and first
wall impacts. Divertor function in self-heated
2. Major (Grand) Challenges tokamak plasmas must be projected to enable
design of waste heat and exhaust handling
The principal challenge in fusion energy solutions in a power plant. Much of this
research for the coming decades is to predictive understanding may still be
determine the key solutions that would undiscovered in data collected from fusion
establish the viability of a fusion power plant. experiments and produced by simulations over
The work on components of this overarching the last ~50 years. Maximizing predictive
challenge is expected to grow, developing in understanding from data, both available and
perhaps unanticipated directions with the produced in the future, will be significantly
arrival of new burning plasma experiments aided by design and application of specialized
such as ITER [1]. A recent joint Fusion Energy ML methods.
Sciences (FES)/Advanced Scientific
Computing Research (ASCR)-sponsored This challenge can be addressed further
workshop [2] identified a set of seven priority through the development of specialized
research opportunities for the application of ML infrastructure, for which requirements are
to accelerate this process. These priorities
06. FUSION 66
tightly coupled to the unique nature of fusion understanding from plasma confinement
experimental and computational resources. For experiments and simulations.
example, neither experimental nor simulation
data produced today are typically archived or Enable real-time understanding in long-
made accessible in ways appropriate for large- pulse tokamak experiments. The advent of
scale application of ML methods. The Fusion long pulse, burning plasma, large-scale
Data Machine Learning Platform [2] is international fusion experimental devices will
envisioned as a novel system for managing, drive unique needs to extract the maximum
formatting and curating fusion experimental amount of information from increasingly large
and simulation data, with the goal of and rapid real-time streams of data
dramatically improving usability of data for ML (Figure 6.4). These long pulse experimental
algorithms. Such a platform is needed to devices will provide the first examples of the
enable unified management of both unique real-time data streaming and analysis
experimental and simulation workflows for ML, requirements that will be posed by an
by supporting sufficiently rapid access to data operational fusion power plant.
from multiple experimental and computational
sources (Figure 6.3). Fusion-specialized tools Addressing this challenge will require
will be needed to enable efficient access to interpreting and reducing fusion data at the
multi-machine and simulated data, either source, as well as along the processing
centralized or distributed, and to enable pipeline. The requirements for generation of
automated generation of fusion metadata for real-time understanding and the nature of long
supervised learning. pulse tokamak data streams are significantly
unique to fusion experiments and burning
plasma devices soon to be online. As such,
they demand unique solutions and unique
specific deployments of analysis systems. The
effort will include integrating large numbers of
fusion-specific data sources (multi-code, multi-
machine, multi-diagnostic) to produce
statistically supported interpretations, quantify
uncertainties, and yield more understanding
than the sum of individual sources. In
particular, enabling federated, multi-institution
collaborations on very large scales will pose
unique problems. AI and ML methods are
expected to be instrumental in addressing this
Figure 6.3 Vision for a future Fusion Data Machine Learning
Platform that connects tokamak experiments with an advanced challenge by providing methods for managing
storage and data streaming infrastructure that is immediately the increased data scales and unique fusion
queryable and enables efficient processing by ML/AI data types, as well as fusion-specific tools for
algorithms.
enhancing interpretability.
Key goals in this area for the next 10 to
Key goals in this area for the next 10 to
15 years include the deployment of an effective
15 years include development of AI methods
Fusion Data Machine Learning Platform,
that will enable: a) in situ, in-memory analysis
characterized by extensive integration into the
and reduction of extreme-scale simulation data
U.S. and international fusion workflow, and
as part of a federated, multi-institutional
development of the relevant enabling
workflow, and b) ingestion into the new Fusion
algorithmic and computer science solutions
Data Machine Learning Platform and analysis
specific to maximizing fusion plasma predictive
of extreme-scale fusion experimental data
06. FUSION 67
Figure 6.4 The shot cycle in tokamak experiments includes many diagnostic data handling and analysis steps that could be
enhanced or enabled by ML methods. These processes include interpretation of profile data, interpretation of fluctuation
spectra, determination of particle and energy balances, and mapping of MHD stability throughout the discharge.
for real- or near-real-time collaborative used successfully in fusion research [3,10], and
experimental research. is expected to play an increasingly important
role in managing uncertainties and knowledge
Develop models that bridge gaps in fusion gaps in the coming era of long pulse burning
plasma confinement and stability plasma experiments.
prediction. Fusion energy science is
significantly challenged by existing gaps and Key goals in this area for the next 10 to
uncertainties in the understanding of fusion- 15 years include the development of
specific plasma physics, coupled with the interpretable ML methods and model extraction
increasing importance of simulations and and reduction techniques that will help guide
analyses in closing these gaps. For example, future experimental campaigns and help close
while great strides have been made in gaps in the understanding of physics. Hybrid or
modeling plasma phenomena that contribute to other ML-informed models will be developed to
energy and particle transport in a tokamak, enable sufficient predictability with quantified
sufficient predictability has not been achieved, uncertainties for fusion plasma confinement,
and the yet-unseen burning plasma regime is instabilities, plasma-wall interaction, and other
expected to yield further new phenomena that critical physics areas.
must be represented in models. Sufficient
predictability of crucial performance-limiting Establish the plasma prediction and control
and potentially disruptive instabilities such as solutions for sustained fusion power plant
tearing modes in tokamaks must also be operation. A viable tokamak-based fusion
achieved to enable operational scenarios and power plant must have high-reliability, high-
control for a reliable power plant. performance plasma control to ensure very low
rates of operational interruption and system
ML offers techniques that can combine failure. Both control physics and control
theoretical and data-driven models in hybrid algorithm mathematics requirements for fusion
systems that better represent the underlying plasma control are uniquely challenging due to
dynamics specific to such fusion plasma their extreme nonlinearity, degree of
phenomena. This approach has already been multiphysics overlaps, resource limitations,
06. FUSION 68
reliability requirements, and range of Key goals in this area for the next 10 to
bandwidths involved. A key requirement is 15 years include the identification of areas of
therefore to use data-driven methods to fusion plasma control research that will most
contribute to control-level modeling, manage- significantly benefit from ML/AI-augmented
ment and interpretation of real-time data control algorithms, including data-driven
for control, optimal trajectory determination, methods that enable the prediction of key
and real-time prediction to support continuous plasma phenomena and plant system states,
and asynchronous actions and prevent allowing critical real-time and offline health
faults (Figure 6.5). monitoring and fault prediction. Mathematical
approaches must be developed for quantifying
the uncertainty of the data-driven fusion
plasma models identified and the reliability of
corresponding plasma control algorithms.
Methods must be developed and qualified for
extracting the required level of real-time control
knowledge from limited diagnostics in a fusion
power plant environment, while accomplishing
the required level of control authority from
limited actuators.
06. FUSION 69
burning plasma experiment, will provide unique easy use by others. The Fusion Data Machine
opportunities to study self-heated plasmas on a Learning Platform is envisioned as a step
size and power scale relevant to a fusion toward solving these problems (see
power plant. JT-60SA [9], the largest long Chapter 12, Data Life Cycle and Infrastructure).
pulse superconducting tokamak in the world
(until ITER operates), will explore advanced Despite these gaps, we believe a research
tokamak regimes not accessible by ITER. Data direction with the potentially highest payoff may
from these devices will provide extensive, be the integration of our knowledge of physics
novel groundwork for application of AI/ML into ML models. Most existing AI/ML models
techniques that maximize the information and are either purely data-driven or incorporate
understanding extracted. The amount and very simple physical laws and constraints.
quality of these data will help better validate Without building the structure of physical laws
key components of plasma physics codes and into ML methods, it is difficult to interpret the
reveal gaps in the understanding of the physics predictions from data-driven models.
behind the models, thus suggesting
improvements to the implementation of codes 5. Expected Outcomes
as well as the theory.
Application of AI/ML methods to fusion energy
The deployment of a Fusion Machine Learning research will accelerate progress toward
Data Platform could in itself prove a trans- realization of a commercial fusion power plant.
formational advance, dramatically increasing It is very possible that the new capabilities
the ability of fusion science, mathematics, and offered will actually enable practical solution of
computer science communities to combine problems not otherwise tractable even on a
their areas of expertise in accelerating the timescale of decades without use of data-
solution of fusion energy problems. driven methods.
06. FUSION 70
6. References 11. Hill, D.N., et al., “DIII-D Research Towards
Resolving Key Issues for ITER and Steady
1. Gribov, Y., et al., “ITER Physics Basis,” State Tokamaks,” Nuclear Fusion, 53
Nuclear Fusion, 47 (2007). (2013).
2. Report of the Workshop on Advancing 12. Kates-Harbeck, J., Svyatkovskiy, A., Tang,
Fusion with Machine Learning April 30 – W., “Predicting Disruptive Instabilities in
May 2, 2019. https://science.osti. Controlled Fusion Plasmas Through Deep
gov/-/media/fes/pdf/workshop-reports/FES_ Learning,” Nature, 568 (2019).
ASCR_Machine_Learning_Report.pdf 13. Li, J., et al, “A Long-Pulse High
3. Baltz, E. A., et al., “Achievement of Confinement Plasma Regime in the
Sustained Net Plasma Heating in a Fusion Experimental Advanced Superconducting
Experiment with the Optometrist Algorithm,” Tokamak,” Nature Physics, 9 (2013).
Nature Scientific Reports, 7 (2017). 14. Meneghini, O., et al., “Self-Consistent Core-
doi:10.1038/s41598-017-06645-7 Pedestal Transport Simulations With Neural
4. Bock, A., et al., “Advanced Tokamak Network Accelerated Models,” Nuclear
Investigations in Full-Tungsten ASDEX Fusion, 57 (2017).
Upgrade,” Physics of Plasmas, 25 (2018). 15. Montes, K. J., et al., “Machine Learning for
5. Bonoli, P. T., et al., ”Lower Hybrid Current Disruption Warning on Alcator C-Mod, DIII-
Drive Experiments on Alcator C-Mod: D, and EAST,” Nuclear Fusion, 59 (2019).
Comparison with Theory and Simulation,” 16. Rea, C., et al., “Disruption Prediction
Physics of Plasmas, 15 (2008). Investigations using machine learning tools
6. Boyer, M. D., Kaye, S., Erickson, K. “Real- on DIII-D and Alcator C-Mod,” Plasma
Time Capable Modeling of Neutral Beam Physics and Controlled Fusion, 60 (2018).
Injection on NSTX-U Using Neural 17. Rebut, P-H., “The Joint European Torus
Networks,” Nuclear Fusion, 59 (2019). (JET),” European Physical Journal, 43
7. Cannas, B., Cau, F., Fanni, A., Sonato, P., (2018).
Zedda, M.K., and JET-EFDA Contributors, 18. Baker, N., et al. Workshop Report on Basic
“Automatic Disruption Classification at JET: Research Needs for Scientific Machine
Comparison of Different Pattern Learning: Core Technologies for Artificial
Recognition Techniques,” Nuclear Fusion, Intelligence. doi:10.2172/1478744 (2019).
46 (2006).
19. Smith, R. C., “Uncertainty Quantification:
8. Maingi, R., et al., “Summary of the FESAC Theory, Implementation, and Applications,”
Transformative Enabling Capabilities Panel SIAM, Philadelphia (2014)
Report,” Fusion Science and Technology,
75 (2019). 20. Windsor, C. G., Pautasso, G., Tichmann,
C., Buttery, R. J., Hender, T. C., JET EFDA
9. Giruzzi, G., et al., “Physics and Operation Contributors and the ASDEX-UG team, “A
Oriented Activities in Preparation of the JT- Cross-Tokamak Neural Network Disruption
60SA Tokamak Exploitation,” Nuclear Predictor for the JET and ASDEX Upgrade
Fusion, 57 (2017). Tokamaks,” Nuclear Fusion, 45 (2005).
10. Gopalaswamy, V., et al., “Tripled Yield in 21. Wroblewski, D., Jahns, G. L., Leuer, J. A.,
Direct-Drive Laser Fusion through “Tokamak Disruption Alarm Based on a
Statistical Modelling,” Nature, 565 (2019). Neural Network Model of the High-Beta
Limit,” Nuclear Fusion, 37 (1997).
06. FUSION 71
This page intentionally blank.
72
07. Engineering and Manufacturing
Over the last decade, advances in Manufacturers of smaller batches, and ones
technologies, such as sensors, networks, and who produce many different variants of similar
control systems, along with the rise of data designs for consumers who want a customized
analytics and artificial intelligence (AI) product, need robots on the assembly line to
approaches, such as machine learning (ML), perform tasks autonomously rather than
have led to increasing discussion of holistic automatically. Typical automation is not
approaches to manufacturing and engineering profitable at this level, and this is often referred
(see Chapter 15, AI at the Edge). Terms such to as the “Batch Size 1” or “Order of One”
as “smart manufacturing,” “the Internet of problem. What is meant by “autonomous” is
things,” and “digital twins” are used to refer to that the robots are not reprogrammed step-by-
these types of transformational approaches, step to complete the new assembly; rather,
with the concept of optimization expanding to they independently learn how to optimally
include an entire lifespan, from raw materials to assemble one variant or another. Basically, the
shape/topology to manufacturing process to robots are provided the fundamentals to learn
end use. how to assemble on their own.
The future of manufacturing hinges on the Siemens Corporate Technology has managed
ability to bring new ideas and custom products to solve this problem for some simple
to market faster than ever before while assemblies [1]. They have done this by
reducing cost, energy use, and waste products. semantically converting the parts and process
A major effort is under way to use distributed information into ontologies and knowledge
manufacturing and products designed for a graphs, thereby converting implicit information
circular economy to shrink the supply chain to into explicit. Previously, the robots had to be
the benefit of local communities. Obstacles taught through code, but now the robots
include: disruptions in the supply chain due to analyze the CAD drawings and find
natural disasters; changing economic costs the corresponding solution to assembly
(tariffs, transportation costs, etc.) or new (Figure 7.1). An added benefit is that the robots
regulations; inability to optimally utilize differing are also able to correct some faults without
raw materials; appropriate data collection; having this option explicitly instructed
weakness in altering processes in real time; beforehand. If a part slips and falls or is
and cybersecurity threats, among others. The needed on the other side of the assembly, one
goal is to overcome these obstacles in an robotic arm can stop and pick it up or pass it off
optimal way to the benefit of the manufacturer, to its partner and the assembly can continue
consumer, and environment. on unimpeded.
Since additive manufacturing (AM) is in Optimally solve the Batch Size 1 problem in
relatively early stages of development, it can additive manufacturing. The ability to quickly
simultaneously gain the greatest benefit from design a new product, optimally, without going
advanced simulation, data analytics, and AI through an expensive simulation (let alone trial
approaches and offers the greatest flexibility and error), is the path to solving the “Batch
and research resources for implementing those Size 1” problem in AM. This can be carried
ideas. So, although the spectrum of out through the creation of a high-quality
engineering and manufacturing processes that surrogate model.
Figure 7.3 Left: In 2010, an Airbus A380 sustained an uncontained engine rotor failure (UERF) of the No. 2 engine as it
departed from Singapore while climbing through 7,000 ft. Debris from the UERF hit the aircraft, which led to significant
structural damage. Bottom: The culprit was caused by metal fatigue in an oil feed stub pipe due to a slightly misaligned
machining that left the pipe a little thinner on one side. [Australian Transport Safety Bureau Investigation #: A0-2010-089]
Surrogate models (which includes reduced models. Improved surrogate models based on
order models, see Chapter 10, AI Foundations physics-informed AI models would enable more
and Open Problems) can play at least three extensive exploration of both design and
roles in AM: (1) a priori optimization, (2) in situ, parameter space, ultimately accelerating
real-time process control, and (3) transferability qualification of AM parts.
of AI models between different devices and/or
feedstocks—heterogeneous manufacturing. The second role would have even more impact,
The first role encompasses both design and but is also significantly more difficult. It requires
process optimization. Design optimization is access to the AM control system, an extensive
the outermost loop and includes both shape array of sensors, and the ability to process data
and topology and implicitly local control of from the sensors during a build, analyze it in
microstructure and properties. Although there real time, and determine whether and how to
has been significant research over the last few alter any process control parameters. Since
years in shape and topology optimization, it this would have to happen in a matter of
typically relies on extremely simplistic physical seconds (between layers of a build), the data
models. The extent to which model fidelity processing and analysis requirements are
impacts “optimal” design is unknown. Similarly, significant, and accurate, fast-running
process optimization (selection of parameters surrogate models are essential (see
such as beam diameter, beam power, preheat, Chapter 15, AI at the Edge).
and scan strategy) also relies on approximate
88
09. AI for Computer Science
Artificial intelligence methods were originally addressed by AI. Specifically, we identify grand
developed to solve one of the grand challenges challenges in the areas of hardware and
in computer science, namely the design of software system design, programming,
computer systems that could behave like theoretical computer science, and workflow
humans. The most recent breakthroughs in AI and infrastructure automation. We do not
use machine learning to address specific address computer science solutions to support
problems in computer vision, natural language AI, which is covered in other chapters (see
processing, and robotics, and to outperform Chapters 11–13).
human players in games of strategy like chess
and Go. AI has the potential to address a 1. State of the Art
variety of computer science challenges where
complex manual processes could be replaced AI has the potential to transform many fields of
by automation, including chip design, software computer science, from low-level hardware
development, and online monitoring, and design to high-level programming and from the
decision making in operating and runtime most fundamental algorithmic challenges to
systems, database management, and day-to-day operation of user facilities.
infrastructure management.
Hardware and software design. The design
The DOE Office of Science Advanced Scientific of next-generation hardware and software
Computing Research (ASCR) program drives systems and mapping of application codes to
innovations and improvements in scientific target systems is currently a static process that
understanding through its world-class research involves human-in-the-loop design processes
program and facilities—both computing and and consists of repeated experiments,
networking. The innovations in science user modeling, and design space exploration. The
facilities (see Chapter 14, AI for Imaging) are design of new chips and HPC systems takes
expanding the boundaries of computing to many years, and hardware vendors and
include the edge (see Chapter 15, AI at the application developers spend months mapping,
Edge), consisting of science instruments and porting, and tuning applications to run on new
sensor networks (see Chapter 16, Facilities systems. As hardware and software get more
Integration and AI Ecosystem). Traditional complex and heterogeneous, current strategies
computer science will not be sufficient to will be impractical. DOE has been a leader in
address the complexity and scale of future the co-design of HPC systems for science, but
systems and workloads arising in the DOE many hardware features are still driven by
science mission described in Chapters 1 technology constraints and can be a challenge
through 8. AI will provide solutions to the for programmers (see Chapter 11, Software
design, development, deployment, operation, Environments and Software Research). The
and optimization of all hardware (see Chapter DOE community has also spearheaded the use
13, Hardware Architectures) and software of automatic performance tuning (autotuning)
components (see Chapter 11, Software using both brute force search and
Environments and Software Research), mathematical optimization [5–8]. In recent
ranging from individual elements to coordinated years, AI has been explored for the design of
orchestration of the workflows over computing, chips [1], storage management [2], hardware
networking, and experimental facilities. [3], optimizing compilers [6,7], and to improve
the performance of single-node computation
In this chapter, we identify the grand [5,8], communication, I/O [9,10], math libraries
challenges in computer science that can be [15], and scheduling [11]. However, the payoffs
Figure 9.3 The performance of data transfer infrastructure depends on all its subsystems, namely, networks, transfer hosts, file and
I/O systems, storage systems, and data transfer. Custom ML methods have been developed to estimate throughput profiles [13].
One of the distinguishing characteristics of Although the past decade has seen significant
science is the existence of laws based on time- algorithmic and theoretical progress, work on
tested observations about natural phenomena. the foundations of AI and ML has been far
How should these governing principles and outpaced by the empirical exploration and use
other scientific domain knowledge be of these techniques [16]. With the increased
incorporated in an AI era? To become an use of AI and ML, clear trends are emerging.
accepted part of the toolbox of scientists and For example, residual network-based
engineers, the validity and robustness of AI convolutional neural networks [10,11] are the
techniques need to be trusted. What are the standard for image processing; automatic
limits of AI techniques, and what assumptions differentiation and accelerated first-order
and circumstances can lead to establishing optimization algorithms are pervasive in
assurance of AI predictions and decisions? training deep networks [12–15]; and generative
Another hallmark of science and engineering is models (e.g., generative adversarial networks,
that limited training data may be available in variational autoencoders) are providing
the most complex, dynamic, and high synthetic data far beyond traditional image
consequence of applications. Which AI applications [6–9,17,25]. Principles underlying
techniques can best address different sampling the use and understanding of these and other
scenarios and enable efficient AI on various techniques tend to be scattered across
computing and sensing environments? disciplines, from theoretical computer science
to signal processing to statistics.
Addressing these and other open problems will
advance the building blocks of the entire Neural networks have started to be specially
AI ecosystem. designed to incorporate some types of domain
knowledge—such as rotational equivariance
1. State of the Art [1,5,21] (Figure 10.1) and statistical [18], partial
differential equation (PDE), [19] and stochastic
Advances in algorithms and hardware have PDE [20] constraints—but these efforts are in
given scientists the tools to model and simulate their infancy. Results are also being
nature at an unprecedented range of scales: established in the computability of AI-related
from computing the history and fate of the problems [26] and in exploiting graph-based
cosmos and the explosion of supernovae to the representations [22–24]. Natural language
evolution of the climate system and the processing and unsupervised learning
techniques are beginning to be explored to transfer learning, and constraint satisfaction for
gain additional insight from the scientific new problem regimes. Incorporating modeling
literature [27,28] and to pass eighth grade and simulation capabilities to generate training
science exams [29]. data leverages decades of HPC improvements
to accelerate learning; incorporating
2. Major (Grand) Challenges mathematical equations and scientific literature
leverages centuries of advances in theory.
Three exemplar grand challenges are identified Furthermore, to complete the scientific
to illustrate the promise of addressing the process, incorporating domain knowledge in AI
foundations of AI. models can be used as the basis for advances
in experimental design, active learning,
Incorporate domain knowledge in ML and facilities operations, formal verification, and
AI. ML and AI are generally domain-agnostic. automated theorem proving to accelerate
Whether studying datasets from a beamline scientific discovery.
scattering experiment, a physics collision, or a
climate simulation, the training procedure Improving our ability to systematically incorpo-
typically treats every labeled dataset as a point rate diverse forms of domain knowledge can
in a high-dimensional space and proceeds to impact every aspect of AI, from selection of
apply standard convolutional and nonlinear decision variables and architecture design to
operations. Off-the-shelf practice treats each of training data requirements, uncertainty quanti-
these datasets in the same way and ignores fication, and design optimization. Indeed,
domain knowledge that extends far beyond the incorporating domain knowledge is a
raw data itself—such as physical laws, distinguishing feature of AI within the DOE
available forward simulations, and established mission, without which AI-based scientific
invariances and symmetries—that is readily progress is otherwise limited to that afforded by
available for many systems, much in the same traditional AI drivers.
way that early knowledge on the neural vision
system led to marked improvements in image Establish assurance for AI. Assurance
processing. Better incorporation and entirely addresses the question of whether an AI model
new methods targeting these principles will has been constructed, trained, and deployed
improve data efficiency; quality, interpretability, so that it is appropriate for its intended use,
and validity of the model; and generalization,
Addressing robustness, uncertainty quanti- With these advances, we expect that AI and
fication, and interpretability of AI systems. ML will become accepted and well-
Increased understanding of the sensitivities characterized tools in the modern scientific
and limitations of AI models and improving computing toolbox, and the abstract models
scientists’ ability to interpret AI outcomes would generated are understood for use in a variety
significantly accelerate the adoption of AI as a of tasks. Minimizing the risks associated with
scientific capability. AI uses is especially important in high-
consequence applications. Increased trust will
Learning for inverse problems and design also further the adoption of AI and embedded
of experiments. Inverting traditional cause-to- intelligence in everything from edge devices to
effect models to learn what causes could have networks to HPC facilities. Significant improve-
produced an effect, and then to efficiently ment in the efficiency of ML will enable more
generate experimental campaigns to test accurate surrogate models of complex physical
these hypotheses, would broaden the systems (e.g., reacting flows or failure mecha-
scientific method. nisms in materials), optimization algorithms for
inverse problems in materials characterization
Reinforcement and active learning to and design, and more accurate computation
develop AI for control and data acquisition uncertainties necessary in all science and
systems. Advances to directly address engineering disciplines.
108
11. Software Environments and Software Research
The DOE Office of Science has an opportunity unique user facilities that produce petabytes of
and need to research and develop software to data, have no counterpart in industry, and
address the office’s research mission. Such an require new AI software and capabilities.
effort would complement large investments by Finally, many of the DOE scientific datasets
industry to develop AI software environments. need the scale of HPC systems for analysis,
The DOE has deep expertise in simulation, and those systems can have unique
modeling, and large-scale data analysis, and it architectural features that require software
also operates the largest and broadest set of attention and investment, such as large-scale
user facilities for experimental and I/O subsystems and heterogeneous compute
observational science, including light sources, elements. With DOE’s challenging datasets
telescopes, and genomics facilities that have and deep expertise in data analytics,
growing computing and data-analysis simulation, and modeling, DOE researchers
requirements (see also Chapter 16, Facilities are well positioned to contribute unique
Integration and AI Ecosystem). There is an enhancements to the AI software stack.
urgent need to develop software and
computing environments that enable AI 2. Major (Grand) Challenges
capabilities to be seamlessly integrated with
large-scale HPC models and the growing data- When considering the impact of AI on software
analysis requirements of experimental facilities. environments and software research, three
significant opportunities are apparent. First, the
1. State of the Art integration of AI into the “inner loop” can lead
to more effective simulations (see also
There is currently a proliferation of software Chapter 10, AI Foundations and Open
and frameworks for data analysis and machine Problems). For example, leveraging AI within a
learning. Top deep learning and ML simulation could lead to more efficient
frameworks today include scikit-learn, modeling by virtue of the development of digital
TensorFlow, PyTorch, and Keras, but new twins during runtime. Second, integration of AI
software and frameworks are being released into the analysis approach could lead to faster
regularly. These new frameworks are primarily generation of analytical results, automate the
developed and led by industry, with some identification of anomalous behavior, and
notable contributions from academia for ultimately lead to automatic hypothesis
software such as Spark and Jupyter. The generation. Finally, the integration of AI into the
software is open source, though not open management and control of research labs,
governance, and is often controlled and facilities, experiments, and workflows (i.e., the
sponsored by industry leaders, such as Google “outer loop”) can help achieve a variety of
and Facebook. goals. Examples include adapting workflows in
response to new hypotheses generated during
There are a few notable gaps between state-of- the workflow, scheduling resources for more
the-art and DOE scientific requirements when it efficient use of facility hardware, and
comes to software for AI. First, DOE dramatically reducing the total cost of operating
researchers produce massive amounts of data facilities. These three grand challenges are not
from simulations and models that can benefit orthogonal and would provide the greatest
from the integration of AI capabilities. These impact when examined together (see also
are often challenging datasets with Chapter 16, Facilities Integration and
multidimensional data and can also include AI Ecosystem).
nonimage-based data. Second, DOE runs
Develop software for seamless integration are sufficiently accurate and, therefore,
of simulations and AI. DOE is the premier determine when the trained model can replace
agency for large-scale simulation and modeling the simulation kernel. Similarly, AI approaches
of physical phenomena because it has deep could be employed to aid in mapping
institutional knowledge and expertise in simulation workflows onto upcoming complex
numerical methods, solvers, and parallel and heterogeneous platforms, revising the use
implementations. There is an opportunity to of resources over the course of workflow
improve the performance, efficiency, and execution through increasingly refined and
fidelity of traditional simulations by integrating accurate performance models. These
AI capabilities. Such a system would allow the approaches have the potential to significantly
integration of data from different sources, in impact traditional simulation and modeling by
different formats, and over different time improving the performance of simulations [1].
domains into existing mathematical models and
adapt in real time to changing model This would lead to a new hybrid computation
conditions. In addition, AI model-generated model, combining traditional simulation with AI
data can be validated against in-memory results in a model that runs more efficiently or
simulation data; by comparing results from in produces higher fidelity results. For example, a
situ analyses on simulation-generated and traditional mathematics-based climate model
model-generated data, one can also determine (i.e., a multiscale, multiphysics simulation)
thresholds at which the model-generated data could be enhanced by replacing a
116
12. Data Life Cycle and Infrastructure
Much recent progress in AI has been fueled by to massive video—and from many sources,
the availability of massive data. For example, including the scientific literature (e.g., Chapter
dramatic progress in deep neural networks for 1, Chemistry, Materials, and Nanoscience),
image understanding owes much to the experiments, simulations, and vehicle fleets,
ImageNet database of more than 14 million and encompassing both public and proprietary
annotated and labeled images. Science, too, is elements. Each item within this data collection
about data, and the AI-driven transformation of is documented with details as to where, when,
science will require major changes in data and how it was generated. Furthermore, the
generation, organization, processing, and data collection accommodates dynamic
sharing. This section reviews these changes additions as new knowledge is created.
and the research and development necessary
to support this vision. Such data collections do not exist today,
outside of a few narrow domains. Laboratories
Consider the following scenario: It is 2030. are not, in general, set up to preserve data.
DOE scientists are working to develop a low- Many data are recorded in archaic formats and
cost, high-performance solid-state battery for media without annotation. Descriptive
use in vehicles. Intuiting that disordered metadata are inadequate and inconsistent.
materials holds promise, they task an AI Data are rarely findable, accessible,
system with identifying candidate formulations. interoperable, or reusable (FAIR) [1], whether
Informed by 400 years of physics knowledge, by scientists or by AI systems. Data collections
100 years of scientific literature, and 40 years are often biased by a tendency not to publish
of experimental data from DOE labs, negative results (i.e., the “file drawer problem”).
universities, and industrial collaborators, the AI Autonomous laboratories that can generate
system is able to evaluate options faster than data at scale and under AI direction exist only
any human expert. It suggests new families of in prototype forms.
disordered materials that may have acceptable
stabilities, power densities, and manufacturing AI-driven discovery across the broad range of
costs. However, it also shows high domains important to DOE science will require
uncertainties in its predictions. transformations in both the methods and
infrastructure used to acquire, organize, and
To collect more data, the scientists task the AI process data and the policies that govern data
system with defining and running a series of access. These advancements must proceed
experimental and simulation studies in new via a process of co-design, with progress in
autonomous laboratories and on postexascale methods informing infrastructure and policy
conventional and quantum computing systems. changes and vice versa. The ultimate goal is a
New data integrated into the AI model motivate system of methods and infrastructure that
further experiments. Within weeks, the human enables the coordinated creation, application,
expert/AI team has refined understanding to and update of large quantities of data and
the point where large-scale manufacturing can knowledge as well as associated models,
be considered. Provenance information workflows, computations, and experiments
collected throughout allow for reuse and meta- (Figure 12.1).
analysis of discovery processes.
This chapter makes the case for three priority
Central to this scenario is the existence of a research directions, or grand challenges, to
large, well-curated, and integrated collection of produce the methodological advances required
data of many types—from point measurements to create AI-ready data infrastructure.
Figure 12.1 AI-driven science requires simultaneous For example, in the field of materials science
advancements in the methods, infrastructure, and policies used (Figure 12.2), data collections number in the
to acquire large-scale scientific data, integrate data and
symbolic knowledge, and structure data infrastructure. hundreds and are distributed worldwide. The
Materials Data Facility [2] indexes more than
Automate the large-scale creation of FAIR 100 data sources and operates automated data
scientific data. Given data’s central role in AI- ingestion and metadata extraction pipelines to
driven science, new technologies, methods, facilitate automated analyses. Nevertheless,
and best practices are needed to scale the most materials data remain unfindable,
generation, capture, annotation, and organiza- inaccessible, and noninteroperable and are
tion of data from experiments, observations, rarely reused.
and simulations to produce large collections of
FAIR data for AI-enabled discovery. As a second example, the velocity at which
microbiome data are generated has far
Integrate data and theory to create con- outpaced current capabilities for collecting,
verged knowledge repositories. Realizing processing, and distributing these data in an
the full potential of scientific AI requires a effective, uniform, and reproducible manner,
convergence of data and symbolic even at the largest data centers. The National
representations. To this end, new methods are Microbiome Data Collaborative (NMDC) was
required to synthesize AI models from data and established by the Office of Science in 2019 to
to integrate symbolic representations of build the infrastructure needed to apply
scientific knowledge, to create knowledge consistent ontologies, annotations, and
collections that are similarly FAIR. processing to create a FAIR microbiome data
resource. The NMDC aims to remove
Architect new infrastructure to support roadblocks in the development of AI methods
ubiquitous scientific AI. As AI methods are for microbiome analysis by making large
deployed ever more widely, new infrastructural quantities of labeled, curated, interoperable
concepts and methods are required to ensure data available to the public. Broad success in
that both data and the computation required to these areas depends on overcoming
ingest, enhance, integrate, and interpret data challenges outlined in this chapter.
can be accessed efficiently and reliably—
whenever, wherever, and at whatever scale The Systems Biology Knowledge Base (kBase)
required. [4], Earth System Grid Federation (ESGF) [5],
and Atmospheric Radiation Measurement
1. State of the Art (ARM) facility [6] are further examples of DOE-
supported data infrastructures that assemble
Despite much progress in scientific data large volumes of important scientific data that
acquisition and management, the datasets, offer opportunities for application of AI
processing methods, and infrastructure needed methods.
Overall, the infrastructure and methods needed science data, and the development of methods
to enable AI methods to access, learn from, and technologies for manipulating those data.
and add to a broader body of knowledge are in By addressing the following grand challenges,
their infancy. this vision has a stronger probability of
being realized.
Annotation with useful metadata is an
important prerequisite for widespread use of Automate the large-scale creation of FAIR
scientific data. Some communities have well- scientific data. Much scientific data today is
established procedures for encoding metadata still created laboriously through individual
in datasets, such as the climate and forecast experiments and then organized via time-
(CF) metadata conventions used in earth and consuming and error-prone manual data
atmospheric sciences. Yet even when such acquisition, movement, and annotation steps.
conventions exist, they often fail to capture Many data are discarded to alleviate transfer
detailed annotations to support searches for and storage costs, and descriptive metadata
specific characteristics or features within large are often inadequate to enable subsequent
datasets. Some recent work investigates the reuse. Scientists need new approaches if they
use of ML to generalize metadata from a are to accumulate the volume, variety, and
subset of labeled data by classifying electron quality of science data required for AI-driven
microscopy images automatically as being methods. In particular, steps must be taken to
generated by either transmission electron or automate major elements of data creation.
scanning transmission electron microscopy [7]. Automation is discussed here from the
Much more work is required to streamline and perspective of data and workflows (see
simplify the process of creating metadata for Chapter 11, Software Environments and
scientific datasets. Software Research). See also a recent ASCR
report [8].
2. Major (Grand) Challenges
While harnessing existing data flows within
Successful realization of AI-driven science at scientific laboratories is an important first step
the scales envisioned in this report requires the toward creating the rich data collections
creation of large collections of FAIR, AI-ready needed for AI-driven science, progress will
At one extreme, systems with thousands of accelerators (i.e., DLAs) for fixed-function
specialized architectures (e.g., NVIDIA Volta convolutional neural networks (CNNs)
and AMD MI60 GPUs, FPGAs from Intel and inference. Another example is Tesla’s FSD
Xilinx, Google TPUs [4], SambaNova, Groq, Chip, which can deliver 72 tera-ops (72x1012
Cerebras) are required to train AI models from operations per second) at 72 watts and support
immense datasets. For example, Google’s TPU capabilities that can respond in 10 milliseconds
pod has 2048 TPUs and 32 terabytes of (driving speed response) with high reliability.
memory and is used for AI model training; its
specialized tensor processors provide 100,000 In contrast, DOE’s applications can require
tera-ops for AI training and inference. In responses 100,000x faster—100 nanoseconds
addition, they are coupled directly to Google’s for real-time experiment optimization in
cloud, a massive data infrastructure electron microscopy or APS experiments
(>100 petabytes). The progress of the Google where the samples degrade rapidly under
TPU in its use for Alpha Go series of matches high-energy illumination (see Chapter 14, AI
demonstrates that codesign—the refinement of for Imaging).
hardware, software, and datasets for solving a
specific goal—provides major benefits to In terms of software, currently, many consumer
performance, power, and quality [7]. applications of AI use software frameworks like
Tensorflow, PyTorch, MXNet, Torch, or Caffe2
At the other end of the spectrum, edge devices that hide much of the complexity of the
must often be capable of low latency inference underlying hardware. As mentioned earlier,
at very low power. Industry has invested these frameworks have been developed for
heavily in a variety of edge computing devices video, image, and speech recognition as well
for AI including tensor calculation accelerators as language translation and natural language
(e.g., ARM Pelion, NVIDIA T4, Google’s Edge processing, but they remain in their infancy for
TPU, and Intel’s Movidius) and neuromorphic processing scientific data. Furthermore,
devices (e.g., IBM’s TrueNorth and Intel’s software integration of this AI ecosystem
Loihi). Experts expect dramatic improvements (e.g., PyTorch) with the HPC ecosystem (e.g.,
in the compute capability and energy efficiency MPI and OpenACC) will be nontrivial;
of these devices over the next decade as they significant challenges remain in coupling
are further refined. For example, NVIDIA and potentially unifying these software
recently released its Jetson AGX Xavier ecosystems for productivity and efficiency (see
platform, which operates at less than 30W and Chapter 11, Software Environments and
is meant for deploying advanced AI and Software Research).
computer vision algorithms at the edge using
many specialized devices such as hardware 2. Major (Grand) Challenges
1 Permission to use each of the pictures was granted Given this spectrum of architectures and their
by each of the respective companies. fast pace of change, DOE will need to be
2 The Cerebras Wafer Scale Engine is 46,222 mm2; actively engaged with the communities of
by comparison the largest GPU is 815 mm2.
140
15. AI at the Edge
Many of the use cases outlined in previous can be deployed to run in the vehicle (i.e., at
chapters—Chapter 4, High Energy Physics, the edge).
Chapter 14, AI for Imaging, and Chapter 16,
Facilities Integration and AI Ecosystem— In the DOE community, a large and growing
describe scientific discoveries using large number of science and engineering projects
instruments such as the Large Hadron Collider, require edge computing to imbue sensors with
the Very Large Array, and the IceCube South real-time adaptive or autonomous capabilities.
Pole Neutrino Detector. Likewise, DOE In addition to the examples mentioned in
operates many distributed facilities, such as the Chapters 4, 14, and 16, consider the following.
ARM Climate Research Facility, that operate There are thousands of environmental
sensors and instruments across the planet (see monitoring sensors that typically produce
also Chapter 2, Earth and Environmental longitudinal data with latencies of minutes to
Sciences). For both centralized and distributed weeks between measurement and data
facilities, instruments such as these produce availability due to their remote locations and
vast quantities of data that often cannot be low (or intermittent) capacity network
efficiently moved to or stored in a central connectivity (see also Chapter 3, Biology and
repository, or they include latency-sensitive Life Sciences). Edge computing capabilities
control systems that must act promptly on the would enable such instruments to analyze data
incoming data. Moving a portion of the data locally in real time and feed a lower volume of
analysis pipeline “to the edge,” where the data processed information to central computing
is generated, allows the required computation services for further processing. A radar
to identify the highest value data to be saved deployed by the DOE ARM facility in Oklahoma
and to autonomously respond and control the could use ML at the edge to identify important
experiment. The potential benefits of edge weather phenomena and dynamically steer the
computing are widely recognized, and a instrument for more precise follow-up
considerable amount of work to realize and observations. Such an approach would
expand upon these benefits in business and increase the accuracy and timeliness of
science is under way [2]. tornado warnings, ultimately saving lives. As
mentioned in Chapter 8, Smart Energy
Advances in AI and ML, both in hardware and Infrastructure, monitoring electrical power
software, are among the enablers of edge distribution infrastructure could prevent power
computing. For example, edge computing failures or predict conditions conducive to
enables a self-driving vehicle to make wildfires; monitoring subsurface vibrations from
decisions within the vehicle, using AI oil wells could improve oil production;
techniques to interpret data from the vehicle’s autonomous soil sampling and analysis
many cameras and sensors. This is necessary devices could improve crop yield; more timely
both because of the volume of data (i.e., too data analysis options would enable large-scale
large to transmit to central servers) and the accelerators and light sources to optimize their
real-time requirement for vehicle controls (i.e., operations and predict (and prevent) failures.
answers from remote servers may arrive far too
late). Edge computing is possible, even with DOE is in a unique position to address these
relatively low-powered computing hardware in challenges because it supports many of the
the vehicle, because a large body of training research facilities requiring edge computing,
data has been processed on high-performance either in the near term to better operate
servers (i.e., in the center) into ML models that existing instruments or in the longer term to
facilities to enable quasi-real-time feedback on necessary for rapid progress in this area.
experiments and observations. The data Software and services can facilitate good data
gateway and the scientific data management practices that will feed AI agents, but actual
system will be critical components expected to accumulation of high-quality datasets is
substantially reduce the accumulation contingent on researchers using the
of “dark” (i.e., unpublished) data and aforementioned data software stack to
accelerate the accumulation of well-annotated populate data repositories. Policies must be
and standardized data for AI in the developed to minimize generation of dark data
upcoming decade. and maximize generation of well-annotated
data. AI efforts will be necessary to draw
Looking further ahead, the ASCR facilities will insights from the collected data, but facilities
continue to design complex, technically need to first train their researchers on ML,
advanced networking and computing facilities including DL, techniques. Furthermore,
for future science generations where the needs facilities will need to foster AI development
of the AI ecosystem will be an integral part of through dedicated research programs. Given
any initial design. Given the pace of change in the data explosion in practically all scientific
AI technology and techniques, these future domains, facilities will need to train
facilities will also need to be designed with researchers on using high-performance
flexibility in mind to take advantage of the computers for developing, scaling, and
advances that will inevitably come from deploying AI agents that can leverage the
application work over the next decade. ballooning body of data.
AI at Scale 1: Microscopy
Sergei Kalinin
AI at Scale 3: Health
Georgia Tourassi
Fundamental Physics………………………………………………….……Cumberland
Co-Leads: Marcel Demarteau, Bronson Messer, Torre Wenaus
3:00 p.m. Breakout Reports Out (10 minutes each) ......................ORNL Conference Center
8:45 a.m. Summary of Day 1 and Day 2 Cross-cut Charge………ORNL Conference Center
Jeffrey Nichols
1:00 p.m. Final Report Out from Breakout Session (10 minutes each)
2:30 p.m. Town Hall Close-out with Next Steps…………….…..….ORNL Conference Center
Jeffrey Nichols
AI at Scale: Astrophysics
Josh Bloom
AI at Scale in Biology
Ben Brown
Physical Sciences
Coordinator: Paolo Calafiura
Energy Sciences
Coordinator: Jonathan Carter
Computer Science
Coordinator: Katherine Yelick
3:00 p.m. Lightning Breakouts Report Out (5 minutes each) ......... Building 50 Auditorium
8:30 a.m. Summary of Day 1 and Day 2 Cross-cut Charge ........... Building 50 Auditorium
Katherine Yelick
11:30 a.m. Collect Lunch and Proceed in to Report Out Session .. Building 50 Auditorium
11:45 a.m. Breakouts Report Out (5 minutes each) ......................... Building 50 Auditorium
1:45 p.m. Town Hall Close-out with Next Steps ............................. Building 50 Auditorium
Katherine Yelick
10:30 a.m. How Significant will AI be for the Energy Sector?.……….Grand Ballroom North
Quantifying progress and outlining signposts
Claire Curry, Bloomberg New Energy Finance
11:15 a.m. AI Research Update: What’s Going On Around ………….Grand Ballroom North
The World and Our Research Plans for Studying AI For Science
Earl Joseph, Hyperion Research
170
AC. Combined Town Hall Registrants
First Name Last Name Institution
Brook Abegaz Loyola University of Chicago
Gina Adam George Washington University
Corey Adams Argonne National Laboratory
Marc Adams NVIDIA Corporation
Ryan Adamson Oak Ridge National Laboratory
Adetokunbo Adedoyin Los Alamos National Laboratory
Vivek Agarwal Idaho National Laboratory
Greeshma Agasthya Oak Ridge National Laboratory
Jeffery Aguiar Idaho National Laboratory
Lars Ahlfors Microsoft Corporation
James Ahrens Los Alamos National Laboratory
Sachin Ahuja CNH Industrial
James Aimone Sandia National Laboratories
Shashi Aithal Argonne National Laboratory
Adeel Akram Uppsala University
Maksudul Alam Oak Ridge National Laboratory
Frank Alexander Brookhaven National Laboratory
Boian Alexandrov Los Alamos National Laboratory
Yuri Alexeev Argonne National Laboratory
Stephanie Allport Bloomberg
Srikanth Allu Oak Ridge National Laboratory
Jeff Alstott Intelligence Advanced Research Projects
Activity
Ilkay Altintas University of California, San Diego
Kenneth Alvin Sandia National Laboratories
James Amundson Fermi National Accelerator Laboratory
Valentine Anantharaj Oak Ridge National Laboratory
James Ang Pacific Northwest National Laboratory
Mihai Anitescu Argonne National Laboratory
Dionysios Antonopoulos Argonne National Laboratory
Katerina Antypas Lawrence Berkeley National Laboratory
Chid Apte IBM Research
Rick Archibald Oak Ridge National Laboratory
Whitney Armstrong Argonne National Laboratory
Richard Arthur General Electric Research
Srinivasan Arunajatesan Sandia National Laboratories
Paul Atzberger University of California, Santa Barbara
Brian Austin Lawrence Berkeley National Laboratory
Ariful Azad Indiana University
Gyorgy Babnigg Argonne National Laboratory
Tyler Backman Lawrence Berkeley National Laboratory
Drew Baden Department of Energy, High Energy
Physics
200
AE. References
01. Chemistry, Materials, and Nanoscience 3. Bergen, K. J., Johnson, P. A., Maarten, V.,
& Beroza, G. C. (2019). Machine learning
1. Riordan, M. & Hoddeson, L., Crystal Fire:
for data-driven discovery in solid Earth
The Invention of the Transistor and the
geoscience. Science, 363(6433),
Birth of the Information Age, W. W. Norton
eaau0323.
& Company, 1998.
4. Bolton, Thomas, and Laure Zanna.
2. Sze, S. M., Physics of Semiconductor
“Applications of deep learning to ocean
Devices, 2nd Edition, John Wiley and Sons,
data inference and subgrid
New York, 1981.
parameterization.” Journal of Advances in
3. Shockley, W., Electrons and Holes in Modeling Earth Systems 11, no. 1 (2019):
Semiconductors: With Applications to 376-399.
Transistor Electronics, D. Van Nostrand
5. Brantley, S. L. (2018) Shale Network
Company, Inc., 1950.
Database, Consortium for Universities for
4. Fuechsle, M. et al., A single-atom the Advancement of Hydrologic Sciences,
transistor. Nat. Nanotechnol. 7, 242–246 Inc. (CUAHSI). DOI: 10.4211/his-data-
(2012). shalenetwork
5. Sumpter, B. G., Vasudevan, R. K., Potok, 6. Brenowitz, N. D., & Bretherton, C.
T., Kalinin, S. V., A bridge for accelerating S. ( 2018). Prognostic validation of a neural
materials design. npj Comp. Mat. 1: 15008 network unified physics parameterization.
(2015). DOI: 10.1038/npjcompumats. Geophysical Research Letters, 45, 6289-
2015.8 6298. https://doi.org/10.1029/2018GL07851
6. Kalinin, S. V., Sumpter, B. G., & Archibald, 7. Cherukara, M. J., Nashed, Y. S. G., &
R. K., Big-deep-smart data in imaging for Harder, R. J. Real-time coherent diffraction
guiding materials design. Nat. Mater. 14, inversion using deep generative networks
973–980 (2015). (2018). Scientific reports 8(1), 165230.
7. M. Ziatdinov, et al., “Building and exploring 8. Collins, W. & P. Tissot. An artificial neural
libraries of atomic defects in graphene: network model to predict thunderstorms
Scanning transmission electron and within 400 km2 South Texas domain,
scanning tunneling microscopy study,” Sci. Meteorological Applications 22, no. 3
Adv. 5:eaaw8989 (2019). DOI: (2015): 650-665.
10.1126/sciadv.aaw8989.
9. Deng, J. et al.. Correlative 3D x-ray
02. Earth and Environmental Sciences fluorescence and ptychographic
tomography of frozen-hydrated green algae
1. Basu, S., Kumbier, K., Brown, J. B., & Yu, (2018), Sci. Adv.4(11) eaau4548(1-10).
B. (2018). Iterative random forests to
discover predictive and stable high-order 10. Flinchum, B. A., et al. Critical Zone
interactions. Proceedings of the National Structure Under a Granite Ridge Inferred
Academy of Sciences, 115(8), 1943-1948. From Drilling and Three-Dimensional
Seismic Refraction Data. (2018) J.
2. Baydin, A. G., Shao, L., Bhimji, W., Geophys. Res.: Earth Surf. 123 (6), 1317-
Heinrich, L., Meadows, L., Liu, J., & Ma, M. 1343.
(2019). Etalumis: Bringing Probabilistic
Programming to Scientific Simulators at
Scale. arXiv preprint arXiv:1907.03382.
20. Windsor, C. G., Pautasso, G., Tichmann, 6. Bonawitz, K., et al., Practical Secure
C., Buttery, R. J., Hender, T. C., JET EFDA Aggregation for Privacy-Preserving
Contributors and the ASDEX-UG team, “A Machine Learning. Proceedings of the 2017
Cross-Tokamak Neural Network Disruption ACM SIGSAC Conference on Computer
Predictor for the JET and ASDEX Upgrade and Communications Security. 1175-1191.
Tokamaks,” Nuclear Fusion, 45 (2005). Oct 30-Nov 3, Dallas, TX, 2017.
21. Wroblewski, D., Jahns, G. L., Leuer, J. A., 7. Kasiviswanathan, S. P., et al. What Can We
“Tokamak Disruption Alarm Based on a Learn Privately? The 49th Annual IEEE
Neural Network Model of the High-Beta Symposium on Foundations of Computer
Limit,” Nuclear Fusion, 37 (1997). Science. 531-540. Oct. 25-28, Philadelphia,
PA (2008).
07. Engineering and Manufacturing 8. Balde, C. P., et al. The Global E-waste
1. Zistl, S. “The Future of Manufacturing: Monitor 2017: Quantities, Flows, and
Prototype Robot Solves Problems without Resources (Bonn, Geneva, and Vienna:
Programming,” Seimens.com Global United Nations University, International
Website. Telecommunication Union, and
International Solid Waste Association,
2017).
214