Unsupervised Machine Learning Discovery of Structural Units and Transformation Pathways From Imaging Data

RESEARCH ARTICLE | JUNE 14 2023
Unsupervised machine learning discovery of structural units

and transformation pathways from imaging data
Sergei V. Kalinin  ; Ondrej Dyck ; Ayana Ghosh ; Yongtao Liu ; Bobby G. Sumpter ;
Maxim Ziatdinov
APL Mach. Learn. 1, 026117 (2023)

https://doi.org/10.1063/5.0147316

View Export
Online Citation
09 July 2024 01:50:19

APL Machine Learning ARTICLE scitation.org/journal/aml
Unsupervised machine learning discovery

of structural units and transformation pathways
from imaging data
Cite as: APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316
Submitted: 21 February 2023 • Accepted: 8 May 2023 •
Published Online: 14 June 2023
Sergei V. Kalinin,1,a) Ondrej Dyck,2 Ayana Ghosh,3 Yongtao Liu,2 Bobby G. Sumpter,2
2,3,b)
and Maxim Ziatdinov
AFFILIATIONS
1
Department of Materials Science and Engineering, The University of Tennessee, Knoxville, Tennessee 37996, USA
2
Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
3
Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
a)
Author to whom correspondence should be addressed: [email protected]
b)
[email protected]
09 July 2024 01:50:19

ABSTRACT
We show that unsupervised machine learning can be used to learn chemical transformation pathways from observational Scanning Trans-
mission Electron Microscopy (STEM) data. To enable this analysis, we assumed the existence of atoms, a discreteness of atomic classes, and
the presence of an explicit relationship between the observed STEM contrast and the presence of atomic units. With only these postulates,
we developed a machine learning method leveraging a rotationally invariant variational autoencoder (VAE) that can identify the existing
molecular fragments observed within a material. The approach encodes the information contained in STEM image sequences using a small
number of latent variables, allowing the exploration of chemical transformation pathways by tracing the evolution of atoms in the latent space
of the system. The results suggest that atomically resolved STEM data can be used to derive fundamental physical and chemical mechanisms
involved, by providing encodings of the observed structures that act as bottom-up equivalents of structural order parameters. The approach
also demonstrates the potential of variational (i.e., Bayesian) methods in the physical sciences and will stimulate the development of more
sophisticated ways to encode physical constraints in the encoder–decoder architectures and generative physical laws and causal relationships
in the latent space of VAEs.
© 2023 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license
(http://creativecommons.org/licenses/by/4.0/). https://doi.org/10.1063/5.0147316
INTRODUCTION the potential application of deep learning to scientific discovery.18–20

While immediately attractive, the examination of this concept sug-
Over the last decade, deep machine learning has become a gests that this is far from trivial. Indeed, it is by now well understood
key enabling technology in multiple areas of computer science, that classical ML methods are correlative in nature and serve as
imaging, and robotics.1–5 Following the successful demonstration powerful interpolators operating within distributions spanned by
of deep convolutional networks for image recognition tasks,6 deep training data.21 Through the choice of the training data or network
learning architectures have helped revolutionize many other areas, architecture, the equivariance or symmetries can be introduced,
ranging from natural language processing, reinforcement learning imposing physical constraints on the derived outputs. However,
for control,7–9 and have begun to rapidly propagate in areas such the capability of deep neural networks (DNNs) to extrapolate out-
as causal and physical discovery,10–12 automated experiments in side of the training domain or work with out-of-distribution data
chemistry, materials science, and biology,13–15 and direct atomic represents a major limitation.22,23 This is in comparison to the
manipulation and assembly.16,17 classical scientific research paradigm, often relying on past knowl-
This rapid growth in machine learning (ML) applications has edge, deductive logic based on known physical laws, causal physical
naturally attracted the attention of domain science communities for mechanisms, and hypothesis-driven paradigms. As such, scientific
APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-1

© Author(s) 2023
discovery in the physical sciences typically relies on relatively small irradiation and inducing slow chemical changes in the suspended
amounts of data and allows extrapolation well outside the specific graphene. Here, we aim to explore whether the elementary bond-
domains, which are the “hard” tasks for traditional ML. Thus, the ing patterns (chemical fragments) in this system and their changes
merger of ML and classical scientific methods may provide a new with time (chemical transformations) can be derived from the STEM
forefront for research and development. observations in an unsupervised manner with a minimum number
One such approach is based on the introduction of known of an a priori assumptions and postulates. In other words, can we
physical constraints in the form of symmetries, conservation laws, create a computational architecture that can discover chemistry and
etc., to limit ML prediction to the physically possible. Simple exam- chemical transformation pathways from observations in an unsu-
ples of this approach are use of sparsity, non-negativity, or sum to pervised manner? While in this case the results can be compared
one constraint in linear and nonlinear unmixing,24 resembling the with prior chemical knowledge in a straightforward fashion, simi-
classical Lagrange multiplier approach. Several groups have intro- lar approaches applied to unknown systems will offer a pathway to
duced the physical constraints of Hamiltonian mechanics in the deep scientific discovery.
neural networks, significantly improving predictions of long-term Our approach is based on the concept of a variational autoen-
dynamics in chaotic systems.25 Similarly, the network architecture coder (VAE)44 that finds the latent representation of the complex
can be made to encode known symmetry constraints, enabling the system. Generally, a VAE is a directed latent-variable probabilistic
discovery of order parameters.26 Alternatively, constraints can be graphical model that learns a stochastic mapping between obser-
introduced on the form and complexity of structural and gener- vations x with a complicated empirical distribution and latent
ative behaviors, as exemplified by the combination of symbolic variables z whose distribution can be relatively simple.45 A VAE
regressions and genetic algorithms.27,28 However, until now most of consists of a generative model (“decoder”) that reconstructs xi from
these developments have been limited to areas, such as mechanics a latent “code” zi and an inference model (“encoder”) whose role
and astronomy, where the generative laws are relatively simple and is to approximate a posterior of the generative model via amor-
well-defined and hence ML predictions can be readily compared to tized variational inference.46 Implementation-wise, both encoder
physical models. and decoder models are approximated by deep neural networks
At the same time, discovery of more complex laws and reduced whose parameters are learned jointly by maximizing the evidence
rules that define fields of chemistry, molecular biology, etc., has not lower boundary via a stochastic gradient descent with randomly
yet been demonstrated. In these fields, the elementary descriptors drawn mini-batches of data. In this manner, VAEs allow one to build
of the systems are generally not well defined, and the constraints relationships between high-dimensional datasets and a small num-
09 July 2024 01:50:19

imposed on the functional relationships between the descriptors ber of latent variables, somewhat reminiscent of manifold learning
are generally “soft.” Combined with the preponderance of large and self-organized feature maps.
datasets, this led to broad development of the correlative ML models. The important aspect of the VAEs (similar to many mani-
Interestingly, even causal machine learning methods first emerged fold learning methods) is that the variability of the behaviors in
in the context of economics or biology, rather than physical sci- the latent space allows one to reveal relevant features of the system
ences. Correspondingly, the open question remains to what extent behavior, i.e., primary non-linear degrees of freedom. One exam-
we can use machine learning for discovery in specific domain ple of this is the hand gesture analysis. Another is writing styles or
areas, given the experimental observations and minimal set of emotional expressions. This capability of VAEs to disentangle data
prior knowledge in the form of postulated descriptors or generative representations has attracted broad attention from the computer
mechanisms. vision community and is being explored here. Second, an important
Here, we demonstrate how unsupervised machine learning aspect of VAEs employed here is the principle of parsimony—i.e.,
can be used to discover elementary building blocks forming the the training process generates the best short descriptors representing
structure of solids, illustrated by the structural units and chemical the data, somewhat reminiscent of Occam’s principle.
transformation mechanisms in defected graphene from atomically However, direct application of the VAE to experimental data
resolved scanning transmission electron microscopy (STEM) data. requires two special aspects. One is a suitable choice of the raw
The postulates that we repeatedly use during the workflow is that descriptors, namely, suitably chosen subsets of the images (as com-
of the existence and observability of atoms in STEM. We show pared to classical applications, such as analysis of MNIST or CIFAR
this is sufficient to discover molecular structures and map chemical datasets, where images are used in full). Establishing these descrip-
transformation mechanisms in graphene. tors requires prior knowledge of the system, since randomly picking
As one model system, we have explored the rich set of electron sub-images (zero prior knowledge) tends to lead to suboptimal
beam (e-beam) induced transformations in Si-containing graphene. results. Here, as the prior knowledge, we use the most basic assump-
This system has been shown to possess a broad range of e-beam tions, i.e., the existence of atoms, discreteness of atom classes,
induced transformations, including the formation of point and and discoverability of atoms from STEM data. For the latter, we
extended defects, migration of Si atoms, emergence of small-angle implicitly rely on the fact that the maximum of the STEM contrast
boundaries and fragmentation of host graphene lattice, and eventu- corresponds to the location of an atomic nucleus and the intensity is
ally formation of defect clusters and degradation.29–43 Sample prepa- proportional to the atomic number, i.e., it enables identification of
ration details are provided in the “Methods” section. Figure 1(a) the atoms and their types.
shows a region of the specimen where suspended clean graphene was To implement this for STEM, we used deep convolutional
observed next to an amorphous region containing Si and Cr atoms. neural networks (DCNN) trained first on simulated data and then
The electron beam was scanned over the field of view recording a iteratively retrained to adapt to experimental images. The details
medium angle annular dark field (MAADF) image sequence during about training DCNNs on simulated/synthetic atom-resolved data

© Author(s) 2023
FIG. 1. Discovery of molecular structural fragments and chemical transformation mechanisms via unsupervised machine learning. Here, a deep convolutional neural network
(DCNN) is used to perform the semantic segmentation of (a) the experimental dataset to yield (b) the coordinates and identities of atomic species. Note that for the “ideal”
experimental data, similar results can be achieved using the simple maxima analysis/blob finding, and a DCNN is used to visualize atoms only. Further shown are the (c)
latent space of the skip rotationally invariant variational autoencoder (srVAE) with skip connectivity between the latent space and decoder and (d) image encoded with the
one of the srVAE latent variables. (e) The analysis of the latent space of the srVAE illustrates the regions containing the easily identifiable molecular fragments, e.g., 6-
and 5-member cycles, 5–7 defects, and edge configurations. Note that these are discovered in a fully unsupervised manner. Comparing the evolution of latent variables
corresponding to a single atom further allows discovery of chemical transformation mechanisms.
09 July 2024 01:50:19

were reported earlier by multiple groups, including the authors of The DCNN analysis of the STEM images allows decoding of
this study.47,48 Here, when we DCNN trained on simulated data the atomic coordinates and their evolution with time. Note that
to the experimental data characterized by rapidly growing amor- similar information can be derived from, e.g., theoretical model-
phous regions and holes in the lattice [up to ∼60% (combined) of ing or “ideal” imaging with very high signal/noise ratio, bypassing
the entire image toward the end of the movie], it works well only the DCNN step. Furthermore, it is important to note that exper-
at the beginning of the movie and its predictions rapidly deteriorate imentally the carbon atoms are difficult to distinguish, i.e., trajec-
as the system disorder increases. For example, it produces a large tories cannot be easily reconstructed. However, the well-separated
number of false positives inside the graphene holes. This behavior Si atoms allow convenient markers without loss of generality of the
is not surprising and is expected when applying a deep learning approach. With the atomic positions (for C and Si) identified, we
model to out-of-distribution data.23 To address this issue, we used form the experimental descriptors as the sub-images of a given size,
the DCNN predictions on the first ∼5 experimental frames where N, centered on atomic units. These descriptors combine the knowl-
detected atomic positions can be refined via standard Gaussian fit- edge of atom existence (coordinates) and experimental data (raw
ting to create a new training set, then retrained a model and applied or semantically segmented contrast). However, they do not contain
it to the entire movie. The re-trained model demonstrated a signifi- any prior information on chemical bonding or larger level structural
cant improvement in the detection rate of atomic species, and most blocks.
importantly, was not prone to artifacts that plagued the performance To discover salient features of system behavior from the
of the initial model toward the end of the movie. bottom up, we need the latent representation of the data that can be
The application of the DCNN to the raw experimental data obtained using a VAE. We note that the possible chemical building
allows transformation of the STEM image stack into the semanti- blocks can have different orientations within the image, necessitat-
cally segmented dataset giving the probability that a specific pixel ing development of the VAE architecture invariant with respect to
belongs to an atom. Combined with the simple thresholding and rotations. Here, we adapted the rotationally invariant VAE (rVAE)
blob finding, this allows straightforward and highly robust transla- originally proposed by Bepler et al.49 and adapted for analysis of
tion of raw STEM data into the atomic coordinates and identities. dynamic scanning probe and transmission electron microscopy data
Note that while DCNNs are a supervised learning technique, they by Kalinin et al.50,51 We note that the shortcoming of the traditional
are trained on the labeled datasets that postulate only the existence VAE’s encoder–decoder architectures is the so-called posterior
of atomic species and define how they manifest in the STEM data. collapse,52 i.e., when the posterior estimates of a latent variable zi
However, there are no assumptions on how atomic units are con- do not provide a good representation of the data. To alleviate this
nected and how these connectivities evolve with time (i.e., chemical problem and emphasize the reconstruction rather than encoding
reactivity). of the data, here we connected the latent space with each layer of

© Author(s) 2023
the rVAE’s decoder’s neural network via the skip connections,53,54 for this window size. Structures in the upper part of the diagram
thereby enforcing a dependence between the observations and their illustrate the presence of edges or rings with different numbers of
latent variables. We note that this approach is different from a well- members. A part of the latent space contains well-known 5–7 defects
known method of adding residual (“skip”) connections between dif- [magnified in Fig. 2(b)]. The “unphysical” images with smeared or
ferent layers of a neural network55 since in our case the connection weak contrast manifest in only a few locations and are an unalien-
paths originate in the latent space. able feature of the projection of discrete system on low dimensional
The latent space of the skip-rVAE trained on the experimen- manifold.
tal data is shown in Fig. 2(a). Here, the rectangular grid of points To gain further insight into the system behavior in the latent
l1 ∈ [l1min , l1max ], l2 ∈ [l2min , l2max ] is formed and the images formed space and establish relationships between latent variables and classi-
by decoding the chosen (l1 , l2 ) pair are plotted on a rectangular grid. cal organic chemistry descriptors, we classify the observed structures
This depiction allows observation of the evolution of sub-images in based on connectivity of the carbon network. Here, we developed a
the latent space and establishes the relationship between the STEM simple approach where all units above an intensity threshold t = 0.5
image (i.e., local atomic structure) and the latent variables. Note in the images projected from [l1 , l2 ] coordinates of the latent space
that not all combinations of latent variables correspond to physically are considered to be physical. We then identified a graph structure
possible STEM images, and hence distributions of the experimental G(n, E) that corresponds to these atomic units, where n is number
data in the latent space can have a complicated structure. Similarly, of nodes (units) in the image and E corresponds to edges connect-
decoded images need to be ascertained for physicality. ing different nodes. Only nodes separated by 0.5lC–C < d < 1.2lC–C
Here, we explore whether latent space reconstructions contain are defined as connected. The lower bound is set such that it can
information on the molecular building blocks in the graphene lat- potentially account for out-of-plane distortions that result in the
tice. By construction of descriptors, the center of a reconstructed apparent shrinkage of bonds as seen from the 2D projection in
sub-image will contain a single well-defined atom (Si or C). Hence, STEM. We note that such an analysis works only for 2D systems.
the remainder of the image contrast can be directly interpreted in Finally, a depth-first search method is used to traverse a graph struc-
terms of whether atomic structures (as opposed to some abstract ture and identify a number of different n-membered rings adjacent
representations) are observed, and what these structures are. Sur- to the central atom. Note that this approach can be extended fur-
prisingly, casual examination of Fig. 2(a) illustrates that even at a ther to explore additional details of the atomic structure beyond
low sampling density, the latent space representations generally cor- adjacent rings (e.g., broken bonds, dangling atoms, etc.), but this
respond to well-defined molecular graphene fragments, comporting analysis will require larger window sizes and will come at the cost of
09 July 2024 01:50:19

to the chemical intuition of an organic chemist. The lower half of the simplicity.
depicted latent space is comprised of the structures formed by the The distribution of the number of 5-, 6-, and 7-member rings
three 6-membered rings, an elementary building block of graphene in the latent space calculated with a high sampling is shown in
FIG. 2. Latent space analysis for atom-centered descriptors. (a) Latent space of the skip-rVAE for low sampling (12 × 12). (b) The graph analysis of the sub-image
corresponding to a selected point in the latent space. Distributions of the (c) pentagonal, (d) hexagonal, and (e) heptagonal rings adjacent to the central atom in the latent
space for high sampling (200 × 200). Here, −1 corresponds to unphysical configurations and 1, 2, and 3 are for number of rings of particular type surrounding central atom.
The dotted boxes in (c)–(e) correspond to latent space area in (a).

© Author(s) 2023
Figs. 2(c)–2(e). The top side of the images (positive l2 ) generally con- each uniform region the structure variability along the lattice space
tains a small number of rings and corresponds to edges or isolated represents the physical distortions or structure outside the primary
atoms. The different values of the latent variables encode structural rings.
deformations within graphene. The bulk of l2 < 0 region corresponds These behaviors are further illustrated in Fig. 3. Figure 3(a)
to a normal graphene structure. Finally, the region for −1 < l1 < 1 shows the latent space of the system at high sampling. While
and 0 < l2 < 1.5 contains islands with varying number of the 5- and the details of individual images cannot be discerned, the overall
7-member rings. Interestingly, overlap of these islands corresponds smooth evolution of decoded patterns is clearly visible, as are large
to the 5–7 defects. scale variations in encoded behaviors. Figure 3(b) represents the
Explicit examination of the images in the different parts of overlay of Figs. 2(c)–2(e), visualizing the domains with dissimi-
the latent space as well as the morphologies of the domains in lar chemical structures. Finally, several structures corresponding
Figs. 2(c)–2(e) suggests that the skip-rVAE has discovered the latent to selected regions of the latent space are shown in Fig. 3(c),
representations of the observed STEM contrast in terms of continu- including (I) prototype graphene, (II) 5-7-7 defect, (III) 7-member
ous latent variables, and these representations separate the possible rings, (IV) edge region, and (V) 5-member ring. We note that
structures of chemical bonding networks. The well-defined chemi- the complexity of chemical space increases with the size of sub-
cal structures occupy specific regions of the latent space with relative image descriptors since, e.g., a 5-member ring can be adjusted to
areas proportional to the fraction of these structures in the ini- 6-, 7-, and 8-membered-rings in different realizations of topological
tial dataset. At higher sampling densities, these regions are often defects.
separated by the thin lines of “unphysical” domains corresponding To simplify the representation of chemical space we also cre-
to the transitions between physically realizable configurations via ated an alternative set of descriptors for skip-rVAE training where
“impossible” configurations, such as smeared atoms. Finally, within we used sub-images centered on hollow sites instead of atoms.
09 July 2024 01:50:19
FIG. 3. Chemical space analysis for atom-centered descriptors. (a) Latent space of skip-rVAE at high sampling. (b) Chemical space map and (c) several examples of the
observed structures, including (I) prototype graphene, (II) 5-7-7 defect, (III) 7-member ring, (IV) fragment of bearded edge, and (V) 5-member ring. The dotted box in (b)
corresponds to the area shown in (a).

© Author(s) 2023
This is achieved by applying the graph analysis described above of the atoms between the frames and reconstruction of the trajecto-
to the DCNN output from the entire dataset and computing the ries. However, in this case, significant changes in the carbon network
center of the mass for each identified cycle (with the maximum between the frames make it difficult, especially for atoms where
cycle length set to 8). By design, this description is suited for ana- the bonding changes. However, Si atoms that are present in the
lyzing the material microstructure on the level of single rings. It graphene lattice offer readily identifiable markers that can be often
is again important to mention that the information available to traced between frames. An example of such an analysis is shown
the network is the positions of the centers and patches of the in Fig. 5. Figure 5(a) depicts the evolution of latent variables for a
image centered at these positions, rather than atomic coordinates selected Si atom through the ∼70 frames. Initially, the l2 values are
or any of the higher-level descriptors (bonds, orientations, etc.). close to 0, corresponding to the Si on a three-coordinated lattice site,
The skip-rVAE results for the new set of descriptors are shown in i.e., a substitutional defect. In the initial stage of the process, the atom
Fig. 4. Here, both the latent manifold [Fig. 4(a)] and the chemi- moves within the “ideal” graphene regions, with the changes in latent
cal space map [Fig. 4(b)] have a much simpler structure exhibit- variables representing the changes in strain state and distant neigh-
ing well-defined regions associated with 5-, 6-, 7-, and 8-member borhood. The 3-fold Si can transform into 4-fold Si [I–II in Fig. 5(c)],
rings [Fig. 4(c)]. clearly visible in the latent space of the system [Fig. 5(b)]. Subse-
This analysis naturally leads to a question as to whether a dif- quently, more complex co-ordinations can emerge, as visualized in
ferent dimensionality of the latent space can be chosen. From a states (III–V) in Fig. 5(c).
general perspective, finite discrete space (i.e., all arrangements of We have shown that unsupervised machine learning can
carbon atoms containing no more than a certain number) can be be used to learn chemical and physical transformation pathways
projected even onto the 1D continuous distribution. Hence, for from observational data. In STEM, we simply assumed the exis-
the discrete data (possible structural fragments) the encoding can tence of atoms, a discreteness of atomic class, and that there is
be performed by 1, 2, or higher dimensional continuous variables. an explicit correspondence between the observed STEM contrast
However, for 1D, the encoding will require a much larger num- and presence of atomic units. These reasonable assumptions, at
ber of significant digits, whereas 3- and higher dimensional latent the stage of the DCNN decoding of the images, enabled transi-
spaces do not allow for straightforward visualization similar to tioning from the STEM data to atomic coordinates, and at the
Fig. 3(a). Hence, we chose 2D latent spaces for convenience, and stage of latent space analysis allowed separation of physical and
note that a similar approach is used in many other branches of ML, unphysical configurations. With only these postulates, skip-rVAE
e.g., in analysis of self-organized feature maps (SOFMs) or graph can identify the existing molecular fragments observed within
09 July 2024 01:50:19

embeddings. the material, encode them via two latent variables (for conve-
Finally, this approach allows exploring the chemical dynamics, nience), and enable exploration of transformation mechanisms
i.e., the transformations of the chemical bonding network during through tracing the evolution of atoms in the latent space of the
the e-beam irradiated process. Generally, this necessitates tracing system.
FIG. 4. Chemical space analysis for hollow-site-centered descriptors. (a) Latent space of skip-rVAE for 12 × 12 sampling. (b) Chemical space map and (c) examples of the
ring structures from each of the four well-defined regions on the map in (b). The dotted box in (b) corresponds to the area shown in (a).

© Author(s) 2023
FIG.√5. Chemical evolution of a selected Si atom during e-beam irradiation. (a) Evolution of the latent variables as a function of time. The position vector is calculated
as x 2 + y 2 from the image origin. (b) dynamics on the latent plane represented as a map of hexagonal rings [see Fig. 3(c)] for the first 30 frames (the scatter plot
09 July 2024 01:50:19

for all the time steps is available from the accompanying notebook). (c) Chemical neighborhoods during the beam-induced transformations including transition from (I)
3-fold coordinated to (II) 4-fold coordinated Si, to (III) transitional structure to (IV) 3-fold coordinated graphene with adjacent 7-ring to (V) more complex patterns involving
formations of quasi-linear chains. Note that central atom is always Si. The depicted structures correspond to frames 3, 10, 11, 15, and 28 in (a).
Overall, this approach suggests that the imaging data obtained architectures, and generative physical laws and causal relationships
during dynamic evolution can be used to derive the chemical and in the latent space of VAEs.
physical transformation pathways involved, by providing encod-
ings of the observed structures that act as bottom-up equivalents of METHODS
structural order parameters. This in turn provides a strong stimu- Graphene sample preparation
lus toward the development of STEM, TEM, and scanning probe
microscope (SPM) techniques, including low dose and ultrafast Atmospheric pressure chemical vapor deposition (AP-CVD)
imaging and full information detection. Finally, we pose that a simi- was used to grow graphene on Cu foil.56 A coating of poly(methyl
lar approach can be applicable to other atomic scale and mesoscopic methacrylate) (PMMA) was spin coated over the surface to pro-
imaging techniques, providing a consistent approach for the iden- tect the graphene and form a mechanical stabilizer during handling.
tification of order parameters and other descriptors in complex Ammonium persulfate dissolved in deionized (DI) water was used
mechanisms. to etch away the Cu foil. The remaining PMMA/graphene stack
Finally, we argue that this approach forms a consistent frame- was rinsed in DI water and positioned on a TEM grid and baked
work for application of machine learning methods in physical sci- on a hot plate at 150 ○ C for 15 min to promote adhesion between
ences. The necessary steps here are the clear formulation on what the graphene and TEM grid. After cooling, acetone was used to
postulates and constraints (prior knowledge) are imposed during remove the PMMA and isopropyl alcohol was used to remove the
the feature selection and network architecture, what data are pro- acetone residue. The sample was dried in air and baked in an Ar–O2
vided, and what new knowledge is revealed by the ML algorithm atmosphere (10% O2 ) at 500 ○ C for 1.5 h to remove residual contam-
given the data. Here, we demonstrated the ML approach for dis- ination.57 Before examination in the STEM, the sample was baked in
covery of chemical transformations from STEM observations given vacuum at 160 ○ C for 8 h.
the existence and observability of atoms. Hence, we also believe that
STEM imaging
this approach will stimulate broader application of variational (i.e.,
Bayesian) methods in physical sciences, as well as the development For imaging, a Nion UltraSTEM 200 was used, operated at
of new ways to encode physical constraints in the encoder–decoder 100 kV accelerating voltage with a nominal beam current of 20 pA

© Author(s) 2023
and nominal convergence angle of 30 mrad. Images were acquired Author Contributions
using the high angle annular dark field detector.
Sergei V. Kalinin: Conceptualization (equal); Investigation (equal);
Writing – original draft (equal); Writing – review & editing (equal).
Ondrej Dyck: Investigation (equal); Writing – review & editing
STEM data analysis (equal). Ayana Ghosh: Investigation (equal); Writing – review
& editing (equal). Yongtao Liu: Investigation (equal); Writing –
The DCNN for atomic image segmentation was based on the review & editing (equal). Bobby G. Sumpter: Investigation (equal);
U-Net architecture58 where we replaced the conventional convolu- Writing – review & editing (equal). Maxim Ziatdinov: Investigation
tional layers in the network’s bottleneck with a spatial pyramid of (equal); Methodology (equal); Writing – review & editing (equal).
dilated convolutions for better results on noisy data.59 The DCNN
weights were trained using Adam optimizer60 with cross-entropy
DATA AVAILABILITY
loss function and learning rate of 0.001. In skip-rVAE, both encoder
and decoder had four fully connected layers with 256 neurons The data that support the findings of this study are openly
in each layer. The skip connections were drawn from the latent available in GitHub repository at https://github.com/ziatdinovmax/
layer into every decoder layer. The latent layer had three neurons ChemDisc.
designated to “absorb” arbitrary rotations and xy-translations of
images content and all the rest neurons (2 in this case) in the
latent layer were used for disentangling different atomic structures. REFERENCES
The encoder and decoder neural networks were trained jointly 1
Z. Ghahramani, “Probabilistic machine learning and artificial intelligence,”
using the Adam optimizer with the learning rate of 0.0001 and Nature 521(7553), 452–459 (2015).
mean-squared error loss. Both DCNN and VAE were implemented 2
J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural
via the custom-built AtomAI package61 utilizing PyTorch deep Networks 61, 85–117 (2015).
3
learning library.62 Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521(7553),
436–444 (2015).
4
M. I. Jordan and T. M. Mitchell, “Machine learning: Trends, perspectives, and
prospects,” Science 349(6245), 255–260 (2015).
ACKNOWLEDGMENTS 5
F. Jiang, Y. Jiang, H. Zhi, Y. Dong, H. Li, S. Ma, Y. Wang, Q. Dong, H. Shen, and
09 July 2024 01:50:19

This effort (ML and STEM) is based upon work supported Y. Wang, “Artificial intelligence in healthcare: Past, present and future,” Stroke
by the U.S. Department of Energy (DOE), Office of Science, Basic Vasc. Neurol. 2(4), 230–243 (2017).
6
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang,
Energy Sciences (BES), Materials Sciences and Engineering Divi-
A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet
sion (S.V.K. and O.D.), the U.S. Department of Energy, Office of large scale visual recognition challenge,” Int. J. Comput. Vision 115(3), 211–252
Science, Office of Basic Energy Sciences Data, Artificial Intelligence (2015).
and Machine Learning at DOE Scientific User Facilities program 7
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A.
under Award No. 34532 (A.G.), and the U.S. Department of Energy, Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A.
Office of Science, Office of Basic Energy Sciences Energy Fron- Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis,
tier Research Centers program under Award No. DE-SC0021118 “Human-level control through deep reinforcement learning,” Nature 518(7540),
529–533 (2015).
(Y.L.) and was performed and partially supported (M.Z. and B.G.S.) 8
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driess-
at the Oak Ridge National Laboratory’s Center for Nanophase che, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Diele-
Materials Sciences (CNMS), a U.S. Department of Energy, Office man, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M.
of Science User Facility operated by Oak Ridge National Lab- Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of
oratory. We acknowledge multiple productive interactions with Go with deep neural networks and tree search,” Nature 529(7587), 484–489
Dr. Stephen Jesse. Notice: This manuscript has been authored by (2016).
9
M. E. Taylor and P. Stone, “Transfer learning for reinforcement learning
UT-Battelle, LLC, under Contract No. DE-AC0500OR22725 with
domains: A survey,” J. Mach. Learn. Res. 10, 1633–1685 (2009), https://www.
the U.S. Department of Energy. The United States Government jmlr.org/papers/v10/taylor09a.html
retains and the publisher, by accepting the article for publication, 10
J. M. Mooij, J. Peters, D. Janzing, J. Zscheischler, and B. Scholkopf,
acknowledges that the United States Government retains a non- “Distinguishing cause from effect using observational data: Methods and
exclusive, paid-up, irrevocable, world-wide license to publish or benchmarks,” J. Mach. Learn. Res. 17, 1–102 (2016), https://jmlr.org/papers/
reproduce the published form of this manuscript, or allow others volume17/14-518/14-518.pdf
11
to do so, for the United States Government purposes. The Depart- J. Pearl, “The seven tools of causal inference, with reflections on machine
learning,” Commun. ACM 62(3), 54–60 (2019).
ment of Energy will provide public access to these results of federally 12
H. Chen, O. Engkvist, Y. Wang, M. Olivecrona, and T. Blaschke, “The rise
sponsored research in accordance with the DOE Public Access Plan of deep learning in drug discovery,” Drug Discovery Today 23(6), 1241–1250
(http://energy.gov/downloads/doe-public-access-plan). (2018).
13
R. W. Epps, M. S. Bowen, A. A. Volk, K. Abdel-Latif, S. Y. Han, K. G. Reyes, A.
Amassian, and M. Abolhasani, “Artificial chemist: An autonomous quantum dot
AUTHOR DECLARATIONS synthesis bot,” Adv. Mater. 32, 2001626 (2020).
14
S. Langner, F. Häse, J. D. Perea, T. Stubhan, J. Hauch, L. M. Roch, T.
Conflict of Interest Heumueller, A. Aspuru-Guzik, and C. J. Brabec, “Beyond ternary OPV: High-
throughput experimentation and self-driving laboratories optimize multicompo-
The authors have no conflicts to disclose. nent systems,” Adv. Mater. 32(14), e1907801 (2020).

© Author(s) 2023
15 36
B. P. MacLeod, F. G. L. Parlane, T. D. Morrissey, F. Häse, L. M. Roch, K. E. T. Susi, J. Kotakoski, D. Kepaptsoglou, C. Mangler, T. C. Lovejoy, O. L. Kri-
Dettelbach, R. Moreira, L. P. E. Yunker, M. B. Rooney, J. R. Deeth, V. Lai, G. J. Ng, vanek, R. Zan, U. Bangert, P. Ayala, J. C. Meyer, and Q. Ramasse, “Silicon–carbon
H. Situ, R. H. Zhang, M. S. Elliott, T. H. Haley, D. J. Dvorak, A. Aspuru-Guzik, bond inversions driven by 60-keV electrons in graphene,” Phys. Rev. Lett. 113(11),
J. E. Hein, and C. P. Berlinguette, “Self-driving laboratory for accelerated discovery 115501 (2014).
of thin-film materials,” Sci. Adv. 6(20), eaaz8867 (2020). 37
A. W. Robertson, G.-D. Lee, K. He, Y. Fan, C. S. Allen, S. Lee, H. Kim, E. Yoon,
16
S. Jesse, B. M. Hudak, E. Zarkadoula, J. Song, A. Maksov, M. Fuentes-Cabrera, H. Zheng, A. I. Kirkland, and J. H. Warner, “Partial dislocations in graphene and
P. Ganesh, I. Kravchenko, P. C. Snijders, A. R. Lupini, A. Y. Borisevich, and their atomic level migration dynamics,” Nano Lett. 15(9), 5950–5955 (2015).
S. V. Kalinin, “Direct atomic fabrication and dopant positioning in Si using elec- 38
A. W. Robertson, G.-D. Lee, K. He, C. Gong, Q. Chen, E. Yoon, A. I. Kirkland,
tron beams with active real-time image-based feedback,” Nanotechnology 29(25), and J. H. Warner, “Atomic structure of graphene subnanometer pores,” ACS Nano
255303 (2018). 9(12), 11599–11607 (2015).
17
S. V. Kalinin, A. Borisevich, and S. Jesse, “Fire up the atom forge,” Nature 39
A. W. Robertson, G.-D. Lee, K. He, E. Yoon, A. I. Kirkland, and J. H. Warner,
539(7630), 485–487 (2016).
18 “Stability and dynamics of the tetravacancy in graphene,” Nano Lett. 14(3),
G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby, L. Vogt- 1634–1642 (2014).
Maranto, and L. Zdeborova, “Machine learning and the physical sciences,” Rev. 40
A. W. Robertson, G.-D. Lee, K. He, E. Yoon, A. I. Kirkland, and J. H. Warner,
Mod. Phys. 91(4), 045002 (2019).
19 “The role of the bridging atom in stabilizing odd numbered graphene vacancies,”
J. Carrasquilla and R. G. Melko, “Machine learning phases of matter,” Nat. Phys.
Nano Lett. 14(7), 3972–3980 (2014).
13(5), 431–434 (2017). 41
20 A. W. Robertson, K. He, A. I. Kirkland, and J. H. Warner, “Inflating graphene
Y. Zhang and E. A. Kim, “Quantum loop topography for machine learning,”
with atomic scale blisters,” Nano Lett. 14(2), 908–914 (2014).
Phys. Rev. Lett. 118(21), 216401 (2017). 42
21
V. N. Vapnik, “An overview of statistical learning theory,” IEEE Trans. Neural A. W. Robertson, C. S. Allen, Y. A. Wu, K. He, J. Olivier, J. Neethling, A. I.
Networks 10(5), 988–999 (1999). Kirkland, and J. H. Warner, “Spatial control of defect creation in graphene at the
22 nanoscale,” Nat. Commun. 3, 1144 (2012).
J. Ren, P. J. Liu, E. Fertig, J. Snoek, R. Poplin, M. Depristo, J. Dillon, and 43
B. Lakshminarayanan, “Likelihood ratios for out-of-distribution detection,” in J. H. Warner, E. R. Margine, M. Mukai, A. W. Robertson, F. Giustino, and A.
Advances in Neural Information Processing Systems (Curran Associates, 2019), pp. I. Kirkland, “Dislocation-driven deformations in graphene,” Science 337(6091),
14707–14718, https://dl.acm.org/doi/abs/10.5555/3454287.3455604. 209–212 (2012).
44
23
Y. Ovadia, E. Fertig, J. Ren, Z. Nado, D. Sculley, S. Nowozin, J. Dil- D. P. Kingma and M. Welling, “Auto-encoding variational Bayes,”
lon, B. Lakshminarayanan, and J. Snoek, “Can you trust your model’s arXiv:1312.6114 (2013).
45
uncertainty? Evaluating predictive uncertainty under dataset shift,” in D. P. Kingma and M. Welling, “An introduction to variational autoencoders,”
Advances in Neural Information Processing Systems (Curran Associates, Found. Trends Mach. Learn. 12(4), 307–392 (2019).
2019), pp. 13991–14002, https://papers.nips.cc/paper_files/paper/2019/hash/ 46
D. M. Blei, A. Kucukelbir, and J. D. McAuliffe, “Variational inference: A review
8558cb408c1d76621371888657d2eb1d-Abstract.html. for statisticians,” J. Am. Stat. Assoc. 112(518), 859–877 (2017).
09 July 2024 01:50:19

24
R. Kannan, A. V. Ievlev, N. Laanait, M. A. Ziatdinov, R. K. Vasudevan, S. Jesse, 47
M. Ziatdinov, O. Dyck, X. Li, B. G. Sumpter, S. Jesse, R. K. Vasudevan, and S. V.
and S. V. Kalinin, “Deep data analysis via physically constrained linear unmixing: Kalinin, “Building and exploring libraries of atomic defects in graphene: Scanning
Universal framework, domain examples, and a community-wide platform,” Adv. transmission electron and scanning tunneling microscopy study,” Sci. Adv. 5(9),
Struct. Chem. Imaging 4, 6 (2018). eaaw8989 (2019).
25
A. Choudhary, J. F. Lindner, E. G. Holliday, S. T. Miller, S. Sinha, and W. L. 48
J. Madsen, P. Liu, J. Kling, J. B. Wagner, T. W. Hansen, O. Winther,
Ditto, “Physics-enhanced neural networks learn order and chaos,” Phys. Rev. E and J. Schiøtz, “A deep learning approach to identify local structures in
101(6), 062207 (2020).
26
atomic-resolution transmission electron microscopy images,” Adv. Theory Simul.
S. Ye, J. Liang, R. Liu, and X. Zhu, “Symmetrical graph neural network for quan- 1(8), 1800037 (2018).
tum chemistry with dual real and momenta space,” J. Phys. Chem. A 124(34), 49
T. Bepler, E. Zhong, K. Kelley, E. Brignole, and B. Berger, “Explicitly dis-
6945–6953 (2020).
27 entangling image content from translation and rotation with spatial-VAE,”
M. Schmidt and H. Lipson, “Distilling free-form natural laws from experimental
in Advances in Neural Information Processing Systems (Curran Associates,
data,” Science 324(5923), 81–85 (2009).
28 2019), pp. 15409–15419, https://papers.nips.cc/paper_files/paper/2019/hash/
T. Wu and M. Tegmark, “Toward an artificial intelligence physicist for 5a38a1eb24d99699159da10e71c45577-Abstract.html.
unsupervised learning,” Phys. Rev. E 100(3), 033311 (2019). 50
29 S. V. Kalinin, J. J. Steffes, Y. Liu, B. D. Huey, and M. Ziatdinov, “Disentangling
O. Dyck, M. Ziatdinov, D. B. Lingerfelt, R. R. Unocic, B. M. Hudak, A. R. Lupini,
ferroelectric domain wall geometries and pathways in dynamic piezoresponse
S. Jesse, and S. V. Kalinin, “Atom-by-atom fabrication with electron beams,” Nat.
force microscopy via unsupervised machine learning,” Nanotechnology 33(5),
Rev. Mater. 4(7), 497–507 (2019).
30 (2021) 055707.
O. Dyck, S. Kim, E. Jimenez-Izal, A. N. Alexandrova, S. V. Kalinin, and S. Jesse, 51
S. V. Kalinin, O. Dyck, S. Jesse, and M. Ziatdinov, “Machine learning of chemical
“Building structures atom by atom via electron beam manipulation,” Small 14(38),
1801771 (2018). transformations in the Si-graphene system from atomically resolved images via
31 variational autoencoder,” arXiv:2006.10267 (2020).
O. Dyck, S. Kim, S. V. Kalinin, and S. Jesse, “E-beam manipulation of Si atoms 52
on graphene edges with an aberration-corrected scanning transmission electron D. P. Kingma, T. Salimans, R. Jozefowicz, X. Chen, I. Sutskever, and M. Welling,
microscope,” Nano Res. 11(12), 6217–6226 (2018). “Improved variational inference with inverse autoregressive flow,” in Proceedings
32
O. Dyck, S. Kim, S. V. Kalinin, and S. Jesse, “Placing single atoms in graphene of the 30th International Conference on Neural Information Processing Systems
with a scanning transmission electron microscope,” Appl. Phys. Lett. 111(11), (Curran Associates, Inc., Barcelona, Spain, 2016), pp. 4743–4751.
53
113104 (2017). A. B. Dieng, Y. Kim, A. M. Rush, and D. M. Blei, “Avoiding latent variable col-
33
M. Tripathi, A. Mittelberger, N. A. Pike, C. Mangler, J. C. Meyer, M. J. Ver- lapse with generative skip models,” in Proceedings of Machine Learning Research,
straete, J. Kotakoski, and T. Susi, “Electron-beam manipulation of silicon dopants edited by C. Kamalika and S. Masashi (PMLR, 2019), Vol. 89, pp. 2397–2405,
in graphene,” Nano Lett. 18(8), 5319–5323 (2018). https://proceedings.mlr.press/v89/dieng19a.html.
34 54
T. Susi, J. C. Meyer, and J. Kotakoski, “Manipulating low-dimensional materials E. Orhan and X. Pitkow, “Skip connections eliminate singularities,” in
down to the level of single atoms with electron irradiation,” Ultramicroscopy 180, International Conference on Learning Representations, 2018.
55
163–172 (2017). R. K. Srivastava, K. Greff, and J. Schmidhuber, “Training very deep
35
T. Susi, D. Kepaptsoglou, Y.-C. Lin, Q. M. Ramasse, J. C. Meyer, K. Suenaga, networks,” in Proceedings of the 28th International Conference on Neural
and J. Kotakoski, “Towards atomically precise manipulation of 2D nanostructures Information Processing Systems (MIT Press, Montreal, Canada, 2015), Vol. 2,
in the electron microscope,” 2D Mater. 4(4), 042004 (2017). pp. 2377–2385.

© Author(s) 2023
56
I. Vlassiouk, P. Fulvio, H. Meyer, N. Lavrik, S. Dai, P. Datskos, and S. Smirnov, of the atomic-scale ferroelectric distortions,” Appl. Phys. Lett. 115(5), 052902
“Large scale atmospheric pressure chemical vapor deposition of graphene,” (2019).
Carbon 54, 58–67 (2013). 60
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
57
O. Dyck, S. Kim, S. V. Kalinin, and S. Jesse, “Mitigating e-beam-induced hydro- arXiv:1412.6980 (2015).
carbon deposition on graphene for atomic-scale scanning transmission electron 61
M. Ziatdinov, A. Ghosh, C. Y. Wong, and S. V. Kalinin, “AtomAI framework for
microscopy studies,” J. Vac. Sci. Technol. B 36(1), 011801 (2017). deep learning analysis of image and spectroscopy data in electron and scanning
58
O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks probe microscopy,” Nat. Mach. Intell. 4, 1101–1112 (2022).
for biomedical image segmentation,” in International Conference on Medi- 62
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen,
cal Image Computing and Computer-Assisted Intervention (Springer, 2015), Z. Lin, N. Gimelshein, and L. Antiga, “PyTorch: An imperative style, high-
pp. 234–241. performance deep learning library,” in Advances in Neural Information Pro-
59
M. Ziatdinov, C. Nelson, R. K. Vasudevan, D. Y. Chen, and S. V. Kalinin, cessing Systems (Curran Associates, 2019), pp. 8026–8037, https://papers.nips.cc/
“Building ferroelectric from the bottom up: The machine learning analysis paper_files/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html.
09 July 2024 01:50:19

© Author(s) 2023

Unsupervised Machine Learning Discovery of Structural Units and Transformation Pathways From Imaging Data

Uploaded by

Copyright:

Available Formats

Unsupervised Machine Learning Discovery of Structural Units and Transformation Pathways From Imaging Data

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unsupervised Machine Learning Discovery of Structural Units and Transformation Pathways From Imaging Data

Uploaded by

Copyright:

Available Formats

RESEARCH ARTICLE | JUNE 14 2023

Unsupervised machine learning discovery of structural units

APL Mach. Learn. 1, 026117 (2023)

09 July 2024 01:50:19

Unsupervised machine learning discovery

09 July 2024 01:50:19

INTRODUCTION the potential application of deep learning to scientific discovery.18–20

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-1

09 July 2024 01:50:19

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-2

09 July 2024 01:50:19

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-3

09 July 2024 01:50:19

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-4

09 July 2024 01:50:19

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-5

09 July 2024 01:50:19

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-6

09 July 2024 01:50:19

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-7

09 July 2024 01:50:19

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-8

09 July 2024 01:50:19

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-9

09 July 2024 01:50:19

APL Mach. Learn. 1, 026117 (2023); doi: 10.1063/5.0147316 1, 026117-10

You might also like