Unsupervised Machine Learning Discovery of Structural Units and Transformation Pathways From Imaging Data
Unsupervised Machine Learning Discovery of Structural Units and Transformation Pathways From Imaging Data
Unsupervised Machine Learning Discovery of Structural Units and Transformation Pathways From Imaging Data
View Export
Online Citation
Sergei V. Kalinin,1,a) Ondrej Dyck,2 Ayana Ghosh,3 Yongtao Liu,2 Bobby G. Sumpter,2
2,3,b)
and Maxim Ziatdinov
AFFILIATIONS
1
Department of Materials Science and Engineering, The University of Tennessee, Knoxville, Tennessee 37996, USA
2
Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
3
Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
a)
Author to whom correspondence should be addressed: [email protected]
b)
[email protected]
discovery in the physical sciences typically relies on relatively small irradiation and inducing slow chemical changes in the suspended
amounts of data and allows extrapolation well outside the specific graphene. Here, we aim to explore whether the elementary bond-
domains, which are the “hard” tasks for traditional ML. Thus, the ing patterns (chemical fragments) in this system and their changes
merger of ML and classical scientific methods may provide a new with time (chemical transformations) can be derived from the STEM
forefront for research and development. observations in an unsupervised manner with a minimum number
One such approach is based on the introduction of known of an a priori assumptions and postulates. In other words, can we
physical constraints in the form of symmetries, conservation laws, create a computational architecture that can discover chemistry and
etc., to limit ML prediction to the physically possible. Simple exam- chemical transformation pathways from observations in an unsu-
ples of this approach are use of sparsity, non-negativity, or sum to pervised manner? While in this case the results can be compared
one constraint in linear and nonlinear unmixing,24 resembling the with prior chemical knowledge in a straightforward fashion, simi-
classical Lagrange multiplier approach. Several groups have intro- lar approaches applied to unknown systems will offer a pathway to
duced the physical constraints of Hamiltonian mechanics in the deep scientific discovery.
neural networks, significantly improving predictions of long-term Our approach is based on the concept of a variational autoen-
dynamics in chaotic systems.25 Similarly, the network architecture coder (VAE)44 that finds the latent representation of the complex
can be made to encode known symmetry constraints, enabling the system. Generally, a VAE is a directed latent-variable probabilistic
discovery of order parameters.26 Alternatively, constraints can be graphical model that learns a stochastic mapping between obser-
introduced on the form and complexity of structural and gener- vations x with a complicated empirical distribution and latent
ative behaviors, as exemplified by the combination of symbolic variables z whose distribution can be relatively simple.45 A VAE
regressions and genetic algorithms.27,28 However, until now most of consists of a generative model (“decoder”) that reconstructs xi from
these developments have been limited to areas, such as mechanics a latent “code” zi and an inference model (“encoder”) whose role
and astronomy, where the generative laws are relatively simple and is to approximate a posterior of the generative model via amor-
well-defined and hence ML predictions can be readily compared to tized variational inference.46 Implementation-wise, both encoder
physical models. and decoder models are approximated by deep neural networks
At the same time, discovery of more complex laws and reduced whose parameters are learned jointly by maximizing the evidence
rules that define fields of chemistry, molecular biology, etc., has not lower boundary via a stochastic gradient descent with randomly
yet been demonstrated. In these fields, the elementary descriptors drawn mini-batches of data. In this manner, VAEs allow one to build
of the systems are generally not well defined, and the constraints relationships between high-dimensional datasets and a small num-
FIG. 1. Discovery of molecular structural fragments and chemical transformation mechanisms via unsupervised machine learning. Here, a deep convolutional neural network
(DCNN) is used to perform the semantic segmentation of (a) the experimental dataset to yield (b) the coordinates and identities of atomic species. Note that for the “ideal”
experimental data, similar results can be achieved using the simple maxima analysis/blob finding, and a DCNN is used to visualize atoms only. Further shown are the (c)
latent space of the skip rotationally invariant variational autoencoder (srVAE) with skip connectivity between the latent space and decoder and (d) image encoded with the
one of the srVAE latent variables. (e) The analysis of the latent space of the srVAE illustrates the regions containing the easily identifiable molecular fragments, e.g., 6-
and 5-member cycles, 5–7 defects, and edge configurations. Note that these are discovered in a fully unsupervised manner. Comparing the evolution of latent variables
corresponding to a single atom further allows discovery of chemical transformation mechanisms.
the rVAE’s decoder’s neural network via the skip connections,53,54 for this window size. Structures in the upper part of the diagram
thereby enforcing a dependence between the observations and their illustrate the presence of edges or rings with different numbers of
latent variables. We note that this approach is different from a well- members. A part of the latent space contains well-known 5–7 defects
known method of adding residual (“skip”) connections between dif- [magnified in Fig. 2(b)]. The “unphysical” images with smeared or
ferent layers of a neural network55 since in our case the connection weak contrast manifest in only a few locations and are an unalien-
paths originate in the latent space. able feature of the projection of discrete system on low dimensional
The latent space of the skip-rVAE trained on the experimen- manifold.
tal data is shown in Fig. 2(a). Here, the rectangular grid of points To gain further insight into the system behavior in the latent
l1 ∈ [l1min , l1max ], l2 ∈ [l2min , l2max ] is formed and the images formed space and establish relationships between latent variables and classi-
by decoding the chosen (l1 , l2 ) pair are plotted on a rectangular grid. cal organic chemistry descriptors, we classify the observed structures
This depiction allows observation of the evolution of sub-images in based on connectivity of the carbon network. Here, we developed a
the latent space and establishes the relationship between the STEM simple approach where all units above an intensity threshold t = 0.5
image (i.e., local atomic structure) and the latent variables. Note in the images projected from [l1 , l2 ] coordinates of the latent space
that not all combinations of latent variables correspond to physically are considered to be physical. We then identified a graph structure
possible STEM images, and hence distributions of the experimental G(n, E) that corresponds to these atomic units, where n is number
data in the latent space can have a complicated structure. Similarly, of nodes (units) in the image and E corresponds to edges connect-
decoded images need to be ascertained for physicality. ing different nodes. Only nodes separated by 0.5lC–C < d < 1.2lC–C
Here, we explore whether latent space reconstructions contain are defined as connected. The lower bound is set such that it can
information on the molecular building blocks in the graphene lat- potentially account for out-of-plane distortions that result in the
tice. By construction of descriptors, the center of a reconstructed apparent shrinkage of bonds as seen from the 2D projection in
sub-image will contain a single well-defined atom (Si or C). Hence, STEM. We note that such an analysis works only for 2D systems.
the remainder of the image contrast can be directly interpreted in Finally, a depth-first search method is used to traverse a graph struc-
terms of whether atomic structures (as opposed to some abstract ture and identify a number of different n-membered rings adjacent
representations) are observed, and what these structures are. Sur- to the central atom. Note that this approach can be extended fur-
prisingly, casual examination of Fig. 2(a) illustrates that even at a ther to explore additional details of the atomic structure beyond
low sampling density, the latent space representations generally cor- adjacent rings (e.g., broken bonds, dangling atoms, etc.), but this
respond to well-defined molecular graphene fragments, comporting analysis will require larger window sizes and will come at the cost of
FIG. 2. Latent space analysis for atom-centered descriptors. (a) Latent space of the skip-rVAE for low sampling (12 × 12). (b) The graph analysis of the sub-image
corresponding to a selected point in the latent space. Distributions of the (c) pentagonal, (d) hexagonal, and (e) heptagonal rings adjacent to the central atom in the latent
space for high sampling (200 × 200). Here, −1 corresponds to unphysical configurations and 1, 2, and 3 are for number of rings of particular type surrounding central atom.
The dotted boxes in (c)–(e) correspond to latent space area in (a).
Figs. 2(c)–2(e). The top side of the images (positive l2 ) generally con- each uniform region the structure variability along the lattice space
tains a small number of rings and corresponds to edges or isolated represents the physical distortions or structure outside the primary
atoms. The different values of the latent variables encode structural rings.
deformations within graphene. The bulk of l2 < 0 region corresponds These behaviors are further illustrated in Fig. 3. Figure 3(a)
to a normal graphene structure. Finally, the region for −1 < l1 < 1 shows the latent space of the system at high sampling. While
and 0 < l2 < 1.5 contains islands with varying number of the 5- and the details of individual images cannot be discerned, the overall
7-member rings. Interestingly, overlap of these islands corresponds smooth evolution of decoded patterns is clearly visible, as are large
to the 5–7 defects. scale variations in encoded behaviors. Figure 3(b) represents the
Explicit examination of the images in the different parts of overlay of Figs. 2(c)–2(e), visualizing the domains with dissimi-
the latent space as well as the morphologies of the domains in lar chemical structures. Finally, several structures corresponding
Figs. 2(c)–2(e) suggests that the skip-rVAE has discovered the latent to selected regions of the latent space are shown in Fig. 3(c),
representations of the observed STEM contrast in terms of continu- including (I) prototype graphene, (II) 5-7-7 defect, (III) 7-member
ous latent variables, and these representations separate the possible rings, (IV) edge region, and (V) 5-member ring. We note that
structures of chemical bonding networks. The well-defined chemi- the complexity of chemical space increases with the size of sub-
cal structures occupy specific regions of the latent space with relative image descriptors since, e.g., a 5-member ring can be adjusted to
areas proportional to the fraction of these structures in the ini- 6-, 7-, and 8-membered-rings in different realizations of topological
tial dataset. At higher sampling densities, these regions are often defects.
separated by the thin lines of “unphysical” domains corresponding To simplify the representation of chemical space we also cre-
to the transitions between physically realizable configurations via ated an alternative set of descriptors for skip-rVAE training where
“impossible” configurations, such as smeared atoms. Finally, within we used sub-images centered on hollow sites instead of atoms.
FIG. 3. Chemical space analysis for atom-centered descriptors. (a) Latent space of skip-rVAE at high sampling. (b) Chemical space map and (c) several examples of the
observed structures, including (I) prototype graphene, (II) 5-7-7 defect, (III) 7-member ring, (IV) fragment of bearded edge, and (V) 5-member ring. The dotted box in (b)
corresponds to the area shown in (a).
This is achieved by applying the graph analysis described above of the atoms between the frames and reconstruction of the trajecto-
to the DCNN output from the entire dataset and computing the ries. However, in this case, significant changes in the carbon network
center of the mass for each identified cycle (with the maximum between the frames make it difficult, especially for atoms where
cycle length set to 8). By design, this description is suited for ana- the bonding changes. However, Si atoms that are present in the
lyzing the material microstructure on the level of single rings. It graphene lattice offer readily identifiable markers that can be often
is again important to mention that the information available to traced between frames. An example of such an analysis is shown
the network is the positions of the centers and patches of the in Fig. 5. Figure 5(a) depicts the evolution of latent variables for a
image centered at these positions, rather than atomic coordinates selected Si atom through the ∼70 frames. Initially, the l2 values are
or any of the higher-level descriptors (bonds, orientations, etc.). close to 0, corresponding to the Si on a three-coordinated lattice site,
The skip-rVAE results for the new set of descriptors are shown in i.e., a substitutional defect. In the initial stage of the process, the atom
Fig. 4. Here, both the latent manifold [Fig. 4(a)] and the chemi- moves within the “ideal” graphene regions, with the changes in latent
cal space map [Fig. 4(b)] have a much simpler structure exhibit- variables representing the changes in strain state and distant neigh-
ing well-defined regions associated with 5-, 6-, 7-, and 8-member borhood. The 3-fold Si can transform into 4-fold Si [I–II in Fig. 5(c)],
rings [Fig. 4(c)]. clearly visible in the latent space of the system [Fig. 5(b)]. Subse-
This analysis naturally leads to a question as to whether a dif- quently, more complex co-ordinations can emerge, as visualized in
ferent dimensionality of the latent space can be chosen. From a states (III–V) in Fig. 5(c).
general perspective, finite discrete space (i.e., all arrangements of We have shown that unsupervised machine learning can
carbon atoms containing no more than a certain number) can be be used to learn chemical and physical transformation pathways
projected even onto the 1D continuous distribution. Hence, for from observational data. In STEM, we simply assumed the exis-
the discrete data (possible structural fragments) the encoding can tence of atoms, a discreteness of atomic class, and that there is
be performed by 1, 2, or higher dimensional continuous variables. an explicit correspondence between the observed STEM contrast
However, for 1D, the encoding will require a much larger num- and presence of atomic units. These reasonable assumptions, at
ber of significant digits, whereas 3- and higher dimensional latent the stage of the DCNN decoding of the images, enabled transi-
spaces do not allow for straightforward visualization similar to tioning from the STEM data to atomic coordinates, and at the
Fig. 3(a). Hence, we chose 2D latent spaces for convenience, and stage of latent space analysis allowed separation of physical and
note that a similar approach is used in many other branches of ML, unphysical configurations. With only these postulates, skip-rVAE
e.g., in analysis of self-organized feature maps (SOFMs) or graph can identify the existing molecular fragments observed within
FIG. 4. Chemical space analysis for hollow-site-centered descriptors. (a) Latent space of skip-rVAE for 12 × 12 sampling. (b) Chemical space map and (c) examples of the
ring structures from each of the four well-defined regions on the map in (b). The dotted box in (b) corresponds to the area shown in (a).
FIG.√5. Chemical evolution of a selected Si atom during e-beam irradiation. (a) Evolution of the latent variables as a function of time. The position vector is calculated
as x 2 + y 2 from the image origin. (b) dynamics on the latent plane represented as a map of hexagonal rings [see Fig. 3(c)] for the first 30 frames (the scatter plot
Overall, this approach suggests that the imaging data obtained architectures, and generative physical laws and causal relationships
during dynamic evolution can be used to derive the chemical and in the latent space of VAEs.
physical transformation pathways involved, by providing encod-
ings of the observed structures that act as bottom-up equivalents of METHODS
structural order parameters. This in turn provides a strong stimu- Graphene sample preparation
lus toward the development of STEM, TEM, and scanning probe
microscope (SPM) techniques, including low dose and ultrafast Atmospheric pressure chemical vapor deposition (AP-CVD)
imaging and full information detection. Finally, we pose that a simi- was used to grow graphene on Cu foil.56 A coating of poly(methyl
lar approach can be applicable to other atomic scale and mesoscopic methacrylate) (PMMA) was spin coated over the surface to pro-
imaging techniques, providing a consistent approach for the iden- tect the graphene and form a mechanical stabilizer during handling.
tification of order parameters and other descriptors in complex Ammonium persulfate dissolved in deionized (DI) water was used
mechanisms. to etch away the Cu foil. The remaining PMMA/graphene stack
Finally, we argue that this approach forms a consistent frame- was rinsed in DI water and positioned on a TEM grid and baked
work for application of machine learning methods in physical sci- on a hot plate at 150 ○ C for 15 min to promote adhesion between
ences. The necessary steps here are the clear formulation on what the graphene and TEM grid. After cooling, acetone was used to
postulates and constraints (prior knowledge) are imposed during remove the PMMA and isopropyl alcohol was used to remove the
the feature selection and network architecture, what data are pro- acetone residue. The sample was dried in air and baked in an Ar–O2
vided, and what new knowledge is revealed by the ML algorithm atmosphere (10% O2 ) at 500 ○ C for 1.5 h to remove residual contam-
given the data. Here, we demonstrated the ML approach for dis- ination.57 Before examination in the STEM, the sample was baked in
covery of chemical transformations from STEM observations given vacuum at 160 ○ C for 8 h.
the existence and observability of atoms. Hence, we also believe that
STEM imaging
this approach will stimulate broader application of variational (i.e.,
Bayesian) methods in physical sciences, as well as the development For imaging, a Nion UltraSTEM 200 was used, operated at
of new ways to encode physical constraints in the encoder–decoder 100 kV accelerating voltage with a nominal beam current of 20 pA
and nominal convergence angle of 30 mrad. Images were acquired Author Contributions
using the high angle annular dark field detector.
Sergei V. Kalinin: Conceptualization (equal); Investigation (equal);
Writing – original draft (equal); Writing – review & editing (equal).
Ondrej Dyck: Investigation (equal); Writing – review & editing
STEM data analysis (equal). Ayana Ghosh: Investigation (equal); Writing – review
& editing (equal). Yongtao Liu: Investigation (equal); Writing –
The DCNN for atomic image segmentation was based on the review & editing (equal). Bobby G. Sumpter: Investigation (equal);
U-Net architecture58 where we replaced the conventional convolu- Writing – review & editing (equal). Maxim Ziatdinov: Investigation
tional layers in the network’s bottleneck with a spatial pyramid of (equal); Methodology (equal); Writing – review & editing (equal).
dilated convolutions for better results on noisy data.59 The DCNN
weights were trained using Adam optimizer60 with cross-entropy
DATA AVAILABILITY
loss function and learning rate of 0.001. In skip-rVAE, both encoder
and decoder had four fully connected layers with 256 neurons The data that support the findings of this study are openly
in each layer. The skip connections were drawn from the latent available in GitHub repository at https://github.com/ziatdinovmax/
layer into every decoder layer. The latent layer had three neurons ChemDisc.
designated to “absorb” arbitrary rotations and xy-translations of
images content and all the rest neurons (2 in this case) in the
latent layer were used for disentangling different atomic structures. REFERENCES
The encoder and decoder neural networks were trained jointly 1
Z. Ghahramani, “Probabilistic machine learning and artificial intelligence,”
using the Adam optimizer with the learning rate of 0.0001 and Nature 521(7553), 452–459 (2015).
mean-squared error loss. Both DCNN and VAE were implemented 2
J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural
via the custom-built AtomAI package61 utilizing PyTorch deep Networks 61, 85–117 (2015).
3
learning library.62 Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521(7553),
436–444 (2015).
4
M. I. Jordan and T. M. Mitchell, “Machine learning: Trends, perspectives, and
prospects,” Science 349(6245), 255–260 (2015).
ACKNOWLEDGMENTS 5
F. Jiang, Y. Jiang, H. Zhi, Y. Dong, H. Li, S. Ma, Y. Wang, Q. Dong, H. Shen, and
15 36
B. P. MacLeod, F. G. L. Parlane, T. D. Morrissey, F. Häse, L. M. Roch, K. E. T. Susi, J. Kotakoski, D. Kepaptsoglou, C. Mangler, T. C. Lovejoy, O. L. Kri-
Dettelbach, R. Moreira, L. P. E. Yunker, M. B. Rooney, J. R. Deeth, V. Lai, G. J. Ng, vanek, R. Zan, U. Bangert, P. Ayala, J. C. Meyer, and Q. Ramasse, “Silicon–carbon
H. Situ, R. H. Zhang, M. S. Elliott, T. H. Haley, D. J. Dvorak, A. Aspuru-Guzik, bond inversions driven by 60-keV electrons in graphene,” Phys. Rev. Lett. 113(11),
J. E. Hein, and C. P. Berlinguette, “Self-driving laboratory for accelerated discovery 115501 (2014).
of thin-film materials,” Sci. Adv. 6(20), eaaz8867 (2020). 37
A. W. Robertson, G.-D. Lee, K. He, Y. Fan, C. S. Allen, S. Lee, H. Kim, E. Yoon,
16
S. Jesse, B. M. Hudak, E. Zarkadoula, J. Song, A. Maksov, M. Fuentes-Cabrera, H. Zheng, A. I. Kirkland, and J. H. Warner, “Partial dislocations in graphene and
P. Ganesh, I. Kravchenko, P. C. Snijders, A. R. Lupini, A. Y. Borisevich, and their atomic level migration dynamics,” Nano Lett. 15(9), 5950–5955 (2015).
S. V. Kalinin, “Direct atomic fabrication and dopant positioning in Si using elec- 38
A. W. Robertson, G.-D. Lee, K. He, C. Gong, Q. Chen, E. Yoon, A. I. Kirkland,
tron beams with active real-time image-based feedback,” Nanotechnology 29(25), and J. H. Warner, “Atomic structure of graphene subnanometer pores,” ACS Nano
255303 (2018). 9(12), 11599–11607 (2015).
17
S. V. Kalinin, A. Borisevich, and S. Jesse, “Fire up the atom forge,” Nature 39
A. W. Robertson, G.-D. Lee, K. He, E. Yoon, A. I. Kirkland, and J. H. Warner,
539(7630), 485–487 (2016).
18 “Stability and dynamics of the tetravacancy in graphene,” Nano Lett. 14(3),
G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby, L. Vogt- 1634–1642 (2014).
Maranto, and L. Zdeborova, “Machine learning and the physical sciences,” Rev. 40
A. W. Robertson, G.-D. Lee, K. He, E. Yoon, A. I. Kirkland, and J. H. Warner,
Mod. Phys. 91(4), 045002 (2019).
19 “The role of the bridging atom in stabilizing odd numbered graphene vacancies,”
J. Carrasquilla and R. G. Melko, “Machine learning phases of matter,” Nat. Phys.
Nano Lett. 14(7), 3972–3980 (2014).
13(5), 431–434 (2017). 41
20 A. W. Robertson, K. He, A. I. Kirkland, and J. H. Warner, “Inflating graphene
Y. Zhang and E. A. Kim, “Quantum loop topography for machine learning,”
with atomic scale blisters,” Nano Lett. 14(2), 908–914 (2014).
Phys. Rev. Lett. 118(21), 216401 (2017). 42
21
V. N. Vapnik, “An overview of statistical learning theory,” IEEE Trans. Neural A. W. Robertson, C. S. Allen, Y. A. Wu, K. He, J. Olivier, J. Neethling, A. I.
Networks 10(5), 988–999 (1999). Kirkland, and J. H. Warner, “Spatial control of defect creation in graphene at the
22 nanoscale,” Nat. Commun. 3, 1144 (2012).
J. Ren, P. J. Liu, E. Fertig, J. Snoek, R. Poplin, M. Depristo, J. Dillon, and 43
B. Lakshminarayanan, “Likelihood ratios for out-of-distribution detection,” in J. H. Warner, E. R. Margine, M. Mukai, A. W. Robertson, F. Giustino, and A.
Advances in Neural Information Processing Systems (Curran Associates, 2019), pp. I. Kirkland, “Dislocation-driven deformations in graphene,” Science 337(6091),
14707–14718, https://dl.acm.org/doi/abs/10.5555/3454287.3455604. 209–212 (2012).
44
23
Y. Ovadia, E. Fertig, J. Ren, Z. Nado, D. Sculley, S. Nowozin, J. Dil- D. P. Kingma and M. Welling, “Auto-encoding variational Bayes,”
lon, B. Lakshminarayanan, and J. Snoek, “Can you trust your model’s arXiv:1312.6114 (2013).
45
uncertainty? Evaluating predictive uncertainty under dataset shift,” in D. P. Kingma and M. Welling, “An introduction to variational autoencoders,”
Advances in Neural Information Processing Systems (Curran Associates, Found. Trends Mach. Learn. 12(4), 307–392 (2019).
2019), pp. 13991–14002, https://papers.nips.cc/paper_files/paper/2019/hash/ 46
D. M. Blei, A. Kucukelbir, and J. D. McAuliffe, “Variational inference: A review
8558cb408c1d76621371888657d2eb1d-Abstract.html. for statisticians,” J. Am. Stat. Assoc. 112(518), 859–877 (2017).
56
I. Vlassiouk, P. Fulvio, H. Meyer, N. Lavrik, S. Dai, P. Datskos, and S. Smirnov, of the atomic-scale ferroelectric distortions,” Appl. Phys. Lett. 115(5), 052902
“Large scale atmospheric pressure chemical vapor deposition of graphene,” (2019).
Carbon 54, 58–67 (2013). 60
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
57
O. Dyck, S. Kim, S. V. Kalinin, and S. Jesse, “Mitigating e-beam-induced hydro- arXiv:1412.6980 (2015).
carbon deposition on graphene for atomic-scale scanning transmission electron 61
M. Ziatdinov, A. Ghosh, C. Y. Wong, and S. V. Kalinin, “AtomAI framework for
microscopy studies,” J. Vac. Sci. Technol. B 36(1), 011801 (2017). deep learning analysis of image and spectroscopy data in electron and scanning
58
O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks probe microscopy,” Nat. Mach. Intell. 4, 1101–1112 (2022).
for biomedical image segmentation,” in International Conference on Medi- 62
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen,
cal Image Computing and Computer-Assisted Intervention (Springer, 2015), Z. Lin, N. Gimelshein, and L. Antiga, “PyTorch: An imperative style, high-
pp. 234–241. performance deep learning library,” in Advances in Neural Information Pro-
59
M. Ziatdinov, C. Nelson, R. K. Vasudevan, D. Y. Chen, and S. V. Kalinin, cessing Systems (Curran Associates, 2019), pp. 8026–8037, https://papers.nips.cc/
“Building ferroelectric from the bottom up: The machine learning analysis paper_files/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html.