11.5-Machine Learning and Evolutionary

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Research Article Vol. 28, No.

13 / 22 June 2020 / Optics Express 18899

Machine learning and evolutionary algorithm


studies of graphene metamaterials for optimized
plasmon-induced transparency
T IAN Z HANG , 1 Q I L IU , 1 Y IHANG DAN , 1 S HUAI Y U , 1 X U H AN , 2
J IAN DAI , 1 AND K UN X U 1,*
1 State
Key Laboratory of Information Photonics and Optical Communications, Beijing University of Posts
and Telecommunications, Beijing 100876, China
2 Huawei Technologies Co., Ltd, Shenzhen 518129, Guangdong, China
* [email protected]

Abstract: Machine learning and optimization algorithms have been widely applied in the
design and optimization for photonics devices. We briefly review recent progress of this field
of research and show data-driven applications, including spectrum prediction, inverse design
and performance optimization, for novel graphene metamaterials (GMs). The structure of the
GMs is well-designed to achieve the wideband plasmon induced transparency (PIT) effect,
which can be theoretically demonstrated by using the transfer matrix method. Some traditional
machine learning algorithms, including k nearest neighbour, decision tree, random forest and
artificial neural networks, are utilized to equivalently substitute the numerical simulation in the
forward spectrum prediction and complete the inverse design for the GMs. The calculated results
demonstrate that all algorithms are effective and the random forest has advantages in terms of
accuracy and training speed. Moreover, evolutionary algorithms, including single-objective
(genetic algorithm) and multi-objective optimization (NSGA-II), are used to achieve the steep
transmission characteristics of PIT effect by synthetically taking many different performance
metrics into consideration. The maximum difference between the transmission peaks and dips
in the optimized transmission spectrum reaches 0.97. In comparison to previous works, we
provide a guidance for intelligent design of photonics devices based on machine learning and
evolutionary algorithms and a reference for the selection of machine learning algorithms for
simple inverse design problems.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction
Traditionally, the design and optimization of photonics devices rely on the repeated trial or
physics-inspired methods [1–2]. However, with the increase of the performance metric and
integration level, the design and optimization processes for photonics devices have become
computationally expensive and complex [3]. For example, owning to the excellent electronic and
optical properties [4–9], graphene, a typical 2D material [10], has been applied in many photonic
devices, such as optical modulators [11], photoelectric detectors [12], sensors [13], absorbers [14],
switchers [15], polarization controllers [16], diodes [17] and so on. For a graphene nanostructure,
we usually consider the influence of critical physical parameters of graphene (e.g. the chemical
potential and the number of layers) on the electromagnetic responses. Nevertheless, the lack of the
empirical relationships between physical parameters and corresponding electromagnetic responses
often leads to the time-consuming brute force search, which calculates the electromagnetic
responses for all physical parameters by using the numerical simulations, such as finite-difference
time-domain (FDTD) and finite element method (FEM) [18–19]. In fact, we can construct a
theoretical model to describe the physical mechanism behind the physical phenomenon [20]. The
electromagnetic responses for different physical parameters can be quickly calculated based on the

#389231 https://doi.org/10.1364/OE.389231
Journal © 2020 Received 28 Jan 2020; revised 17 Mar 2020; accepted 12 Apr 2020; published 10 Jun 2020
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18900

theoretical model. However, the constructions of such theoretical models for complex graphene
nanostructures are generally difficult because the physical mechanisms are hard to understand.
In order to solve the above problems, some data-driven approaches based on machine learning
(ML) have been proposed to equivalently substitute numerical simulation or even the theoretical
models. Especially in recent years, with the development of high performance computing,
artificial neural networks (ANNs), and deep learning in particular, have attracted a great deal of
research attention for an impressively large number of applications, such as image processing
[21], natural language processing [22], acoustical signal processing [23], time series processing
[24], self-driving [25], games [26], robots [27] and so on. Many researchers attempt to use ANNs
to construct a model describing the relationship between the physical parameters of photonics
devices and the electromagnetic responses [28–50]. Once the data-driven model is constructed,
the electromagnetic responses can be calculated in a very short time based on the model inference
when the physical parameters are input into the model [28]. Typically, the inference time of the
ANNs-based model is significantly smaller than the calculation time of numerical simulation.
Thus, the equivalent approximation of the numerical simulation based on a data-driven model
can accelerate the device-level variability analysis and performance evaluation for photonics
devices. Moreover, it has been proven that these data-driven methods are conductive to the
inverse design of photonics devices. The purpose of the inverse design is to search for the suitable
physical parameters, which can generate the targeted electromagnetic response [1]. If the the
potential relationship between the electromagnetic responses and physical parameters can be
constructed by using ML techniques, the inverse design problems are also solved by using the
data-driven methods. Contrary to the model used in the simulation approximation (it predicts
the electromagnetic responses from the physical parameters), the model used in inverse design
predicts the physical parameters according to electromagnetic response. We briefly review recent
progress of this field of research.
For example, J. Peurifoy et al. found that the ANNs could be used to simulate the light
scattering and inversely determine the physical parameters of multilayer nanospheres [28]. The
electromagnetic responses for all physical parameters of nanospheres were predicted by the
ANNs, which were trained by using a small sampling of simulation results. And they pointed out
that the ANNs had ultra-fast prediction speed in comparison to numerical simulation. It should
be noted that the principles behind ML techniques that were used in the simulation approximation
and the inverse design of photonics devices were the data regression between physical parameters
and electromagnetic responses. Researchers begun to explore the applications of data regression
in the simulation approximation and inverse design for photonics devices from two perspectives:
photonics devices and algorithms. For the aspect of photonics devices, the shallow ANNs were
also used in the inverse design and optimization for plasmonic waveguide systems [20], metal
gratings [29], VO2 -based nano-structures [30], strip waveguides [31], chirped Bragg gratings [31],
sub-wavelength grating couplers [32], plasmonic nanoparticles [33] and so on. In the above works,
the successful applications of shallow ANNs demonstrated that the simple network architectures
were enough to fit a small quantity of physical parameters. For the aspect of algorithms, the ANNs
were meticulously designed to fit various photonics devices. Many ANNs with different network
architectures, such as deep neural networks [34–41], adaptive neural networks [42], bidirectional
neural networks [43–44], tandem networks [45] had been proposed to design and optimize for
the complex photonics devices and optical properties, such as power splitters [34], metasurfaces
[35–36], plasmonic colours [37], photonic crystal nanocavities [38–39], optical chiralities [40],
plasmonic sensors [41], metamaterials [42–43], silicon colors [44], multilayer films [45] and
so on. There is no doubt that for the complex photonics devices, such as a resonator based on
encoding metamaterials with random distributions [34], deep neural networks were an effective
modelling method to construct the complex relationships between electromagnetic responses
and physical parameters. In order to reduce the size of training set and improve the accuracy
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18901

for the ANNs, Y. Qu et al. proposed to use transfer learning technique to migrate knowledge
between different physical scenarios [46]. More interestingly, as an unsupervised learning
method, generative adversarial networks (GAN) had been proven effective in the generation of
high performance photonics devices with a broad design space [47–50]. J. Jiang et al. found
that a topologically complex device could be produced by using the GAN with a wide parameter
space [50]. In addition to supervised learning and unsupervised learning, reinforcement learning
(RL) was also used to build an autonomous system to solve the decision-making problems in
the optimization of photonics devices [51–55]. I. Sajedian et al. used deep RL to search for
optimal material types and structural parameters of high-quality metasurfaces [54]. In order
to speed up the search process and optimize for the neural network architectures, the ANNs
were combined with evolutionary algorithms to design the photonics couplers [56–57]. Besides,
other ML algorithms, including dimension reduction and bayesian optimization, were also used
to design grating couplers and wavelength-selective thermal radiators [58–59]. It should be
noted that although the ANNs provided an effective approximation approach to replace the
numerical simulation, they required a great deal of time to collect training sets. In comparison
to the traditional ML algorithms, such as support vector machines (SVM) and random forest
(RF), shallow ANNs had disadvantages in training time. It had been proven that traditional ML
algorithms were more effective in some uncomplicated applications with a small quantity of
physical parameters [60–61]. However, there was a lack of comprehensive analytical report for
the applications of traditional ML algorithms in the simulation approximation and inverse design
for photonics devices.
In addition to the data-driven methods mentioned above, the inverse design of photonics
devices could be solved by using optimization algorithms, which were divided into two classes:
gradient based methods and gradient free methods [20]. As a representative method of gradient
based methods, adjoint variable method (AVM) could not only design for the linear optical
devices but also optimize for the nonlinear devices in the frequency domain [62–63]. In 2018,
T.W. Hughes et al. proposed a novel training method based on AVM to compute the gradients
of optical neural networks (ONNs) [64]. And the objective-first optimization method and
steepest descent method were used to optimize for wavelength demultiplexers and computational
metasturctures [65–66]. However, gradient based methods required physical background to
derive the gradient of objective function. Thus, it increased the difficulty in the practice
applications. On the other hand, as the representative algorithms of gradient free methods, search
algorithms (direct-binary search [67]) and evolutionary algorithms were also used in the inverse
design and optimization for photonics devices [67–71]. Recently, we used two evolutionary
algorithms, genetic algorithms (GA) and particle swarm optimization (PSO) to determine the
hyper-parameters and weights of the ONNs [70]. A hierarchical evolutionary algorithm was
established to solve the large-pixelated and complex inverse meta-optics design [69]. Although
evolutionary algorithms had advantages in simplicity and effectiveness, they easily fell into
local optimum and demanded significant computing time [72]. Quantum genetic algorithm
(QGA), which took advantage of the power of quantum computation, had been demonstrated to
speed up the genetic procedures. No matter gradient based methods and gradient free methods,
they usually optimized for a single performance metric of photonics devices. Multi-objective
optimization, which optimized for multiple performance metrics synthetically, had gradually
come into the researchers’ consideration [73–74].
In this article, we provide a guidance for the intelligent design of photonics device based
on ML and evolutionary algorithms. Recently, various nanostructures based on graphene and
surface plasmon polaritons (SPPs) [75–77], including graphene metamaterials (GMs) [78–81],
graphene nanoribbons (GNRs) [82–84] and graphene waveguides [85–87], have been proposed
to construct plasmonic filters [85], perfect absorbers [82], sensors [83], logic gates [87] and
so on. In these structures, the GMs that consists of periodically spaced GNRs have attracted
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18902

widespread attention because of their relatively simple fabrication techniques [80–82]. In order
to evidently show the design effects, we propose novel GMs consisted of parallel GNRs and
try to achieve spectrum prediction, inverse design and performance optimization for them. In
addition, the physical parameters of the GMs are well-designed to achieve the plasmon induced
transparency (PIT) effect in transmission spectrum. The reason for the selection of the PIT
effect as optimization object is attributed to that it has various applications, such as optical
switching, slow light, modulator, filter, sensor, absorber and so on. [18,19,88]. Several traditional
ML algorithms, are used to achieve simulation approximation and inverse design for the GMs.
On the other hand, although the single-objective optimization has been used to optimize for
polarization beam splitters [67], rotators [68], power splitters [69], wavelength multiplexers [89],
mode multiplexers [90], researches pay little attention to the design of graphene nanostructures.
And the single-objective and multi-objective optimizations are used to optimize for the GMs by
taking different performance metrics into consideration.

2. Device design and simulation result


As shown in Fig. 1, our proposed novel GMs consist of two layer GNRs with alternative chemical
potentials. Here, the double layer GNRs are periodically arranged and infinite along x (y) axis.
The thin conductive layer covered on the bottom and top of the dielectric layer forms as electrodes
to alternatively apply voltage V 1 (V 3 ) and V2 (V 4 ) on the GNRs 1 (2), leading to the graphene
ribbons of two GNRs with alternative chemical potential (µc1 and µc2 for GNRs 1, µc3 and µc4
for GNRs 2). The period of the GNRs 1 (GNRs 2) is set as Λ1 =400 nm (Λ2 =200 nm), and the
width of the graphene ribbon in the GNRs 1 (GNRs 2) is w1 =350 nm (w2 = 175 nm), leading
to a filling ratio of r1 =0.875 (r2 =0.875). The refractive index of the dielectric layer is set to
nInter =1.45 for the simplicity without loss of generality [18]. We use the Kubo formula to model
the conductivity of the GNRs in the FDTD simulation [77]:

e2 kB T µc µc
    
σg = i 2 + 2 ln exp − + 1
π~ (ω + iτ −1 ) kB T kB T
2
(1)
2| µc | − ~(ω + iτ )
−1
 
e
+i ln
4π~ 2| µc | + ~(ω + iτ −1 )

where kB , T (=300 K), ~, τ (=0.5 ps), µc , e, and ω represent the Boltzmann’s constant, temperature,
reduced Planck’s constant, relaxation time, chemical potential, electron charge and angular
frequency, respectively. For a few layers (<6) of graphene, the conductivity of them can be
expressed as σfg =Nσg , where N is the number of layers [16]. In the mid-infrared, the simplified
conductivity can be calculated by considering the domination of interband electron-photon
process and µc >> kB T
Ne2 µc
σfg = i 2 (2)
π~ (ω + iτ −1 )
In order to analyze the excitation condition of the SPPs in the GMs, the dispersion equation is
retrieved based on the Maxwell equation and continuous boundary condition [77]

ε1 ε2 iσfg
+q =− (3)
ωε0
q
ω2
ε1 ω2
ε2
βSPP 2 − c2
βSPP 2 − c2

where βSPP is the propagation constant of SPPs, c represents the light speed in vacuum, ε0 is the
dielectric constant of free space, ε1 and ε2 are the effective permittivities of the medium on each
side of GNRs (ε1 and ε2 are equal to εInter =nInter 2 =2.1 because the GNRs are surrounded by the
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18903

Fig. 1. The schematic view of the proposed GMs, which consist of double layer GNRs
embedded into insulated dielectric layer with a separation dg=300 nm.

same medium in our proposed GMs). Here, since the solution satisfies βSPP >> ω/c, the effective
refractive index of SPPs deduced from Eq. (3) is given by [76]

βSPP 2ε0 εSio2 π~2 c


nSPP = = (ω + iτ −1 ) (4)
k0 Ne2 µc
where k0 relates to the wave vector of vacuum. As shown in Fig. 2(a)-(b), the dispersion curves
(solid lines) of few graphene layers surrounded by dielectric layer match well with the dispersion
data (circle marks) calculated by the mode solution. Notably, the monolayer and multilayer
graphene in numerical simulation are treated as a surface with electric conductivity σfg since the
graphene ribbons are ultrathin. It can be found that both the real part and imaginary part of the
effective refractive indices for the SPPs on graphene decrease with the increasing of wavelength.
As the wavelength of incident light increases, the field confinement (propagation loss) of the
SPPs on graphene becomes weaker (smaller). In addition, the field confinement of the SPPs on
graphene becomes weaker with the increasing of the number of graphene layers. In order to
increase the interaction of the upper GNRs and the lower GNRs, we set the number of the layers
of GNRs as N=4 in the following article.
First of all, as shown in Fig. 2(c), we analyze the transmission spectrum of a single layer
grating composed of GNRs. For the grating with small filling ratio (w/Λ), the SPPs on a graphene
ribbon can hardly interact with that on the adjacent graphene ribbon. Thus, the propagation of
SPPs on the grating can be equivalently substituted by that in a single graphene ribbon. It has
been demonstrated that the SPPs on a single graphene ribbon are nearly totally reflected at the
boundary together with a phase jump of ϕ=0.27π [91]. Thus, the SPPs excited on a graphene
ribbon are caused by the Fabry-Perot (FP) like resonances, which satisfies

Re(nSPP )k0 w + ϕ = mπ, m = 1, 2, 3, . . . (5)

Substituting Eq. (4) into Eq. (5), the resonance frequency of the SPPs on GNRs can be achieved
as following s
(m − ϕ)Ne2 µc
ωr = , m = 1, 2, 3, . . . (6)
2ε0 εInter ~2 w
As shown in Fig. 2(d), the resonance curves (three dashed lines) for three modes of the single layer
graphene grating agree with the numerical simulation results (the absorption contour patterns).
Obviously, the comparison results verify the effectiveness of theoretical model and numerical
simulation. Here, we only calculate the resonance curves for the odd modes in Fig. 2(d) since
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18904

Fig. 2. The real part (a) and imaginary part (b) of effective refractive indices for SPPs. The
solid lines are the dispersions of SPPs calculated by theoretical model, and the marks are
those calculated by the mode solution. (c) Schematic of the single layer grating consisted of
GNRs. (d) When mid-infrared wave normally incident on the single layer grating, blue, red
and orange lines are the resonance curves for three modes m=1, 3 and 5, respectively. For
comparison, the absorption contour patterns of the single layer grating are calculated by the
FDTD method. The value of w/Λ is set as 1/4. In (a), (b) and (d), the chemical potentials of
graphene µc are set as 0.5 eV.

the even modes cannot be excited with normal incident wave [91]. At the frequencies of these FP
resonance modes, light will excite SPPs modes on GNRs, causing absorption enhancements and
dips in the transmission spectrum.
The optical responses of single upper GNRs and single lower GNRs are simulated by using
the FDTD method, respectively. In the FDTD simulations, boundary conditions of x direction
are set as periodic boundary conditions and other boundaries are set as perfectly matched layers.
The Fermi levels of GNRs are set as µc1 =0.7 eV, µc2 =0.5 eV, µc3 = 0.15 eV and µc4 = 0.75 eV
in our simulation. As shown in Fig. 3, when the TM polarized light normally illuminates on
the GMs that only includes the upper GRNs (blue dashed line) and the lower GRNs (green
dashed line), two obvious dips emerge in the transmission spectrum. Here, the appearances
of the dips are related to the excitation of the SPPs modes on the GNRs. Next, we proceed
to consider the optical characteristics of the complete GMs that includes the upper GNRs and
the lower GNRs. From Fig. 3, two transmission peaks respectively located between two dips
emerge in the transmission spectrum, spectrum, indicating the appearance of the PIT-like effect.
Generally speaking, the PIT effect can be explained by using two alternative ways: bright-dark
mode coupling mechanism and the doublet of dressed states mechanism [15]. For our proposed
GMs, the traditional bright-dark mode coupling mechanism is not suitable for explaining the
PIT effect because it difficult for us to distinguish the bright mode or dark mode. Similar to
the PIT effect in [92], this phenomenon is attributed to the constructive interference between
the reflected waves by two mirrors in a in metallic plasmonic FP system. We can attribute the
appearance of the PIT effect to the destructive interference between the SPPs modes excited on
the GNRs, and it is equivalent to the case of destructive interference between two closely spaced
broadened resonances decaying to the same continuum. The PIT effect can be applied in the
optical switching and slow light because it has large extinction ratio and wide bandwidth [92].
The optical characteristics of the dips in the PIT effects are similar to those of the single layer
GNRs mentioned, which suggests that the appearance of dips are attributed to the excitation
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18905

of the SPPs mode on GNRs. And the formation of the PIT effects can be explained by the
normalized magnetic field distribution of transmission peaks (B and E) and dips (A, C, D and F).
As shown in Fig. 3, it can be observed that the appearances of dips A, C, D and F are related to
the excitation of SPPs on the graphene ribbons 1, 2, 3 and 4, respectively. While it’s the coupling
between the SPPs mode on graphene ribbons 1 (2) and 3 (4) gives rise to transmission peaks B
(D). In order to model the dynamic transmission of the GMs, the transfer matrix method is used
to explain the physical phenomenon, The transfer matrix can be defined as [92]
H = M2 S12 M1 (7)
where M1 (M2 ) and S12 represent the matrices of the upper (lower) GNRs and dielectric layer,
respectively. They are governed by
0
© e
iϕ 0 1 © t12 t21 − r12 r21 r21 ª
S12 = ­ ® , Mq = ® , q = 1, 2 (8)
ª
­
−iϕ0 t21
« 0 e ¬ « −r12 1 ¬

Fig. 3. Transmission spectrums of the proposed GMs based on the FDTD simulation (red
solid line) and theoretical model (purple dashed line). The blue dashed line and green dashed
line are the transmission spectrums of the GMs that only includes the upper GNRs and the
lower GNRs, respectively. The normalized magnetic field distributions of the transmission
dips (A (λ=5.30 µm), C (λ=7.04 µm), D (λ=10.40 µm) and F (λ=13.16 µm)) and peaks (B
(λ=6.32 µm) and E (λ=12.35 µm)).

Under light normally illuminates on the GMs, the Fresnel coefficients in the matrix Mq are
expressed as t12 =t21 = 2nInter /(2nInter +Z0 σq ´), r12 =r21 =-Z0 σq ´/(2nInter +Z0 σq ´), where Z0 =367.7
Ω represents the vacuum impedance and ϕ´=dg nInter ω/c is the phase difference between the upper
GNRs and the lower GNRs. Under the condition of quasistatic approximation, the average sheet
conductivity σq ´ is given by
r e2 µ Nω r e2 µ Nω
 
 σ1 0 = 2i π~2 (ω12 −ωc1 2 )+iΓ ω + π~2 (ω12 −ωc2 2 )+iΓ ω


(9)

r1 r1 r2 r2
2 r2 e2 µc4 Nω
 σ2 0 = 2i π~2 (ωr22e−ωµc3 Nω
 
2 )+iΓ ω + π~2 (ω 2 −ω 2 )+iΓ ω

 r3 r3 r4 r4

where ωrj is the resonance frequency, which is calculated by using Eq. (6) for different µcj
(j=1, 2, 3, 4). And the resonance width Γrj of the GNRs is usually 10% larger than the Drude
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18906

scattering width Γj =evF 2 /(µµcj ) in the unpatterned graphene, where vF ≈c/300 is Fermi velocity
and µ=10000 cm2 /V is DC mobility [92]. The phase factor Φj =m-ϕj (m=1, 2, 3, 4. . . ) is a fitting
parameter deduced from the FDTD simulation. From Eqs. (7)-(9), the transmittance of the GMs
can be expressed as

4n2Inter
T=[ ]2 (10)
(2nInter + Z0 σ10 )(2nInter + Z0 σ20 )e−iϕ0 − Z02 σ10 σ20 eiϕ0

According to Eq. (10), the theoretical transmission spectrum of the GMs is shown by the purple
dashed line in Fig. 3. We find that the theoretical transmission spectrum (purple dashed line)
agrees with the simulated transmission spectrum (solid red line) when the fitting parameters Φj
are fitted as Φ1 =3.77, Φ2 =0.85, Φ3 =5 and Φ4 =0.45.

3. Spectrum prediction and inverse design


For the GMs, the slight changes of the structure parameters have significant influence on the
transmission spectrum. If we want to discover the potential relationship between the structure
parameters of the GMs and the transmission spectrum, it requires a high computational cost
to traverse all structure parameters. Actually, we can use Monte Carlo simulation or interval
sampling method to reduce simulation time, but it leads to the loss of accuracy due to the
interpolation and fitting. Another way to improve the efficiency is to train a model based on ML
algorithms by a small part of simulation results [28–50]. The prediction process for transmission
spectrum according to structure parameters is known as ‘forward spectrum prediction’ [20]. By
contrast, inverse design trains a reliable regression model to search for the most suitable structure
parameter for a transmission spectrum. Notably, the principles behind the forward spectrum
prediction and inverse design based on ML model are date regression between the structure
parameters and electromagnetic response. It means that the labels of training data are continuous
variables rather than discrete variables. There are several ML algorithms can be used in data
regression. Compared with the traditional ML algorithms, such as SVM and RF, the training
cost of the ANNs is much higher. It has been demonstrated that SVM performs better than the
ANNs in the trend prediction of soil organic carbon and river flow [60–61]. In addition, the
selection of the hyper-parameters for the ANNs (layers, solver, activation function, learning rate,
batch size and so on) is more complex than that of the SVM and RF. Moreover, the training
time and inference time of the ANNs significantly exceed those of the traditional ML algorithms
because the training process of the ANNs includes forward-propagation, back-propagation and
stochastic gradient decent [60–61]. To overcome the defects of the ANNs, we use traditional
regression algorithms to achieve the forward spectrum prediction and inverse design for the GMs.
Similar to k nearest neighbour (kNN) classification, kNN regression calculates the distances
between the targeted instance and each training instance and then it selects the most similar k
data as candidate set to determine the results [93]. And three tree-based regression algorithms,
including decision tree (DT), RF, and extremely randomized trees (ERT), are also used in the
inverse design for the GMs. These tree-based regression algorithms have the same steps, such as
selecting splits and selecting optimal tree [94]. RF is a famous ensemble algorithm based on the
bootstrap aggregating [95]. In comparison to the RF, the split of features for the ERT is more
random, leading to the reduction of the variance for the trained model [96].
First of all, we attempt to use the regression algorithms to replace the FDTD simulation in
the forward spectrum prediction. In Fig. 4(a), it can be found that the regression algorithms
take the structure parameters of the GMs as the input and predict corresponding transmission
spectrum. Here, the potential relationships between the chemical potentials of the GNRs µc1 ,
µc2 , µc3 and µc4 and the transmittances in transmission spectrum are taken into consideration.
In order to train the regression models, we use the repeatable FDTD simulation and Monte
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18907

Carlo simulation to generate training sets because the regression algorithms belong to supervised
learning [97]. Each instance in 20,000 training instances includes 4 structure parameters (µc1 , µc2 ,
µc3 , µc4 ) and 200 transmittances evenly sampled from the transmission spectrum. All structure
parameters are initialized in different ranges specified by minimum and maximum values 0.6
eV<µc1 <0.8 eV, 0.4 eV <µc2 < 0.6 eV, 0.05 eV<µc3 < 0.25 eV and 0.6 eV<µc1 <0.8 eV. It indicates
that the chemical potentials of graphene ribbons are randomly generated from the ranges with the
precision of 0.1 eV. When we have enough training instances, the models are trained by using
20000 training instances, while another 2000 instances are left as the test set to validate the
training effect. It should be noticed that though the generation of 22000 training instances takes
us 23 hours by using a high performance server, the prediction speed of regression algorithms for
a new structure parameter is faster than the 2D FDTD simulation once the model is constructed
[28]. Thus, although it is less time-consuming for inversely designing a single transmission
spectrum based on the evolutionary algorithms, the regression algorithms can save more time
and energy if there are several transmission spectrums need to be designed. Obviously, once the
model is constructed for a specific photonics devices, the advantage of the regression algorithms
is the model can be reusable [20]. Before training the model, we should pay attention to the
influence of hyper-parameters on the training effect, such as the number of trees in the forest
and the maximum depth of tree for the RF. Here, the deterministic process of hyper-parameters
for the ANNs is relatively complex because there are a great deal of hyper-parameters should
be considered [20]. We use the GA to search for the optimal hyper-parameters and network
architecture for the ANNs, and the variations of the loss and accuracy for different generations
are shown in Fig. 4(b). Here, the accuracies of the regression algorithms are represented by the
scores, which measure the similarity between the predicted transmission spectrums and practical
transmission spectrums (the best and worst values for the score are 1 and arbitrary negative,
respectively) [98]. And the scores are regarded as the fitness for the GA used in finding the
optimal hyper-parameters. As shown in Fig. 4(b), the score (loss) increases (decreases) from
86.8 (20) to 95 (0.01), indicating that the optimization of hyper-parameters for the ANNs are
efficient. After optimizing the network architectures based on the GA, the suitable network
architecture for the ANNs in the forward spectrum prediction is a fully connected network whose
network topology is 4−200−50−50–200. Besides, we also use the same training set to train
other regression algorithms. Figure 4(c) shows the training time and accuracies for different
regression algorithms. Surprisingly, it can be found that the scores of all regression algorithms
are greater than 91, indicating that other regression algorithms are competitive with the ANNs
in the forward spectrum prediction. Although the ANNs-based model is effective intuitively,
the accuracy (score) of RF (96) outperforms that of the ANNs (95). In order to illustrate the
effectiveness of regression models vividly, we compare the transmission spectrums predicted
by regression algorithms and simulated by the 2D FDTD simulation. We randomly select a
group of structure parameter from the test set and calculate transmission spectrums based on
the regression algorithms and the FDTD simulation. As shown in Fig. 4(d), the transmission
spectrums predicted by the regression algorithms agree with the FDTD simulation results. In
comparison to the transmission spectrum predicted by the ANNs, the transmission spectrum
predicted by the RF are closer to the ground truth (the transmission spectrum calculated by the
FDTD simulation). More importantly, once the hyper-parameters are determined, the training
cost of the ANNs (36 seconds) exceeds those of other ML algorithms, especially for the RF. With
a comprehensive consideration of training cost and accuracy, RF is a more suitable regression
algorithm to complete forward spectrum prediction for the GMs in comparison to the ANNs.
Similar to the forward spectrum prediction, the regression algorithms mentioned above can
be employed in the inverse design for the GMs. Contrary to the forward spectrum prediction,
Fig. 5(a) shows the diagram of the inverse design for the GMs. It can be found that the inputs
and outputs of the models are the transmittances in transmission spectrum and the structure
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18908

Fig. 4. The diagram of the forward spectrum prediction. (b) Score and loss for different
generations of the GA. (c) Training time and accuracies for different regression algorithms
in forward spectrum prediction. (d) The transmission spectrums predicted by the regression
algorithms and simulated by the FDTD simulation.

parameters of the GMs, respectively. It should be noted that there is no need to generate new
training instances, we use the same training instances to train the inverse design model by
converting the inputs (outputs) to outputs (inputs) in the forward spectrum prediction. In the
inverse design, the suitable network architecture for the ANNs is a fully connected network
whose network topology is 200-50-200-500-100-4. The training time and accuracies for all
regression algorithms in the inverse design are exhibited in Fig. 5(b). It can be found that all
regression algorithms can achieve excellent performance and the score of the DT (93) is lower
than that of the ANNs (97), ERT (96), kNN (96.5) and RF (98). To validate the effectiveness
of the regression algorithms in the inverse design, we randomly select a transmission spectrum
from the test set and input it into the model. The structure parameters (chemical potentials µc1 ,
µc2 , µc3 and µc4 ) of the GMs predicted by regression algorithms and the ground truth are shown
in Fig. 5(c). We can observe that the predicted chemical potentials µc1 , µc2 , µc3 and µc4 are
close to the practical chemical potentials (red dashed lines), confirming the effectiveness of all
regression algorithms. In addition, we use the chemical potentials predicted by the regression
algorithms to simulate the GMs based on the FDTD simulation. As shown in Fig. 5(d), the
accuracy of the RF outperforms those of the DT, ANNs, ERT and kNN. More importantly, the
training time of the RF (6 seconds) is lower than that of the ANNs (34 seconds). Obviously,
the calculated results shown in Fig. 5(b)-(d) indicate that the ANNs is not the best choice for
the inverse design of the GMs. And the RF outperforms the ANNs in terms of accuracy and
efficiency. Although the performance of the RF is superior to that of the ANNs for the inverse
design of the GMs. But that does not mean the RF is a better choice for the inverse design of
photonics device. The choice of ML algorithms usually depends on the application scenarios.
The ANNs, especially for deep learning, may be more effective for the complicated photonics
devices. But for the uncomplicated scenarios, such as the inverse design and optimization for a
photonics device that contains a small number of structure parameters (<15), the ANNs may not
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18909

be the best choice. Compared with the previous ANNs-based methods, our proposed scheme
provides a more simple and time-saving way to design the photonics devices.

Fig. 5. (a) The diagram of the inverse design. (b) The training time and accuracies (scores)
for all regression algorithms in inverse design. (c) The structure parameters (chemical
potentials µc1 , µc2 , µc3 and µc4 for the GNRs in the GMs) predicted by all regression
algorithms and the ground truth. (d) The FDTD simulated transmission spectrums for the
chemical potentials predicted by the regression algorithms.

4. Performance optimization
The performance optimization for the transmission spectrum of the GMs can be divided into
two categories: optimization for the complete transmission spectrum and optimization for the
performance metrics, such as transmittance or bandwidth. On one hand, similar to inverse
design, the data-driven model predicts the physical parameters that can generate the optimized
transmission spectrum. On the other hand, the performance metrics of transmission spectrum,
such as the transmittance at a given wavelength or the bandwidth of a transparency window can
be pertinently optimized. In this section, we try to use evolutionary algorithms to optimize for
the GMs and compare the optimization results between the single-objective optimization and the
multi-objective optimization.
The algorithmic details of the single-objective optimizations (GA and PSO) are outlined as
follows: (i) randomly generating an initial population consisted of N=40 individuals. Each
individual has four structure parameters, namely, the chemical potentials of graphene ribbons
(µc1 , µc2 , µc3 , µc4 ). All structure parameters are initialized in different ranges specified by
minimum and maximum values 0.6 eV<µc1 <0.8 eV, 0.4 eV<µc2 < 0.6 eV, 0.05 eV<µc3 < 0.25 eV
and 0.6 eV< µc4 <0.8 eV. (ii) For the N groups of structure parameters, transmission spectrums are
simulated by using the FDTD method. Different performance metrics, such as the transmittance
at a given wavelength, are regarded as the optimization objective and fitness for the GA and PSO.
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18910

If the transmission spectrum with a wide bandwidth is optimized, the fitness can be defined as
λ
Õ max

F= |S0 (λ) − S(λ)| (11)


λmin

where λ, λmin (λmax ), and S0 (λ) (S(λ)) are the wavelength, minimum (maximum) wavelength and
targeted (optimized) transmission spectrum, respectively. Then, the individuals of population are
sorted according to the fitness in descending order. (iii) Trying to generate a new population based
on the previous generation. It should be noted that the generation methods of a new population
for GA and PSO are different. For the GA, a new population is generated by using the standard
selection, crossover and mutation procedures. In the selection process, two parent individuals are
selected from the previous generation based on roulette-wheel selection or tournament strategy
[99]. The structure parameters (chemical potentials) with better fitness are selected with higher
probability. In order to maintain the diversity of population and keep some superior individuals,
a small part of the superior (inferior) chemical potentials are kept in the next generation. In the
crossover process, structure parameters (chemical potentials) are converted into binary values
firstly. It should be noted that the conversion of decimal to binary can lead to the loss of digital
precision. Thus, the optimization variables (chemical potentials) of parent individuals cross
over to generate a new population based on uniform crossover or single-point crossover [99]. In
the mutation process, each element in a binary number has 5% probability to flip from 0 (1)
to 1 (0). After converting the optimization variables from binary to decimal, a new population
is generated. For the PSO, the individuals in the population depend on the globally optimal
individual and historically optimal record for each individual to search for the optimal structure
parameters [69]. When we use the PSO to optimize the GMs, there no need to convert decimal
to binary, which can effectively avoid the loss of digital precision. (iv) Finally, the fitness of
newly generated population for the GA and PSO are evaluated to determine the iteration whether
to stop or not. If the generation of structure parameters evolve for 1000 times or the structure
parameters remain unchanged for more than 5 generations, GA stops, otherwise, proceeds to
Step (ii). For PSO, if the population does not meet the termination conditions, the variations of
structure parameters (so-called velocities) are calculated in the next iteration. QGA is a parallel
evolutionary algorithm, which combines the traditional GA and quantum algorithms together
[100]. In the QGA, the encode method for the optimization variable is quantum bit rather than
binary number. And in the crossover and mutation processes, QGA uses the quantum rotation
gate to update the individual.
To compare the optimization effects between the GA, QGA and PSO, we randomly select
a complete transmission spectrum (red dashed line in Fig. 6(c)) from test set as optimization
objective. Figure 6(a) shows the fitnesses of the GA, QGA and PSO for different generations in
the optimization of the GMs. It can be observed that the fitnesses of the GA, QGA and PSO are
close to 0, indicating these single-objective optimization algorithms are convergence and the
optimized transmission spectrums are gradually close to targeted transmission spectrum. And the
convergence speeds of the GA and PSO are faster than that of the QGA. In the 100th generation,
we select the optimized structure parameters for all optimization algorithms and compare them
with the ground truth. It can be found in Fig. 6(b) that the chemical potentials optimized by
the GA, QGA and PSO agree well with the targeted chemical potentials. In Fig. 6(c), it can be
found that the optimized transmission spectrums in the first generation (green solid lines) are
randomly generated and those in the 100th generation (blue solid lines) are close to the targeted
transmission spectrums.
Finally, the GMs are optimized for several performance metrics, such as the transmittances
at different wavelengths. The steep degree of the PIT effect is a critical performance indicator,
which affects the bandwidth, group index, figure of merit and so on. In order to achieve steep
transmission characteristics, we use a multi-objective optimization algorithm, non-dominated
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18911

Fig. 6. (a) The fitnesses of the GA, QGA and PSO for different generations in the
performance optimization. (b) Optimization results of chemical potentials for the GA,
QGA and PSO in the 100th iteration. (c) The optimized transmission spectrums of the GA,
QGA and PSO in the first iteration (green line) and the 100th iteration (blue line). (d) The
multi-objective optimization results for two differences between one peak (8161 nm) and
two dips (7659 nm and 11620 nm). (e) The multi-objective optimization results for four
differences between two peaks (6110 nm and 12620 nm) and four dips (5150 nm, 6890 nm,
10310 nm and 13220 nm).

sorting genetic algorithm-II (NSGA-II), to optimize for the transmittances at different wavelengths.
In comparison to other multi-objective optimization algorithms, NSGA-II finds the pareto optimal
solution based on the fast nondominated sorting method (FNSM) and elitist strategy [101]. In
NSGA-II, the crowding distances of individuals and the levels calculated by the FNSM are
combined to jointly determine the order of the individuals [102]. For all performance indicators,
the individuals in the lower level are better than those in the higher level, while the individuals
in the same level are incommensurable. Here, the steep degree of the PIT effect is simply
characterized as the differences between the transmission peaks and dips. The algorithmic details
of NSGA-II are outlined as follows: (i) the generation of the initial population for NSGA-II is the
same as that of the GA, QGA and PSO. Here, each individual has seven structure parameters:
the chemical potentials of graphene ribbons (µc1 , µc2 , µc3 , µc4 ), the filling ratio of graphene
ribbons (r1 , r2 ) and the distance dg between the upper GNRs and the lower GNRs. Here, all
structure parameters are initialized in different ranges 0.6 eV<µc1 <0.8 eV, 0.4 eV<µc2 <0.6 eV,
0.05 eV<µc3 <0.25 eV, 0.6 eV<µc4 <0.8 eV, 0.7 <r1 <0.9, 0.7<r2 <0.9 and 100 nm < dg <300 nm.
(ii) The differences between the transmittances at different wavelengths are regarded as the fitness
for the NSGA-II. It means that two differences and four differences between the transmission
peaks and dips are calculated for one transparency window and two transparency windows,
respectively. Unlike GA, QGA and PSO, the levels of the individuals in a population for NSGA-II
are determined by the FNSM. And the crowding distances are calculated for the individuals in
the same level to maintain the diversity of the population. The individuals in a population are
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18912

sorted based on the levels and crowding distances [101]. (iii) The generation process of a new
population for NSGA-II is the same as those of GA, QGA and PSO. (iv) The individuals in a
newly generated population are placed into the previous population to generate a large population.
The individuals in the large population are sorted based on the FNSM and crowding distances.
Finally, top N individuals are selected to generate a new population for next iteration based
on elitist strategy [101]. (v) The evaluation of a newly generated population for NSGA-II is
similar to GA and PSO. And the best individual in the pareto front is selected as the solution
for the NSGA-II. Figure 6(d) and Fig. 6(e) exhibit the multi-objective optimization results
for one transparency window and two transparency windows, respectively. The optimization
objective for one transparency window is two differences between a transmission peak and two
dips, while that for two transparency windows is four differences between two transmission peak
and four dips. After 100 iterations, the differences between the transmission peaks and dips
reach 0.76 and 0.97 (0.87, 0.83, 0.79 and 0.69) for one transparency window (two transparency
windows), indicating the NSGA-II is effective for the optimization of the GMs. Obviously, the
multi-objective optimization can be used to achieve steep optical characteristics by synthetically
considering several different performance metrics.

5. Conclusions
In this article, we provide a guidance for the intelligent design of photonics device based on
ML and evolutionary algorithms. We take the GMs as an example and the structure parameters
of the GMs are well-designed to obtain PIT effect in transmission spectrum. The theoretical
transmission spectrum based on the transfer matrix method agree well with the FDTD simulated
transmission spectrum. In addition, several traditional ML algorithms are used to achieve the
forward spectrum prediction and inverse design for the GMs. The calculated results demonstrate
that all the algorithms are effective and the RF has advantages in terms of accuracy and training
speed in comparison to the ANNs. And these conclusions can extend to other physics topics.
Moreover, we use the single-objective optimization and multi-objective optimization algorithms
to optimize for the GMs by synthetically taking many performance metrics into consideration.
The maximum difference between the transmission peaks and dips in the optimized transmission
spectrum can reach 0.97. This work provides a guidance for the intelligent design of graphene-
based structures and has important applications in the optimization of advanced materials and
metamaterials.

Funding
National Natural Science Foundation of China (61705015, 61625104, 61431003, 61821001);
Beijing Municipal Science and Technology Commission (Z181100008918011); Fundamental
Research Funds for the Central Universities (2019RC15, 2018XKJC02); National Key Research
and Development program (2019YFB1803504, 2018YFB2201803, 2016YFA0301300).

Disclosures
The authors declare no conflict of interest.

References
1. S. Molesky, Z. Lin, A. Y. Piggott, W. Jin, J. Vucković, and A. W Rodriguez, “Inverse design in nanophotonics,” Nat.
Photonics 12(11), 659–670 (2018).
2. W. Bogaerts and L. Chrostowski, “Silicon photonics circuit design: methods, tools and challenges,” Laser Photonics
Rev. 12(4), 1700237 (2018).
3. K. Yao, R. Unni, and Y. Zheng, “Intelligent nanophotonics: merging photonics and artificial intelligence at the
nanoscale,” Nanophotonics 8(3), 339–366 (2019).
4. A. A. Balandin, S. Ghosh, W. Bao, I. Calizo, D. Teweldebrhan, F. Miao, and C. N. Lau, “Superior thermal conductivity
of single-layer graphene,” Nano Lett. 8(3), 902–907 (2008).
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18913

5. A. Roberts, D. Cormode, C. Reynolds, and T. Newhouse-Illige, “Response of graphene to femtosecond high-intensity


laser irradiation,” Appl. Phys. Lett. 99(5), 051912 (2011).
6. E. Hendry, P. J. Hale, J. Moger, A. Savchenko, and S. Mikhailov, “Coherent nonlinear optical response of graphene,”
Phys. Rev. Lett. 105(9), 097401 (2010).
7. Q. Bao and K. P. Loh, “Graphene photonics, plasmonics, and broadband optoelectronic devices,” ACS Nano 6(5),
3677–3694 (2012).
8. T. Zhang, J. Dai, Y. Dai, Y. Fan, X. Han, J. Li, F. Yin, Y. Zhou, and K. Xu, “Tunable plasmon induced transparency in
a metallodielectric grating coupled with graphene metamaterials,” J. Lightwave Technol. 35(23), 5142–5149 (2017).
9. R. R. Nair, P. Blake, A. N. Grigorenko, K. S. Novoselov, T. J. Booth, T. Stauber, N. M. R. Peres, and A. K. Gei, “Fine
structure constant defines visual transparency of graphene,” Science 320(5881), 1308 (2008).
10. A. K. Geim and K. S. Novoselov, “The rise of graphene,” Nat. Mater. 6(3), 183–191 (2007).
11. M. Liu, X. Yin, E. Ulin-Avila, B. Geng, T. Zentgraf, L. Ju, F. Wang, and X. Zhang, “A graphene-based broadband
optical modulator,” Nature 474(7349), 64–67 (2011).
12. T. J. Echtermeyer, P. Nene, M. Trushin, R. V. Gorbachev, A. L. Eiden, S. Milana, Z. Sun, J. Schliemann, E. Lidorikis,
K. S. Novoselov, and A. C. Ferrari, “Photothermoelectric and photoelectric contributions to light detection in
metal–graphene–metal photodetectors,” Nano Lett. 14(7), 3733–3742 (2014).
13. S.-H. Bae, Y. Lee, B. K. Sharma, H.-J. Lee, J.-H. Kim, and J.-H. Ahn, “Graphene-based transparent strain sensor,”
Carbon 51, 236–242 (2013).
14. M. Amin, M. Farhat, and H. Bağcı, “An ultra-broadband multilayered graphene absorber,” Opt. Express 21(24),
29938–29948 (2013).
15. X. Han, T. Wang, X. Li, S. Xiao, and Y. Zhu, “Dynamically tunable plasmon induced transparency in a graphene-based
nanoribbon waveguide coupled with graphene rectangular resonators structure on sapphire substrate,” Opt. Express
23(25), 31945–31955 (2015).
16. T. Zhang, X. Yin, L. Chen, and X. Li, “Ultra-compact polarization beam splitter utilizing a graphene-based
asymmetrical directional coupler,” Opt. Lett. 41(2), 356–359 (2016).
17. H.-Y. Kim, K. Lee, N. McEvoy, C. Yim, and G. S. Duesberg, “Chemically modulated graphene diodes,” Nano Lett.
13(5), 2182–2188 (2013).
18. S.-X. Xia, X. Zhai, L.-L. Wang, and S.-C. Wen, “Plasmonically induced transparency in double-layered graphene
nanoribbons,” Photonics Res. 6(7), 692–702 (2018).
19. H. Li, C. Ji, Y. Ren, J. Hu, M. Qin, and L. Wang, “Investigation of multiband plasmonic metamaterial perfect
absorbers based on graphene ribbons by the phase-coupled method,” Carbon 141, 481–487 (2019).
20. T. Zhang, J. Wang, Q. Liu, J. Zhou, J. Dai, X. Han, J. Li, Y. Zhou, and K. Xu, “Efficient spectrum prediction and
inverse design for plasmonic waveguide systems based on artificial neural networks,” Photonics Res. 7(3), 368–380
(2019).
21. J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural Netw. 61, 85–117 (2015).
22. T. Young, D. Hazarika, S. Poria, and E. Cambria, “Recent trends in deep learning based natural language processing,”
IEEE Comput. Intell. Mag. 13(3), 55–75 (2018).
23. G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath,
and B. Kingsbury, “Deep neural networks for acoustic modeling in speech recognition: the shared views of four
research groups,” IEEE Signal Process. Mag. 29(6), 82–97 (2012).
24. M. Längkvist, L. Karlsson, and A. Loutfi, “A review of unsupervised feature learning and deep learning for time-series
modeling,” Pattern Recognit. Lett. 42, 11–24 (2014).
25. M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J.
Zhang, X. Zhang, J. Zhao, and K. Zieba, “End to end learning for self-driving cars,” arXiv preprint arXiv:1604.07316
(2016).
26. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with
deep reinforcement learning,” arXiv preprint arXiv:1312.5602 (2013).
27. S. Gu, E. Holly, T. Lillicrap, and S. Levine, “Deep reinforcement learning for robotic manipulation with asynchronous
off-policy updates,” in 2017 IEEE international conference on robotics and automation (ICRA), (IEEE, 2017),
3389–3396.
28. J. Peurifoy, Y. Shen, L. Jing, Y. Yang, F. Canorenteria, B. Delacy, J. D. Joannopoulos, M. Tegmark, and M. Soljačić,
“Nanophotonic particle simulation and inverse design using artificial neural networks,” Sci. Adv. 4(6), eaar4206
(2018).
29. S. Inampudi and H. Mosallaei, “Neural network based design of metagratings,” Appl. Phys. Lett. 112(24), 241102
(2018).
30. I. Balin, V. Garmider, Y. Long, and I. Abdulhalim, “Training artificial neural network for optimization of nanostructured
VO2 -based smart window performance,” Opt. Express 27(16), A1030–A1040 (2019).
31. A. M. Hammond and R. M. Camacho, “Designing integrated photonic devices using artificial neural networks,” Opt.
Express 27(21), 29620–29638 (2019).
32. D. Gostimirovic and N. Y. Winnie, “An open-source artificial neural network model for polarization-insensitive
silicon-on-insulator subwavelength grating couplers,” IEEE J. Sel. Top. Quantum Electron. 25(3), 1–5 (2019).
33. J. He, C. He, C. Zheng, Q. Wang, and J. Ye, “Plasmonic nanoparticle simulations and inverse design using machine
learning,” Nanoscale 11(37), 17444–17459 (2019).
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18914

34. M. H. Tahersima, K. Kojima, T. Koike-Akino, D. Jha, B. Wang, C. Lin, and K. Parsons, “Deep Neural Network
Inverse Design of Integrated Photonic Power Splitters,” Sci. Rep. 9(1), 1368 (2019).
35. S. An, C. Fowler, B. Zheng, M. Y. Shalaginov, H. Tang, H. Li, L. Zhou, J. Ding, A. M. Agarwal, C. Rivero-Baleine, K.
A. Richardson, T. Gu, J. Hu, and H. Zhang, “A deep learning approach for objective-driven all-dielectric metasurface
design,” ACS Photonics (2019).
36. C. C. Nadell, B. Huang, J. M. Malof, and W. J. Padilla, “Deep learning for accelerated all-dielectric metasurface
design,” Opt. Express 27(20), 27523–27535 (2019).
37. J. Baxter, A. C. Lesina, J. M. Guay, A. Weck, P. Berini, and L. Ramunno, “Plasmonic colours predicted by deep
learning,” Sci. Rep. 9(1), 8074 (2019).
38. T. Asano and S. Noda, “Optimization of photonic crystal nanocavities based on deep learning,” Opt. Express 26(25),
32704–32717 (2018).
39. T. Asano and S. Noda, “Iterative optimization of photonic crystal nanocavity designs by using deep neural networks,”
Nanophotonics 8(12), 2243–2256 (2019).
40. Y. Li, Y. Xu, M. Jiang, B. Li, T. Han, C. Chi, F. Lin, B. Shen, X. Zhu, L. Lai, and Z. Fang, “Self-Learning Perfect
Optical Chirality via a Deep Neural Network,” Phys. Rev. Lett. 123(21), 213902 (2019).
41. X. Li, J. Shu, W. Gu, and L. Gao, “Deep neural network for plasmonic sensor modeling,” Opt. Mater. Express 9(9),
3857–3862 (2019).
42. Y. Chen, J. Zhu, Y. Xie, N. Fengb, and Q. Liu, “Smart inverse design of graphene-based photonic metamaterials by
an adaptive artificial neural network,” Nanoscale 11(19), 9749–9755 (2019).
43. W. Ma, F. Cheng, and Y. Liu, “Deep-learning-enabled on-demand design of chiral metamaterials,” ACS Nano 12(6),
6326–6334 (2018).
44. L. Gao, X. Li, D. Liu, L. Wang, and Z Yu, “A bidirectional deep neural network for accurate silicon color design,”
Adv. Mater. (2019).
45. D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training deep neural networks for the inverse design of nanophotonic
structures,” ACS Photonics 5(4), 1365–1369 (2018).
46. Y. Qu, L. Jing, Y. Shen, M. Qiu, and M. Soljacic, “Migrating knowledge between physical scenarios based on
artificial neural networks,” ACS Photonics 6(5), 1168–1174 (2019).
47. Z. Liu, D. Zhu, S. P. Rodrigues, K.-T. Lee, and W. Cai, “Generative model for the inverse design of metasurfaces,”
Nano Lett. 18(10), 6570–6576 (2018).
48. J. Jiang and J. A. Fan, “Simulator-based training of generative neural networks for the inverse design of metasurfaces,”
Nanoscale (2019).
49. W. Ma, F. Cheng, Y. Xu, Q. Wen, and Y. Liu, “Probabilistic representation and inverse design of metamaterials based
on a deep generative model with semi-supervised learning strategy,” arXiv:1901.10819 (2019).
50. J. Jiang, D. Sell, S. Hoyer, J. Hickey, J. Yang, and J. A. Fan, “Free-form diffractive metagrating design based on
generative adversarial networks,” ACS Nano 13(8), 8872–8878 (2019)..
51. Z. Huang, X. Liu, and J. Zang, “The inverse design of structural color using machine learning,” Nanoscale 11(45),
21748–21758 (2019).
52. I. Sajedian, T. Badloe, and J. Rho, “Optimisation of colour generation from dielectric nanostructures using
reinforcement learning,” Opt. Express 27(4), 5874–5883 (2019).
53. I. Sajedian, T. Badloe, and J. Rho, “Finding the optical properties of plasmonic structures by image processing using
a combination of convolutional neural networks and recurrent neural networks,” Microsyst. Nanoeng. 5(1), 27 (2019).
54. I. Sajedian, T. Badloe, and J. Rho, “Double-deep Q-learning to increase the efficiency of metasurface holograms,”
Sci. Rep. 9(1), 10899–8 (2019).
55. M. Turduev, E. Bor, C. Latifoglu, I. H. Giden, Y. S. Hanay, and H. Kurt, “Ultra-compact photonic structure design for
strong light confinement and coupling into nano-waveguide,” J. Lightwave Technol. 36(14), 2812–2819 (2018).
56. A. da Silva Ferreira, C. H. da Silva Santos, M. S. Gonçalves, and H. E. H. Figueroa, “Towards an integrated
evolutionary strategy and artificial neural network computational tool for designing photonic coupler devices,” Appl.
Soft Comput. 65, 1–11 (2018).
57. R. S. Hegde, “Photonics inverse design: pairing deep neural networks with evolutionary algorithms,” IEEE J. Sel.
Top. Quantum Electron. 26(1), 1–8 (2020).
58. A. Sakurai, K. Yada, T. Simomura, S. Ju, M. Kashiwagi, H. Okada, T. Nagao, K. Tsuda, and J. Shiomi, “Ultranarrow-
band wavelength-selective thermal emission with aperiodic multilayered metamaterials designed by Bayesian
optimization,” ACS Cent. Sci. 5(2), 319–326 (2019).
59. D. Melati, Y. Grinberg, M. K. Dezfouli, S. Janz, P. Cheben, J. H. Schmid, A. Sánchez-Postigo, and D. Xu, “Mapping
the global design space of nanophotonic components using machine learning pattern recognition,” Nat. Commun.
10(1), 4775 (2019).
60. K. Were, D. T. Bui, Ø.B. Dick, and B. R. Singh, “A comparative assessment of support vector regression, artificial
neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane
landscape,” Ecol. Indic. 52, 394–403 (2015).
61. A. Ahmad, M. Hassan, M. Abdullah, H. Rahman, F. Hussin, H. Abdullah, and R. Saidur, “A review on applications
of ANN and SVM for building electrical energy consumption forecasting,” Renewable Sustainable Energy Rev. 33,
102–109 (2014).
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18915

62. Z. Lin, X. Liang, M. Lončar, S. G. Johnson, and A. W. Rodriguez, “Cavity-enhanced second-harmonic generation via
nonlinear-overlap optimization,” Optica 3(3), 233–238 (2016).
63. T. W. Hughes, M. Minkov, I. A. Williamson, and S. Fan, “Adjoint method and inverse design for nonlinear
nanophotonic devices,” ACS Photonics 5(12), 4781–4787 (2018).
64. T. W. Hughes, M. Minkov, Y. Shi, and S. Fan, “Training of photonic neural networks through in situ backpropagation
and gradient measurement,” Optica 5(7), 864–871 (2018).
65. A. Y. P. iggott, J. Lu, K. G. Lagoudakis, J. Petykiewicz, T. M. Babinec, and J. Vučković, “Inverse design and
demonstration of a compact and broadband on-chip wavelength demultiplexer,” Nat. Photonics 9(6), 374–377 (2015).
66. N. M. Estakhri, B. Edwards, and N. Engheta, “Inverse-designed metastructures that solve equations,” Science
363(6433), 1333–1338 (2019).
67. B. Shen, P. Wang, R. Polson, and R. Menon, “An integrated-nanophotonics polarization beamsplitter with 2.4 × 2.4
µm2 footprint,” Nat. Photonics 9(6), 378–382 (2015).
68. H. Cui, X. Sun, and Z. Yu, “Genetic-algorithm-optimized wideband on-chip polarization rotator with an ultrasmall
footprint,” Opt. Lett. 42(16), 3093 (2017).
69. J. C. Mak, C. Sideris, J. Jeong, A. Hajimiri, and J. K. Poon, “Binary particle swarm optimized 2×2 power splitters in
a standard foundry silicon photonic platform,” Opt. Lett. 41(16), 3868 (2016).
70. T. Zhang, J. Wang, Y. Dan, Y. Lanqiu, J. Dai, X. Han, and K. Xu, “Efficient training and design of photonic neural
network through neuroevolution,” Opt. Express 27(26), 37150–37163 (2019).
71. Z. Jin, S. Mei, S. Chen, Y. Li, C. Zhang, Y. He, X. Yu, C. Yu, J. K. W. Yang, B. Luk’yanchuk, S. Xiao, and C. Qiu,
“Complex Inverse Design of Meta-optics by Segmented Hierarchical Evolutionary Algorithm,” ACS Nano 13(1),
821–829 (2019).
72. Y. Xing, D. Spina, A. Li, T. Dhaene, and W. Bogaerts, “Stochastic collocation for device-level variability analysis in
integrated photonics,” Photonics Res. 4(2), 93–100 (2016).
73. S. D. Campbell, D. Sell, R. P. Jenkins, E. B. Whiting, J. A. Fan, and D. H. Werner, “Review of numerical optimization
techniques for meta-device design [Invited],” Opt. Mater. Express 9(4), 1842–1863 (2019).
74. J. Nagar, S. D. Campbell, Q. Ren, J. A. Easum, R. P. Jenkins, and D. H. Werner, “Multiobjective Optimization-Aided
Metamaterials-by-Design With Application to Highly Directive Nanodevices,” IEEE J. Multiscale Multiphys. Comput.
Tech. 2, 147–158 (2017).
75. A. V. Zayats, I. I. Smolyaninov, and A. A. Maradudin, “Nano-optics of surface plasmon polaritons,” Phys. Rep.
408(3-4), 131–314 (2005).
76. M. Jablan, H. Buljan, and M. Soljačić, “Plasmonics in graphene at infrared frequencies,” Phys. Rev. B 80(24), 245435
(2009).
77. T. Zhang, L. Chen, and X. Li, “Graphene-based tunable broadband hyperlens for far-field subdiffraction imaging at
mid-infrared frequencies,” Opt. Express 21(18), 20888–20899 (2013).
78. A. Vakil and N. Engheta, “Transformation optics using graphene,” Science 332(6035), 1291–1294 (2011).
79. L. Ju, B. Geng, J. Horng, C. Girit, M. Martin, Z. Hao, H. A. Bechtel, X. Liang, A. Zettl, Y. R. Shen, and F. Wang,
“Graphene plasmonics for tunable terahertz metamaterials,” Nat. Nanotechnol. 6(10), 630–634 (2011).
80. M. A. Othman, C. Guclu, and F. Capolino, “Graphene-based tunable hyperbolic metamaterials and enhanced
near-field absorption,” Opt. Express 21(6), 7614–7632 (2013).
81. S. Xiao, T. Wang, T. Liu, X. Yan, Z. Li, and C. Xu, “Active modulation of electromagnetically induced transparency
analogue in terahertz hybrid metal-graphene metamaterials,” Carbon 126, 271–278 (2018).
82. R. Alaee, M. Farhat, C. Rockstuhl, and F. Lederer, “A perfect absorber made of a graphene micro-ribbon metamaterial,”
Opt. Express 20(27), 28017–28024 (2012).
83. D. Rodrigo, O. Limaj, D. Janner, D. Etezadi, F. J. G. de Abajo, V. Pruneri, and H. Altug, “Mid-infrared plasmonic
biosensing with graphene,” Science 349(6244), 165–168 (2015).
84. T. Zhang, L. Chen, B. Wang, and X. Li, “Tunable broadband plasmonic field enhancement on a graphene surface
using a normal-incidence plane wave at mid-infrared frequencies,” Sci. Rep. 5(1), 11195 (2015).
85. H. Li, L. Wang, J. Liu, Z. Huang, B. Sun, and X. Zhai, “Investigation of the graphene based planar plasmonic filters,”
Appl. Phys. Lett. 103(21), 211104 (2013).
86. A. Y. Nikitin, F. Guinea, F. J. García-Vidal, and L. Martín-Moreno, “Edge and waveguide terahertz surface plasmon
modes in graphene microribbons,” Phys. Rev. B 84(16), 161407 (2011).
87. T. Zhang, J. Zhou, J. Dai, Y. Dai, X. Han, J. Li, F. Yin, Y. Zhou, and K. Xu, “Plasmon induced absorption in a
graphene-based nanoribbon waveguide system and its applications in logic gate and sensor,” J. Phys. D: Appl. Phys.
51(5), 055103 (2018).
88. R. D. Kekatpure, E. S. Barnard, W. Cai, and M. L. Brongersma, “Phase-coupled plasmon-induced transparency,”
Phys. Rev. Lett. 104(24), 243902 (2010).
89. A. Y. Piggott, J. Lu, K. G. Lagoudakis, J. Petykiewicz, T. M. Babinec, and J. Vučković, “Inverse design and
demonstration of a compact and broadband on-chip wavelength demultiplexer,” Nat. Photonics 9(6), 374–377 (2015).
90. L. F. Frellsen, Y. Ding, O. Sigmund, and L. H. Frandsen, “Topology optimized mode multiplexing in silicon-on-
insulator photonic wire waveguides,” Opt. Express 24(15), 16866–16873 (2016).
91. L. Du, D. Tang, and X. Yuan, “Edge-reflection phase directed plasmonic resonances on graphene nano-structures,”
Opt. Express 22(19), 22689–22698 (2014).
Research Article Vol. 28, No. 13 / 22 June 2020 / Optics Express 18916

92. C. Zeng, J. Guo, and X. Liu, “High-contrast electro-optic modulation of spatial light induced by graphene-integrated
Fabry-Pérot microcavity,” Appl. Phys. Lett. 105(12), 121103 (2014).
93. M. Maltamo and A. Kangas, “Methods based on k-nearest neighbor regression in the prediction of basal area diameter
distribution,” Can. J. For. Res. 28(8), 1107–1115 (1998).
94. A. Swetapadma and A. Yadav, “A novel decision tree regression-based fault distance estimation scheme for
transmission lines,” IEEE Trans. Power Delivery 32(1), 234–245 (2017).
95. A. Liaw and M. Wiener, “Classification and regression by randomForest,” R news 2, 18–22 (2002).
96. P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,” Mach Learn 63(1), 3–42 (2006).
97. C. M. Bishop, Pattern recognition and machine learning (springer, 2006).
98. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss,
V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn:
Machine learning in Python,” J. Mach. Learn. Res. 12, 2825–2830 (2011).
99. A. Chipperfield and P. Fleming, “The MATLAB genetic algorithm toolbox,” From IEE Colloquium on Applied
Control Techniques Using MATLAB Digest No. 1995/014 (1995).
100. K.-H. Han and J.-H. Kim, “Genetic quantum algorithm and its application to combinatorial optimization problem,”
in Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No. 00TH8512), (IEEE, 2000),
1354–1360.
101. K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,”
IEEE Trans. Evol. Computat. 6(2), 182–197 (2002).
102. Y. Zhang, D. Liu, X. Shen, J. Bai, Q. Liu, Z. Cheng, P. Tang, and L. Yang, “Design of iodine absorption cell for
high-spectral-resolution lidar,” Opt. Express 25(14), 15913–15926 (2017).

You might also like