10.1007@s10957 019 01614 8

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Journal of Optimization Theory and Applications

https://doi.org/10.1007/s10957-019-01614-8

Machine-Learning Techniques for the Optimal Design


of Acoustic Metamaterials

Andrea Bacigalupo1 · Giorgio Gnecco1 · Marco Lepidi2 · Luigi Gambarotta2

Received: 14 June 2019 / Accepted: 13 November 2019


© Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract
Recently, an increasing research effort has been dedicated to analyze the transmis-
sion and dispersion properties of periodic acoustic metamaterials, characterized by
the presence of local resonators. Within this context, particular attention has been
paid to the optimization of the amplitudes and center frequencies of selected stop and
pass bands inside the Floquet–Bloch spectra of the acoustic metamaterials featured by
a chiral or antichiral microstructure. Novel functional applications of such research
are expected in the optimal parametric design of smart tunable mechanical filters and
directional waveguides. The present paper deals with the maximization of the ampli-
tude of low-frequency band gaps, by proposing suitable numerical techniques to solve
the associated optimization problems. Specifically, the feasibility and effectiveness of
Radial Basis Function networks and Quasi-Monte Carlo methods for the interpola-
tion of the objective functions of such optimization problems are discussed, and their
numerical application to a specific acoustic metamaterial with tetrachiral microstruc-
ture is presented. The discussion is motivated theoretically by the high computational
effort often needed for an exact evaluation of the objective functions arising in band
gap optimization problems, when iterative algorithms are used for their approximate
solution. By replacing such functions with suitable surrogate objective functions con-
structed applying machine-learning techniques, well-performing suboptimal solutions
can be obtained with a smaller computational effort. Numerical results demonstrate the
effective potential of the proposed approach. Current directions of research involving
the use of additional machine-learning techniques are also presented.

Keywords Nonlinear programming · Surrogate optimization · Radial basis function


networks · Lattice materials · Dispersion properties

Mathematics Subject Classification 41A05 · 90C26 · 90C90

B Giorgio Gnecco
[email protected]
Extended author information available on the last page of the article

123
Journal of Optimization Theory and Applications

1 Introduction

The extraordinary perspectives offered by the recent theoretical advances and tech-
nological developments in the field of micro-architected materials have attracted a
growing attention from a large and multi-disciplinary scientific community [1–3].
In particular, an increasing research effort has been dedicated over the last years to
analyze the transmission and dispersion properties of elastic waves propagating in peri-
odic microstructures characterized by the presence of one or more auxiliary oscillators
per elementary cell [4–6]. Such periodic systems—known as acoustic metamaterial-
s—can leverage the local resonance mechanism between the cellular microstructure
and the auxiliary oscillators (resonators) to ensure a variety of functional improve-
ments with respect to the corresponding (resonator-free) materials. Indeed, if properly
designed according to a suited parametric optimization strategy [7], the optimized
geometric, inertial and elastic parameters of the periodic microstructure and of the
resonators may allow the achievement of desirable wave dispersion properties. Conse-
quently, challenging issues in the modern field of theoretical and applied optimization
arise in optimally designing the spectral metamaterial properties for specific objec-
tives, including—for instance—opening, enlarging, closing, or shifting band gaps in
desired acoustic frequency ranges. From the technological viewpoint, a successful
optimization of the dispersion properties for acoustic metamaterials paves the way
for developing a new generation of smart engineering devices, such as directional
waveguides, mechanical filters, negative refractors, sub-wavelength edge detectors,
invisibility cloaks, ultrasound focusers, and mechanical energy propagators [8–12].
A promising technique for the spectral design of acoustic metamaterials is based on
the formulation and numerical solution of constrained nonlinear optimization prob-
lems, typically characterized by continuous optimization variables. Such optimization
problems can be suitably stated as follows. First, some geometric, elastic and inertial
properties (such as the mass and stiffness of the local resonators) are assumed as design
parameters, playing the role of optimization variables. Second, mechanical and geo-
metric limitations (such as validity boundaries for the assumptions of the model) are
taken into account to establish the linear/nonlinear constraints. Third, any desirable
physical property associated with the metamaterial spectrum is adopted as nonlinear
objective function. Depending on functional requirements, the objective function can
be defined by properly weighting different spectral properties [13–17], or by combin-
ing dynamic and static features [18]. Unfortunately, the numerical solution of such
optimization problems by straightforwardly applying classical iterative algorithms
[19] tends to become highly demanding and, consequently, often inconvenient. Specif-
ically, a first time-consuming bottleneck in the application of such algorithms can be
identified in the evaluation of the objective function at each iteration. Indeed, evalu-
ating the objective function typically requires the numerical solution of a sequence
of eigenvalue or generalized eigenvalue subproblems, one for each value of the (dis-
cretized) domain of the dispersion functions. A second, similar issue may rise up for
the approximate computation of the partial derivatives of the objective functions with
respect to the optimization variables, whenever the iterative algorithm exploits such
information during its iterations. Finally, it is worth remarking that the computational
effort rapidly grows up with (i) the increment of the physical model dimensions (larger

123
Journal of Optimization Theory and Applications

eigenspace and, usually, higher spectral density) and/or (ii) the increase in the number
of design parameters taken as optimization variables (larger domain of the objective
function).
The computational hurdles slowing down the spectral optimization process can
be by-passed by suitably re-formulating the problem in a two-phase procedure. By
replacing the original (or true) objective functions (and their partial derivatives) with
more-easily computable approximations, surrogate optimization [20, 21] can tackle the
spectral optimization problems in a numerically efficient way, rapidly providing well-
performing suboptimal solutions (first phase). Therefore, if necessary, the suboptimal
solutions could be re-optimized locally, turning back to the true objective functions,
hence to the original optimization problems (second phase).
Within this challenging framework, the present contribution is focused on the
formulation and application of surrogate optimization techniques—based on inter-
polation through Gaussian Radial Basis Function (RBF) networks and Quasi-Monte
Carlo sequences—for the optimization of the filtering properties of a periodic acoustic
metamaterial with tetrachiral cellular topology. Based on the physical–mathematical
lagrangian model governing the free propagation of elastic waves through the meta-
material [22], the paper intends to thoroughly revise the methodological approach
presented in [23], by conveniently re-formulating and extending it in the framework
of a surrogate optimization strategy. To this purpose, proper motivations for the appli-
cation of interpolation techniques in the construction of surrogate objective functions
to be used in band gap optimization problems are provided first (Sect. 2). Second, the
new approach adopted for the numerical solution of these optimization problems is
detailed (Sect. 3), with focus on the combination of Gaussian RBF networks, a suit-
able iterative optimization algorithm, and Quasi-Monte Carlo sequences. The latter,
in particular, are applied for both the construction of the Gaussian RBF networks and
the re-initialization of the optimization algorithm. Then, the numerically optimized
designs obtainable for the microstructure of the acoustic tetrachiral metamaterial are
presented, and properly discussed with reference to the existent literature (Sect. 4).
Original ideas and viable directions for further research developments, based on the
use of additional machine-learning techniques are finally outlined (Sect. 5). Then,
some conclusions are drawn (Sect. 6). In order to favor the paper readability for the
largest possible audience, some specific mathematical concepts, not strictly necessary
to present and discuss the principal substantial aspects of the research, have been orga-
nized in three thematic Appendices, in which formal notations and complete technical
details are reported.

2 Motivations for Machine Learning in Spectral Design Problems

A fundamental problem in the spectral design of periodic materials and metamateri-


als consists in the maximization of the stop bandwidth (amplitude of the band gap)
separating a selected pairs of consecutive dispersion curves in the acoustic part of the
dispersion spectrum [5, 24–26]. From the mathematical viewpoint, the matter can be

123
Journal of Optimization Theory and Applications

approached by adopting the stop bandwidth as objective function f (x) and formulating
a constrained nonlinear optimization problem in the form

maximiz
p
e f (x),
x∈R
s.t. h i (x)  0, i  1, . . . , n eq , (1)
g j (x) ≤ 0, j  1, . . . , n ineq ,

where the physical properties of the metamaterial microstructure play the role of real-
valued optimization variables, collected in the vector x ∈ R p . The auxiliary functions
h i (x) and g j (x) are employed to express the constraints in the mathematical form of
equalities and inequalities, respectively.
Considering a lagrangian beam lattice formulation, a M-dimensional physical—
mathematical model can be employed to govern the free propagation of linear elastic
waves in the periodic tetrachiral metamaterial [22]. Denoting by μ the vector col-
lecting the minimal set of independent dimensionless mechanical parameters of the
lagrangian model, and denoting by k the dimensionless wavevector spanning the Bril-
louin zone of the periodic lattice, the Floquet–Bloch spectrum is characterized by M
real-valued dispersion relations ω(μ, k). Denoting by M(μ) and K(μ, k) the mass
and generalized stiffness matrices of the mechanical model, the dispersion relations
satisfy
 the characteristic equation
 F(ω, μ, k)  0 associated with a linear eigenprob-
lem K(μ, k) − ω2 M(μ) ψ  H(ω, μ, k)ψ  0 governing the free-wave propagation
according to the Floquet–Bloch theory. Therefore, the objective function f (x) describ-
ing the amplitude of a certain spectral band gap in the spectral design optimization
problem (1) can be established as a suited combination of two consecutive dispersion
relations ω(μ, k) associated to different waveforms ψ(μ, k). Consequently, all (or
part of) the mechanical parameters in the μ-vector play the role of the p optimization
variables, or design parameters, composing the vector x.
The main computational issues arise when trying to solve numerically this kind of
optimization problems can be discussed. The evaluation of the band gap amplitude
associated with any given choice for the set of design parameters requires two main
computational steps:
(a) Determining the M coefficients of the characteristic polynomial F(ω, μ, k)
related to the determinant of a square (M-by-M) Hermitian matrix H(ω, μ, k),
and then calculating the M polynomial roots. This step has to be repeated sev-
eral times, each time considering a different wavevector k identifying one of the
points necessary to discretize with sufficient fineness the closed boundary ∂ B of
a suitable subdomain B of the Brillouin zone;
(b) Determining the maximum and minimum values of two selected consecutive
dispersion curves and then comparing these values.
Within this two-step procedure, the computational effort is clearly dominated by the
first step, and increases by increasing either the dimension of the square matrix (model
improvement), or the refinement of the discretization of the closed boundary ∂ B. It
is worth remarking that, in this context, a symbolic computation of the coefficients of
the characteristic polynomial does not help to decrease the computational effort (actu-
ally, it may even increase it), unless the square matrix H(ω, μ, k) is sparse, which is

123
Journal of Optimization Theory and Applications

usually not the case in this kind of physical models. Only for a sparse matrix, indeed,
a significant reduction in the number of terms appearing in the symbolic expression of
the matrix determinant is expected. In all the other cases, a numerical evaluation of the
coefficients for each choice of the μ-vector of design parameters has to be preferred.
However, a more feasible way to reduce—possibly significantly—the computational
effort consists in reducing the number of evaluations of the objective function of the
original optimization problem. This computational cost-saving target can be achieved
by introducing acceptable simplifications, including—for instance—the adoption of
supervised machine-learning techniques (i.e., based on a set of labeled training exam-
ples) for function interpolation or approximation. The application of such techniques
to the optimal design of innovative periodic metamaterials for the realization of highly
performing acoustic filters is an original contribution of the paper. It has to be remarked
that the considerations motivating the employment of machine-learning techniques are
quite general, and their validity holds also for other optimizable topologies of periodic
metamaterials [15–17]. It can be also observed that the application of machine-learning
techniques represents an important recent trend of research also in materials science
(see [27] for a recent review about the state-or-the-art in this field).

3 Surrogate Optimization in Spectral Design

In order to decrease the computational effort associated with the numerical solution of
band gap optimization problems, a surrogate optimization approach can be considered.
The leading idea is that the true objective function can be evaluated exactly on a finite
subset of the domain to which the set of continuous design variables belongs. In
machine-learning terms, this subset is called training set. Then, the function values
are interpolated/extrapolated in the remaining domain of the design variables. As a
consequence, the numerical optimization is performed by replacing the true objective
function with its interpolant, whose evaluation is expected to be easier (once the
interpolant has been constructed in the training phase). According to this general
strategy, the approach proposed in this paper to tackle the band gap optimization
problems under investigation is characterized by the following features:

(a) A mesh-free interpolation is used for interpolating the objective function values,
using a finite number Nc of strictly positive-definite RBF computational units
[28]. Technical details about the mesh-free interpolation are reported in “Ap-
pendix A.” Moreover, a Quasi-Monte Carlo discretization [29] of the domain of
design variables is exploited to select the centers of the RBFs, after removing
from the Quasi-Monte Carlo sequence all the points that do not satisfy the con-
straints of the original optimization problem. These choices are motivated by the
nice approximation error guarantees (expressed in the sup norm) that are avail-
able for the mesh-free interpolation method, when both the objective function
to be interpolated is smooth enough and the fill distance (see “Appendix B”) is
small enough (see, for instance, Theorem 14.5 in [28]). Loosely speaking, the fill
distance is a way to quantify how well the training points fill the optimization
domain. A small value of the fill distance can be obtained, e.g., when the train-

123
Journal of Optimization Theory and Applications

ing points are generated using a (mesh-free) Quasi-Monte Carlo discretization


[29]. In the paper, Gaussian RBFs with identical width are used for the inter-
polation. The width is chosen by leave-one-out cross-validation, following the
approach described in [28] (in Sect. 17.1.3) and therein called Rippa method [30].
For simplicity, the centers of the Gaussian RBF computational units are chosen
coincident with the training points. In this way, the strict positive-definiteness
of the Gaussian RBF computational units guarantees the non-singularity of the
associated interpolation matrix, hence, also the uniqueness of the interpolant (see
Chapter 3 in [28]);
(b) A surrogate objective function is constructed by using the mesh-free function
interpolation method, because the values assumed by the true objective function
at the input training points are known exactly. This specific point distinguishes
the approach from alternative machine-learning methods like Support Vector
Regression (see for instance [31] in Sect. 6.2), which exploit, instead, function
approximation. Technical details about the mesh-free interpolation are reported
in “Appendix B”;
(c) A numerical optimization of the surrogate objective function is performed by
applying a Sequential Linear Programming (SLP) iterative algorithm based on
an adaptive trust region. This version of the SLP algorithm has some similarities
with that presented in [32] for a different application of numerical optimization
in materials science. At each iteration of the SLP algorithm, the initial optimiza-
tion problem (whose objective function is the surrogate one) is replaced by its
linearization around the current vector of design variables. Changes in these vari-
able values are also limited by the presence of an additional adaptive trust region
constraint, which typically depends on the quality of the linearization performed
in the previous iterations. Specifically, at each iteration but the first one, the size of
the trust region (which is centered on the current choice for the vector of design
variables) is expanded (up to an upper bound on that size) when the previous
linear approximation is accurate. This means that all the relative errors between
the linear approximations of the partial derivatives of the objective function (and
possibly, also of the constraining functions, when they are nonlinear) and their
true values at the current point are smaller than a given tolerance ς > 0. Oth-
erwise, the size of the trust region is reduced (up to a lower bound). A possible
choice for the adaptive trust region is a hypercube with edges of length L, whose
lower and upper bounds are denoted, respectively, by L min and L max . Its initial
length is denoted by L init . Here, the SLP algorithm is terminated after a fixed
number Ni of iterations, although more sophisticated termination criteria may be
also exploited. Differently from [15–17, 22], the SLP algorithm is used instead of
the Globally Convergent version of the Method of Moving Asymptotes (GCMMA)
[33, 34], since some preliminary simulations suggest that, at least for the spe-
cific optimization problem, the performance of SLP does not depend strongly on
the interpolation quality. Technical details about the mesh-free interpolation are
reported in “Appendix C”;
(d) An iterative optimization algorithm is applied for a finite number Ns of times,
by adopting a Quasi-Monte Carlo multi-start initialization approach (identical
to that employed in [15–17, 22]), which produces Ns different starting points.

123
Journal of Optimization Theory and Applications

In addition, a subset of the same input training points used for the construction
of the interpolant is selected also to perform the Quasi-Monte Carlo multi-start
initialization, since on such points the surrogate objective function reproduces
exactly the true objective function, committing a zero approximation error therein.
Basically, at each initial iteration of the algorithm, the true objective function and
the surrogate one assume the same value.

It is worth remarking that the proposed approach has some similarities with the
methodology applied in [35] to a mechanical design problem. The similarities include
the application of an RBF network interpolant in the context of surrogate optimization,
whereas the differences consist of (i) the use of a Quasi-Monte Carlo sequence both for
constructing the interpolant and for initializing the optimization, (ii) the application of
leave-one-out cross-validation to select the common width parameter of the Gaussian
RBFs, and (iii) the use of the SLP algorithm with the adaptive trust region for the
numerical optimization. In the context of the band gap optimization problem under
investigation, an additional motivation for the application of a Quasi-Monte Carlo
sequence is the Lipschitz continuity of its objective function (see “Appendix B” for
more details about this issue).
As final remark, the maximum value achieved by the Gaussian RBF network inter-
polant on the whole domain of design variables is typically different from that assumed
by the surrogate/true objective function in correspondence of the training set. This
occurrence can be justified on the light of the additional simplifying condition that
there are no equality constraints and all the inequality constraints are inactive at opti-
mality (i.e., that they hold with the strict inequality). When this condition holds, a
necessary (but not sufficient) condition for getting the same maximum values in the
two cases is that the gradient of the Gaussian RBF network interpolant is equal to
the all-zero vector, for the elements of the training set associated with the maximum
objective value on such a set. However, this condition, combined with the interpolat-
ing conditions, establishes a system of linear equations having typically no solution
(instead, the interpolating conditions alone always determine a unique Gaussian RBF
network interpolant). Consequently, the maximum value of the Gaussian RBF net-
work interpolant on the whole domain of design variables is typically larger than the
maximum value of the true objective function on the set of input training points.

4 Spectral Design of the Tetrachiral Material

The surrogate optimization method can be applied to the design problem of maximiz-
ing the band gap amplitude in the low-frequency range of the dispersion spectrum
associated with the tetrachiral periodic metamaterial. Following the operational pro-
cedure outlined in Sect. 3, the Gaussian RBF network interpolant of the true objective
function is constructed using a training set made of Nc  500 points, generated starting
from a Sobol’ sequence [29, 36]. The points violating the constraints of the optimiza-
tion problem are discarded. The common width parameter of the Gaussians is tuned
automatically using leave-one-out cross-validation. Then, the numerical optimization
is performed by executing Ns  10 times the SLP algorithm with the adaptive trust

123
Journal of Optimization Theory and Applications

region, initializing that algorithm each time with a different center. Each repetition
consists of Ni  100 iterations of the SLP algorithm. The adaptive trust region is
chosen as a hypercube with edges of length L, centered on the parameter vector gen-
erated at each iteration. The initial length is L init  0.1, whereas its lower and upper
bounds are, respectively, L min  0.005 and L max  1. At the end of each iteration,
the value of the current length L is doubled (respectively, reduced by one half), up to
its upper (respectively, lower) bound, depending on the linearization accuracy, which
is measured by the tolerance ς  0.1.
Considering the lagrangian physical–mathematical model of the tetrachiral meta-
material [12, 22], a set of four nondimensional mechanical
  parameters
 are
 selected
 as
design variables, and collected in the vector μ  wan w, E r E s , νr , ρr ρan ∈ R4 .
With reference to the characteristic microstructure of the tetrachiral periodic cell
(Fig. 5a), the geometric quantities w and wan stand for the transversal widths of the
beams and rings, respectively. The elastic properties E s ,E r and νr are the Young moduli
and the Poisson ratios of the beams and rings (subscript s) and the resonator (subscript
r). Finally, the inertial properties ρr and ρan represent the different mass densities
of the resonator and ring, respectively. The four design variables have been selected
as the most convenient optimizable quantities among the complete set of indepen-
dent mechanical parameters, according to the outcomes of previous, effective but less
refined, optimization strategies based on an eight-parameter set [22].  Forthe sake of
completeness,
 the remaining four
   (fixed) parameters are w / ε  3 50, R ε  1 5,
r / ε  3 50 and β  arcsin 2 5 , where ε is the periodic cell length, R and r are the
radii of the ring and the resonator, respectively, and finally β is the chirality angle. It
is worth remarking that the dimensional reduction in the design variable space simpli-
fies the problem, without compromising the generality of the surrogate optimization
method. Moreover, it simplifies the comparison of the proposed method with other
algorithms applied directly to the optimization of the true objective function.
According to the selection of the design variables, the spectral optimization prob-
lem for the tetrachiral metamaterial can be focused on maximizing the amplitude of
the band gap separating two low-frequency dispersion curves. Mathematically, the
problem can be formulated as

maximize ω∂ B1 (μ)
μ∈R4 (2)
s.t. μl,min ≤ μl ≤ μl,max , l  1, . . . , 4,

i.e., as a nonlinear optimization problem with box constraints, where the μ-dependent
objective function ω∂ B1 (μ) is the amplitude of the partial band gap between the
second and third dispersion curves, properly normalized by dividing the amplitude by
the center frequency of the band gap (see [12, 15–17, 22] for details). The amplitude
is evaluated as the difference between the maximum of the second frequency and the
minimum of the third frequency over the horizontal segment ∂ B1 of the boundary ∂ B
closing a suitable subdomain B of the Brillouin zone. The amplitude can be positive
or null, if the difference is negative (no band gap). Differently from [22], the problem
constraints are limited to lower and upper bounds μl,min and μl,max imposed on the
admissible range of each component μl of the μ-vector (see Table 1).

123
Journal of Optimization Theory and Applications

  
Table 1 Lower and upper bounds wan w Er Es νr ρr ρan
on the geometrical and
   
mechanical parameters μl,min 1 3 1 10 1 5 1 2
 
μl,max 10 3 1 2 5 2

  
Table 2 Best solutions of the wan w Er Es νr ρr ρan f surr (μ) f (μ)
optimization problems and
corresponding objective values μ∗surr 0.333 0.100 0.400 2.000 0.482 0.450
μ∗train 0.410 0.166 0.212 1.735 0.295 0.295
μ∗true 0.333 0.100 0.280 2.000 – 0.447

Since two or more local maxima of the band gap amplitude can co-exist in the admis-
sible range of the parameters, the largest amplitude among those found by the search
algorithms (best solution) is assumed to actually solve the optimization problem. In
the optimization problem (2), the band gap amplitude represents the true objective
function f (μ), which in the framework of the surrogate optimization approach is
instead replaced by its interpolant f surr (μ). The best solution μ∗surr solving the surro-
gate optimization problem is compared with the best choice μ∗train of the parameter
vector within the training set, i.e., that maximizing the surrogate objective function
on that set (or, equivalently, that maximizing the true objective function therein, since
the two functions coincide on the training set). For a further comparison, the best
solution μ∗true achievable by means of the GCMMA optimization algorithm, applied
to the true objective function using the same number of repetitions and starting from
the same initializations as SLP applied to surrogate optimization, is also considered.
On the one hand, the choice of GCMMA as a benchmark is motivated by its com-
mon use for the numerical solution of band gap optimization problems (see [15] and
the references therein) and by its theoretical guarantees of convergence [33]. On the
other hand, the choice of SLP for surrogate optimization has been made due to its
algorithmic simplicity. Of course, in principle other iterative optimization algorithms
(such as sequential quadratic programming, or SQP [19]) could be applied, as long as
they are able to solve the original optimization problem in a small number of (possi-
bly computationally expensive) iterations. In this comparison, genetic algorithms [37]
have been not considered, since their application to generalizations of problem (2)
including additional nonlinear constraints is expected to be difficult (although they do
not require any gradient evaluation), due to the necessity of defining a crossing-over
operator capable of preserving feasibility of its operands. Table 2 summarizes the best
solutions, and the corresponding values of the surrogate/true objective function.
The comparison of the solutions shows that the largest surrogate objective value
f surr (μ∗surr ) found by the surrogate optimization method is larger than the largest sur-
rogate/true objective value f surr (μ∗train ) achieved on the training set, demonstrating the
usefulness of surrogate optimization for the specific application. This finding is aligned
with the remarks in Sect. 3. Moreover, even though the surrogate value f surr (μ∗surr ) actu-
ally overestimates the true value f (μ∗surr ), close objective values f (μ∗surr )  f (μ∗true )
are obtained. These results confirm that the surrogate optimization method can achieve

123
Journal of Optimization Theory and Applications

nearly the same maximum objective value obtainable by the GCMMA-based maxi-
mization of the true objective function. In the particular case shown in Table 2, it
provides even a slightly larger value of the true objective.
In order to deepen the analysis of the numerical results, Fig. 1a illustrates, for
each of the Ns repetitions of the surrogate optimization (one line for each repetition),
the values assumed by the surrogate objective function (blue lines) and by the true
objective function (red lines). Both function values vary with respect to the iteration
number, when a total number of Ni  100 iterations of the SLP algorithm with the
adaptive trust region are performed to solve numerically the surrogate optimization
problem. It is worth remarking that these Ni true objective values are not exploited
by the SLP algorithm for further optimization, but are computed a posteriori, for a
final accuracy check. The numerical results reported in the figure fit the theoretical
expectations, since the two objective functions evolve in a qualitatively similar way
over the repetitions. This qualitative fitting persists even in scenarios of worse approx-
imation by the Gaussian RBF network interpolant, possibly due to a sufficient but
inadequate number of centers/interpolation points. For comparison, Fig. 1b shows the
results obtained by the GCMMA algorithm applied directly to the true objective func-
tion. It is worth noting that, at each iteration, in this case GCMMA needs to compute
both the true objective function and its gradient (i.e., four partial derivatives, which
are approximated by finite differences). Instead, a negligible computational effort is
required for the evaluation of the gradient of the surrogate objective function, since
this is computed analytically (see Eq. (4) in “Appendix A”), once its coefficients have
been determined during the training phase. Hence, while the surrogate optimization
method requires Nc  500 evaluations of the true objective function (required to con-
struct the interpolant), the GCMMA algorithm applied to the true objective function
requires a one-order of magnitude larger number Ns · Ni · (1 + 4)  10 · 100 · 5  5000
of evaluations. Further advantages of the surrogate optimization method are expected
for higher-dimensional problems, for which the extra computation cost needed to con-
struct the surrogate objective function can be compensated by a large cost of an exact
evaluation of the true objective function [38]. It is also worth remarking that larger
values of Nc are able to produce better guarantees on the approximation error of the
interpolant (see Eq. (9) in “Appendix B”), and on the error of approximate global opti-
mization associated with the Quasi-Monte Carlo sequence (see Eq. (9) in “Appendix
B”). Indeed, in both cases, a smaller fill distance is obtained. Larger values of Ns are
associated with larger (or equal) values of f surr (μ∗surr ). When the surrogate objective
function is a good approximation of the true one, also an increment of f (μ∗surr ) is
expected, although not theoretically guaranteed.
For a deeper discussion of the achieved best solutions, Fig. 2a shows a comparison
between  the surrogate
 and true objective functions in the bi-dimensional subspace of
the (ρr ρan , wan w)-parameters, when the other parameters are fixed at the values
of the solution μ∗surr . More specifically, Fig. 2b reports, for the particular repetition
of the SLP algorithm for which the best solution μ ∗ is obtained, the optimization
surr 
paths followed by the algorithm iterations in the (ρr ρan , wan w)-plane (blue dots).
The path convergence to the maximum of the surrogate objective function (dot on the
blue wireframe surface) can be appreciated. The convergent behavior is reported also
for the repetition of the GCMMA algorithm for which μ∗true is obtained. In this case,

123
Journal of Optimization Theory and Applications

0.5 0.5

f (μ) f (μ)
0.4 0.4
f surr (μ)
f (μ)
0.3 0.3

0.2 0.2

0.1 0.1

(a) (b)
0.0 0.0
0 20 40 60 80 100 0 1 2 3 4 5 20 40 60 80 100
Ni Ni

Fig. 1 Objective function values versus iteration number: a comparison of the surrogate and true function
values for the SLP algorithm applied to the surrogate objective function, b true function values for the
GCMMA algorithm applied to the true objective function

the other parameters are fixed at the values of the solution μ∗true . The
 optimization
path followed by the GCMMA algorithm iterations in the (E r E s ,ρr ρan )-plane (red
dots) can be observed to differ from that followed by the SLP algorithm. Again,
the path convergence to the maximum of the true objective function (dot on the red
wireframe surface) can be appreciated. Similar comparisons between the surrogate
and the true objective functions, together with the different paths followed by the
SLP and GCMMA  algorithms
 are reported in Figs. 3 and 4 in the bi-dimensional
subspace of the (ρr ρan ,E r E s )-parameters and the (E r E s ,wan w)-parameters,
respectively. The analyses related to the dependence on the remaining parameter νr
are not reported, since the two objectives can be verified to be essentially independent
of the νr -variations within the two-dimensional neighborhoods of the νr -value of the
best solutions μ∗surr and μ∗true .
Finally, the dispersion spectrum of the tetrachiral metamaterial, designed according
to the surrogate optimization strategy (best solution μ∗surr reported in Table 2), is
portrayed in Fig. 5. First, the boundary ∂ B closing the subdomain B of the Brillouin
zone corresponding to the periodic cell of the metamaterial is presented (Fig. 5a, b).
Second, the six dispersion curves composing the spectrum are illustrated, with focus
on the optimally amplified partial band gap (green region) separating the dispersion
curves of the frequencies ω2 and ω3 over the horizontal segment of the boundary
∂ B, spanned by the local abscissa Ξ (Fig. 5c). Lastly, the six dispersion surfaces of
the designed metamaterial are reported in the entire bi-dimensional Brillouin zone
spanned by the wavevector k  (k1 , k2 ).

5 Future Developments

The machine-learning approach to spectral optimization problems examined in this


work is open to further promising improvements. Specifically, some feasible ideas for
ongoing and future investigations can be found in the following directions:

123
Journal of Optimization Theory and Applications

 
Fig. 2 Objective functions in the bi-dimensional subspace of the admissible parameters ρr ρan and wan w:
a surrogate function (blue surface) and true function (black wireframe), b optimization paths in the parameter
subspace and objective values of the best solutions for the surrogate function (blue wireframe) and true
function (red wireframe)

 
Fig. 3 Objective functions in the bi-dimensional subspace of the admissible parameters ρr ρan and E r E s :
a surrogate function (blue surface) and true function (black wireframe), b optimization paths in the parameter
subspace and objective values of the best solutions for the surrogate function (blue wireframe) and true
function (red wireframe)

(a) A more sophisticated iterative algorithm for surrogate optimization could be


applied, if an updating of the surrogate objective function is provided during the
execution of the algorithm, by evaluating the true objective function on points
available sequentially (online learning [39]), possibly selected by the algorithm
itself (active learning [40]). This would include a first exploration performed on a
coarse grid, followed by a local exploration performed on a finer grid, adaptively
selected by the algorithm. Moreover, the results of a problem-specific analysis of
the properties of the optimal solution (e.g., its belonging to the boundary of the
feasible region) may be combined with the surrogate optimization algorithm, to
effectively reduce the dimension of the parameter vector;

123
Journal of Optimization Theory and Applications

 
Fig. 4 Objective functions in the bi-dimensional subspace of the admissible parameters E r E s and wan w:
a surrogate function (blue surface) and true function (black wireframe), b optimization paths in the parameter
subspace and objective values of the best solutions for the surrogate function (blue wireframe) and true
function (red wireframe)

(a) (b) (d)

(c)

Fig. 5 Tetrachiral metamaterial, designed according to the surrogate optimization strategy: a, b periodic cell
and closed boundary ∂ B of the subdomain B for the corresponding Brillouin zone, c dispersion curves with
the optimally amplified band gap, d dispersion surfaces

(b) Gaussian RBF networks with variable widths and centers could be used, since
it is known that this improvement helps mitigating the curse of dimensionality
for other (function approximation and optimization) problems [40–43]. Simi-

123
Journal of Optimization Theory and Applications

larly, feedforward neural networks with sigmoidal computational units could be


employed, since they offer similar guarantees [40, 41, 44]. Together with other
classes of approximators, both these kinds of networks are applied extensively by
the so-called extended Ritz method [40, 45–47], allowing to get suboptimal solu-
tions to functional optimization problems with guarantees on their performance.
Likewise Gaussian RBF networks, also feedforward neural networks with sig-
moidal computational units can be applied to perform surrogate optimization
[48];
(c) fixed the total computational time, the acceptable precision in the evaluation of
the true objective function on the training set could be decreased. This could
be done, for instance, by using a coarser grid for the closed boundary of the
subdomain of the Brillouin zone, or by decreasing the precision in the eigenvalue
computations. By doing this, the generation of each (less precise) training example
would require less time, hence a larger number of training examples could be
exploited to construct the surrogate objective function;
(d) still in order to use a coarser grid and allow at the same time a precise evalua-
tion of the band gap amplitude, machine-learning methods could be applied for
dispersion curve identification in the presence of curve intersections [49].

Finally, the development and the application of efficient surrogate optimization


algorithms to band gap optimization problems can be argued to be an important pre-
liminary step for the optimal design of more multi-field electro-mechanical models
representing smart (e.g., automatically tunable) filters, characterized by a much larger
number of (tunable) parameters. For such problems, an online re-optimization of the
parameters as a consequence of a change in the operating conditions could be extremely
time consuming if it were based on exact evaluations of the true band gap objective
functions (especially if a large number of such evaluations were needed). Then, sur-
rogate optimization may represent a valid alternative to exact computations, possibly
providing good suboptimal solutions in reasonable amounts of time. It could be also
combined with reinforcement learning techniques (see [40] for a description of some
possibilities), for problems where a model of the tunable filter is not available (or it is
difficult to analyze), still it is possible to measure its output.

6 Conclusions

A viable approach to apply machine-learning techniques to spectral optimization prob-


lems for periodic metamaterials has been investigated in this work, by leveraging a
surrogate optimization strategy based on a suited combination of Radial Basis Func-
tion network interpolation methods and Quasi-Monte Carlo sequences. Motivated by
the pressing operational need to significantly reduce the high computational effort
involved in tackling band gap optimization problems, the surrogate optimization strat-
egy has been verified to efficiently treat the large size of the physical matrices defining
the spectral eigenproblems and the extra-fine discretization of the dispersion function
domain, which traditionally require unaffordable computational resources for the eval-
uation of the objective functions. Specifically, numerical results have been reported

123
Journal of Optimization Theory and Applications

for a band gap optimization problem related to the maximization of the partial band
gap in the acoustic low-frequency spectrum of the periodic tetrachiral material. The
superior performance of the surrogate optimization strategy (in terms of reduced num-
ber of numerical evaluations to achieve a certain optimal solution) has been clearly
highlighted from the direct comparison with a more traditional approach based on
the exact evaluation of the true objective function, still applicable by virtue of the
relatively small number of variables to be optimized.

Acknowledgments The authors acknowledge financial support of the (MURST) Italian Department for
University and Scientific and Technological Research in the framework of the research MIUR Prin15
project 2015LYYXA8, “Multi-scale mechanical models for the design and optimization of micro-structured
smart materials and metamaterials.” The authors also acknowledge financial support by National Group of
Mathematical Physics (GNFM-INdAM).

Compliance with Ethical Standards

Conflict of interest The authors declare that they have no conflict of interest.

Appendices

Some specific mathematical concepts and the related technical notations employed in
the paper are reported in further details. In particular, “Appendix A” describes mesh-
free interpolation methods, “Appendix B” provides details on surrogate optimization,
and “Appendix C” is concerned with iterative optimization algorithms.

A: Mesh-Free Interpolation Methods

A Radial Basis Function (RBF) network interpolation method [28] based on Nc com-
putational units is characterized by an interpolant 
f : Ω ⊂ R p → R of the form


Nc

f (x)  ci ϕi (x − xi ), (3)
i1

where, for i  1, . . . , Nc , the ci ∈ R are the coefficients of the linear combination, the
ϕi : R p → R are radial basis functions (i.e., functions whose values depend only on the
distances of their arguments from the origin), and the xi ∈ R p are called centers. RBF
network interpolation methods are examples of mesh-free methods that, differently
from mesh-based ones, do not require the presence of connections between their
nodes, but are rather based on the interaction between each node and all its neighbors.
As a particular case, the Gaussian RBF network interpolation method exploits an
interpolant of the form


Nc
x − xi 2

f (x)  ci exp − 2
, (4)
i1
2σi2

123
Journal of Optimization Theory and Applications

where x − xi 2 denotes the Euclidean norm of the vector x − xi , and the coefficients
σi > 0 are called widths. In the paper, a Gaussian RBF network interpolation method
with fixed centers and the same widths σi  σ is used, where the coefficients ci are
chosen in such a way to make, for a function f : Ω ⊂ R p → R to be interpolated,
the error associated with its interpolation by  f equal to zero on a finite set of input
training points in R p . In the following, such a set is taken to be coincident with the
set of Nc centers. Consequently, the coefficients ci are obtained by solving the linear
system

Ac  f, (5)

where A ∈ R NC ×NC is called interpolation matrix, whose generic element, in the



2
case of Gaussian RBF networks, has the form Ai j  exp − xi − x j 2 (2σ 2 ) ,
whereas f ∈ R Nc is a column vector collecting the values assumed by the function f
on the input training points, and c ∈ R Nc is the column vector collecting the unknown
coefficients of the linear combination.
Typically, radial basis functions used in RBF network interpolation methods are
chosen to be strictly positive-definite. It is recalled here that a function g : R p → R is
called strictly positive-definite if, for every positive integer n and every collection of n
distinct elements x1 , . . . , xn ∈ R p , the symmetric matrix G ∈ Rn×n having elements
of the form G i j  g(xi − x j ) is positive definite. Since the interpolation matrix A
belongs to this matrix class, strict positive-definiteness of a fixed (i.e., independent
from the index i) radial basis function guarantees the non-singularity of the associated
interpolation matrix; hence, existence and uniqueness of the associated interpolant,
for every choice of the training set, provided the input training points are distinct (see
also [28], Chapter 3). In this way, indeed, the solution of Eq. (5) is

c  A−1 f. (6)

When the training points do not coincide with the centers, then Eq. (5) may not be
solvable and an approximate (typically not interpolating) solution must be searched.
For instance the solution c+  A+ f (where A+ is the Moore–Penrose pseudo-inverse
 −1
of A) or the Tikhonov-regularized solution cλ  AT A + λI AT f, where λ > 0 is
a suitable regularization parameter, can be achieved. However, for the kind of band
gap optimization problems considered in this paper, interpolation is preferred, since
there is no noise in the training set.
Despite the uniqueness of the interpolation for every choice of the width parameter
σ , different choices provide different interpolants, with varying quality of the approx-
imation outside the training set. A possible way to select a suitable value for σ , aimed
to prevent overfitting, is leave-one-out cross-validation [28], which in the context of
RBF network interpolation is also called Rippa method. According to this method,
for each choice of σ , Nc different RBF network interpolants are constructed starting
from Nc different—but highly overlapping—training sets of size Nc − 1, where each
of them is obtained by removing a different training point from the full training set
of size Nc . Then, for the k-th such interpolant (k  1, . . . , Nc ), the approximation

123
Journal of Optimization Theory and Applications

error εk is evaluated on the unique point that has been excluded from its training set.
All these errors are collected in a column vector ε ∈ R Nc , whose maximum norm is
minimized with respect to σ in order to choose the width optimally. Interestingly, the
evaluation of each εk does not require the actual training of the k-th interpolant, since
it is possible to prove [30] that such error has the expression
ck
εk   −1
 , (7)
A kk

where ck is the k-th element of the column vector c of coefficients of the linear
combination defining the interpolant trained on the whole training set, and A is the
associated interpolation matrix (still dependent on the whole training set). Finally, the
output of the leave-one-out cross-validation is made of:
(a) The optimal choice of σ and
(b) The RBF network interpolant trained on the whole training set, for that value
of σ .
If Nc is large, the interpolant is expected to be very similar those trained—for the
same choice of σ —on the Nc smaller training sets of size Nc − 1, due to their high
overlap.

B: Surrogate Optimization

Surrogate optimization replaces the objective function of an optimization problem


with a suitable interpolant (or more generally, with a suitable approximating func-
tion), which is called surrogate objective function [20]. Once a suitable optimization
algorithm is chosen to solve the resulting surrogate optimization problem, the perfor-
mance of surrogate optimization can be measured in terms of:
(a) The quality of the suboptimal solution it provides, expressed in terms of the value
assumed by the surrogate objective function at the suboptimal solution, and by
its comparison with the value assumed by the true objective function;
(b) The total computational time needed by the surrogate optimization algorithm to
find the suboptimal solution;
(c) Its number of evaluations of the true objective function.
In the paper, a fixed surrogate objective function is considered, i.e., the inter-
polant of the true objective function is constructed in the initialization phase of the
surrogate optimization algorithm and remains unchanged. More advanced surrogate
optimization algorithms update their interpolants during their iterations, by exploiting
additional information coming from the evaluation of the true objective function on
points selected adaptively by the algorithm itself [21].
Constructing an interpolant requires the choice of a suitable input training set, on
which the function to be interpolated is evaluated. In this way, suitable coefficients of
the linear combination defining the interpolant are obtained (possibly together with
additional coefficients, like the width parameter in Gaussian RBF network interpola-
tion). Two possible choices for the training set are provided by:

123
Journal of Optimization Theory and Applications

(a) Quasi-Monte Carlo discretization, according to which the input training set is
originated as a subsequence of a Quasi-Monte Carlo (deterministic) sequence,
and
(b) Monte Carlo discretization, which exploits, instead, a realization of a Monte
Carlo (random) sequence.

Both kinds of sequences are used extensively in numerical integration [29]. In that
context, Quasi-Monte Carlo sequences are often preferred to realizations of Monte
Carlo ones, because the former are typically able to cover the domain Ω in a more
uniform way than the latter (see, e.g., [17] for an illustrative comparison). In other
words, points generated from Monte Carlo sequences tend to form clusters. This
issue depends on the Monte Carlo algorithm that, once a specific point belonging to
a Monte Carlo sequence has been generated, keeps no memory of it to generate the
successive point of the sequence (being such points realizations of independent vector-
valued random variables). An additional feature of Quasi-Monte Carlo sequences
is that, since they are deterministic, they generate perfectly reproducible point sets.
However, this can be achieved also by Monte Carlo discretization, if pseudo-random
(deterministic) sequences, whose statistical properties are similar to the ones of Monte
Carlo sequences, are exploited.
The use of Quasi-Monte Carlo sequences (and of the point sets associated with their
subsequences) also in function optimization and function interpolation can be justified
using the concept of dispersion. Let X  {x1 , . . . , x Nc } ⊂ Ω be a finite subset of the
domain. Then, its dispersion (or fill distance) is defined as

h X ,Ω  sup min x − x j 2
. (8)
x∈Ω x j ∈X

Then, when f : Ω ⊂ R p → R is Lipschitz continuous with Lipschitz constant


L f ,Ω > 0, and its domain Ω is replaced by X , the following upper bound holds on
the error of approximate global optimization of f :

 
 
max f (x) − max f (x) ≤ L f ,Ω h X ,Ω (9)
 x∈X x∈Ω 

The bound above holds for the specific case of the optimization problem (2), since its
objective function ω∂ B1 (μ) is Lipschitz continuous (although its Lipschitz constant
may be difficult to compute). The proof of this result (which is sketched here) follows
from an application of the perturbation bound for generalized eigenvalue problems
provided by Theorem 2.2 in [50]), combined with the specific expression of ω∂ B1
(μ) (which can be inferred from Eqs. (4), (6), (7) and Appendix in [22]), and involves
a generalized eigenvalue problem parametrized by both μ and the wavevector k).
Moreover, according to Theorem 14.5 in [28], a large class of RBF network inter-
polants f X : Ω ⊂ R p → R with centers and input training points coincident with
the elements of X is characterized by the following upper bound on the error (in the

123
Journal of Optimization Theory and Applications

sup norm) in the approximation of a function f : Ω ⊂ R p → R satisfying a suitable


smoothness condition
 
f X (x) ≤ Ch kX ,Ω f H ,
sup  f (x) −  (10)
x∈Ω

where C is a positive constant (which does not depend on f and on the choice of the
fixed radial basis function), k is a lower bound on the degree of smoothness of the
computational units, and f H is the norm of f in a suitable function space H , which
is associated with the specific choice of the fixed radial basis function. Of course, to
reduce the value of the upper bound above on the approximation error, one possibility
is to choose a low-dispersion sequence.
It has to be observed that the bound (10) requires a smoothness assumption on the
function f , namely its belonging to the space H . When this does not hold, the domain
Ω can be replaced with a subdomain on which f is smooth, or f can be replaced with
a smooth approximation (possibly coincident with f on X ). The bound (10) is then
applied to the f -approximation. In this case, the presence of the error associated with
the additional approximation step must be taken into account.
Interestingly, when the domain Ω is the p-dimensional hypercube [0, 1] p , the dis-
persion h X ,[0,1] p can be bounded from above in terms of another property of the set X
(which can be also related to the associated sequence x1 , . . . , x Nc ), called discrepancy
and defined as
 
 S(G)  Nc 

D X  sup  − (bi − ai ) (11)
G∈ N c i1

 Nc
where is the family of all subsets of [0, 1] p having the form G  i1 [ai , bi ),
and S(G) is the cardinality of the intersection X ∩ G. Then, the upper bound on the
dispersion (see, for instance Theorem 6.6 in [29]) is
1 1
h X ,[0,1] p ≤ p 2 (D X ) p , (12)

hence it goes to zero when the discrepancy vanishes. Since Quasi-Monte Carlo
sequences are characterized by low discrepancy, also their dispersions are low. This
justifies the use of Quasi-Monte Carlo sequences (e.g., the Sobol’ sequence) also in
function interpolation.
To conclude, it is worth remarking that upper bounds on the approximation error
of the function f through its interpolant f X , similar to that expressed in the sup norm
and defined in Eq. (10), allow also to bound from above the error in the approximation
of the maximum value of f in terms of the maximum value of  f X . Indeed, assuming
that the respective maxima exist, Eq. (10) implies
 
 
max f (x) − max  f (x)  ≤ Ch k f H. (13)
 x∈Ω
X
x∈Ω
 X ,Ω

Similar guarantees cannot be obtained if one starts, instead, from upper bounds on the
approximation error expressed in Lebesgue-space norms like the L 2 norm. Analogous

123
Journal of Optimization Theory and Applications

remarks hold for other optimization problems to which Gaussian RBF networks have
been applied [41, 43].

C: Iterative Optimization Algorithms

Both the original and surrogate band gap optimization problems are characterized by
the presence of continuous variables to be optimized, a nonlinear objective function,
and linear and/or nonlinear constraints. To solve them approximately, suitable itera-
tive optimization algorithms can be applied. In general, the number of iterations being
the same, the performance of each such algorithm is expected to produce better sub-
optimal solutions if it is applied to the original optimization problem rather than to
the surrogate one. This is motivated by the absence of the additional approximation
error associated with the use of the surrogate objective function. Nevertheless, the
iterations of the algorithm are faster when it is applied to a surrogate optimization
problem with a fixed surrogate objective function, since such function—once it has
been constructed—is cheaper to evaluate. This advantage holds also for its partial
derivatives, if they are used during the iterations of the algorithm. For instance, par-
tial derivatives of a Gaussian RBF network interpolant can be computed in closed
form, once the values of the parameters of the interpolant have been chosen. So, even
better suboptimal solutions may be produced starting from such surrogate objective
function, the total computational time being the same. This possibility is particularly
important when there are upper bounds on the available total computational time. It
is worth mentioning that these arguments have been proved in [51] for the related
problem of regression, whose investigation can be considered as a preliminary step
for the analysis of surrogate optimization (indeed, it involves only the construction of
the approximating function, but not its successive optimization). Possible choices for
the iterative optimization algorithm, examined in this paper are:
(a) Method of Moving Asymptotes (MMA) [33] and its Globally Convergent upgrade
(GCMMA) [34]. The GCMMA algorithm—which was developed originally with
the specific aim of solving optimization problems arising in structural engi-
neering—replaces the initial (either the original, or the surrogate) optimization
problem with a sequence of approximating optimization subproblems, which are
easier to solve, though being still nonlinearly constrained. In each subproblem,
the objective and constraining functions of the initial optimization problem are
replaced by suitable approximating functions. From a certain viewpoint, this can
be also considered as a form of surrogate optimization. However, such approx-
imations are characterized by a much simpler functional form than Gaussian
RBF network interpolation. In any case, the resulting subproblems are easier to
solve than the initial optimization problem, e.g., because GCMMA applies sep-
arable approximators, which are summations of functions depending on a single
real argument. Nevertheless, just because of their simplicity, the quality of such
approximations may be worse than those of Gaussian RBF network interpolation,
if a sufficiently large number of basis functions is used. Moreover, Gaussian RBF
network interpolation guarantees a zero approximation error on the training set.
Still, GCMMA is globally convergent in the sense that, for each possible initial-

123
Journal of Optimization Theory and Applications

ization of the set of variables to be optimized, the algorithm was proved in [34]
to converge to a stationary point of the initial optimization problem (assuming
twice-continuous differentiability of the objective and constraining functions; in
practice, convergence is often observed also for less smooth functions). Finally,
it can be recalled that the expression moving asymptotes reflects that, in each
iteration, GCMMA exploits approximating functions characterized by suitable
asymptotes, which typically change (move) from one iteration to the successive
one;
(b) Sequential Linear Programming (SLP) [19]. Also in this case, the initial optimiza-
tion problem is replaced by a sequence of simpler approximating subproblems.
However, differently from GCMMA, at each iteration of SLP, linearizations of
the objective and constraining functions at the current suboptimal solution are
exploited. Moreover, additional box constraints are inserted, which define a so-
called trust region inside which the approximation error due to the linearizations
above is guaranteed to be small (such a guarantee can be obtained, e.g., by compar-
ing the actual derivatives of the objective and constraining functions on selected
points belonging to the trust region with their constant approximations coming
from the linearizations above). Each resulting optimization subproblem is even
easier to solve than those occurring in the application of GCMMA. Indeed, the
simplex algorithm can be applied, or—if the number of variables to be optimized
is not too large—the objective function can be evaluated only on the vertices of
the admissible region of each subproblem, which are simplexes. Nonetheless,
convergence may be slower in practice with respect to GCMMA, because of the
extreme simplicity of the linear approximations, and of the need to choose a suf-
ficiently small trust region, in order to guarantee a small approximation error in
the linearizations. However, a possible way to improve the rate of convergence
is obtained by exploiting an adaptive trust region (i.e., a trust region having an
adaptive size), which can be constructed by enlarging (decreasing) the size of
the trust region if the current linear approximations are good (bad). Still, a lower
and upper bound on the size of the adaptive trust region are needed, in order to
avoid such size growing or shrinking indefinitely, possibly generating numerical
errors.

Iterative optimization algorithms need suitable termination criteria [19]. The sim-
plest criterion consists in stopping the algorithm after a given number Ni of iterations.
However, if different algorithms characterized by different computational times per
iteration are used, then a fair approach to compare their performances consists in
stopping each algorithm after the same total computational time. Other termination
criteria involve, e.g., the comparison of the values assumed by the (true or surrogate)
objective function in correspondence of the suboptimal solutions generated in consec-
utive iterations of the optimization algorithm, or the comparison between the values
assumed by its gradient and the all-zero vector (limiting for simplicity the discussion
to the case of unconstrained optimization).
Finally, it is important to recall that iterative optimization algorithms for nonlinearly
constrained optimization problems with nonlinear objective function and continuous
variables to be optimized typically find only local maxima (the same GCMMA, though

123
Journal of Optimization Theory and Applications

globally convergent, is guaranteed to converge only to a local stationarity point, which


could even not coincide with a local maximum point). In order to increase the prob-
ability of finding the global maximum or a good local maximum (i.e., one whose
value of the objective function is near the global maximum), several repetitions of
each iterative optimization algorithm can be performed, starting from its different
initializations, possibly generated using a Quasi-Monte Carlo subsequence (this is
called a Quasi-Monte Carlo multi-start initialization approach). In this way, indeed,
the probability of starting the iterative optimization algorithm increases—in at least
one repetition—either from the basin of attraction of a global maximum, or from that
associated with a sufficiently good local maximum [40].

References
1. Fleck, N.A., Deshpande, V.S., Ashby, M.F.: Micro-architectured materials: past, present and future.
Proc. R. Soc. A Math. Phys. Eng. Sci. 466(2121), 2495–2516 (2010)
2. Schaedler, M.F., Carter, W.B.: Architected cellular materials. Annu. Rev. Mater. Res. 46, 187–210
(2016)
3. Meza, L.R., Zelhofer, A.J., Clarke, N., Mateos, A.J., Kochmann, D.M., Greer, J.R.: Resilient 3D
hierarchical architected metamaterials. Proc. Natl. Acad. Sci. 112(37), 11502–11507 (2015)
4. Liu, Z., Zhang, X., Mao, Y., Zhu, Y.Y., Yang, Z., Chan, C.T., Sheng, P.: Locally resonant sonic materials.
Science 289(5485), 1734–1736 (2000)
5. Lu, M.H., Feng, L., Chen, Y.F.: Phononic crystals and acoustic metamaterials. Mater. Today 12(12),
34–42 (2009)
6. Ma, G., Sheng, P.: Acoustic metamaterials: from local resonances to broad horizons. Sci. Adv. 2(2),
e1501595 (2016)
7. Phani, A.S., Hussein, M.I. (eds.): Dynamics of lattice materials. Wiley, New York (2017)
8. Zhang, S., Yin, L., Fang, N.: Focusing ultrasound with an acoustic metamaterial network. Phys. Rev.
Lett. 102(19), 194301 (2009)
9. Craster, R.V., Guenneau, S. (eds.): Acoustic metamaterials: negative refraction, imaging, lensing and
cloaking, vol. 166. Springer, Berlin (2012)
10. Bigoni, D., Guenneau, S., Movchan, A.B., Brun, M.: Elastic metamaterials with inertial locally resonant
structures: application to lensing and localization. Phys. Rev. B 87, 174303 (2013)
11. Molerón, M., Daraio, C.: Acoustic metamaterial for subwavelength edge detection. Nat. Commun. 6,
8037 (2015)
12. Bacigalupo, A., Gambarotta, L.: Simplified modelling of chiral lattice materials with local resonators.
Int. J. Solids Struct. 83, 126–141 (2016)
13. Diaz, A.R., Haddow, A.G., Ma, L.: Design of band-gap grid structures. Struct. Multidiscip. Optim.
29(6), 418–431 (2005)
14. Meng, H., Wen, J., Zhao, H., Wen, X.: Optimization of locally resonant acoustic metamaterials on
underwater sound absorption characteristics. J. Sound Vib. 331(20), 4406–4416 (2012)
15. Bacigalupo, A., Lepidi, M., Gnecco, G., Gambarotta, L.: Optimal design of auxetic hexachiral meta-
materials with local resonators. Smart Mater. Struct. 25(5), 054009 (2016)
16. Bacigalupo, A., Gnecco, G., Lepidi, M., Gambarotta, L.: Optimal design of low-frequency band gaps
in anti-tetrachiral lattice meta-materials. Compos. B Eng. 115(5), 341–359 (2017)
17. Bacigalupo, A., Lepidi, M., Gnecco, G., Vadalà, F., Gambarotta, L.: Optimal design of the band structure
for beam lattice metamaterials. Front. Mater. 6, 1–14 (2019)
18. Bruggi, M., Corigliano, A.: Optimal 2D auxetic micro-structures with band gap. Meccanica 54(13),
20012027 (2019)
19. Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms. Wiley,
New York (2006)
20. Koziel, S., Leifsson, L. (eds.): Surrogate-Based Modeling and Optimization: Applications in Engi-
neering. Springer, Berlin (2013)

123
Journal of Optimization Theory and Applications

21. Wild, S.M., Shoemaker, C.: Global convergence of radial basis function trust-region algorithms for
derivative-free optimization. SIAM Rev. 55, 349–371 (2013)
22. Bacigalupo, A., Gnecco, G., Lepidi, M., Gambarotta, L.: Design of acoustic metamaterials through
nonlinear programming. In: 2nd International Workshop on Optimization, Machine Learning and Big
Data (MOD 2016). Lecture Notes in Computer Science, vol. 10122, pp. 170–181 (2016)
23. Bacigalupo, A., Gnecco, G.: Metamaterial filter design via surrogate optimization. In: International
Conference on Metamaterials and Nanophotonics (METANANO 2018). Journal of Physics: Confer-
ence Series, vol. 1092, pp. 1–4 (2018)
24. Krushynska, A.O., Kouznetsova, V.G., Geers, M.G.D.: Towards optimal design of locally resonant
acoustic metamaterials. J. Mech. Phys. Solids 71, 179–196 (2014)
25. Lepidi, M., Bacigalupo, A.: Multi-parametric sensitivity analysis of the band structure for tetrachiral
acoustic metamaterials. Int. J. Solids Struct. 136, 186–202 (2018)
26. D’Alessandro, L., Zega, V., Ardito, R., Corigliano, A.: 3D auxetic single material periodic structure
with ultra-wide tunable bandgap. Sci. Rep. 8(1), 2262 (2018)
27. Schmidt, J., Marques, M.R.G., Botti, S., Marques, M.A.L.: Recent advances and applications of
machine learning in solid-state materials science. Nat. Comput. Mater. 5, 1–36 (2019)
28. Fasshauer, G.E.: Meshfree Approximation Methods with MATLAB. World Scientific, Singapore
(2007)
29. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM, Philadelphia
(1992)
30. Rippa, S.: An algorithm for selecting a good value for the parameter c in radial basis function interpo-
lation. Adv. Comput. Math. 11, 193–210 (1999)
31. Cristianini, S., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based
Methods. Cambridge University Press, Cambridge (2000)
32. Lin, C., Lee, Y.H., Schuh, J.K., Ewoldt, R.H., Allison, J.T.: Efficient optimal surface texture design
using linearization. Advances in Structural and Multidisciplinary Optimization, pp. 632–647. Springer,
New York (2017)
33. Svanberg, K.: The method of moving asymptotes—a new method for structural optimization. Int. J.
Numer. Meth. Eng. 24, 359–373 (1987)
34. Svanberg, K.: A class of globally convergent optimization methods based on conservative convex
separable approximations. SIAM J. Optim. 12, 555–573 (2002)
35. Wei, P., Ma, H., Wang, M.Y.: The stiffness spreading method for layout optimization of truss structures.
Struct. Multidiscip. Optim. 49, 667–682 (2014)
36. Sobol, I.M.: The distribution of points in a cube and the approximate evaluation of integrals. USSR
Comput. Math. Math. Phys. 7, 86–112 (1967)
37. Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1996)
38. Müller, J., Shoemaker, C.A.: Influence of ensemble surrogate models and sampling strategy on the
solution quality of algorithms for computationally expensive black-box global optimization problems.
J. Global Optim. 60, 123–144 (2014)
39. Gnecco, G., Bemporad, A., Gori, M., Sanguineti, M.: LQG online learning. Neural Comput. 29,
2203–2291 (2017)
40. Zoppoli, R., Sanguineti, M., Gnecco, G., Parisini, T.: Neural Approximations for Optimal Control and
Decision. Springer (2020) (forthcoming)
41. Gaggero, M., Gnecco, G., Sanguineti, M.: Dynamic programming and value-function approximation
in sequential decision problems: error analysis and numerical results. J. Optim. Theory Appl. 156,
380–416 (2013)
42. Girosi, F.: Approximation error bounds that use VC-bounds. In: Proceedings of the International
Conference on Artificial Neural Networks, pp. 295–302 (1995)
43. Gnecco, G., Sanguineti, M.: Suboptimal solutions to dynamic optimization problems via approxima-
tions of the policy function. J. Optim. Theory Appl. 146, 764–794 (2010)
44. Barron, A.R.: Neural net approximation. In: Proceedings of the 7th Yale Workshop on Adaptive and
Learning Systems, pp. 69–72 (1992)
45. Giulini, S., Sanguineti, M.: Approximation schemes for functional optimization problems. J. Optim.
Theory Appl. 140, 33–54 (2019)
46. Kůrková, V., Sanguineti, M.: Error estimates for approximate optimization by the extended Ritz method.
SIAM J. Optim. 15, 461–487 (2005)

123
Journal of Optimization Theory and Applications

47. Zoppoli, R., Sanguineti, M., Parisini, T.: Approximating networks and extended Ritz method for the
solution of functional optimization problems. J. Optim. Theory Appl. 112, 403–439 (2002)
48. Schweidtmann, A.M., Mitsos, A.: Deterministic global optimization with artificial neural networks
embedded. J. Optim. Theory Appl. 180, 925–948 (2019)
49. Gnecco, G.: An algorithm for curve identification in the presence of curve intersections. In: Mathe-
matical Problems in Engineering, pp. 1–7 (2018)
50. Nakatsukasa, Y.: Absolute and relative Weyl theorems for generalized eigenvalue problems. Linear
Algebra Appl. 432, 242–248 (2010)
51. Gnecco, G., Nutarelli, F.: On the trade-off between number of examples and precision of supervision
in machine learning problems. Optim. Lett. (2019). https://doi.org/10.1007/s11590-019-01486-x

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.

Affiliations

Andrea Bacigalupo1 · Giorgio Gnecco1 · Marco Lepidi2 · Luigi Gambarotta2

Andrea Bacigalupo
[email protected]
Marco Lepidi
[email protected]
Luigi Gambarotta
[email protected]
1 MUSAM and AXES Research Units, IMT School for Advanced Studies, Piazza S. Francesco,
19, 55100 Lucca, Italy
2 Department of Civil, Chemical, and Environmental Engineering, University of Genoa, Via
Montallegro, 1, 16145 Genoa, Italy

123

You might also like