An Electromagnetism-likeMechanism For Global Optimization
An Electromagnetism-likeMechanism For Global Optimization
An Electromagnetism-likeMechanism For Global Optimization
263
© 2003 Kluwer Academic Publishers. Printed in the Netherlands.
1. Introduction
In recent years global optimization has become a rapidly developing field. Many
real life problems in areas such as physics, chemistry and molecular biology (Schoen,
1989; More and Wu, 1995) involve nonlinear functions of many variables with
attributes (multi modality, discontinuouity, etc.) that are difficult to optimize by
conventional mathematical tools, such as the gradient methods.
To overcome the difficulties described above, stochastic search methods, which
rely heavily on computer power, have been developed starting in the 1980s. A
random search algorithm may give a usable solution even when other algorithms
fail because of irregularities or high dimensionality. We refer to Dixon and Szegö
(1978), Kan and Timmer (1987a, b), Törn and Žilinskas (1989) and Boender (1985)
for excellent surveys.
In this paper we study a special class of optimization problems with bounded
variables in the form of:
min f (x)
(1)
s. t. x ∈ [l, u],
2. Motivation
We deal with the functions of the form (1), with the following parameters given:
AN ELECTROMAGNETISM-LIKE MECHANISM FOR GLOBAL OPTIMIZATION 265
1: Initialize()
2: iteration ← 1
3: while iteration < MAXI T ER do
4: Local(LSI T ER, δ)
5: F ← CalcF()
6: Move(F)
7: iteration ← iteration + 1
8: end while
3.2. INITIALIZATION
The procedure Initialize is used to sample m points randomly from the feasible
domain, which is an n dimensional hyper-cube. Each coordinate of a point is as-
sumed to be uniformly distributed between the corresponding upper bound and
lower bound. After a point is sampled from the space, the objective function value
for the point is calculated using the function pointer f (x) (Algorithm 2, line 6). The
procedure ends with m points identified, and the point that has the best function
value is stored in x best (line 8).
We use the words particle and point interchangeably throughout the paper.
266 Ş.İ. BİRBİL AND S.-C. FANG
ALGORITHM 2. Initialize()
1: for i = 1 to m do
2: for k = 1 to n do
3: λ ← U (0, 1)
4: xki ← lk + λ(uk − lk )
5: end for
6: Calculate f (x i )
7: end for
8: x best ← argmin{f (x i ), ∀i}
12: yk ← yk − λ2 (Length)
13: end if
14: if f (y) < f (x i ) then
15: xi ← y
16: counter ← LSI T ER − 1
17: end if
18: counter ← counter + 1
19: end while
20: end for
21: end for
22: x best ← argmin{f (x i ), ∀i}
In this way, points that have better objective values possess higher charges. We
multiply the fraction by the dimension n, because in higher dimensions the number
268 Ş.İ. BİRBİL AND S.-C. FANG
of points in the population tends to get large. As a result of this, the fraction may be-
come very small, and may cause overflow problems in calculating the exponential
function.
We define the charge, q i according to the relative efficiency of the objective
function value of the corresponding point in the population. This is clearly not the
unique nor the optimal choice for this calculation. An alternative calculation, which
rank the points according to their objective function values, may be used here. Our
experiments have shown that the proposed calculation in Eq. (2) is satisfactory for
our study.
Notice that, unlike electrical charges, no signs are attached to the charge of an
individual point in Eq. (2). Instead, we decide the direction of a particular force
between two points after comparing their objective function values. Hence, the
total force F i exerted on point i is computed by the following equation:
qi qj
(x − x )
m j i
2 if f (x ) < f (x )
j i
Fi = x j − x i , ∀i (3)
j =i if f (x j ) f (x i )
i j
(x i − x j ) jq q i 2
x −x
As seen in Algorithm 4 (lines 7–8), between two points, the point that has a
better objective function value attracts the other one. Contrarily, the point with
worse objective function value repels the other (lines 9–10). Since x best has the
minimum objective function value, it acts as an absolute point of attraction, i.e., it
attracts all other points in the population.
When we examine the algorithm carefully, we can see that the determination of
a direction via total force vector resembles the statistical estimation of the gradient
of f . However, the estimation computed by our heuristic is different since this dir-
ection depends on the Euclidean distance between the points. That is, the points that
become close enough may lead each other to a direction other than the statistically
estimated one.
ALGORITHM 4. CalcF():F
1: for i = 1 to m
do
i )−f (x best )
2: q i ← exp −n mf (x
(f (x k )−f (x best ))
k=1
3: Fi ← 0
4: end for
5: for i = 1 to m do
6: for j = 1 to m do
7: if f (x j ) < f (x i ) then
qi qj
8: F i ← F i + (x j − x i ) {Attraction}
x j −x i 2
9: else
qi qj
10: F i ← F i − (x j − x i ) {Repulsion}
x j −x i 2
AN ELECTROMAGNETISM-LIKE MECHANISM FOR GLOBAL OPTIMIZATION 269
11: end if
12: end for
13: end for
Fi
x i = x i + λ i (RNG) i = 1, 2, ..., m (4)
F
Algorithm 5 gives the pseudo-code of the Move procedure. Note that the best
point, x best , is not moved and is carried to the subsequent iterations (line 2). This
suggests that we may avoid the calculation of the total force on the current best
point in Algorithm 4 (yet the computational effort for calculating the total force on
current best point is negligible).
ALGORITHM 5. Move(F)
1: for i = 1 to m do
2: if i = best then
3: λ ← U (0, 1)
i
4: F i ← FF i
5: for k = 1 to n do
6: if Fki > 0 then
7: xki ← xki + λFki (uk − xki )
8: else
9: xki ← xki + λFki (xki − lk )
10: end if
11: end for
270 Ş.İ. BİRBİL AND S.-C. FANG
12: end if
13: end for
3.7. AN EXAMPLE
We use the Spiky function to demonstrate a typical run of the algorithm. In the
following figures, ♦ represents the current best point and * shows the location of
the global optimum. In this example we have selected the population size to be 20.
Figure 2 shows the location of the points when the algorithm is started by
randomly sampling points from the feasible region. Initially the current best point
(x best ) is far away from the global optimum.
The points in the population move towards the region around the current best
point. Note that some points are repelled closer to the global optimum (Figure 3).
In Figure 4, one of the points observes an objective function value better than
the current best point, and this new point becomes the current best point. Thus, the
points start to converge towards this new x best .
Figure 5 demonstrates that the points in the population are directed towards the
region around the current best point. And after this iteration, the global optimum is
located.
Figure 2. The starting position of the particles just after the procedure Initialize.
4. Computational Results
In a recent paper, Törn et al. (1999) classified the well-known global optimization
problems as unimodal, easy, moderately difficult, and difficult problems. They also
suggested that the test problems should represent different classes.
We have included some of the functions that Törn et al. (1999) had discussed
and we have created a test function set consisting of 15 functions. The remaining
functions are taken from Demirhan et al. (1999), and from the web site, http://
solon.cma.univie.ac.at/~neum/glopt.html. We refer to this test function set as the
general test functions.
Initially, we studied three versions of EM, which differ in the local search
procedure. We demonstrate our results for the general test functions in terms of
the average number of function evaluations over 25 runs. The average and best
objective function values are also reported.
After selecting the best version, we applied EM to a well known test set, which
consists of seven test functions (Dixon and Szegö, 1978). In addition to reporting
our results, we compare the results with other methods recently known from the
work by Huyer and Neumaier (1999).
All the computations were conducted on a Pentium-III 450 MHz PC. The al-
gorithm is coded in C++ and it is available upon request.
272 Ş.İ. BİRBİL AND S.-C. FANG
Figure 4. One of the repelled points observes a better region, and signals for the others.
Complex 2 10 50 10 5.0e-3
Davis 2 20 50 30 5.0e-3
Epistacity(4) 4 30 50 10 1.0e-3
Epistacity(5) 5 40 100 20 1.0e-3
Griewank 2 30 100 20 1.0e-3
Himmelblau 2 10 50 5 1.0e-3
Kearfott 4 10 50 5 1.0e-3
Levy 10 20 75 5 1.0e-3
Rastrigin 2 20 50 10 5.0e-3
Sine Envelope 2 20 75 10 5.0e-4
Stenger 2 10 75 10 1.0e-3
Step 5 10 50 5 1.0e-3
Spiky 2 30 75 10 1.0e-3
Trid(5) 5 10 125 50 1.0e-2
Trid(20) 20 40 500 150 1.0e-3
274 Ş.İ. BİRBİL AND S.-C. FANG
Figure 5. Best point attracts the others and global optimum is found.
Function name Avg. evals. Avg f (x) Best f (x) Known optimum
Function name Avg. evals. Avg f (x) Best f (x) Known optimum
Function name Avg. evals. Avg f (x) Best f (x) Known optimum
are much better. However, the number of evaluations drastically increases for the
problems with high dimensionality.
4.1.3. EM with Local procedure applied to the current best point only
To reduce the number of function evaluations, we tried applying Local proced-
ure to the current best point only (Table 4). This choice balances the number of
evaluations and the accuracy of the results.
For this case, the average number of evaluations is closer to those of the first
version (Section 4.1.1), while the quality of the results is comparable to those of
the second version (Section 4.1.2). However, when the function has good attractors
spread over the feasible region as in functions Sine Envelope and Spiky, the points
visiting the neighborhood of the global optimum may not be powerful enough
to attract other points. In function Davis, the algorithm rapidly converges to the
neighborhood of the optimum point, but the region around global optimum has
many local minimizers in steep valleys. Thus, EM is trapped in one of these local
minima.
lar in the neighborhood of the global optimum, or when the global optimum is far
from the highly attractive local minimizers. Although the application of the local
search to all points provides the best results, the improvement over applying local
search only to the current best point is not significant. Therefore, for most of the
problems, applying local search to all points may turn out to be overkill. In the
subsequent experimentation, we utilized the third version of EM for comparison
with other global optimization methods.
For the comparison, we used the standard test functions given by Dixon and
Szegö (1978). The performance of different methods on these functions has been
extensively studied by Huyer and Neumaier (1999). We use their results in compar-
ing EM with the existing approaches. As a stopping criterion we use the following
Eq. (5), that is also applied by Huyer and Neumaier (1999):
(f (x best ) − fglob)
10−4 (5)
fglob
Bremmerman –a –a –a –a –a –a 250
Mod. Bremmerman –a –a –a –a 515 300 160
Zilinskas –a –a –a 8641 5129
Gomulka-Branin 5500 5020 4860
Törn 3679 3606 3874 2584 3447 2499 1558
Gomulka-Törn 6654 6084 6144
Gomulka-V.M. 7085 6684 7352 6766 11125 1495 1318
Price 3800 4900 4400 2400 7600 2500 1800
Fagiuoli 2514 2519 2518 513 2916 158 1600
De Biase-Frontini 620 788 1160 732 807 378 587
Mockus 1174 1279 1209 513 1232 362 189
Storn-Price –d 6400 6194 6251 476 7220 1018 1190 416 1371
MCS –e,f 83 129 103 79 111 81 41 42 69
EM 3368 1782 5620 1114 2341 420 315 233 358
a Method converged to a local minimum.
b Average number of function evaluations when converges. For H6, converged only 70% of time.
c Global minimum not found within 12 000 function calls.
d Average over 25 cases. For H6, average over 24 cases only; one case did not converge within 12 000
function calls.
e The version that gives the best results is selected.
f Recent methods that use first or second order information.
g Requires closed form for a particular integral.
h Partitions the search space into hyper-rectangles.
Missing entry means that no result is available from the literature(Huyer and Neumaier, 1999).
− Unlike MCS and other methods indicated with (f), EM does not use the first
or the second order information.
− EM is able to provide answers for all the problems.
− All the methods, other than EM that are included in Table 6, have some limit-
ations due to their rigid structures. EM has a very flexible design which would
permit using the desirable features from other methods.
AN ELECTROMAGNETISM-LIKE MECHANISM FOR GLOBAL OPTIMIZATION 279
− We could have included results using other methods, which are discussed in
Section 1.2. (such as Genetic Algorithms, Simulated Annealing, etc.). How-
ever our earlier results (Demirhan et al., 1999) with these methods on some of
the problems above have shown that EM outperforms those methods.
− In both sets of test problems, we have observed that EM performs well when
there is a pattern in the function. However, if there are multiple attractive local
minimizers spread around the feasible region, EM may be deceived and the
performance of it may decrease.
− The combination of the general scheme with other powerful local search meth-
ods, such as Solis and Wets (1981), may increase the efficiency of the al-
gorithm.
− The computational results to date suggest that we could improve our results by
using first or second order information, such as Fletcher and Reeves (1964),
in the local search.
In some of the problems, it is not necessary to apply this sub procedure. However in certain
cases this procedure improves the results critically.
280 Ş.İ. BİRBİL AND S.-C. FANG
Acknowledgment
We thank the anonymous referee for suggesting the statistical estimation of gradi-
ent of f .
References
Boender, C. G. E., Kan, A. H. G. R. and Timmer G. T. (1985), A stochastic approach to global
optimization, Computational Mathematical Programming (NATO ASI Series), F15: 291–308.
Cowan, E. W. (1968), Basic Electromagnetism, Academic Press, New York.
Demirhan, M., Özdamar, L., Helvacıoĝlu, L. and Birbil, Ş. İ. (1999), FRACTOP: A geometric
partitioning metaheuristic for global optimization, Journal of Global Optimization 14: 415–435.
Dixon, L. C. W. and G. P Szegö, (1978). The global optimization problem: An introduction. Towards
Global Optimization 2, North-Holland, Amsterdam, pp. 1–15.
Fletcher, R. and Reeves, C. (1964), Function minimization by conjugate directions, Computer
Journal 7: 149–154.
282 Ş.İ. BİRBİL AND S.-C. FANG
Glover, J. K. F. and Laguna, M. (1995), Genetic algorithms and tabu search: Hybrids for optimization.
Computers and Operations Research 22: 111–134.
Hart, W. E. (1994), Adaptive Global Optimization with Local Search. PhD Thesis, University of
California, San Diego, 1994.
Huyer, W. and Neumaier, A. (1999), Global optimization by multilevel coordinate search. Journal of
Global Optimization 14: 331–355.
Kan, A. H. G. R. and Timmer, G. T. (1987), Stochastic global optimization methods, Part I: Clustering
methods, Mathematical Programming 39: 27–56.
Kan, A. H. G. R. and Timmer, G. T. (1987), Stochastic global optimization methods Part II: Multi
level methods, Mathematical Programming 39: 57–78.
Michalewicz, Z. (1994), Genetic Algorithms + Data Structures = Evolution Programs, Springer,
Berlin.
More, B. J. and Wu, Z. (1995), Global Smoothing and Continuation for Large-scale Molecular
Optimization. Argonne National Laboratory, Illinois: Preprint MCS-P539-1095.
Neumaier, A. (2001), Global optimization web site. Computational mathematics group, University
of Vienna, http://solon.cma.univie.ac.at/~neum/glopt.html, Austria.
Pinter, J. (1992), Convergence qualification of adaptive partitioning algorithms in global optimiza-
tion, Mathematical Programming 56: 343–360.
Schoen, F. (1989), A wide class of test functions for global optimization, Proceedings of the 3rd
International Conference on Genetic Algorithms, pp. 51–60.
Solis, F. J. and Wets, R. J-B. (1981), Minimization by random search techniques, Mathematics of
Operations Research 6: 19–30.
Törn, A., Ali, M. M. and Viitanen, S. (1999), Stochastic global optimization: Problem classes and
solution techniques, Journal of Global Optimization 14: 437–447.
Törn, A. and Viitanen, S. (1994), Topographical global optimization using pre-sampled points,
Journal of Global Optimization 5: 267–276.
Törn, A. and Žilinskas, A. (1989), Global Optimization, Springer, Berlin.