Differential Evolution
Differential Evolution
Differential Evolution
involves nding the best values of the decision variables over all possibilities. The best values would give the smallest objective function value for a minimization problem or the largest objective function value for a maximization problem. In terms of real world applications, the objective function is often a representation of some physically signicant measure such as prot, loss, utility, risk or error.
Evolution in Biology
Evolution in Biology I
Organisms produce a number of offspring similar to
themselves but can have variations due to: Mutations (random changes)
Evolution in Biology II
Some offspring survive, and produce next generations, and some
dont: The organisms adapted to the environment better have higher chance to survive Over time, the generations become more and more adapted because the fittest organisms survive
Dierential Evolution
Population-based optimisation algorithm
Introduced by Storn and Price in 1996 But many practical problems have objective functions
that are non-dierentiable, hence gradient based methods cannot be employed Developed to optimise mixed integer-discretecontinuous non-linear programming as against the continuous non-linear programming which is done by most of the algorithms.
probability density functions(PDFs) for parameter perturbation. These PDFs have to be adapted throughout the optimization by a process that imposes additional complexity to the optimization procedures. DE is a self-adjusting as it deduces the perturbation information from the distances between the vectors that comprise the population.
Problem
The population
A population is a congregation of individuals. An important
Initialisation
Mutation
Cross-over
Selection
parameter vectors xi,j. The random number generator, randj[0,1), returns a uniformly distributed random number from within the range [0,1), i.e., 0 randj[0,1) < 1. The subscript, j, indicates that a new random value is generated for each parameter.
DE Mutation Operator
DE generates new parameter vectors by adding the weighted
difference between two parameter vectors to a third vector For each target vector xi,G, i = 1,2,,NP, a mutant vector is generated according to: ui,G+1 = xr1,G + F (xr2,G xr3,G) where ui,G+1 is a mutant vector; r1,r2,r3 {1, 2, , NP}, random integer mutually different and different from index i then NP4 F is a real and constant factor (0, 1+] which controls the amplification of the differential variation; typical value is 0.8-0.9
DE Mutation Operator
Crossover
The classic variant of diversity enhancement is crossover
which mixes parameters of the mutation vector vi,g and the so-called target vector xi,g in order to generate the trial vector ui,g. The most common form of crossover is uniform and is defined as
controls the fraction of parameter values that are copied from the mutant.
Selection
DE uses simple one-to-one survivor selection where
the trial vector ui,g competes against the target vector xi,g. The vector with the lowest objective function value survives into the next generation g+1
Termination criteria
a preset number of maximum generations,
the difference between the best and worst function
values in the population is verysmall, the best function value has not improved beyond some tolerance value for a predened number of generations, the distance between solution vectors in the population is small.
=0.9. However, F=0.5 and pc=0.1 are also claimed to be a good rst choice.
control parameters. Increasing the population size is suggested if differential evolution does not converge. However, this only helps to a certain extent. Mutation intensity has to be adjusted a little lower or higher than 0.8 if population size is increased. It is claimed that convergence is more likely to occur but generally takes longer with larger population and weaker mutation (smaller F).
DE dynamics
There are two opposing mechanisms that influence
DEs population. 1) Population tendency to expand and explore the terrain on which it is working. It also prevents DEs population from converging prematurely. 2)By selection, expansion is counteracted by removing vectors which are located in unproductive regions.
Sample problem
f 2 (x) y (1 - x1).^2 100 * (x2 - x1.^2).^2 with x j [2.048, 2.048] and f * (x) 0.0.
Here it is a two variable problem, so D=2 The demonstration is for first 2 generations so g=2 We take a population size of 5 so Np=5 xL=-2.048 xU=2.048
Step 1:Initialization
bL=-2.048 bU=2.048
1 0.6557 0.0357
2 0.8491 0.9340
3 0.6787 0.7577
4 0.7431 0.3922
5 0.6555 0.1712
Var 2
-1.9017
1.7776
1.0557
-0.4414
-1.3468
Example of Mutation
ui,G+1 = xr1,G + F (xr2,G xr3,G) F=0.9
Pop Var 1 Var 2 1 0.6379 -1.9017 2 1.4300 1.7776 3 0.7321 1.0557 4 0.9959 -0.4414 5 0.6368 -1.3468
Xr2=0.6368
Xr3=0.9959 Trial vector U=1.4300 + 0.9 *(0.6368-0.9959)=1.10681
Generation 2
Function value of the previous generation is
533.1020893 7.329829 27.08365508 205.4110593 307.1924338
Cross over
Var 2
-0.9729
-1.9283
0.9434
-0.4414
-0.1686
Selection
f [u (i,2)]
1493.445845 1189.330898 459.4931906 11.29843767 180.2946151
f [x (i,2)]
f 2 (x) [100(x
j 1
D/ 2
2 2 j 1
x2 j ) (1 x2 j 1 ) ]
2 2
problem because the convergence to the global optimum is inside a long, narrow, parabolic shaped flat valley
f 2 (x) [100( x2 x ) (1 x1 ) ]
2 2 1 2
problem because the convergence to the global optimum is inside a long, narrow, parabolic shaped flat valley
Gen 1
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
-4
-3
-2
-1
-4
-3
-2
-1
Gen 3
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
-4
-3
-2
-1
-4
-3
-2
-1
Gen 5
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
-4
-3
-2
-1
-4
-3
-2
-1
Gen 10
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
-4
-3
-2
-1
-4
-3
-2
-1
Gen 20
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
-4
-3
-2
-1
-4
-3
-2
-1
Gen 27
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
-4
-3
-2
-1
-4
-3
-2
-1
Gen 40
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
-4
-3
-2
-1
-4
-3
-2
-1
Gen 50
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
-4
-3
-2
-1
-4
-3
-2
-1
Gen 60
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5
-4
-3
-2
-1
-4
-3
-2
-1
X1=0.9997; x2=0.9995
X1=0.4367; x2=0.7395
Inequality Constraints
Most often, inequality constraints are implemented as
penalty funcions. One common way to integrate constraint viola-tions into an optimization task is to multiply each penalty by a weight, wm, and add the result to the objective function, f(x):
may differ by many orders of magnitude, leaving violations with small penalties underrepresented until those that generate large penalties become just as small. When there are many constraints, the main drawback of the penalty approach is that pre-specified weights must be well chosen to keep the population from converging upon either infeasible or non-optimal vectors.
one penalty will dominate unless weights are correctly adjusted. In addition, the population can converge upon an infeasible region if its objective function values are much lower than those in feasible regions. It can even happen that no set of weights will work. Because weight selection tends to be a trial and error optimization problem in its own right, simpler direct constraint handling methods have been designed that do not require the user to tune penalty weights.
vector ui,g if: 1. ui,g satisfies all constraints and has a lower or equal objective function value than xi,g, or 2. ui,g is feasible and xi,g is not, or 3. ui,g and xi,g are both infeasible, but ui,g does not violate any constraint more than xi,g.
Equality Constraints
equality constraints can be written in the form
Eliminating variables ??
Eliminating an objective function variable with an
equality constraint not only ensures that all vectors satisfy the constraint, but also reduces the problems dimension by 1. In pactice, not all equality constraint equations can be solved for a term that also appears in the objective function. When it cannot be used to eliminate a variable, an equality constraint can be recast as a pair of inequality constraints.
solution exactly satisfies an equality constraint. Otherwise, the finite precision of floating-point number formats limits the degree to which an equality constraint can be satisfied. It is more reasonable, therefore, to demand that an equality constraint violation be less than , where can be made as small as desired:
Penalty Approach
Like inequality constraints, equality constraints can be
transformed into cost terms that penalize the objective function. Equation 4.29 shows that either the absolute value or the square of n(x) makes a suitable penalty. When all constraints are fulfilled, the penalty term, like , becomes zero: