Genetic Algorithm: Artificial Neural Networks (Anns)
Genetic Algorithm: Artificial Neural Networks (Anns)
Genetic Algorithm: Artificial Neural Networks (Anns)
Introduction:
Applying mathematics to a problem of the real world mostly means, at first,
modeling the problem mathematically, maybe with hard restrictions, idealizations,
or simplifications, then solving the mathematical problem, and finally drawing
conclusions about the real problem based on the solutions of the mathematical
problem. Since about 60 years, a shift of paradigms has taken placein some
sense, the opposite way has come into fashion. The point is that the world has done
well even in times when nothing about mathematical modeling was known. More
specifically, there is an enormous number of highly sophisticated processes and
mechanisms in our world which have always attracted the interest of researchers
due to their admirable perfection. To imitate such principles mathematically and to
use them for solving a broader class of problems has turned out to be extremely
helpful in various disciplines.
The fourth class of such methods will be the main object of study in this lectures
Genetic Algorithms (GAs).
Generally speaking, genetic algorithms are simulations of evolution, of what kind
ever. In most cases, however, genetic algorithms are nothing else than probabilistic
optimization methods which are based on the principles of evolution. This idea
appears first in 1967 in J. D. Bagleys thesis The Behavior of Adaptive Systems
Which Employ Genetic and Correlative Algorithms [1]. The theory and applicability
was then strongly influenced by J. H. Holland, who can be considered as the pioneer
of genetic algorithms [27, 28]. Since then, this field has witnessed a tremendous
development. The purpose of this lecture is to give a comprehensive overview of
this class of methods and their applications in optimization, program induction, and
machine learning.
Definitions and Terminology:
Definition:
Assume S to be a set of strings (in non-trivial cases with some underlying
grammar). Let X be the search space of an optimization problem as above, then a
function c : X S x 7 c(x) is called coding function. Conversely, a function c :
S X s 7 c(s) is called decoding function.
Algorithm.
t := 0;
Compute initial population B0;
WHILE stopping condition not fulfilled DO
BEGIN
select individuals for reproduction;
create offsprings by crossing individuals;
eventually mutate some individuals;
compute new generation
END
As obvious from the above algorithm, the transition from one generation to the next
consists of four basic components:
Selection:
Mechanism for selecting individuals (strings) for reproduction according to their
fitness (objective function value).
Crossover:
Method of merging the genetic information of two individuals; if the coding is
chosen properly, two good parents produce good children.
Mutation:
In real evolution, the genetic material can by changed randomly by erroneous
reproduction or other deformations of genes, e.g. by gamma radiation. In genetic
algorithms, mutation can be realized as a random deformation of the strings with a
certain probability. The positive effect is preservation of genetic diversity and, as an
effect, that local maxima can be avoided.
Sampling:
Procedure which computes a new generation from the previous one and its
offsprings. Compared with traditional continuous optimization methods, such as
Newton or gradient descent methods, we can state the following significant
differences:
1. GAs manipulate coded versions of the problem parameters instead of the
parameters themselves, i.e. the search space is S instead of X itself.
2. While almost all conventional methods search from a single point, GAs always
operate on a whole population of points (strings). This contributes much to the
robustness of genetic algorithms. It improves the chance of reaching the global
optimum and, vice versa, reduces the risk of becoming trapped in a local stationary
point.
3. Normal genetic algorithms do not use any auxiliary information about the
objective function value such as derivatives. Therefore, they can be applied to any
kind of continuous or discrete optimization problem. The only thing to be done is to
specify a meaningful decoding function.
4. GAs use probabilistic transition operators while conventional methods for
continuous optimization apply deterministic transition operators. More specifically,
the way a new generation is computed from the actual one has some random
components.
Algorithm.
t := 0;
Create initial population B0 = (b1,0, . . . , bm,0);
WHILE stopping condition not fulfilled DO
BEGIN
( proportional selection )
FOR i := 1 TO m DO
BEGIN
x := Random[0, 1];
k := 1;
WHILE k < m & x < Pk j=1 f(bj,t)/ Pm j=1 f(bj,t) DO
k := k + 1;
bi,t+1 := bk,t
END
( one-point crossover )
FOR i := 1 TO m 1 STEP 2 DO
BEGIN
IF Random[0, 1] pC THEN
BEGIN pos := Random{1, . . . , n 1};
FOR k := pos + 1 TO n DO
BEGIN
aux := bi,t+1[k];
bi,t+1[k] := bi+1,t+1[k];
bi+1,t+1[k] := aux
END
END
END
( mutation )
FOR i := 1 TO m DO
FOR k := 1 TO n DO
IF Random[0, 1] < pM THEN
invert bi,t+1[k];
t := t + 1
END
Simulated Annealing
Simulated annealing (SA) is a random-search technique which exploits an analogy
between the way in which a metal cools and freezes into a minimum energy
crystalline structure (the annealing process) and the search for a minimum in a
more general system; it forms the basis of an optimisation technique for
combinatorial and other problems.
Simulated annealing was developed in 1983 to deal with highly nonlinear problems.
SA approaches the global maximisation problem similarly to using a bouncing ball
that can bounce over mountains from valley to valley. It begins at a high
"temperature" which enables the ball to make very high bounces, which enables it
to bounce over any mountain to access any valley, given enough bounces. As the
temperature declines the ball cannot bounce so high, and it can also settle to
become trapped in relatively small ranges of valleys. A generating distribution
generates possible valleys or states to be explored. An acceptance distribution is
also defined, which depends on the difference between the function value of the
present generated valley to be explored and the last saved lowest valley. The
acceptance distribution decides probabilistically whether to stay in a new lower
valley or to bounce out of it. All the generating and acceptance distributions depend
on the temperature.
It has been proved that by carefully controlling the rate of cooling of the
temperature, SA can find the global optimum. However, this requires infinite time.
Fast annealing and very fast simulated reannealing (VFSR) or adaptive simulated
annealing (ASA) are each in turn exponentially faster and overcome this problem.
The method
SA's major advantage over other methods is an ability to avoid becoming trapped in
local minima. The algorithm employs a random search which not only accepts
changes that decrease the objective function f (assuming a minimisation problem),
but also some changes that increase it. The latter are accepted with a probability
p = exp ( -df / T) (1)
where df is the increase in f and T is a control parameter, which by analogy with the
original application is known as the system ''temperature" irrespective of the
objective function involved. The implementation of the basic SA algorithm is
straightforward.
The following figure shows its structure
Strengths
1. Simulated annealing can deal with highly nonlinear models, chaotic and noisy
data and many constraints. It is a robust and general technique.
2. Its main advantages over other local search methods are its flexibility and its
ability to approach global optimality.
3. The algorithm is quite versatile since it does not rely on any restrictive
properties of the model.
4. SA methods are easily "tuned". For any reasonably difficult nonlinear or
stochastic system, a given optimisation algorithm can be tuned to enhance
its performance and since it takes time and effort to become familiar with a
given code, the ability to tune a given algorithm for use in more than one
problem should be considered an important feature of an algorithm.
Weaknesses
1. Since SA is a metaheuristic, a lot of choices are required to turn it into an
actual algorithm.
2. There is a clear tradeoff between the quality of the solutions and the time
required to compute them.
3. The tailoring work required to account for different classes of constraints and
to fine-tune the parameters of the algorithm can be rather delicate.
4. The precision of the numbers used in implementation is of SA can have a
significant effect upon the quality of the outcome.
Neural nets
The main difference compared with neural nets is that neural nets learn (how to
approximate a function) while simulated annealing searches for a global optimum.
Neural nets are flexible function approximators while SA is an intelligent random
search method. The adaptive characteristics of neural nets are a huge advantage in
modelling changing environments. However, the power-hungriness of SA limits its
use as a real-time application.
Genetic algorithms
Direct comparisons have been made between ASA/VFSR and publicly-available
genetic algorithm (GA) codes, using a test suite already adapted and adopted for
GA. In each case, ASA outperformed the GA problem. GA is a class of algorithms
that are interesting in their own right; GA was not originally developed as an
optimisation algorithm, and basic GA does not offer any statistical guarantee of
global 5 convergence to an optimal point. Nevertheless, it should be expected that
GA may be better suited for some problems than SA.
Random Search
The brute force approach for difficult functions is a random, or an enumerated
search. Points in the search space are selected randomly, or in some systematic
way, and their fitness evaluated. This is an unintelligent strategy, and is rarely used
by itself.
Suitability
SA appears rapidly to be becoming an algorithm of choice when dealing with
financial instruments [1]. Standard nested regression and local-search methods
usually are applied to develop hybrid securities, e.g. combining markets in interest
rates, foreign exchange, equities and commodities by linking them via options,
futures, forwards, and swaps, to increase profits and reduce risks in investments as
well as in trading.
The potential uses for SA in stock price modelling may be limited. However,
simulated annealing has been reasonably successfully used in the solution of a
complex portfolio selection model [3]. The algorithm was able to handle more
classes of constraints than most other techniques. Trading constraints 6 were,
however, difficult to handle because of the discontinuities they introduced in the
space of feasible portfolios.