Tutorial EA
Tutorial EA
Tutorial EA
htm
Advertisements
Previous Page
Next Page
Introduction to Optimization
Optimization is the process of making something better. In any process, we have a set of
inputs and a set of outputs as shown in the following figure.
Optimization refers to finding the values of inputs in such a way that we get the “best”
output values. The definition of “best” varies from problem to problem, but in mathematical
terms, it refers to maximizing or minimizing one or more objective functions, by varying the
input parameters.
The set of all possible solutions or values which the inputs can take make up the search
space. In this search space, lies a point or a set of points which gives the optimal solution.
The aim of optimization is to find that point or set of points in the search space.
In this way we keep “evolving” better individuals or solutions over generations, till we reach
a stopping criterion.
Genetic Algorithms are sufficiently randomized in nature, but they perform much better than
random local search (in which we just try various random solutions, keeping track of the
best so far), as they exploit historical information as well.
Advantages of GAs
GAs have various advantages which have made them immensely popular. These include −
Does not require any derivative information (which may not be available for many
real-world problems).
Optimizes both continuous and discrete functions and also multi-objective problems.
Always gets an answer to the problem, which gets better over the time.
Useful when the search space is very large and there are a large number of
parameters involved.
Limitations of GAs
Like any technique, GAs also suffer from a few limitations. These include −
GAs are not suited for all problems, especially problems which are simple and for
which derivative information is available.
Being stochastic, there are no guarantees on the optimality or the quality of the
solution.
If not implemented properly, the GA may not converge to the optimal solution.
2/4
GA – Motivation
Genetic Algorithms have the ability to deliver a “good-enough” solution “fast-enough”. This
makes genetic algorithms attractive for use in solving optimization problems. The reasons
why GAs are needed are as follows −
Previous Page
Print
Next Page
Advertisements
4/4
tutorialspoint.com/genetic_algorithms/genetic_algorithms_fundamentals.htm
Advertisements
Previous Page
Next Page
This section introduces the basic terminology required to understand GAs. Also, a generic
structure of GAs is presented in both pseudo-code and graphical forms. The reader is
advised to properly understand all the concepts introduced in this section and keep them in
mind when reading other sections of this tutorial as well.
Basic Terminology
Before beginning a discussion on Genetic Algorithms, it is essential to be familiar with
some basic terminology which will be used throughout this tutorial.
1/5
Genotype − Genotype is the population in the computation space. In the
computation space, the solutions are represented in a way which can be easily
understood and manipulated using a computing system.
Phenotype − Phenotype is the population in the actual real world solution space in
which solutions are represented in a way they are represented in real world
situations.
Decoding and Encoding − For simple problems, the phenotype and genotype
spaces are the same. However, in most of the cases, the phenotype and genotype
spaces are different. Decoding is a process of transforming a solution from the
genotype to the phenotype space, while encoding is a process of transforming from
the phenotype to genotype space. Decoding should be fast as it is carried out
repeatedly in a GA during the fitness value calculation.
For example, consider the 0/1 Knapsack Problem. The Phenotype space consists of
solutions which just contain the item numbers of the items to be picked.
2/5
Fitness Function − A fitness function simply defined is a function which takes the
solution as input and produces the suitability of the solution as the output. In some
cases, the fitness function and the objective function may be the same, while in
others it might be different based on the problem.
Genetic Operators − These alter the genetic composition of the offspring. These
include crossover, mutation, selection, etc.
Basic Structure
The basic structure of a GA is as follows −
We start with an initial population (which may be generated at random or seeded by other
heuristics), select parents from this population for mating. Apply crossover and mutation
operators on the parents to generate new off-springs. And finally these off-springs replace
the existing individuals in the population and the process repeats. In this way genetic
algorithms actually try to mimic the human evolution to some extent.
Each of the following steps are covered as a separate chapter later in this tutorial.
3/5
A generalized pseudo-code for a GA is explained in the following program −
GA()
initialize population
find fitness of population
Previous Page
4/5
Print
Next Page
Advertisements
5/5
tutorialspoint.com/genetic_algorithms/genetic_algorithms_genotype_representation.htm
Genotype Representation
Advertisements
Previous Page
Next Page
One of the most important decisions to make while implementing a genetic algorithm is
deciding the representation that we will use to represent our solutions. It has been observed
that improper representation can lead to poor performance of the GA.
In this section, we present some of the most commonly used representations for genetic
algorithms. However, representation is highly problem specific and the reader might find
that another representation or a mix of the representations mentioned here might suit
his/her problem better.
Binary Representation
This is one of the simplest and most widely used representation in GAs. In this type of
representation the genotype consists of bit strings.
For some problems when the solution space consists of Boolean decision variables – yes
or no, the binary representation is natural. Take for example the 0/1 Knapsack Problem. If
there are n items, we can represent a solution by a binary string of n elements, where the
x th element tells whether the item x is picked (1) or not (0).
For other problems, specifically those dealing with numbers, we can represent the numbers
with their binary representation. The problem with this kind of encoding is that different bits
have different significance and therefore mutation and crossover operators can have
undesired consequences. This can be resolved to some extent by using Gray Coding, as a
change in one bit does not have a massive effect on the solution.
Integer Representation
For discrete valued genes, we cannot always limit the solution space to binary ‘yes’ or ‘no’.
For example, if we want to encode the four distances – North, South, East and West, we
can encode them as {0,1,2,3}. In such cases, integer representation is desirable.
Permutation Representation
In many problems, the solution is represented by an order of elements. In such cases
permutation representation is the most suited.
A classic example of this representation is the travelling salesman problem (TSP). In this
the salesman has to take a tour of all the cities, visiting each city exactly once and come
back to the starting city. The total distance of the tour has to be minimized. The solution to
this TSP is naturally an ordering or permutation of all the cities and therefore using a
permutation representation makes sense for this problem.
Previous Page
Print
Next Page
Advertisements
2/2
tutorialspoint.com/genetic_algorithms/genetic_algorithms_population.htm
Advertisements
Previous Page
Next Page
Population is a subset of solutions in the current generation. It can also be defined as a set
of chromosomes. There are several things to be kept in mind when dealing with GA
population −
The population size should not be kept very large as it can cause a GA to slow down,
while a smaller population might not be enough for a good mating pool. Therefore, an
optimal population size needs to be decided by trial and error.
The population is usually defined as a two dimensional array of – size population, size x,
chromosome size.
Population Initialization
There are two primary methods to initialize a population in a GA. They are −
Heuristic initialization − Populate the initial population using a known heuristic for
the problem.
It has been observed that the entire population should not be initialized using a heuristic, as
it can result in the population having similar solutions and very little diversity. It has been
experimentally observed that the random solutions are the ones to drive the population to
optimality. Therefore, with heuristic initialization, we just seed the population with a couple
of good solutions, filling up the rest with random solutions rather than filling the entire
population with heuristic based solutions.
It has also been observed that heuristic initialization in some cases, only effects the initial
fitness of the population, but in the end, it is the diversity of the solutions which lead to
optimality.
Population Models
1/2
There are two population models widely in use −
Steady State
In steady state GA, we generate one or two off-springs in each iteration and they replace
one or two individuals from the population. A steady state GA is also known as Incremental
GA.
Generational
In a generational model, we generate ‘n’ off-springs, where n is the population size, and the
entire population is replaced by the new one at the end of the iteration.
Previous Page
Print
Next Page
Advertisements
2/2
tutorialspoint.com/genetic_algorithms/genetic_algorithms_fitness_function.htm
Advertisements
Previous Page
Next Page
The fitness function simply defined is a function which takes acandidate solution to the
problem as input and produces as output how “fit” our how “good” the solution is with
respect to the problem in consideration.
In most cases the fitness function and the objective function are the same as the objective
is to either maximize or minimize the given objective function. However, for more complex
problems with multiple objectives and constraints, an Algorithm Designer might choose to
have a different fitness function.
It must quantitatively measure how fit a given solution is or how fit individuals can be
produced from the given solution.
In some cases, calculating the fitness function directly might not be possible due to the
inherent complexities of the problem at hand. In such cases, we do fitness approximation to
suit our needs.
The following image shows the fitness calculation for a solution of the 0/1 Knapsack. It is a
simple fitness function which just sums the profit values of the items being picked (which
have a 1), scanning the elements from left to right till the knapsack is full.
1/2
Previous Page
Print
Next Page
Advertisements
2/2
tutorialspoint.com/genetic_algorithms/genetic_algorithms_parent_selection.htm
Advertisements
Previous Page
Next Page
Parent Selection is the process of selecting parents which mate and recombine to create
off-springs for the next generation. Parent selection is very crucial to the convergence rate
of the GA as good parents drive individuals to a better and fitter solutions.
However, care should be taken to prevent one extremely fit solution from taking over the
entire population in a few generations, as this leads to the solutions being close to one
another in the solution space thereby leading to a loss of diversity. Maintaining good
diversity in the population is extremely crucial for the success of a GA. This taking up of
the entire population by one extremely fit solution is known as premature convergence
and is an undesirable condition in a GA.
Consider a circular wheel. The wheel is divided into n pies, where n is the number of
individuals in the population. Each individual gets a portion of the circle which is
proportional to its fitness value.
1/4
It is clear that a fitter individual has a greater pie on the wheel and therefore a greater
chance of landing in front of the fixed point when the wheel is rotated. Therefore, the
probability of choosing an individual depends directly on its fitness.
Starting from the top of the population, keep adding the finesses to the partial sum P,
till P
2/4
It is to be noted that fitness proportionate selection methods don’t work for cases where the
fitness can take a negative value.
Tournament Selection
In K-Way tournament selection, we select K individuals from the population at random and
select the best out of these to become a parent. The same process is repeated for
selecting the next parent. Tournament Selection is also extremely popular in literature as it
can even work with negative fitness values.
Rank Selection
Rank Selection also works with negative fitness values and is mostly used when the
individuals in the population have very close fitness values (this happens usually at the end
of the run). This leads to each individual having an almost equal share of the pie (like in
case of fitness proportionate selection) as shown in the following image and hence each
individual no matter how fit relative to each other has an approximately same probability of
getting selected as a parent. This in turn leads to a loss in the selection pressure towards
fitter individuals, making the GA to make poor parent selections in such situations.
3/4
In this, we remove the concept of a fitness value while selecting a parent. However, every
individual in the population is ranked according to their fitness. The selection of the parents
depends on the rank of each individual and not the fitness. The higher ranked individuals
are preferred more than the lower ranked ones.
A 8.1 1
B 8.0 4
C 8.05 2
D 7.95 6
E 8.02 3
F 7.99 5
Random Selection
In this strategy we randomly select parents from the existing population. There is no
selection pressure towards fitter individuals and therefore this strategy is usually avoided.
Previous Page
Print
Next Page
Advertisements
4/4
tutorialspoint.com/genetic_algorithms/genetic_algorithms_crossover.htm
Advertisements
Previous Page
Next Page
In this chapter, we will discuss about what a Crossover Operator is along with its other
modules, their uses and benefits.
Introduction to Crossover
The crossover operator is analogous to reproduction and biological crossover. In this more
than one parent is selected and one or more off-springs are produced using the genetic
material of the parents. Crossover is usually applied in a GA with a high probability – pc .
Crossover Operators
In this section we will discuss some of the most popularly used crossover operators. It is to
be noted that these crossover operators are very generic and the GA Designer might
choose to implement a problem-specific crossover operator as well.
1/3
Uniform Crossover
In a uniform crossover, we don’t divide the chromosome into segments, rather we treat
each gene separately. In this, we essentially flip a coin for each chromosome to decide
whether or not it’ll be included in the off-spring. We can also bias the coin to one parent, to
have more genetic material in the child from that parent.
Obviously, if α = 0.5, then both the children will be identical as shown in the following
image.
Create two random crossover points in the parent and copy the segment between
them from the first parent to the first offspring.
Now, starting from the second crossover point in the second parent, copy the
remaining unused numbers from the second parent to the first child, wrapping around
the list.
Repeat for the second child with the parent’s role reversed.
2/3
There exist a lot of other crossovers like Partially Mapped Crossover (PMX), Order based
crossover (OX2), Shuffle Crossover, Ring Crossover, etc.
Previous Page
Print
Next Page
Advertisements
3/3
tutorialspoint.com/genetic_algorithms/genetic_algorithms_mutation.htm
Advertisements
Previous Page
Next Page
Introduction to Mutation
In simple terms, mutation may be defined as a small random tweak in the chromosome, to
get a new solution. It is used to maintain and introduce diversity in the genetic population
and is usually applied with a low probability – pm. If the probability is very high, the GA gets
reduced to a random search.
Mutation is the part of the GA which is related to the “exploration” of the search space. It
has been observed that mutation is essential to the convergence of the GA while crossover
is not.
Mutation Operators
In this section, we describe some of the most commonly used mutation operators. Like the
crossover operators, this is not an exhaustive list and the GA designer might find a
combination of these approaches or a problem-specific mutation operator more useful.
Random Resetting
Random Resetting is an extension of the bit flip for the integer representation. In this, a
random value from the set of permissible values is assigned to a randomly chosen gene.
Swap Mutation
In swap mutation, we select two positions on the chromosome at random, and interchange
the values. This is common in permutation based encodings.
1/2
Scramble Mutation
Scramble mutation is also popular with permutation representations. In this, from the entire
chromosome, a subset of genes is chosen and their values are scrambled or shuffled
randomly.
Inversion Mutation
In inversion mutation, we select a subset of genes like in scramble mutation, but instead of
shuffling the subset, we merely invert the entire string in the subset.
Previous Page
Print
Next Page
Advertisements
2/2
tutorialspoint.com/genetic_algorithms/genetic_algorithms_survivor_selection.htm
Advertisements
Previous Page
Next Page
The Survivor Selection Policy determines which individuals are to be kicked out and which
are to be kept in the next generation. It is crucial as it should ensure that the fitter
individuals are not kicked out of the population, while at the same time diversity should be
maintained in the population.
Some GAs employ Elitism. In simple terms, it means the current fittest member of the
population is always propagated to the next generation. Therefore, under no circumstance
can the fittest member of the current population be replaced.
The easiest policy is to kick random members out of the population, but such an approach
frequently has convergence issues, therefore the following strategies are widely used.
For instance, in the following example, the age is the number of generations for which the
individual has been in the population. The oldest members of the population i.e. P4 and P7
are kicked out of the population and the ages of the rest of the members are incremented
by one.
1/3
Fitness Based Selection
In this fitness based selection, the children tend to replace the least fit individuals in the
population. The selection of the least fit individuals may be done using a variation of any of
the selection policies described before – tournament selection, fitness proportionate
selection, etc.
For example, in the following image, the children replace the least fit individuals P1 and
P10 of the population. It is to be noted that since P1 and P9 have the same fitness value,
the decision to remove which individual from the population is arbitrary.
Previous Page
Print
Next Page
Advertisements
2/3
3/3
tutorialspoint.com/genetic_algorithms/genetic_algorithms_termination_condition.htm
Advertisements
Previous Page
Next Page
For example, in a genetic algorithm we keep a counter which keeps track of the generations
for which there has been no improvement in the population. Initially, we set this counter to
zero. Each time we don’t generate off-springs which are better than the individuals in the
population, we increment the counter.
However, if the fitness any of the off-springs is better, then we reset the counter to zero.
The algorithm terminates when the counter reaches a predetermined value.
Like other parameters of a GA, the termination condition is also highly problem specific and
the GA designer should try out various options to see what suits his particular problem the
best.
Previous Page
Print
Next Page
Advertisements
1/2
2/2
tutorialspoint.com/genetic_algorithms/genetic_algorithms_advanced_topics.htm
Advertisements
Previous Page
Next Page
In this section, we introduce some advanced topics in Genetic Algorithms. A reader looking
for just an introduction to GAs may choose to skip this section.
In such a scenario, crossover and mutation operators might give us solutions which are
infeasible. Therefore, additional mechanisms have to be employed in the GA when dealing
with constrained Optimization Problems.
1/4
Using penalty functions which reduces the fitness of infeasible solutions, preferably
so that the fitness is reduced in proportion with the number of constraints violated or
the distance from the feasible region.
Using repair functions which take an infeasible solution and modify it so that the
violated constraints get satisfied.
Schema Theorem
Researchers have been trying to figure out the mathematics behind the working of genetic
algorithms, and Holland’s Schema Theorem is a step in that direction. Over the year’s
various improvements and suggestions have been done to the Schema Theorem to make it
more general.
In this section, we don’t delve into the mathematics of the Schema Theorem, rather we try
to develop a basic understanding of what the Schema Theorem is. The basic terminology
to know are as follows −
2/4
Building Block Hypothesis
Building Blocks are low order, low defining length schemata with the above given average
fitness. The building block hypothesis says that such building blocks serve as a foundation
for the GAs success and adaptation in GAs as it progresses by successively identifying and
recombining such “building blocks”.
It means that the more we understand a problem, our GA becomes more problem specific
and gives better performance, but it makes up for that by performing poorly for other
problems.
It should be kept in mind that the standard issue like crossover, mutation, Lamarckian or
Darwinian, etc. are also present in the GBML systems.
Previous Page
Print
3/4
Next Page
Advertisements
4/4