Goldberg Gasoml c3
Goldberg Gasoml c3
Goldberg Gasoml c3
of a GeneticAlgorithm
When first approaching genetic algorithms, many users hesitate, not knowing
where to start or how to begin. On the one hand, this aversive reaction seems
strange. After all, in the first two chapters we have seen how genetic algorithms
are mechanically quite simple, involving nothing more than random number gen-
eration, string copies, and partial string exchanges. On the other hand, for many
business, scientific, and engineering users and programmers this stark simplicity
is itself part of the problem; these individuals are familiar with using and program-
ming high-level computer codes involving complex mathematics, interwoven da.
tabases, and intricate computations. Moreover, this same audience is most
cornfortable with the reassuring repeatability of deterministic computer pro-
grams. The direct manipulation ofbit strings, the construction of custom codings,
and even the randomness of GA operatQrs can present a sequence of high hurdles
that prevent effective application.
In this chapter, we leap these obstacles by first constructing the data struc-
tures and algorithms necessary to implement the simple genetic algorithm de-
scribed earlier. Specifically, we write a Pascal computer code called the simple
genetic algorithm (SGA), which contains nonoverlapping string populations, re-
production, crossover, and mutation applied to the optimization of a simple func-
tion of one variable coded as an unsigned binary integer. We also examine some
implementation issues such as discretization of parameters, coding of strings, en-
forcement of constraints, and mapping of fitness that arise in applying GAs to
particular problems.
60 Chapter 3 / Computer Implementation of a Genetic Algorithm
DATA STRUCTURES
INDIVIDUAL INDIVIDUALS
NUMBER STRING X f<X)PTHER
1 01111 15 225
2 01001 9 81
3
n 00111 7 49
FIGURE 3.2 A simple genetic algorithm, SGA, data type declarations in Pascal.
~.
REPRDJIlJ:TJtJI
HUT"TICH
N -1 N -1
N N
The three operators of the simple tripartite algorithm can each be implemented
in straightforward code segments. This comes as no surprise, since we have been
touting the simple mechanics of these operators. Before we look at each routine,
we must remember the common thread running through the three operators:
Reproduction, Crossover, and Mutation 63
eách depends on random choice. In the code segments that follow, we assume
the existence of three random choice routines:
random returns a real pseudorandom number between zero and one (a
uniform random variable on the real interval [O, 1D.
flip returns a boolean true value with specified probability (a Ber-
ilóulli random variable).
rnd returns an integer value between specified lower and upper Íimits
(a uniform rándom variable over a subset of adjacent integers).
A more complete discussion of these routines is contained in Appendix B where
severat progr;imming examples are given.
In the simple genetic aIgorithm, teproduction is iniplemented in the function
setect as a linear search througb a roulette wheel with slots weighted in propor-
tion to string fitness values. In the code shown in Fig. 3.5, we see that select
returns the population index value corresponding to the selected individual. To
do this, the partial sum of the fitness values is accumulated in the real variable
partsum. The real variable rand contains the location where the wheel has
landed after a rándom spin according to the computation:
rand := rahdom * sumfitness
Í-Iere the sum of the population fitnesses (calculated in the procedure statistics)
is multiplied by the normalized pseudorandom number generated by random.
Finally.the repeat-until construct seárches through the weighted roulette wheel
until the partial sum is greater than or equal to the stopping point rand. The
fuilction returns with the current population index valué j assigned to select.
This is perhaps the simplest way to implement selection. There are more
efficient codes to implement this operator (a binary search will certainly speed
things up), and thert: are many other ways to choose o~spring with appropriate
bias toward the best. We will examine some of these in later chapters, but for
now we stay with this basic mechanism. .
The code segment select gives us a straightforward way of choosing offspring
for the next generation. From our previous descriptions, we know our next step
is crossover. Ín SGA the crossover operator is implemented in a procedure that
we, cleverly enough, have called crossover (Fig. 3.6). The routine crossover takes
two parent strings called parentl ánd parent2 and generates two offSpring strings
called childl and child2. The probabilities of crossover and mutation, pcross and
pmutation, are passed to crossover, along with the string length lchrom, a cross-
over count accilmulator ncross, and a mutation count accumulator nmutation.
Within crossover the operations mirror our description in Chapter 1. At the
top of the rdutine, we determine whether we are going to perform crossover on
the current pair of parent chromosomes. Specifically, we toss a biased coin that
comes up heads (true) with probability pcross. The coin toss is simulated in the
boolean function flip, where flip in turn calls on the pseudorandom number
routine random If a cross is called for, a crossing site is selected between l and
the last cross site. The crossing site is seleCted in the function rnd, which returns
a pseudorandom irlteger between specified lower and upper limits (between l
and lchrom - 1). If no cross is to be performed, the cross site is selected as
léhrom (the full string length l) so a bit-by-bit mutation will take place despite
the absence of a cross. Finally, the partial exchange of crossover is carried out in
the two for-do constructs at the end of the codeo The first for-do handles the
partial transfer of bits between parentl and childl and between parent2 and
child2. The second for-do construct handles the transfer and partial exchange of
material between parentl and child2 and between parent2 and childl. In all
cases, a bit-by-bit mutation is carried out by the boolean (or allelean) function
mutation.
Mutation at a point is tarried out by mutation as shown in Fig. 3.7. This
function uses the functión flip (the biased coin toss) to determine whether or
not to change a true to a false (a l to a O) or vice versa. Of course the function
flip will only come up heads (true) pmutation percent of the time as a result of
the call to the pseudbrandom number generator random within flip itself. The
fuñction also keeps ta:bson the number of mutations by incrementing the variable
nmutation. Aswith reproductlon, there are ways to improve our simple mutation
operator. For example, it would be possible to avoid much random number gen-
eration if we decided when the next mutation should occur rather than calling
flip each time. Again, in this chapter, we avoid sophisticated niceties and stick
with the basics.
The three main pieces of our genetic aigorithm puzzle have proven to be
none too puzzling. We have seen in this section how the three may be easily
coded and easily understood. The next sectioil continues piecing together the
bigger GA picture as we coordinate reproduction, crossover, and mutation in a
single generation.
With the Big Three designed and built, creating a new population from an óld
one is no big deal. The l?roper sequencing is shown in Fig. 3.8 in the procedure
generation. Starting at an individual index j = 1 and continuing until the popil-
lation size, popsize, has been exceeded, we pick tWo mates, mate] and mate2,
using succéssive calls to select We cross and mutate the chromosomes using
crossover (which itself contains the necessary invocations of mutation). In a final
tlurry of mad activity, we decode the pair of chromosomes, evaluate the objective
(fitness) function vaiues, ánd increment the population index j by 2.
Having already examined selec~ crossover, and mutation in detail, we need
only concern ourselves with the two problem-dependent routines hinted at
above. For any problem we must create a procedure that decodes the string to
procedure generation;
( Create a new generation through select, crossover. and mutation )
( Note: generationassumes an even-numberedpopsize )
var j, matel, mate2, jcross:integer;
begin
j :- 1;
repeat (select. crossover. and mutation until newpop is filled )
matel:- select(popsize. sumfitness, oldpop); ( pick pair of mates)
mate2 :- seléct(popsize, sumfitness. oldpop);
( Crossover and mutation - mutation embedded with!n crossover )
crossover(oldpop[matel].chrom, oldpop[mate2].chrom.
newpop[j . ].chrom. newpop[j + l].éhrom,
lchrom, ncross. nmutation, jcross, pcross, pmutatlon);
( Decode string. evaluate fitness. & record parentage date on both children )
with newpop[j ] do begin
x :- decode(chrom. lchrom);
fitness :- objfunc(x);
parentl :- matel;
parent2 :- mate2;
xsite :- jcross;
end;
with newpop[j+l] do begin
x :- decode, (chrom lchrom);
fitness :- objfunc(x);
párehtl :- matel;
parent2 :- mate2;
xsite :- jcross;
end;
( Increment population index )
j :- j + 2;
until j>popsize
end;
FIGURE 3.8 Procedure generation generates a new pópúlation from the pre-
vious population.
A lime fo Reproduce, a lime fo Cross 67
function objfunc(x:rea1):rea1;
( Fitness function - f(x) - x**n }
const coef - 1073741823.0; ( Coefficient to norma1ize doma in )
n-lO; ( Power of x )
begin objfunc :- power( xjcoef, n ) end;
We have described the data structures. We have built the genetic operators. We
have decoded the strings, and we have figured the fitness vaIues. Now is the time
lO wrap a ribbon around this parcel, test it, and ship it on for further use. In Fig.
3.11 we see the main program of SGA.At the top of the code, we start innocently
enough by setting the generation counter to O,gen: = o. We build steam as we
read in program data, initialize a random population, calculate initial population
statistics, and print out a special initial report using the procedure initialize. We
won't dwell on the initialization code here. The interested reader should refer to
Appendix e, which contains a complete copy of the SGAcodeo
At long last, with necessary preliminaries complete, we hit the main loop
contained within the repeat-until construct. In rapid succession we increment
the generation counter, generate a new generation in generation, calculate new
generation statistics in statistics, print out the generation report in report, and
advance the population in one fell swoop:
oldpop : = newpop;
AlI this continues, step after relentless step, until the generation counter exceeds
the maximum, thereby forcing the machinery to a grinding hall.
In our rush lO see the big picture, we have missed some important details.
The statistical routine statistics (Fig. 3.12) calculates the average, maximum, and
minimum fitness values; it also calculates the sumfitness required by the roulette
wheel. This version of statistics is again something of a minimalIy acceptable
solution. Many other interesting population statistics could and probably should
be tracked. For example, allele convergence statistics are often tabulated during
a generation. Best string so far or best k strings so far could be slOred for future
reference. Population standard deviation or even population histograms might
also be of interest in doing more detailed run postmortems. The separation of
statistical functions in the routine statistics permits the easy addition of any or all
of these computations.
Get with the Main Program 69
The procedure report presents the fuU population report, including strings,
fitnesses, and parameter values. A listing of report and its single subprocedure
writechrom are presented in Fig. 3.13. Once again, a wide array of tabular and
graphic reporting options may be useful in genetic algorithm work. The simple
report procedure is a good tool because it permits side-by-side comparison of
consecutive generations. In turn, this allows checking of operators and analysis
of the events leading to the construction of the best individuals.
procedure statistics(popsize:integer;
var max,avg,min,sumfitness:rea1;
var pop:popu1ation);
( Ca1cu1ate popu1ation statistics )
var j:integer;
begin
( Initia1ize )
sumfitness :- pop[l].fitness;
min :- pop[l].fitness;
max :- pop[l].fitness;
( Loop for max, min, sumfitness )
for j :- 2 to popsize do with pop[j] do begin
sumfitness :- sumfitness + fitness; ( Accumu1ate fitness sum }
if fitness>max then max :- fitness; ( New max }
if fitness<min then min :- fitness; ( New min }
end;
( Ca1cu1ate average)
avg :- sumfitnessjpopsize;
end;
procedure report(gen:integer);
( Write the popu1ation report )
const 1ine1ength - 132;
var j:integer;
begin
repchar(lst,'-' ,line1ength); write1n(lst);
repchar(lst,' , ,50); write1n(lst,'Popu1ation Report');
repchar(lst, , ',23); write(lst,'Generation' ,gen-1:2);
repchar(lst,' , ,57); write1n(lst,'Generation ',gen:2);
write1n(lst);
write(lst,'# string x fitness');
write(lst,' # parents xsite');
write1n(lst, ' string x fitness');
We have trudged through the SGA code, step by step, inch by inch, gaining a
better feel for some of the ins and outs of genetic algorithm prograrnming. Of
course, this is not the only way to sIdn a GA cat, and a number of public domain
How Well Does It Work? 71
codes are available with numerous bells and whistles for effective optimization
in a variety of domains (Booker and De Jong, 1985; De Jong, 1982; Grefenstette,
1984a, 1984b). Actually, we will be adding several important featUres of our own
in this and later chapters. Let us resist temptation and hold off the faney features.
In this section, we stick with the bare-bones GA and see how well it works.
We have already specified our simple test problem. The bit string decodes as
an unsigned 30-bit integer. The fitness function f is the power function f(x) =
(x/e)", where e has been chosen to Ílormalize .x; and n has been chosen as 10.
Some of you may cry foul, wondering why we have chosen a different function
from the one followed in Chapters 1 and 2 (f(x) = x2). Actually, we have
changed the problem to make things tougher for the GA, as illustrated in Fig.
3.14. With the larger exponent, the average function value is lower, and a smaller
proportion of the domain maps to values above some specified quantity. As a
result, the raódom starting population will not contain very good points to begin;
this is a better test of GA performance.
To specify our computer simulations more precisely, let's choose a trial set
of GA parameters. in De Jong's (1975) study of genetic algorithms in function
optimization, a series of parametric studies across a five-function suite of prob-
lems suggested that good GAperformance requires the choice of a high crossover
probability, a low mutation probability (inversely proportional to the population
size), and a moderaté population size. Following these suggestions we adopt the
following parameters for our first computer simulations:
pmutation = 0.0333 (probability of mutation)
pcross = 0.6 (probability of crossover)
popsize = 30 (population size, n)
f(x)
o
o
x
FIGURE3.14 Comparison of the functions x> and Xl" on the unit interval.
72 Chapter 3 / Computer Implementation of a Genetic Algorithm
The string length for the hand simulations of a genetic algorithm in Chapter
1 was short (short by genetic algorithm standards): l = 5. This translated into a
ridiculously small space with only 25 = 32 points, where there was little practical
need for genetic search. Any enumerative search or random walk would have
found good points quickly. Of couese, at that time oue aim was pedagogical clarity,
and the size bf the search space was of little interest. Now that we are interested
in seeing a stiffer test of GA performance, the string length is increased and the
exponent of the test function is increased. With a string length lcbrom = 30, the
search space is much larger and random walk or enumeration should not be so
profitable. With lcbrom = 30 there are 230 = 1.07( 101°) points. With over 1.07
billion points in the space, one-at-a-time niethods are unlikely to do very much
very quickly. Moreover, the increase in exj>onent has adjusted the space so that
only 1.05 percent of the points have a value greater than 0.9, as shown in Fig.
3.14. These two modifications make the problem a better test of GAperformance.
We start the simple genetic algorithm and let it run for seven generations.
The statistical report for the run is shúwn in Fig. 3.15 and the initial generation
(gen = O) and the first generation are shown side by side in Fi~. 3.16. The initial
population starts out with a populátion average fitness of 0.0347. The average
fitness of the function on the specified interval may be calculated to be 0.0909.
In some sense, we have been unlucky (but not unrealistically so) in oue random
choice of a population. Additionally glancing at our best member fitness in the
initial population,fmax = 0.2824, we should expect to have 30( 1 - 0.2824°.1)=
3.56 or approximately four stríngs in a random population of 30 with fitness
SGA Parameters
................
FIGURE 3.1 $ Initial report from an SGA, simple genetic algoritlun, ruD.
How Well Does It Work? 73
oo...................................
Population Report
Generat ion o Generatiqn 1
greater than 0.2824. We have not only been unlucky on average, we have been
unlucky at the topo Despite the unfortunate initial depression of the population,
once the genetic algorithm gets started, it quickly finds good performance, as we
can see vividly after the first round of reproduction, crossover, and mutation. In
the first generation, a very good string is found with fitness 0.8715. As the run
continues, further improvement is found in both maximum and average popula-
tion fitness as demonstrated in Fig. 3.17. Toward the end of the run, a form of
convergence is observed as displayed in Fig. 3.18. In generation 7, if we scan up
and down the bit strings we notice that there is a fair amount of agreement at
74 Chapter 3 / Computer Impl~mentation of a Genetic Algorithm
'.0
2
'<-
0.5
b-6 max.
>+--K aVQ.
0.0
Ó 2 3 4 5 6 7
generation number
FIGURE 3.17 SGA run, best-of-generation (max) results and generation aver-
age (avg) results to generation 7.
most bit positions. This has occurred even though we have not reached the best
point in the space; we have gotten close, however. !n generation 6 an individual
has appeared with fitness f = 0.9807. This is ne~r optimal but not optimal (this
point is in the top 0.19 percent of the points in the space). Convergent behavior
without guarantee of optimality bothers many people who approach gene tic al-
gorithms from other, more traditional, optimization backgrounds. There are ways
to slow down this premature convergence, as it has been called, and we shall
look at some of these methods in this and later chapters; however, the fact of the
matter is that genetic algorithms have no convergence gu~rantees in arbitrary
problems. They do sort out interesting areas of a. space quickly, but they are a
weak method, without the guarantees of more convergent procedures. This does
not reduce their utility. Quite tQe contrary, more cOQvergent methods sacrifice
globality and flexibility for their convergence. Additionally, many methods are
limited to a narrow class of problem. As a result, genetic algorithms can be used
where more convergent techniques dare not tread. Moreover, if you are solving
problems where known local, but convergent, methods exist, the idea of a hybrid
scheme is natural. Start your search using a gen~tic algorithm to sort out the
interesting hills in your problem. Once the GA ferrets out the best regions, then
take your locally convergent scheme and climb the local peaks. In this way, you
can combine the globality and parallelism of the GA with the more convergent
behavior of the local technique. In a moment we will look at one practical way
of reducing the premature convergence problem through fitn~ss scaling. First we
must devise techniqu~s to transfQrm arbitrary objective functio~s to proper fit-
ness function formo
Mapping Objedive Functions to Fitness Form 7S
............................................................................................................................................................................
Population Report
Generation 6 Generation 7
FITNESS SCALlNG
m 2f'
m
w
z
1-
¡¡:
fiJ f'avg,
...J u_-------------
«
ü ---
'" 1'''';01---
al I
a fmax
RAWFITNESS
Toward the end of a run, this choice of Cmultstretches the raw fitness signifi.
cantly. This may in turn cause difficulty in applying the linear scaling rule as
shown in Fig. 3.19. As we can see, at first there is no problem applying the linear
scaling rule, because the few extraordinary individuals get scaled down and the
lowly members of the population get scaled up. The more difficult situation is
shown in Fig. 3.20. This type of situation is common in a mature run when a few
lethals (bad strings) are far below the population average and maximum, which
are relatively close together. If the scaling rule is applied in this situation, the
stretching required on the relatively close average and maximum raw fitness val.
ues causes the low fitness values to go negative after scaling. A number of solu-
tions are available to solve this problem; here, when we cannot scale to the
desired Cmult,we still maintain equality of the raw and scaled fitness averages and
we map the minimum raw fitnessfmin to a scaled fitnessf'min = O.
The simple scaling procedure may easily be added to the simple genetic al.
gorithm code through the use of three routines shown in Fig. 3.21:preseale, seale,
and sealepop. The procedure preseale takes the average, maximum, and mini.
mum raw fitness values, ealled umax, uavg, and umin, and ealeulates linear seal.
ing eoeffieients a and b based on the logic deseribed above. If it is possible to
seale to the desired multiple, Cmult(in the code it is ealled fmultiple), then that
is the eomputation performed. Otherwise, scaling is performed by pivoting about
the average value and stretehing the fitness until the minimum value maps to
zero. The proeedure sealepop is ealled after the preparation routine presea le to
2f'avg
C/)
C/)
w
------
~
ü:
f'avg
o
W
-1
<§ O
C/) favg fmax
!fmin RAWFITNESS
f'min L l Negative fitness violates
nonnegativity requirement
FIGURE 3.20 Difficulty with linear scaling procedure in mature run. Points
with low fitness can be scaled to negative values.
Fitness Scaling 79
FIGURE 3.21 Scaling routines: procedure preseale, function seale, and pro-
cedure sealepop.
scale all the ind~vidual raw fitness values using the simple function seale. Here
we have assumed that the raw fitness values are stored in the individual record
in a real value called objeetive (popfj].objeetive). The scaled fitness is placed in
the real parameter fitness, and the sum of the fitness values sumfitness is recal-
culated. Installation and testing of the scaling procedure is left as an exercise.
In this way, simple scaling helps prevent the earty domination of extraordi-
nary individuals, while it later on encourages a healthy competition among near
equals. This does not complete our examination of the possible objective func-
tion transfprl!lations, and we shall return to some more examples in Chapter 4.
At the moment, we examine some of the coding options available to us in GA
work \Jeyónd the simple codings we have used thus faro
80 Chapter 3 / Computer Implementation of a Genetic Algorithm
CODINGS
We only have examined a very limited number of string coding alternatives for
mapping a finite-Iength string to the pararneters of an optimization problem. We
have introduced a simple binary coding in response to a simple binary switctling
problem. In this coding we have concatenated a string of O's and l's ~oding,
wQere the ith O(or 1) has meant that the ith switch is off (or on). We have also
decoded a binary string as an um¡igned int~ger where the string A = apl- ¡ . . .
aZa¡ has decoded to the pararneter value x = ¡ai.~I-l. Although these codin~
have given us some flexibility, they do not provide the variety of options we
require to tackle the spectrum of problems we face in science, business, and
engineering. In this section, we examine two fQndarnental principies of genetic
algorithm coding to help guide our coding design in different problems. Later on
we will look at a multiparameter, mapped, binary string coding ~at has proved
and should continue to prove useful in a variety of problems.
In one sense, coding a problem for genetic search is no problem because the
genetic algorithm prograrnmer is limited largely by his imaginati°!1. As we saw in
the last chapter, genetic algorithms exploit similarities in arbitrary codings as long
as bq.ilding blocks (short, high-perfoqIlance schemata) lead to I).ear optima. In
another sense, this freedom of choice is a mixed blessing to the new user; the
array of possible coding alternatives i~ both invigorating and bewildering. Given
this freedom, how does a new user choose a good coding? Fortunately, geneti~
algorithms are forgiving because they are robust, and in that sense there is ~slJally
no need to agonize over coding decisions. Additionally, we offer two basic prin-
cipies for choosing a GA coding: me principie 01 meaningful building blocks
and the principie 01 minimal qlpbabets.
The principie of meaningful building blocks is simply this:
Theusershould select a coding so that shor_? low-order schemata
are relevant to the underiying problem and reiatively unre~ted
to schemata over other tixed positions.
Although this principie cal).be checked rigorously using the Walsh analysis men-
tioned in Chapter 2, mis procedure is rarely practical and as a result, coding
design for meaningful building blocks is something of an ;p-t.Nonetheless, when
we design a coding we 1'hould check the distances betwee:;nrelated bit positions.
CQapter 5 presents w-aysto rearrange the ordering of a string coding, as wdl as
several operators that search for good codings while they search for good
so.utions.
The second coding cule, the principie of minimal alpha~ets, is simply stated:
The user ~hould select the smallest alphabet that permits a natu-
ral expression of the problem.
Until now we:;have been almost opsessed with the idea of binary codings. Has
this been accidental or has there been method to our coding madness? That there
has been qtethod can be best illustrated by returning to our tireq but illustrative
Codings 81
five-bit example started in Chapter 1. In Table 3.1, we see the same old fopr
binary strings with their sap¡e old fitness values (which we obtained by decoding
the strings as unsigned binary integers and thereby evaluated the fitness accord-
ing to the relation f(x) = X2). Recall that one of our original motivations for
considering schemata was the natural attempt to associate high fitness with sim-
ilarities among strings in the population. In the table, we also consider a nonbi-
nary coding. In fact, we consider an extreme example. Consider a oQe-to-one
mapping of the binary integers [O, 31] to the 32-letter alphabet consisting ofthe
26 letter alphabe~ {A-Z} and the six digits {1...,.6},
as shown in Table 3.2.
In the binary case, as we sean the list (Table 3.1), the hunt fQr important
similarities is made possible by the small cardinality of the alphabet. In the non-
binary case, as we scan the lis! 'Ve only have the four single-Ietter strings and
their fitness values; there are no coding similarities to exploit. This is sur ely an
extreme example, but the sam~ principIe holds true in less tlagra,nt cases.
To see this a bit more matl1ematically, we should really compare the number
of schemata available in a binary codin~ to the number of sche~ata available in
a nonbinary coding. Of eours~, both the binary and nonbinary codings shoul(l
1 100 1 Z
110 1 O 1
110 1 1 2
1 1 111 6
82 Chapter 3 / Computer Implementation of a Genetic Algorithm
code the same number of alternatives; however, the different alphabet cardinali-
ties require different string lengths. For equality of the number of points in each
space, we require 21 = kl', where l is the binary code string length and l' is the
nonbinary code string length. The number of schemata for each coding may then
be calculated using the respective string length: 31 in the binary case and
(k + 1Y' in the nonbinary case. It is easy to show that the binary alphabet offers
the maximum number of schemata per bit of inform'ation of any cocfing. Since
these similarities are the essence of our search, when we design acode we should
maximize the number of them available for the GA to exploit.
Our two principies give us some cIues for designing effective codings for simple
genetic algorithms. They do not, however, suggest practical methods for coding
a particular problem. One successfuIly used method of coding multiparameter
optimization problems of real parameters is the concatenated, multiparameter,
mapped, fixed-point coding. .
We have aIready considered an unsigned fixed-point integer coding; how-
ever, what happens if we are not very much interested in a parameter x E [O,
21)?One way to circumvent this apparent limitation is to map the decod~d un-
signed integer linearly from [0,21] to ~ specified interval [U~in'Umax].In this way,
we can carefuIly control the range and precision of the decision variables. The
precision of this mapped coding may be caIculated:
1T=
Umax - Umin
21 - 1
SINGLE
o OOO
1111 -
-
U1 PARAMETER (l,
UMln
UMQX
= 4)
O O 01: O1 O1 : 11100111111
1 . I 1
U1 I U2 1 1 U9 1 U10 1
I 1 I 1 I
var parms:parmspecs;
FIGURE 3.23 Coding routines for use in SGA: procedure extract~arm, func-
tion map~arm, and procedure decode~arms.
84 Chapter 3 / Computer Implementation of a Genetic Algorithm
DISCRETIZATION
F,
-continuous
'
FORCE I ", F;
I
F 1'0 I
:\,,,....--linear
\
I \
I '
\
I \
I \
I
I I
I I
I I
I I
I I
I I
to tI t2 t3 t. ts
TIME t
CONSTRAINTS
Thus far, we have only discussed genetic algorithms for searching unconstrained
objective functions. Many practical problems contain one or more constraints
that must also be satisfied. In this section, we consider the incorporation of con-
straints into gene tic algorithm search.
Constraints are usually classified as equality or inequality relations. Since
equality constraints may be subsumed into a system model-the black box-we
are really only concerned with inequality constraints. At fiest, it would appear
that inequality constraints should pose no particular problem. A genetic algo-
rithm generates a sequence of parameters to be tested using the system model,
objective function, and the constraints. We simply run the model, evaluate the
objective function, and check to see if any constraints are violated. If not, the
parameter set is assigned the fitness value corresponding to the objective func-
tion evaluation. If constraints are violated, the solution is infeasible and thus has
no fitness. This procedure is fine except that many practical problems are highly
constrained; finding a feasible point is almost as difficult as finding the best. As a
result, we usually want to get some information out of infeasible solutions, per-
haps by degrading their fitness ranking in relation to the degree of constraint
violation. This is what is done in a penalty metbod.
In a penalty method, a constrained problem in optimization is transformed
to an unconstrained problem by associating a cost or penalty with all constraint
violations. Tbis cost is included in the objective function evaluation. Consider,
for example, the original constrained problem in minimization form:
minimize g( x)
subject to b,(x) ~ O i = 1,2,...,n
where x is an m vector
We transform this to the unconstrained form:
n
SUMMARY
This chapter has unveiled some of the genetic algorithm's mystery, through
careful examination of the data structures, procedures, and details necessary to
implement a practical, yet simple, genetic algorithm. Specifically, we have imple-
mented a bare-bones genetic algorithm called the simple genetic algorithm
(SGA), written in Pascal programming language for execution on commonly
available microcomputers.
As might be expected, the primary data structure of the simple genetic al-
gorithm is the string population. The SGAformulation uses two nonoverlapping
populations to make birth and replacement as easy as possible. The populations
themselves consist of an array of individuals that contain the bit strings, the de-
coded parameter, and the fitness function value along with other important aux-
iliary information.
The primary work of the SGAis performed in three routines, seiect, crossover,
and mutation. Select performs simple stochastic selection with replacement,
what we have been calling roulette wheel selection. Crossover and mutation per-
form their namesake operations as described in Chapter 1. Their action is coor-
dinated by a procedure called generation that generates a new population at each
successive generation.
After building SGA we have tested it in a small run on a simple function,
f(x) = C"XIO.While concrete conclusions are impossible on a single trial of a
stochastic process, the GA does find near optimal results quickly after searching
a small portion of the search space.
Various details of implementation have also been discussed. Objective-to-
fitness transformations have been examined to convert normal objecti~e func-
tion formulations, whether maximization or minimization formulations, to proper
fitness function formo Additionally, fitness scaling has been suggested for main-
taining more careful control over the allocation of trials to the best strings.
Some of the issues underlying the coding problem have been examined. The
principie of minimai aipbabets and the principie of effective building biocks
have been laid down to help the GA user design more effective codings. A coding
that has been useful in a number of problems has been presented in Pascal com-
puter code formo This coding, the multiparameter, mapped, fixed-point coding,
should be useful in many parameter optimization problems.
Problems 87
The need for discretization is not restricted to that imposed by codings. Many
practical optimization problems require the specification of a control functiop or
functions over a continuum. These optimizat~on pro~lel11s-more prqperly, op-
timal control problems-must first pe reduced to finite parameter problems,
which may in tuco be discretized by the coding procedure discussed in this chap-
ter. Discretization of this type is performed by selecting appropriate interpolation
functions (often linear interpolation will suffice) and the parameters associated
with the chosen interpolation form are then coded into a concaten~ted, qnite-
length bit string.
Last, we have recognized the need for special methods to adjoin inequality
constraints to a problem. Genetic algorithms are n¡lturally cast as an uncon-
strained search technique. After all, nature tries and tries again, only finding the
constraints of its environment through the survival or death of its trials. Similarly,
genetic algorithms must h:we fitness functions that reflect informatipn about bQttI
the quality and the feasibUity of solutions. Exterior penalty methods have been
used successfully in a number of problems. With these ~ethods, whenever a con-
straint is violated, the unconstrained objective function value is pepalized by an
amount related to a function of tne constraint violation.
The examination of conCrete code examples and ~ number of implementa-
tion issues has made GAs more accessible for our use _n practical scientific, en-
gineering, and business problems. In the next chapter, we look at a number of
early and current applications of straightforward genetjc algorithms.
. PROBLEMS
3.1. Consider the fitness functionf(x) = x" on the interval x E [0,1]. Calculate
the expected population average of a randomly selected population of points.
Calculate the probability of selecting a point x > xo. Compare numerical values
of the populatiQn average for n = 2 and n = 10. Compare probabiUties of se-
lecting one point x > 0.9 for the same two exponents.
3.2. A search space contains 2,097,152 poiQts. A binary-cQded genetic algorithm
is compared to an octal-coded genetic algorithm. Calculate and compare the fol-
lowing quantities in the two cases, binary and octal:
a) Total number of schemata
b) Total number of search points
c) Number of schemata contained within single individual
d) Upper and lower bpunds on number of schemata iQ population of size
n = 50
1.0. The desired precision for X, y, and z are 0.5, 10\ and 0.001 respectively.
Design a concatenated, mapped, fIXed-point coding for this problem. What is the
minimum number of bits required to obtain the desired precision? With the se-
lected coding, determine the bit strings that represent each of the following
points: ( - 20, O, - 1), (125.0, 1.~E6, 1.0), (50, 100000, 0.597).
. COMPUTERASSIGNMENTS
A. Install the scaling routine of Fig. 3.21 in the simple genetic algorithm (SGA)
code and reproduce the experiment on functionf(x) = c.x1O.Compare and con-
trast the res~lts obtained with and without scaling instal1ed.
B. Test the multiparameter, mapped, fixed-point coding ro~tines of Fig. 3.23 on
a three-parameter problem with parameter maximums, minimums, and substriQg
lengths as follows:
Max¡ = 20, 100, 300
~ngthl = s, lO, 15
Test the coding routine on the all-O string, the all-l string, and the string
010101...0101. Check the computer calculation with a hand calculation. AIso de-
termine the precision of each of the subcodes within the coding. After testing,
install the routines in the SGAcode with appropriate initialization procedures.
C. Minimize the::function f(x, y, z) = x2 + y2 + Z2. where X, y, and z are
permitted to vary between - 512 and 512. Use a lO-bit coding for each substring
(this is De Jong's function Fl ).
D. Improve the efficiency of the selection procedure by implementing a binary
search using cumulative selection probability distribution values.
E. Implement and test a routine to perform mutation as a 11lu~ationdock. (Hint:
Use an exponential distribution and calqdate the time until next mutation.)
F. Implement a coding routine to implement a floating-point code with speci-
fied mantissa and exponent.
G. Compare the performance of a binary-coded genetic algorithm to a nonbi-
nary-coded genetic algorithm. Specifically, compare the performance of a binary-
coded GA on the fitness function ofthis chapter f(x) = x10 to an octal-coded GA
on the same problem. Use a 30-bit code and a lQ-position octal code {O,1,2,3,
4, 5, 6, 7}. Compare and contrast the rate of cOQvergence and ultimate conver-
gence under both codings.