Graph Based Genetic Algorithms

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Graph Based Genetic Algorithms

Daniel Ashlock Mark Smucker John Walker


Iowa State University Firefly Network Iowa State University
Department of Mathematics One Broadway, 6th Floor Mechanical Engineering Department
Ames, Iowa, 50010 Cambridge, MA 02142 Ames, Iowa, 50010
[email protected] [email protected] [email protected]

Abstract- Genetic algorithms use crossover to blend pairs ulation as the algorithm progresses.
of putative solutions to a problem in hopes of creat- Various approaches to preventing diversity loss have been
ing novel solutions. At its best, crossover takes distinct tried. This include using a high mutation rate, reducing the
good features from each of the two structures involved fitness of organisms in proportion to the number of organ-
in the crossover. This creates a conflict: progress results isms representing similar solutions (niche specialization), di-
from crossing over distinct types of structures but such rectly rejecting duplicate solutions, and imposing a geogra-
crossover produces new structures that are like their par- phy upon the population[1, 10]. We expand on this last no-
ents, reducing the diversity on which successful crossover tion by imposing geographical structures coded as combina-
depends. In this paper we describe and test genetic al- torial graphs on the simple genetic algorithm. In Section 2
gorithms that use a combinatorial graph to limit choice we give the necessary mathematical definitions and three in-
of crossover partner. This gives a computationally cheap variants that estimate ability permit heterogeneous crossover
method of picking a level of tradeoff between having het- (crossover between genetically distinct individuals), the abil-
erogeneous crossover (crossover between genetically dis- ity to preserve population diversity, and an aggregate measure
tinct individuals) and preservation of population diver- of both these qualities, geometrically averaged. In Section 3
sity. Statistics for estimating the degree to which a given we define graph based genetic algorithms and describe the
graphical population structure favors population diver- three problems we use to test them. Section 4 gives the pre-
sity or heterogeneous crossover are given. These statistics cise design of the experiments performed and their outcomes.
are computed for ten example graphs. These graphs are Section 5 gives conclusions and discusses the experimental
then used as population structures for genetic algorithms results. Section 6 discusses future directions for this work.
of three test problems: a trivial string evolver, the plus-
one-recall-store (PORS) test suite for genetic program- 2 Mathematical Background
ming [3, 4], and simple string controllers for Astro Teller’s
Tartarus problem. [13] A combinatorial graph or graph, G, is a set V (G) of vertices
and E(G) of edges where E(G) is a subset of the unordered
1 Introduction pairs that can be drawn from V (G). Two vertices of the graph
are neighbors if they are members of the same edge. For an
In nature we find constraints such as geography or mutual introduction to graph theory, we refer the reader to [14]. We
infertility imposed on an organism’s ability to sexually repro- will term a graph used to constrain mating in a population
duce with other organisms. In the Simple Genetic Algorithm the population structure. The general strategy is to use the
(SGA) [5] the only constraint on reproduction is that more fit graph to specify the geography on which a population lives,
individuals have a higher probability of being selected for re- permitting mating only between neighbors, and find graphs
production. In nature individuals who are separated by great that preserve diversity without hindering progress due to het-
distances, no matter what their respective fitnesses may be, erogeneous crossover.
have a very low probability of reproducing with each other. We use a nonstandard operation on graphs that generates
Within higher order species one also finds cultural constraints a valuable substructure, simplexification. Simplexification at
on the probability of two individuals reproducing, and actu- a vertex v replaces v with a cluster of vertices, one for each
ally in almost all “intelligent” animals the individuals in the neighbor of v so that all the new vertices are neighbors of one
population select their partner. another and each is a neighbor of exactly one of v’s former
One of the central problems of population genetics is ex- neighbors. Simplexification of a vertex with four neighbors
plaining why there are not problems with loss of diversity in is shown in Figure 1. The effect of simplexification creates
natural populations even though standard mathematical mod- small groups of vertices that are closely coupled to one an-
els show that diversity should vanish rapidly. Sewall Wright other but less closely coupled to the rest of the graph. This
has his theory of Isolation by distance [16]. Kimura and Crow crates an analog of biological refuges in the graphical con-
[6] examine the rate different graphical population structures nection topology.
lose their genetic diversity under simple reproduction with- A coloring of a graph is a mapping, f : V (G) → C,
out selection. Similarly, one of the fundamental problems in where C is a set of colors. In this research the different col-
genetic algorithms is maintaining useful diversity in the pop- ors will represent distinct genotypes and hence each possible
ger value (this is a population with maximal edge and en-
tropy). We repeatedly generate mating events that consist of
picking a vertex uniformly at random and copying the integer
currently on one of its neighbors, also selected uniformly at
random, onto it. As this happens we compute the edge and
entropy of the population as the function of the number of
mating events. The averages approximate the way a popu-
lation structure G permits populations living on it to evolve
their entropy and edge under neutral selection. For a popula-
tion structure G, denote the neutral selection entropy function
Figure 1: Simplexification of a vertex with four neighbors N E G (t) and the neutral edge function N QG (t) where t is the
number of mating events.
A high edge value in a population usually implies that it is
population corresponds to a coloring f. We will measure the losing entropy (diversity) quickly. This is because crossover
diversity of a population by the entropy of the frequency dis- events between distinct types of parents often replaces one
tribution of colors. More precisely, if there are n members of of them with a child less distinct from the surviving par-
a population, k genotypes (or colors) with ai members of the ent. From this we see that these two qualities of a popula-
population having genotype i, 1 ≤ i ≤ k, then the entropy tion structure, that it maintain high entropy and high edge
E(f) of the population f is over time, conflict with one another. We define the following
X ai n summary statistics and compare them to performance on test
E(f) = · log2 ( ). (1) problems since it is problem dependent which measure is a
n ai
ai 6=0 more useful predictor of performance in a genetic algorithm
using a given population structure. For all three statistics we
This is the Shannon entropy of the distribution of genotypes
and is one standard measure of population diversity used in take the time weighted average of the quality of interest as
conservation biology. preserving edge or entropy later in evolution is more impres-
If our only goal were to preserve the diversity then we sive that preserving it early on. The entropy preservation of a
would generate a random population, inherently richly di- population structure G is:
verse, and allow no mating to take place. Since we want to Z
N E G(t)
permit the population to evolve, we must permit some mat- EP (G) = t · · dt. (3)
Log2 (|V (G)|)
ing. Connection topologies that preserve diversity do so by
allowing only limited mating to take place. In Ackley and The edge preservation of a population structure G is
Littman[1] an evolving population is situated on a grid of Z
processors in a multiprocessor machine. The authors observe QP (G) = t · N QG (t) · dt. (4)
that the useful crossover-based computation takes place in ar-
eas where different types of creatures are adjacent. Ackley
The mean quality of a population structure G is
and Littman’s geography quickly self-organizes into geneti-
cally homogeneous cells with useful crossover based compu- Z s
tation taking place mostly at the cell boundaries where dis- N E G(t)
M Q(G) = t · · N QG(t) · dt. (5)
similar creatures can mate. Motivated by this example we Log2 (|V (G)|)
approximate the useful crossover-based computation taking
place in a graphically structured population f by the edge, Division by log2 (|V (G)|) normalizes entropy to have a max-
Q(f), of the population defined to be: imum value of 1 no matter the number of vertices, |V (G)| in
the population structure. A logical question to ask is “even if
|{{u, v}εE(G) : f(u) 6= f(v)}| preserving edge and entropy is good, why would the neutral
Q(f) = . (2)
|E(G)| selection versions matter?” There are two answers. First, we
will present experiments to test their utility. Second, many
The edge of population on a graph is the fraction of potential evolving populations reach a point with little fitness varia-
matings that are between organisms with different genotypes. tion. Neutral selection behavior may be a good model for this
Intuitively this fraction should be closely related to the faction regime.
of heterogeneous crossover events.
The neutral selection behavior of a population is the way 2.1 Desirability of Edge and Entropy Preservation
a population changes in the absence of any selective pres-
sure. We approximate the average neutral selection behavior Edge and entropy are statistical surrogates for qualities that
of edge and entropy for a population structure (graph) by re- conventional wisdom believes worth preserving in a genetic
peating the following computation many times and averaging algorithm. The entropy statistic is a surrogate for popula-
the results. Each vertex is initially assigned a distinct inte- tion diversity, borrowed from population biology. From an
optimization perspective, a diverse population is one not yet
trapped in a local optima or scattered about widely within an
optima and so more likely to escape it. It is clear that di-
versity preservation is valuable in problems with multimodal
fitness landscapes. If we depart from optimization and exam-
ine adaption there is an even more pressing need to preserve
diversity. An algorithm with agents undergoing adaption to
changing conditions, be it other agents or a varying figure of
merit imposed by the experimenter, can be viewed as having
a moving fitness landscape. A picture of such a fitness land-
scape might be like the surface of an ocean on a choppy day.
With a diverse population there are more likely to be individ-
uals on or near what is presently a high spot.
The need for a high value of the edge preservation statis-
tic is less clear, but may be seen by examining a situation in
which crossover is of no worth. One such situation is a pop-
ulation in which most of the diversity is gone and individuals
are very much like one another. Here, the typical crossover
produces two children that look like their parents. This hap-
pens because the parents are identical or differ in only one
location, with crossover deciding which child is a copy of
which parent, but generating no new structures. In this case
any effort spent manipulating structures to perform crossover
is wasted. If we want crossover to be between genes that are
different, then the preservation of edge is at least in the di-
rection of keeping the rate of such crossovers relatively high Figure 2: K4,4 and the result of simplexifying every vertex of
over time. K4,4 .
Imagine that we have a graph with a high mean quality
(Equation 5). Then something intrinsic in the topology of
the graph keeps both edge and and entropy at at least some fitness (roulette selection) and let the new individual replace
minimal level over time; this is implicit in the definition of the old if it is at least as fit as the individual it replaces. We
mean quality. Individuals with distinct genotypes are cross- call this local mating rule locally elite roulette mating.
ing over fairly often and no one genotype is coming to dom- We report results on graphical genetic algorithms for four
inate the population too rapidly. This should then improve problems on ten graphs. See West[14] for definition of all of
performance on problems where heterogeneous crossover and the following graphs except the last. These graphs all avail-
diversity preservation are valuable. able by electronic mail, as lists of neighbors, from the first
author. The graphs are as follows: the complete graph K512 ,
3 Definition of Graph Based Genetic Algorithms the generalized Petersen graphs P256,1, P256,3, P256,17, the
4-neighbor tori T4,128, T8,64 , T16,32, the cycle C512, the 9-
A graph based genetic algorithm is a genetic algorithm run dimensional hypercube H9 , and a graph Z derived by the
with a graphical population structure. We assume the reader following procedure. Start with the complete bipartite graph
to be familiar with the standard terminology of genetic al- K4,4 . Simplexify every vertex (see Section 2). Repeat sim-
gorithms as given in [5]. One individual is placed on each plexification twice more. The graph K4,4 and first step of this
vertex of the graph. A type of steady state genetic algorithm process are shown in Figure 2.
[11, 12, 15] is used where evolution proceeds by individual All of these graphs contain 512 vertices. The complete
mating events. A mating event is performed as follows. Pick graph was chosen as a baseline - a graph based genetic al-
a vertex v of the graph uniformly at random. A neighbor gorithm using the complete graph simulates a fairly standard
of v is chosen for mating with a fitness bias. Crossover and genetic algorithm. The other graphs were chosen because ei-
probabilistic mutation are used to produce a single new indi- ther (i) they are the connection topology of a popular type of
vidual which may or may not be used to replace the individ- multiprocessor machine or (ii) they added to the diversity of
ual on vertex v. The details of how the neighbor is picked the graphs tested as measured by our summary statistics. The
for mating and how to decide if the new individual replaces summary statistics for these ten graphs are given in Figure 3
the individual on vertex v are called the local mating rule of together with the graphs listed in increasing order for each
the graph based genetic algorithm. In this research the local statistic.
mating rule will pick neighbors in direct proportion to their
G EP(G) QP(G) MQ(G) gorithm uses two point crossover and mutation consists of
K512 79281 41396 152588 changing one character, selected uniformly at random, from
P256,1 69518 177679 27408 zero to one or one to zero. The confidence interval upper
P256,3 87384 128892 59312 bounds for this experiment are given in Figure 4.
P256,17 103432 84733 126898
T4,128 81037 140117 47044 4.2 Plus-One-Recall-Store Experiments
T8,64 94873 103610 86881
T16,32 102961 86249 123394 This experiment tests the value of graph based genetic algo-
C512 59703 213902 16829 rithms for genetic programming [8, 7]. The test problem,
H9 11957 7550 19005 called the plus-one-recall store (PORS) efficient node use
Z 98851 127435 76807 problem, is to find parse trees in a very limited language that,
when executed, generates the largest integer result possible
given a fixed maximum number of parse tree nodes. The lan-
EP(G) QP(G) MQ(G) guage has two operations, integer addition and a store that
H9 H9 C512 places its argument in an external memory location, and two
C512 K512 H9 terminals, the integer 1 and recall from an external memory.
P256,1 P256,17 P256,1 This test problem is described in detail in [3, 4].
K512 T16,32 T4,128 The difficulty of the PORS-efficient node use problem
T4,128 T8,64 P256,3 varies very strongly according to the upper bound on the num-
P256,3 Z Z ber of nodes modulo three. We ran experiments on n = 15
T8,64 P256,3 T8,64 nodes, the hardest of the three classes. In this set of experi-
Z T4,128 T16,32 ment the initial population was made of randomly generated
T16,32 P256,1 P256,17 trees with n nodes. A successful individual was defined to be
P256,17 C512 K512 a tree that produces the largest possible number (these num-
bers are computed in [4]). Crossover was performed by the
Figure 3: Summary statistics for graphical connection topolo-
usual subtree exchange [8]. If this produced a tree with more
gies and graphs ordered by the statistics in increasing order.
than n nodes then a subtree of the root node iteratively re-
placed the root node until the tree had less than n nodes. We
4 Experimental Design and Results term this operation chopping. Mutation was performed by
replacing a subtree uniformly at random with a new random
For each test problem we give a definition of a successful in- subtree the same size, with probability 0.2 for each new tree
dividual. For each graph we run a graph based genetic algo- produced. The confidence interval upper bounds for this ex-
rithm 200 times, saving the number of mating events until the periment are given in Figure 5.
first successful individual appears in each run. We treat the
fraction of populations that contain a successful individual 4.3 Tartarus String Experiments
after k mating events as a Bernoulli variable with probabil-
ity pk of success. For each graph we compute the minimum The Tartarus problem was first posed by Astro Teller [13]. A
number of mating events k for which half the populations as- small autonomous robot is places in a six-by-six grid. The
sociated with that graph contain a successful individual. For robot occupies one grid square and may turn right, turn left,
each other graph we then use a normal approximation to the or attempt to advance during each time step. Six boxes, each
binomial distribution to construct a 95% confidence interval filling a grid square, are placed away from the edges of the
for pk . Other graphs are judged significantly worse as popula- grid with no two-by-two region containing four boxes. (The
tion structures for the test problem if this confidence intervals robot heading, position, and box positions at the start of a
upper bound is less than 0.5. A table of these upper bounds test define a “board”.) The robot may advance into an empty
is supplied with each experiment For details of the method of square or pushing one box but two boxes or a wall stop the
constructing this type of confidence interval see Larsen and robot. The robot’s task is to push boxes into corners (worth
Marx[9]. two points) or against a wall (worth one point). The robot
is given eighty time steps and then its score is assessed for
a given starting configuration. Teller and we both sum fit-
4.1 Trivial String Evolver Experiment
nesses over 40 boards to approximate the quality of a robot
The first experiment operates on twenty character binary controller.
strings. The initial population consists of strings chosen uni- Research on baselines for the Tartarus problem reported
formly at random and the fitness of a string is the number of in [2] indicate that non-reactive controllers with memory, im-
ones in the string. This problem is easy, high-dimensional, plemented as strings of moves, can obtain fitness comparable
and unimodal. A successful individual in this experiment is to those obtained by Teller’s reactive GP-tree controllers with
one composed entirely of ones. The graph based genetic al- memory. In this experiment we use strings of sixteen moves
H9 C512 Z K512 P256,17 P256,3 P256,1 T16,32 T41 28 T8,64

H9 . 0.16 ∗ 0.40 ∗ 0.29 ∗ 0.34 ∗ 0.36 ∗ 0.43 ∗ 0.50 ∗ 0.45 ∗ 0.48
C512 0.99 . 0.94 0.69 0.92 0.92 0.95 0.97 0.95 0.97
Z 0.76 ∗ 0.26 . ∗
0.38 ∗ 0.48 0.55 0.61 0.69 0.58 0.65
K512 0.95 ∗ 0.45 0.84 . 0.79 0.83 0.84 0.90 0.85 0.90
P256,17 0.83 ∗ 0.29 0.64 ∗
0.43 . 0.63 0.71 0.75 0.67 0.74
∗ ∗
P256,3 0.78 0.27 0.60 0.38 ∗ 0.50 . 0.64 0.70 0.61 0.68
P256,1 0.72 ∗ 0.24 0.55 ∗
0.36 ∗ 0.47 0.51 . 0.65 0.54 0.62
T16,32 0.64 ∗ 0.17 ∗ 0.45 ∗ 0.32 ∗ 0.39 ∗ 0.43 ∗ 0.48 . ∗
0.47 0.54
T41 28 0.74 ∗ 0.25 0.55 ∗
0.37 ∗ 0.47 0.54 0.59 0.68 . 0.64
∗ ∗ ∗
T8,64 0.68 0.20 0.50 0.34 ∗ 0.42 ∗ 0.45 0.52 0.61 0.51 .
* Values Less than 0.5 indicate the graph indexing the column performed significantly
worse than the graph indexing the row.

Figure 4: Confidence interval upper bounds for trivial string evolver experiment.

H9 C512 Z K512 P256,17 P256,3 P256,1 T16,32 T41 28 T8,64


H9 . 0.59 0.67 0.65 0.65 0.64 0.75 0.67 0.64 0.66
C512 0.51 . 0.62 0.59 0.63 0.59 0.72 0.65 0.60 0.61

Z 0.47 0.52 . 0.52 0.58 0.54 0.64 0.59 0.56 0.55
K512 0.51 0.55 0.61 . 0.61 0.57 0.69 0.63 0.59 0.59
P256,17 ∗ 0.47 0.51 0.56 0.51 . 0.53 0.63 0.58 0.56 0.54
P256,3 ∗ 0.49 0.54 0.60 0.55 0.59 . 0.68 0.62 0.57 0.57
P256,1 ∗ 0.42 ∗ 0.46 ∗ 0.50 ∗ 0.42 0.52 ∗
0.49 . ∗
0.49 ∗ 0.48 ∗
0.49
T16,32 ∗ 0.46 ∗ 0.50 0.55 ∗
0.49 0.56 0.52 0.62 . 0.54 0.52

T4,128 0.47 0.53 0.59 0.53 0.58 0.55 0.67 0.61 . 0.57
T8,64 ∗ 0.47 0.52 0.58 0.53 0.58 0.55 0.65 0.60 0.56 .
* Values less than 0.5 indicate the graph indexing the column performed significantly worse
than the graph indexing the row.

Figure 5: Confidence interval upper bounds for the PORS-efficient node use experiment on n = 15 node trees.

(left, right, forward) which the algorithm cycles through five 5 Conclusions
times to obtain 80 moves.
The termination requirement, i.e. definition of success- The first analysis we performed on the significance data in
ful individual, is modified to require the average per-board Figures 4, 5, 6 was to partially order the graphs by relative
fitness of the entire population to be greater than 3.0 (about performance advantage. In the trivial string evolver exper-
60% of the maximum known fitness for string controllers of iment 36 of the 81 pairs of graphs exhibited a statistically
this type) for forty boards. The requirement that success is significant difference in performance. The partial order seg-
demonstrated over long time scales is necessary due to the regates the graphs nicely as follows:
stochasticity of the fitness function. The set of forty boards H9 ≥ T16,32 , T8,64 ≥ P256,1, T4,128, P256,3, Z ≥ P256,17 ≥
used in a fitness evaluation are generated uniformly at ran- K512 ≥ C512 .
dom. Every 256 mating events (half the population size) the Figure 7 shows this relationship graphically with an arrow
set of forty test boards is replaced with a new, randomly gen- indicating statistically significant dominance.
erated set of forty. The fitness test thus samples the space
of possible boards in a manner consistent with a steady state Z
genetic algorithm. We use two point crossover of the strings T8,64
of moves and a probabilistic mutation that changes between P256,3
zero and three positions in the string to a new randomly gen- H9 P256,17 K512 C512
erated action. The confidence interval upper bounds for this T4,128
experiment are given in Figure 6. T16,32
P256,1
Figure 7: Graph dominance figure for trivial strings.
H9 C512 Z K512 P256,17 P256,3 P256,1 T16,32 T41 28 T8,64

H9 . 0.01 ∗ 0.14 0.74 ∗
0.20 ∗ 0.21 ∗ 0.10 ∗ 0.33 ∗ 0.29 ∗
0.33
C512 1.00 . 1.00 1.00 1.00 0.99 0.99 1.00 1.00 1.00

Z 0.99 0.10 . 1.00 0.65 0.61 0.54 0.81 0.84 0.83

K512 0.40 ∗ 0.00 ∗ 0.07 . ∗
0.06 ∗ 0.12 ∗ 0.01 ∗ 0.18 ∗ 0.17 ∗
0.19

P256,17 0.97 0.05 ∗ 0.47 0.98 . 0.55 ∗
0.42 0.70 0.71 0.74
∗ ∗
P2563 0.98 0.07 0.51 0.99 0.62 . 0.48 0.76 0.78 0.80

P256,1 1.00 0.12 0.62 1.00 0.72 0.69 . 0.86 0.87 0.85

T16,32 0.90 0.04 ∗ 0.38 0.95 ∗
0.45 ∗ 0.42 ∗ 0.28 . 0.61 0.60

T4,128 0.90 0.04 ∗ 0.38 0.95 ∗
0.45 ∗ 0.42 ∗ 0.28 0.60 . 0.60
∗ ∗ ∗
T8,64 0.90 0.04 0.38 0.95 0.45 ∗ 0.42 ∗ 0.28 0.60 0.61 .
* Values less than 0.5 indicate the graph indexing the column performed significantly worse
than the graph indexing the row.

Figure 6: Confidence interval upper bounds for the string based Tartarus controller experiments.

In the PORS experiment only 15 of 81 pairs of graphs ex- T16,32


hibited a performance advantage and the partial order indi-
cated unambiguous dominance by P256,1 but otherwise gave P256,17 Z
a somewhat ragged ranking of the graphs:
P256,1 ≥ P256,17 ≥ T4,128 , T8,64, T16,32, P256,3 ≥
K512 H9 T8,64 C512
K512 , H9 , C512. P256,3 P256,1
Figure 8 shows this relationship graphically with an arrow
indicating statistically significant dominance. T4,128
Figure 9: Graph dominance figure for Tartarus.
P256,17 T8,64
C512
intuition of the authors is that more subtle measures of graph-
P256,3 H9 ical connectivity may be the key predictor of performance.
P256,1 In any case we have some support for Ackley and Littman’s
T16,32 hypothesis that having lots of boundary area between regions
K512 with distinct genotypes is good.
Z T4,128 To finish our conclusions on a positive note, we can re-
port a very strong effect from changing population structures.
While we at present have little of worth to say about which
Figure 8: Graph dominance figure for PORS. population structures are good for what problem (the sum to-
tal of general result is that C512 is bad) we see that (i) chang-
The Tartarus experiment yielded 39 of 81 pairs of graphs ing the population structure has a significant effect on solu-
with a statistically significant performance advantage. The tion time and (ii) different population structures are good for
partial order gave a firm segregation of the graphs in the fol- different problems. Note, for example, the positions of the
lowing fashion: nine-hypercube H9 and the complete graph K512 in each of
K512 ≥ H9 ≥ T4,128, T8,64 , T16,32 ≥ P256,3, P256,17 ≥ the three test problems.
P256,1, Z ≥ C512.
Figure 9 shows this relationship graphically with an arrow
indicating statistically significant dominance.
6 Discussion and Future Work
Comparing the partial orders of graphs by relative perfor- An important point to make about the graph based genetic
mance with the summary statistics in Figure 3 we find that algorithm is that when use of a given graphical connection
preservation of diversity as estimated by EP(G) had little pre- topology helps, it does so with little run time cost. The cost
dictive value, that the mean quality statistic MP(G) had lit- of locating the good graph is paid before the experimenter
tle predictive value, but that edge preservation was some- runs his genetic algorithm. Only the overhead for maintain-
what predictive of success in the trivial string evolver and ing the list of graphical connections is charged to the graph
strongly predicted success in the Tartarus experiments. Ei- based genetic algorithm’s run time. The crop of experiments
ther our measure of diversity preservation is not a good one reported here show that graphical connection topologies can
or diversity preservation is not an issue for the test problems help but give only a a modest amount of help in choosing
and population sizes we have presented. The graph-theoretic them (the edge preservation statistic, Equation 4, has a lim-
ited, empirical correlation with performance on two of three [5] David E. Goldberg. Genetic Algorithms in Search, Opti-
test problems). mization, and Machine Learning. Addison-Wesley Pub-
Knowing that the choice of connection topology is impor- lishing Company, Inc., Reading, MA, 1989.
tant, it remains to figure out good predictors of what graphical
topologies will help with which sorts of problems. In addition [6] Motoo Kimura and James Crow. On the maximum
to trying graph based genetic algorithms on a wider variety avoidance of inbreeding. Genetical Research, 4:399–
of problems (and on harder problems) there are a number of 415, 1963.
other issues that need to be examined. It is intuitive that the [7] Kenneth Kinnear. Advances in Genetic Programming.
local mating rule is important and we have tested exactly one, The MIT Press, Cambridge, MA, 1994.
highly elitist, local mating rule. Experiments for well-mixed
populations that speak to optimum allocation of breeding tri- [8] John R. Koza. Genetic Programming. The MIT Press,
als, mutation rates, and crossover rates need to be repeated Cambridge, MA, 1992.
in the graph based context to see if the change in popula-
tion structure modifies results. The experimental results from [9] Richard J. Larsen and Morris L. Marx, editors. An intro-
this paper and the ordering of graphs they induce are sugges- duction to mathematical statistics and its applications.
tive and far from conclusive: a much larger number of graphs Prentice-Hall Inc., 1981.
should be studied. Researchers wishing to have a the graph ¨
[10] Heinz M{u}hlenbein. Darwin’s continent cycle theory
based genetic algorithm code used in this paper as a starting and its simulation by the prisoner’s dilemma. Complex
point should contact the first author. Systems, 5:459–478, 1991.
Another, more difficult, direction for future research in-
volves locating good graphical topologies. If a good predictor [11] Craig Reynolds. An evolved, vision-based behavioral
of performance of graphical population structures is available model of coordinated group motion. In Jean-Arcady
for a class of problems then a genetic algorithm or other op- Meyer, Herbert L. Roiblat, and Stewart Wilson, editors,
timization method could be used to locate good graphs. It From Animals to Animats 2, pages 384–392. MIT Press,
also might be possible to vary the local structure of the graph 1992.
and so locate good connection topologies in an online man-
ner during the execution of a genetic algorithm. This second [12] Gilbert Syswerda. A study of reproduction in gener-
suggested direction is a meta-algorithms, a notoriously tricky ational and steady state genetic algorithms. In Foun-
and time-consuming effort. dations of Genetic Algorithms, pages 94–101. Morgan
Kaufmann, 1991.
Acknowledgments [13] Astro Teller. The evolution of mental models. In Ken-
neth Kinnear, editor, Advances in Genetic Program-
The authors would like to thank the Iowa Center for Emerging ming, chapter 9. The MIT Press, 1994.
Manufacturing Technology for their hardware and logistical
support of this research. We also would like to thank the ref- [14] Douglas B. West. Introduction to Graph Theory. Pren-
erees for their proofreading and helpful suggestions. tice Hall, Upper Saddle River, NJ 07458, 1996.
[15] Darrel Whitley. The genitor algorithm and selection
Bibliography pressure: why rank based allocation of reproductive tri-
[1] D. L. Ackley and M. L. Littman. A case for distributed als is best. In Proceedings of the 3rd ICGA, pages 116–
Lamarckian evolution. Working Paper, Cognitive Sci- 121. Morgan Kaufmann, 1989.
ence Research Group, Bellcore, New Jersey, 1992. [16] Sewall Wright. Evolution. University of Chicago
[2] Dan Ashlock and Mark Joenks. ISAc lists, a different Press, 1986. Edited and with introductory Materials by
representation for program induction. In Genetic Pro- William B. Provine.
gramming 98, proceedings of the third annual genetic
programming conference., pages 3–10, San Francisco,
1998. Morgan Kaufmann.
[3] Dan Ashlock and James I. Lathrop. An arithmetic test
suite for genetic programming. ISU Applied Mathemat-
ics Report AM98-1, 1998.
[4] Dan Ashlock and James I. Lathrop. A fully character-
ized test suite for genetic programming. In Evolution-
ary Programming VII, pages 537–546, New York, 1998.
Springer.

You might also like