Graph Partitioning Using Genetic Algorithms
Christian Hohn
Institut fur Grundlagen Elektrotechnik/Elektronik
Technische Universitat Dresden
Colin R Reeves
School of Mathematical and Information Sciences
Coventry University
Email:
[email protected]
Abstract
The ecient implementation of parallel processing architectures generally requires the solution of hard combinatorial problems involving allocation and scheduling of
tasks. Such problems can often be modelled with considerable accuracy as a graph partitioning problem|a problem well-known from complexity theory to be NP-hard,
which implies that for practical purposes the global optimum will not be found by any exact algorithm, except
for trivially small problems.
Recently genetic algorithms (GAs) have been proposed
as a means of nding good solutions to the graph partitioning problem. In this paper two methods of implementing GAs are explored. In the rst case, they are
combined with a heuristic `greedy algorithm'. Unlike previous approaches the heuristic is directly linked to the recombination operator and this leads to a new `heuristic
neighbourhood search' operator. Comparisons with existing approaches for combining heuristics with genetic
algorithms show that the direct approach leads to improved results. In addition the heuristic neighbourhood
search operator gives rise to a more computationally efcient scheme.
Secondly, an approach which uses a structural
crossover operator is investigated. Again, improved versions of existing approaches are developed, and nally
the two forms of operator are compared using a number
of di erent graphs.
1. Introduction
Modern parallel processing architectures give rise to
a number of interesting problems of implementation.
Despite the considerable potential increase in computational eciency from using parallel processors, what is
achieved in practice may be signi cantly below this potential if the overall load is not distributed in a balanced
way.
One of the most fundamental questions is how to al-
locate computational tasks to many processors, when
tasks are known to be inter-related, in such a way
that the overall computational load on each processor is
equalized as much as possible. Given a set of algorithms
to be implemented on a processor, we can derive information on the number and type of simple arithmetic or
logical operations, and consequently on how much time
they need. It is also necessary to know what dependencies exist between the operations, all of which can be
represented on a directed graph. We can also represent
the processor information on a graph which de nes the
speed of the processors, and the nature and speed of
the communication links between processors. A recent
survey of this problem is given in [1].
Assuming such information has been gathered, the
allocation problem can be modelled quite closely by the
graph-partitioning problem, which we formulate below.
1.1. The k-way Graph Partitioning Problem
Given an undirected graph G = (V; E; !v ; !e ), where
V = fv1; v2 ; :::; vmg represents the set of vertices, E
V V denotes the set of edges, and the weight functions
!v : V 7! Z + and !e : E 7! Z + de ne non-negative integral weights for all vertices and for all edges respectively,
then for a positive integer k the k-way graph partitioning
problem (GPP) consists of nding k non-empty mutual
disjoint subsets V1 ; V2 ; :::; Vk such that [ki=1 Vk = V . A
partition is said to be feasible if
X ! (v) q ; 8i 2 [1; k]
v2Vi
v
i
where qi represents the lower bound for the size of the
ith partition. By formulating the problem in this way,
we allow the possibility of partitioning intoPsubsets of
di erent sizes. However, by choosing qi = v V !v =k,
we can force all subsets to be of exactly equal size. In
some problems, this restriction may not be necessary
(or even possible), and the constraint can be relaxed by
decreasing qi .
2
The cost of a partition is the sum of !e (i; j) where i
and j are in di erent subsets (or the sum of all external
edge weights). Then the optimization task consists of
nding the partition which minimizes the external edge
weights (or equivalently that which maximizes the internal edge weights).
The k-way graph partitioning task has proved to be
a good model for many real world problems, such as
memory partitioning and circuit partitioning, as well as
the problem of load balancing for multiprocessor systems. In the latter case one can think of the vertices
as the tasks (or jobs) in a multiprocessor system, where
!v represents the computational time required by each
job, and !e expresses the amount of communication between the jobs. The problem is to assign each job to one
of the k processors, such that the load is balanced according to q1; q2; :::; qk; and the communication between
the processors is minimal.
In this paper initially two examples of graph partitioning tasks are considered. For a simple example a two
dimensional grid of 10 10 nodes, all of the same weight,
is to be partitioned into k subsets, for k = f2; 4; 7; 15g.
As a second example some geometric graphs reported in
[2] with di erent weights at the edges and vertices are
considered.
To obtain subsets of nearly equal size, in all experiments the qi were evaluated according to
X
(1)
qi = b( !kv )c; 8i 2 [1; k]:
v V
2
2. Genetic algorithms
Following Holland's introduction of the concept [3],
genetic algorithms (GAs) have received increasing attention as a means of solving complex nonlinear optimization problems. They have proved to be robust and
well suited to many optimization tasks including problems involving multi-variable solutions for nonlinear objective functions.
Essentially, GAs solve optimization problems in an
analogous manner to the genetic processes of nature.
The physical solution of a given problem (corresponding
to a `phenotype') is coded into a string, termed a chromosome, which represents the `genotype'. Each physical
variable is associated with a gene (or set of genes) and a
value of a gene is called an allele. The theory was originally developed for binary representations, but often
higher cardinality alphabets may be more natural.
To generate a new solution two `parents' are drawn
from the present population and parts of their chromosomes are exchanged. For instance, suppose 2 strings
(a1 ; a2; a3; a4; a5) and (b1 ; b2; b3; b4; b5) represent two solutions to a problem. A crossover point is chosen at
random from the numbers f1; . . .; 4g, and a new solution produced by combining the pieces of the the original `parents'. For instance, if the crossover point were
2, then the `child' solutions would be (a1 ; a2; b3; b4; b5)
and (b1; b2; a3; a4; a5).
The child is thus constructed from components which
have already been tested in one context, and which can
now be tested in another. If it is important to ensure
that the child chromosome is a `pure' recombination the
recombination operator used should possess the property of `strict transmission' [4]: all alleles of the child
chromosome are chosen from one of the corresponding
parent values. While this de nition appears straightforward, there are many practical problems where the
solution space is restricted by a set of constraints, so
that operators will di er in the degree to which they
strictly transmit the parent alleles.
For the reproduction process parents are drawn from
the population according to their tness. The better
a solution, the more likely its chromosome is to be selected for creating new o spring. Thus, it is hoped that
the population will naturally be directed towards `better' regions of the solution space. However, in the case
of combinatorial problems, this assumption is often not
ful lled, and the basic GA needs some modi cation.
Recently, Reeves [5] has shown how the GA can be
regarded as a neighbourhood search (NS) procedure,
which di ers from traditional NS, and its modern variants, in two important respects. Firstly, the neighbourhood is de ned by two solutions (the parents) instead
of only one. Furthermore, the recombination operator
used in a GA is more likely to produce a new solution
that is further away from the parent solutions than the
operators that are used in common NS techniques. Indeed, these are the fundamental di erences that lead to
two signi cant properties of the GA as a neighbourhood
search methodology: on one hand they are less likely
to become trapped at local optima; on the other hand
they produce a relatively poor local performance. The
latter property has led to several attempts to combine
a GA with local hill-climbing procedures which e ectively re ne the solution obtained by the GA as a postprocessing tandem operation, as in [6] for example. In
[5] it is suggested that the neighbourhood search should
be embedded within the recombination step rather than
as an `add-on' extra.
Another approach is to hybridize the GA with an
on-line heuristic which continually assists the genetic
search process. However, it is not entirely clear how
best to arrange this combination such that the resulting
hybrid scheme works most eciently. In this paper a
novel methodology of combining the GA with an on-line
heuristic is proposed and its performance demonstrated
when applied to the graph partitioning problem.
2.1. Genetic algorithms and the GPP
For general partitioning tasks where a set of objects is
required to be decomposed into a set of subsets, `adding'
heuristics are often useful. The objects are ordered in
some fashion and are then assigned to one of the subsets
following the rules of the adding heuristic. In the case
of the GPP an attempt is made to nd the best subset
for each vertex with respect to the vertices that have already been placed. This may be achieved by the `greedy
algorithm' shown in Fig. 1. Obviously, the resulting solution may depend strongly on the initial sequence chosen. Thus, it would appear to be helpful to use a GA to
optimize the initial sequence and this is the basis of the
traditional approach [7].
For m objects and k subsets a sequence-based chromosome of length m is required. This way of encoding
has been applied to many combinatorial problems where
sequences are of interest, e.g. the travelling salesman
problem or machine sequencing. Furthermore, many operators have been developed to combine permutations.
Supposing the vertex vi is to be assigned to a subset:
1. Determine for each candidate subset the internal cost
which will result if vi is assigned to it.
2. Assign the vertex to the subset which maximizes the
internal cost.
3. If there is more than one possible subset assign the
vertex to the one of the smallest size at present.
4. If there is still more than one possible subset assign
the vertex randomly.
Figure 1. The greedy algorithm
This type of hybrid method was used by Jones and
Beltramo [7], whose experiments suggested the best approach for partitioning problems was to use the PMX operator combined with a greedy decoding of the sequenceencoded chromosome.
Although the combination of the GA and an on-line
heuristic has also been successfully applied to other hard
combinatorial problems [8, 9], there still remain some
strong objections. In particular, it can be argued that
by interposing an on-line heuristic the phenotype might
become too remote from the present coding to nd good
`building blocks'. If such fears are well-founded, the recombination of `good' genotypes will not necessarily lead
to a chromosome which represents a `good' phenotype.
Thus, much of the information provided by the parent
solutions may be lost.
To investigate this further, an experiment to analyze
the constitution of a newly generated solution was set
up. The parent solutions were compared with the child
to ascertain the degree of `strict transmission' by recording the number of `randomly' placed vertices within the
child (i.e. vertices that are placed in subsets to which
neither of the parents assigned them). Two di erent
GAs were run, one using the PMX operator combined
with the greedy algorithm shown in Fig. 1 and the other
a `strict transmission' (ST) operator inspired by the random assortment operator proposed in [10]. The ST operator uses group-number encoding, where the ith gene
encodes the subset to which vertex vi 2 V is assigned,
and it constructs a child by recombining the values of the
parent solutions as long as the o spring remains within
the solution space (see Fig. 2). In the experiment the
1. Choose two chromosomes x1 and x2 .
2. For (i = 0; i < m; i ++) if (x1 [i] == x2 [i]) then put 2
copies of x1 [i] into the bag. Otherwise put one copy
of x1 [i] and one copy of x2 [i] into the bag.
3. Draw randomly an allele x[i] from the bag.
4. If equation (1) is satis ed then y[i] = x[i] and all
remaining values x[i] are removed from the bag.
5. Go back to 3 until the child is completed or the bag
has become empty.
6. If the bag is empty ll the child chromosome randomly, but obeying equation (1).
Figure 2. The ST operator
ST approach serves as an ideal comparison because although the transmission may not be completely strict, in
practice only a very small number of `randomly' chosen
values can be expected. For the problem under consideration a 10 10 grid was to be partitioned into k subsets,
k 2 f2; 4; 7; 15g, and a population of 100 chromosomes
was used. The number of `randomly' placed vertices was
averaged over all solutions generated in 5000 trials and
the results are given in Table 1.
k
2
4
7
15
PMX+Gr 24.9% 56.2% 73.3% 86.3%
ST
1.3% 2.5% 4.7% 11.9%
Table 1. Percentage of `randomly' placed vertices averaged over 30 runs
It was found that the number of `randomly' placed
nodes increased with the number of subsets. In fact, for
the hybridized GA the values for 25-85% of the variables
were found to be set exclusively by the greedy algorithm,
with the information o ered by the parent solutions being ignored. It appears that the heuristic substitutes for
rather than assists the GA. In recognition of the potential shortfalls of the traditional approach the hypothesis
put forward in this paper is that the performance of the
`hybridized' GA may be improved if a better gene transmission can be achieved.
3. Heuristic neigbourhood search
To circumvent the problems discussed so far the
heuristic can be combined directly with the recombination operator as suggested in [5]. Again group-number
encoding is used. Assuming that the loci of the chromosomes are numbered from 1; . . .; m, two mutually exclusive and exhaustive index sets Ifix and Ifree can be
constructed by assigning the index of each gene either
to Ifix or Ifree . The child is then created by xing the
alleles for all genes de ned by Ifix while the indices of
the remaining positions build up an initial sequence for
the greedy algorithm to complete the child solution. In
this way the operation of the heuristic is restricted to
a neighbourhood search in the hyperplane de ned by
Ifix . Therefore, the group of operators working on this
basis are termed heuristic neighbourhood search (HNS)
operators.
In essence, this approach o ers the opportunity to
control the in uence of the heuristic by manipulating
the sets Ifix and Ifree . However, since the greedy algorithm acts on a particular subset only once, the initial
sequence cannot be optimized (even if that were possible in many trials). It would thus seem reasonable to
present the sequence in a random order, which also reduces the possibility of the greedy algorithm becoming
trapped in local optima. However, under such conditions
the HNS operators are likely to perform worse than the
traditional approach if the range of the neighbourhood
becomes too large. Attention must therefore be focused
toward a proper adjustment of the neighbourhood size
in order to optimize the performance of the heuristic. In
what follows methods for constructing Ifix and manipulating the neighbourhood size in a sensible manner are
discussed.
3.1. The respectful HNS operator
Radcli e [4] suggests that an important property of
a recombination operator is that it is `respectful', by
which he means that if both parents share a common
characteristic the child should inherit it. In the context
of this paper, if the HNS operator is to pay full `respect'
to both parents then it is essential that Ifix includes the
indices of all positions where the parent chromosomes
share the same allele. The resulting operator is termed
the respectful HNS (rHNS) operator. Obviously, the
size of the neighbourhood area jIfree j will depend on
the cardinality of the alphabet used. The higher the
cardinality becomes, the less likely it is that two parents
will share a common value. Consequently, the neighbourhood becomes larger and the task for the greedy
algorithm increases.
Redundancy arises in the GPP since there are k! possible ways to encode the same solution (the labelling of
groups is arbitrary). This will also a ect the size of the
neighbourhood, as the greater the redundancy, the fewer
positions the parents will share. Since the redundancy
increases factorially, large values of k may reduce the
cardinality of Ifix signi cantly, thus preventing the GA
from nding good `building blocks'. Redundancy can be
removed by renumbering the chromosomes before applying recombination as suggested in [7]. Here renumbering
is indicated by `+R', e.g. rHNS+R.
3.2. The escape mechanism
As with traditional GAs the hybridized version runs
the risk of becoming trapped at local optima. As soon
as the whole population has converged to a region which
excludes global optima, the GA gets trapped because the
rHNS operator o ers no way to escape. To prevent the
GA from getting trapped a mutation operator is often
used which randomly alters the value of a variable so as
to escape. A similar operator could be introduced which
acts on Ifix , such that each index belonging to Ifix can
be removed and inserted into Ifree with a given probability. This approach has the possibly useful advantage
that the in uence of the mutation operator increases
as the population converges. However, in this paper the
avoidance of local optima is attempted in a deterministic
manner which makes use of problem-speci c knowledge.
Fig. 3 shows four plots representing solutions for
the partitioning of a 4 4 grid into four subsets. The
rst plot shows one of the optimal solutions, the second
and the third show two potential parent solutions and
the fourth plot represents the resulting neighbourhood.
From the last plot it may be seen that this neighbourhood excludes the global optimum because (for example) the node in the third row and second column is
misplaced. To establish access to the global optimum
this node must be removed, which involves moving the
associated index from Ifix to Ifree .
To achieve this an algorithm is implemented which
scans Ifix for nodes blocking the global optimum and
subsequently removes them. Several characteristics can
be used to recognize such nodes. In the algorithm proposed here the index of a vertex is moved from Ifix to
Ifree if the vertex possesses an external weight but does
not possess an internal weight. Thus the neighbourhood
is extended in some sensible manner and the GA will be
less likely to become trapped. However, this is achieved
at the cost of slower convergence. In the following experiments the use of the escape mechanism is labeled by
`+E', e.g. rHNS+E.
It should also be recognized that any redundancy
within the code acts on the neighbourhood. As stated
above, the greater the redundancy, the larger the neighbourhood, thus reducing the possibility of the GA becoming trapped. Whilst for large values of k the redundancy has a detrimental e ect on the GA there might
be a range of small values of k where it is advantageous,
helping the GA to avoid local optima. However, to make
use of the redundancy one has to nd a parameter to
tune its in uence properly.
3.3. The neighbourhood reduction operator
It has been argued that poor performance will result if the neighbourhood becomes too large. To reduce
the neighbourhood it has been suggested that all redundancy should be removed. However, this e ect is limited
and it may be desirable to make a further reduction of
the neighbourhood size. This will involve a corresponding increase in the cardinality of the set Ifix . To achieve
1
1
3
3
1
1
3
3
2
2
4
4
2
2
4
4
1
3
3
3
2
3
1
4
2
2
1
4
2
1
4
4
1
3
3
3
1
1
1
3
2
2
4
4
2
2
4
4
1
2 2
3
2
3 1
4
3
4 4
Figure 3. 4 4 grid to be partitioned into 4 subsets
this, additional variables are required to be xed in their
values. One could use random values within the feasible
range, but this would impair the aim of a stricter transmission. A simple approach is to make use of the parent
solutions by taking the values in a random fashion from
both parents or strictly from only one of the parent solutions. The latter approach will bias the neighbourhood
towards one of the parents and is more likely to improve the local performance. Therefore, an algorithm
is implemented which moves half of the elements belonging to the set Ifree to the set Ifix , adjusting the
corresponding position in the child chromosome to the
values of one parent solution. However, by reducing the
neighbourhood in this manner, the GA may be forced
directly into a local optimum. To avoid this case the
neighourhood reduction operator was only used in connection with the escaping algorithm described earlier.
In the following experiments the use of the neighbourhood reduction operator in conjunction with the escape
mechanism is marked by `E+N', e.g. rHNS+E+N.
4. Experimental design
To compare the traditional approach with the proposed HNS operators the rst experiment used the
10 10 grid. For this task a steady state GA was
used where an o spring replaces an individual randomly
drawn from the lower half of the present population.
The population size was set to 100 and the number of trials to 5000. The parents for the recombination operators
were chosen randomly and no mutation operator was
used. The initial population was obtained in two steps.
Initial permutations for the traditional approach were
generated by `shuing' the numbers 0; . . .; 99; then, to
get an initial population, these were translated into partitions by randomly assigning each vertex to a subset
obeying eq. (1). All experiments were run 30 times
using di erent initial populations.
k
PMX rHNS rHNS rHNS rHNS rHNS
+Gr
+E
+R +E+R +E+N
2 16.0 11.1 10.6 11.6
10.6
10.6
4 38.1 22.2 22.4 23.9
22.2
21.4
7 51.8 37.9 37.5 38.6
37.3
37.6
15 73.6 77.8 75.3 74.0
72.5
68.9
Table 2. Average best value
Table 2 shows that for k < 15 all HNS operators produced roughly the same perfomance and converged toward better solutions than the traditional approach. For
k = 15, apart from the rHNS+E+N operator, the HNS
operators did not achieve better results. However, they
di er in the computational time required. The results
in Table 3 show that in the cases where the neighbourhood is reduced either by using a non-redundant code
or by the rHNS+E+N operator, the computational time
decreases signi cantly. Furthermore, one can observe
that for k < 15 the rHNS operator is slightly quicker
on average than rHNS+R, whilst for k = 15 rHNS+R
is better. This illustrates the di erent in uences of the
redundancy on a GA as argued in the previous section.
k
PMX rHNS rHNS rHNS rHNS rHNS
+Gr
+E
+R +E+R +E+N
2
47
10
17
10
14
12
4
58
13
17
15
17
18
7
76
25
27
25
26
27
15 140
101
100
77
76
78
Table 3. Computational time in seconds
4.1. Geometric graphs
To demonstrate the performance of the proposed operators when applied to graphs with a less regular structure, geometric graphs with di erent weights at the
edges and vertices were considered. The rst to be investigated was a geometric graph like the G 600 graph
reported in [2]. Using the same set of parameters, i.e.
600 nodes and an expected degree of 10, a graph was generated with 2844 edges, the sum of vertex weights was
1779, and the sum of edge weights 20096. The graph
was to be partitioned into 20 subsets.
Besides the operators used in the previous experiment, a standard GA using the ST operator was also
used. To set up a fair competition, noting the fact that
the algorithms di er in their computational complexities, the computational time was limited to 100 seconds
per run using a SPARC station IPX. The population
size was set to 50.
Table 4 summarizes the results obtained after 30 runs.
It shows the best value and the average best values in
the rst two rows, and the number of trials completed
in 100 seconds in the third row. To highlight the di erent in uences of the heuristic, the number of `randomly'
PMX+Gr
G 600 10 graph
Best
6721
Aver
7207
Trials
350
Rstart
90.2%
Rend
90.2%
G 600 4 graph
Best
1948
Aver
2070
Trials
300
Rstart
90.5%
Rend
90.3%
rHNS rHNS+E rHNS+R rHNS+E+R rHNS+E+N
ST
6378
6976
550
76.6%
74.5%
6270
6806
500
78.4%
73.1%
5729
6408
650
72.7%
56.6%
5856
6495
600
76.5%
57.4%
3263
3692
1000
57.1%
15.0%
18653
18718
650
3.5%
3.5%
1848
2016
450
78.1%
77.6%
1570
1876
400
80.3%
78.5%
1627
1868
450
76.4%
69.8%
1532
1775
450
78.3%
51.8%
484
749
850
54.2%
20.5 %
7487
7570
650
3.5%
3.4%
Table 4. Results for the geometric graphs
placed vertices at the start and end of a typical run are
given in the fourth and fth rows. In this experiment the
rHNS+E+N operator clearly outperforms the other operators. The number of `randomly' placed nodes seemed
to fall in an `optimal' range of roughly 60% at the beginning, decreasing rapidly to 15% at the end of the run. In
comparison the ST which placed only 3.5% of the vertices `randomly', performed very poorly. Furthermore,
the performance of a standard GA was no better than
that produced by any other random technique, so that
the `pure' recombination of two solutions cannot substitute for the knowledge incorporated by a heuristic (at
least not in the time allowed). On the other hand the
PMX/greedy algorithm placed nearly 90% `randomly'
(i.e. according to the heuristic) which led to poor results
although the initial sequences for the greedy algorithm
were optimized during the run. Therefore, it appears
that the best results will be obtained if the in uence of
the heuristic can be tuned properly as in the case of the
rHNS+E+N operator.
In order to increase the con dence of the results a second geometric graph was generated with 600 nodes and
an expected degree of 4. The resulting graph had 1171
edges, the sum of node weights was 1757 and the sum
of edge weights 8229. The results for this graph generally con rm the previous observations, although there is
an interesting di erence between the best value and the
average best value for the rHNS+E+N operator, indicating a more complex search space in which the operator
becomes trapped more often. This observation is supported by the fact that the operators which made use
of the escaping mechanism achieved signi cantly better
results than those which did not.
5. Structural crossover
The above results are encouraging, but it is clear that
the HNS operators do not really perform any recombination as traditionally understood in the realm of genetic
algorithms. The second parent merely determines the
nodes belonging to Ifix but contributes minimal addi-
tional information concerning the partition. Although
the results appears to be fairly good, one has to recognize that the GA does not use a proper recombination
operator, which is usually regarded as one of the hallmarks of the GA approach.
5.1. Redundancy
The main problem in the construction of appropriate
recombination operators is caused by the highly redundant nature of the group-number encoding. For example, if solution A assigns the vertices vi and vj to subset
a, while solution B assigns the vertices vi and vj to subset b then both solutions put vi and vj into a common
subset. However, the di erence in labelling hides this
similarity, so that the redundancy prevents the GA from
nding similarities among the solutions and increases
the diculty of nding good `building blocks'. In the
following, a procedure is given which allows the recombination of two partitions, while preserving this kind of
structural information.
Given G = (V; E; !v ; !e ), two partitions can be written as V1 = V11 [ V12 [ ::: [ V1k = V and V2 =
V21 [ V22 [ ::: [ V2k = V . Then
V = V 1 \ V2 =
[k [k V
i=1 j =1
1i \ V2j :
The resulting subsets V1i \ V2j contain only those vertices which are assigned by both parents to a common
subset. Let V be the set of all resulting subsets. Then
a feasible partition of V will also be a feasible partition
of V , and a respectful combination of V1 and V2 , in the
sense that all vertices assigned to a common subset by
both parents will remain in one.
However, to implement this procedure directly would
be computationally expensive, because the cutset of k2
subsets has to be calculated. In the following, several
approaches will be discussed for realizing such a procedure more e ectively.
0
0
5.2. The SR operator
The structural recombination operator proposed by
von Laszewski [11] can be considered as an approximation to the procedure given above. One subset, say V2j ,
is chosen randomly from the second parent and copied
into the partition of the rst parent. This temporary and
probably illegal partition is then repaired as follows. If
V1j is the subset replaced by V2j then a repairing mechanism detects all vertices belonging to V1j \ V2j and assigns them randomly to the remaining subsets to obtain
a legal solution. However, in this approach, the operator
still identi es the subset V1j as the corresponding subset to V2j according to the group-number. Although von
Laszewski did propose a modi ed procedure to remove
the in uence of the redundancy, it has a high computational cost.
In this paper a modi ed version of this structural
crossover is used. The `SR' operator di ers from the
original in the following respects:
For generating the initial population a clustering
algorithm is used to partition the graph into k clusters using some randomly chosen vertices as initial
points.
The subset which is copied is chosen according to
some partial tness value, to prevent a bad partial solution from being inherited by the o spring.
The only candidate subsets are those which have an
above average internal cost within the partition.
The subset V2j does not simply replace the subset
bearing the same group-number, but the one which
has the most nodes in common, say V1i . In this way,
as much as possible of the information encoded in
the parent solutions is preserved.
All vertices belonging to V1i \ V2j are re-assigned by
means of a greedy algorithm instead of randomly.
As a nal variation, the `SR+Gr' operator additionally re-assigns all those vertices whose internal costs are
less than or equal to the external costs by means of a
greedy algorithm.
6. Results
In this section, some computational experiments are
reported for the operators `SR' and `SR+Gr' as described above. The results of the `PMX+Gr' and
`rHNS+E+N' approaches (as de ned earlier) are repeated for comparison.
6.1. The grid instances
The initial experiments again used the 10 10 grid
with equal weights at the edges and vertices, to be partitioned into k = f2; 4; 7; 15g subsets. Tables 5 and 6
display the performances achieved. The undoubted winner of this competition is the structural recombination
operator `SR+Gr'. The speed of this approach is particularly impressive. It appears that the very regular
structure of the graph favours this approach.
k PMX+Gr rHNS+E+N SR+Gr
2
4
7
15
16.0
38.1
51.8
73.6
10.6
21.4
37.6
68.9
10.0
20.0
37.7
66.3
SR
10.0
20.0
39.8
75.3
Table 5. Average best value
k PMX+Gr rHNS+E+N SR+Gr SR
2
4
7
15
47
58
76
140
12
18
27
78
3
4
4
8
3
4
4
9
Table 6. Computational time in seconds
6.2. Geometric graphs
Table 7 shows the best and the average best solutions
found within 30 runs for the geometric graphs. Furthermore, the number of trials performed within 100 seconds highlights the computational e ort required. Finally, the fourth and fth rows of the tables display the
percentage of vertices placed by the heuristic at the beginning and end of the search respectively, in order to
illustrate the in uence of the heuristic during the search.
PMX+Gr rHNS+E+N SR+Gr
SR
G 600 10 graph
Best
6721
3263
2964 3781
Aver
7207
3692
3087 4063
Trials
350
1000
1800 2200
Hstart
100%
79.5% 12.6% 3.5%
Hend
100%
39.1%
9.3% 3.3%
G 600 4 graph
Best
1948
484
368 854
Aver
2070
749
459 1004
Trials
300
850
3600 4000
Hstart
100%
73.5% 10.0% 5.9%
Hend
100%
41.0%
3.3% 2.9%
Table 7. Results for the geometric graphs
The basic SR operator is inferior here to the HNS
operator, but SR+Gr dominates again, presumably because of the reduced in uence of the heuristic, and the
corresponding increased in uence of recombination.
6.3. Random graph
In the nal set of experiments, a random graph was
generated with 400 vertices and an expected degree
d = 8. The nal graph had 1762 edges and was to be
partitioned into 8 subsets. For all algorithms the initial
conditions were as above.
PMX+Gr rHNS+E+N SR+Gr
SR
3000
2593
2638 2806
3040
2655
2697 2831
1100
3300
2100 4300
100%
50.4% 32.2 % 4.5%
100%
5.0 % 54.0 % 11.3%
Best
Aver
Trials
Hstart
Hend
Table 8. Results for the R 400 8 graph
The main di erence between this experiment and the
previous one was that the number of vertices placed by
the greedy algorithm increased for SR+Gr, considerably
slowing down the operator. It appeared that the random
graph o ers less structural information, so that recombination disturbs the parent solutions rather than leading
to an improvement. Thus the HNS operator, which actually performs no recombination, seems to cope better
with the random graph than the SR operators. However,
a more thorough analysis has revealed a problem with
the selection of the vertices which are to be re-assigned
by the greedy algorithm. This aspect is currently under
investigation, but preliminary results seem to indicate
that the performance of SR+Gr can be enhanced by using a more exible approach.
7. Conclusion
A new group of heuristic neighbourhood search operators has been proposed and investigated for the GPP.
They are constructed by combining a heuristic directly
with the recombination operator. In contrast to the traditional hybridized GAs the operation of the heuristic
is restricted, so that the HNS operators are less computationally intensive and preserve the parent solutions
better. Several operators have been investigated for controlling the in uence of the heuristic by manipulating
the size of the neighbourhood. This approach demonstrates better performance than the simple `PMX+Gr'
approach of [7], in terms of computational time and
quality of results when applied to two examples of the
graph partitioning problem. On the whole, the results
are fairly encouraging and it is believed that the approach may be applicable to a wider range of combinatorial problems.
However, there are reasons for believing that in many
cases a GA can only outperform other search techniques
if full use of recombination is made. Therefore, an alternative recombination operator has been investigated to
nd one which best preserves the information gathered
View publication stats
by the parent solutions while being computationally efcient. The results achieved suggest that some version
of the structural recombination operator is likely to produce the best results overall, although for random graphs
it does not do quite as well as the best HNS approach.
Overall, the results obtained indicate that the performance of a GA approach to graph partitioning can be
considerably enhanced if account is taken of problemspeci c factors in designing and implementing operators.
References
[1] T.Chockalingam and S.Arunkumar (1995) Genetic algorithm based heuristics for the mapping
problem. Computers & Opns Res., 22, 55-64.
[2] L.Tao and Y.Zhao (1993) Multi-way graph partitioning by stochastic probe. Computers &
Opns Res., 20, 321-347.
[3] J.H.Holland (1975) Adaptation in Natural and
Arti cial Systems. University of Michigan Press,
Ann Arbor.
[4] N.J.Radcli e (1991) Forma analysis and random
respectful recombination. In [12], 222-229.
[5] C.R.Reeves (1994) Genetic algorithms and
neighbourhood search. In T.C.Fogarty (Ed.)
(1994) Evolutionary Computing: AISB Workshop, Leeds, UK, April 1994; Selected Papers.
Springer-Verlag, Berlin.
[6] D.E.Goldberg (1989) Genetic Algorithms in
Search, Optimization, and Machine Learning.
Addison-Wesley, Reading, Mass.
[7] D.R.Jones and M.A.Beltramo (1991) Solving
partitioning problems with genetic algorithms.
In [12], 442-449.
[8] C.R.Reeves (1996) Hybrid genetic algorithms
for bin-packing and related problems. To appear
in Annals of Operations Research.
[9] J.N.Bhuyan, V.V.Raghavan and V.K.Elayavalli
(1991) Genetic algorithm for clustering with an
ordered representation. In [12], 408-415.
[10] N.J.Radcli e and F.A.W. George (1993) A
study in set recombination. In S.Forrest (Ed.)
(1993) Proceedings of 5th International Conference on Genetic Algorithms. Morgan Kaufmann, San Mateo, CA , 23-30.
[11] G.von Laszewski (1991) Intelligent structural
operators for the k-way graph partitioning problem. In [12], 45-52.
[12] R.K.Belew and L.B.Booker (Eds.) (1991) Proceedings of 4th International Conference on Genetic Algorithms. Morgan Kaufmann, San Mateo, CA.
[13] S.Forrest (Ed.) (1993) Proceedings of 5th International Conference on Genetic Algorithms.
Morgan Kaufmann, San Mateo, CA.