An Experimental Study of LP-Based Approximation Algorithms For Scheduling Problems
An Experimental Study of LP-Based Approximation Algorithms For Scheduling Problems
An Experimental Study of LP-Based Approximation Algorithms For Scheduling Problems
R. N. Uma
Department of Computer Science, University of Texas at Dallas, M/S EC 31, Richardson, Texas 75083-0688, USA,
[email protected]
Joel Wein
Department of Computer Science, Polytechnic University, Brooklyn, New York 11201, USA, [email protected]
R ecently there has been much progress on the design of approximation algorithms for a variety of scheduling
problems in which the goal is to minimize the average weighted completion time of the jobs scheduled.
Many of these approximation algorithms have been inspired by polyhedral formulations of the scheduling
problems and their use in computing optimal solutions to small instances.
In this paper we demonstrate that the progress in the design and analysis of approximation algorithms for
these problems also yields techniques with improved computational efficacy. Specifically, we give a compre-
hensive experimental study of a number of these approximation algorithms for 1rj wj Cj , the problem of
scheduling jobs with release dates on one machine so as to minimize the average weighted completion time
of the jobs scheduled. We study both the quality of lower bounds given for this problem by different linear-
programming relaxations and combinatorial relaxations, and the quality of upper bounds delivered by a number
of approximation algorithms based on them. The best algorithms, on almost all instances, come within a few per-
cent of the optimal average weighted completion time. Furthermore, we show that this can usually be achieved
with On log n computation.
In addition we observe that on most kinds of synthetic data used in experimental studies a simple greedy
heuristic, used in successful combinatorial branch-and-bound algorithms for the problem, outperforms (on aver-
age) all of the LP-based heuristics. We identify, however, other classes of problems on which the LP-based
heuristics are superior and report on experiments that give a qualitative sense of the range of dominance of
each. We consider the impact of local improvement on the solutions as well.
We also consider the performance of the algorithms for the average weighted flow-time criterion, which,
although equivalent to average weighted completion time at optimality, is provably much harder to approx-
imate. Nonetheless, we demonstrate that for most instances we consider that the algorithms give very good
results for this criterion as well.
Finally, we extend the techniques to a rather different and more complex problem that arises from an actual
manufacturing application: resource-constrained project scheduling. In this setting as well, the techniques yield
algorithms with improved performance; we give the best-known solutions for a set of instances provided by
BASF AG, Germany.
Key words: single-machine scheduling; linear programming; approximation algorithms
History: Accepted by Jan Karel Lenstra, Area Editor; received August 2002; revised August 2003; accepted
August 2003.
established to be a small-constant-factor
approxima- combinatorial relaxations and study the quality of
tion algorithm for 1rj wj Cj (Hall et al. 1997). their performance.
Thus results from the enumerative “community” have Another optimality criterion that is closely related
had a direct impact on the approximation commu- to average weighted completion
time is the average
nity. Subsequently, several papers appeared that gave weighted
flow time, w j C j − rj (often denoted
more ingenious variants of the algorithms in Hall wj Fj ), which in many settings is a much better cri-
et al. (1997) with improved performance guaran- terion of good average service. These two criteria
tees (Goemans 1997, Chekuri et al. 2000, Schulz and are equivalent at optimality, but from the perspective
Skutella 1997a, Wang 1996, Goemans et al. 2002). In of approximation they are very different there is no
this paper, we demonstrate experimentally that these -approximation algorithm for the nonpreemptive
improved approximation algorithms yield improved minimization of average flow time √ of jobs with release
empirical performance as well. We also generalize dates on one machine with = o n unless P = NP
the techniques to yield improved algorithms for a (Kellerer et al. 1999). Nonetheless, we demonstrate
more complex scheduling problem that arises in an that for many of the instances that we considered,
actual manufacturing application from BASF AG, the approximation algorithms perform very well for
Germany. We feel that the experimental results in this the average weighted flow-time criterion as well. We
paper, taken in conjunction with a number of previ- also identify, however, several categories of prob-
ous papers on both enumerative and approximation lem instances for which the performance of these
approaches, represent an example of good synergy techniques degrades dramatically for the average
between theory, experimentation, and practice. weighted flow-time criterion.
In this paper, we demonstrate the impact that We also study a number of more detailed issues
progress in the approximation approach can have about the performance of the algorithms, including
the impact of the quality of the linear-programming
on enumerative/computational approaches. We show
relaxation to which they are applied and the power
that the ideas that lead to improved approximation
of randomization. In addition, we note that simple
algorithms also lead to heuristics that are quite effec-
local-improvement techniques are often very success-
tive in empirical experiments. Furthermore we show
ful in giving good solutions to scheduling prob-
that they can be extended to give improved heuristics
lems (Anderson et al. 1997). We therefore consider
for more complex problems that arise in practice.
the impact of some simple local-improvement tech-
Specifically, we study the performance of a suite of
niques when applied both “from scratch” and to the
different algorithms based both on the Cj -relaxation
solutions yielded by the various heuristics that we
and on the xjt -relaxation. We demonstrate that, consider.
although the xjt formulation is known to provide Finally, we show that the ideas behind the more
stronger lower bounds, for most of the instances we sophisticated algorithms lead to improved heuristics
considered, the Cj formulation, at a significantly less for a rather different sort of scheduling problem that
computational cost, provides both a lower bound arises in practice. Specifically, we consider a prob-
(through the solution of its relaxation) and an upper lem and data arising from a manufacturing appli-
bound (through associated heuristics) that are on cation from BASF AG, Germany, and give the best
average just a few percent from those given by known solutions for several specific instances of this
the time-indexed formulation. Furthermore, the best problem.
algorithms give very high-quality upper bounds for This last result highlights
average weighted completion time; the ratio of the an important aspect of
our study. Although 1rj wj Cj is a natural and
computed upper bound to the lower bound almost much-studied problem, instances of scheduling prob-
always comes within a few percent of optimal. lems that are exactly of this form are rare in practice.
In parallel with work on linear-programming The rationale for its importance is captured by the fol-
lower bounds for 1rj wj Cj there has been sig- lowing quote due to Dyer and Wolsey (1990, p. 255).
nificant
work on branch-and-bound algorithms for This is not an end in itself, but it is we believe one
1rj wj Cj based on combinatorial lower bounds of the inherently difficult single-machine problems for
(Belouadah et al. 1992, Bianco and Ricciardelli 1982, which it is a challenge to obtain strong lower bounds.
Dessouky and Deogun 1981, Hariri and Potts 1983). We hope that ultimately this approach will allow us to
The most successful of these is due to Belouadah tackle and solve problems including many machines
et al. (1992) who made use of two combinatorial and other types of constraints including deadlines,
lower bounds based on job splitting (Posner 1985, precedence constraints and order dependent process-
Belouadah 1985, Belouadah et al. 1992), and an upper ing times.
bound based on a simple greedy heuristic. We evalu- In the development of approximation algorithms
ate their lower bounds and the simple greedy heuris- their hope has been borne out, as techniques devel-
tic. We also apply the LP-based heuristics to their oped for the one-machine problem led to the design of
Savelsbergh, Uma, and Wein: Experimental Study of LP-Based Approximation Algorithms for Scheduling Problems
INFORMS Journal on Computing 17(1), pp. 123–136, © 2005 INFORMS 125
approximation algorithms for a wide variety of prob- where the binary variable xjt for each job j j =
lems (Hall et al. 1997; Chakrabarti et al. 1998; Chekuri 1 n and time period t − 1 t t = 1 T indi-
et al. 2000; Schulz and Skutella 1997a, b; Goemans cates whether job j completes in period t − 1 t
et al. 2002). We view the experimental results in this xjt = 1 or not xjt = 0. The assignment constraints
paper as further evidence towards the validation of (2) state that each job has to be completed exactly
their thesis. once, and the capacity constraints (3) state that the
Finally, we note that our goal is not an in-depth machine can handle at most one job during any time
understanding of the running times of the different period. This formulation can be used to model several
algorithms, but rather an understanding of the ap- single-machine scheduling problems by an appropri-
proximation performance of two different approaches ate choice of the objective coefficients and possibly a
with radically different running times; therefore we do restriction on the set of variables. For instance, if the
not report on running times in depth in this paper. objective is to minimize the weighted sum of the com-
We also note that we do not claim that our best algo- pletion times, we take coefficients cjt = wj t, where wj
rithms are the best known, as we have not conducted denotes the weight of job j; if there are release dates
in-depth comparisons with all other approaches. Our rj , i.e., job j becomes available at time rj , then we dis-
goal is simply to understand the behavior of the card the variables xjt for t = 1 rj + pj − 1.
above-mentioned approximation algorithms and to The xjt -LP is obtained from (1)–(4) by relaxing the
demonstrate the possible impact of these approaches integrality constraint on xjt (4) to
on real problems.
0 ≤ xjt ≤ 1 j = 1 n t = 1 T (5)
2. Background: Formulations and The xjt -LP has been observed to give strong lower
Algorithms bounds (de Sousa and Wolsey 1992, van den Akker
1994) but is very difficult to solve due to its size.
2.1. Time-Indexed Formulations Because the number of constraints is n + T and the
A time-indexed formulation is based on time dis-
number of variables is roughly nT with T > j pj ,
cretization. That is, time is divided into periods, even for instances with relatively few jobs, the size
where period t starts at time t − 1 and ends at time t. can be enormous. As a result, the memory required to
The planning horizon is denoted by T and there- store an instance and the time required to solve just
fore we consider the time periods 1 2 T . Dyer the LP relaxation may be prohibitively large. There-
and Wolsey (1990) had proposed a number of time- fore, for time-indexed formulations to be useful, ways
indexed formulations for scheduling problems. Here are needed to reduce the memory requirements and
we focus on two of their formulations, xjt -formulation the solution times of the LP relaxation. van den Akker
and yjt -formulation. These two formulations have et al. (1999, 2000) show that Dantzig-Wolfe decompo-
been studied extensively by de Sousa and Wolsey sition techniques and column-generation techniques
(1992), van den Akker et al. (1999, 2000), van den can be used for partial alleviation of the difficul-
Akker (1994), and Queyranne and Schulz (1994) and ties associated with the size of time-indexed formu-
have been useful in the design of approximation algo- lations. Even though Dantzig-Wolfe decomposition
rithms (Hall et al. 1997, Goemans 1997, Chekuri et al. techniques and column-generation techniques allow
2000, Schulz and Skutella 1997a, Wang 1996, Goemans the solution of the LP relaxation of much larger in-
et al. 2002). We describe below these two formulations stances, there are some inherent difficulties with that
for single-machine scheduling problems. approach too. Column-generation techniques tend to
xjt -Formulation. This formulation is general in that converge very slowly, especially on larger more diffi-
it models several single-machine scheduling problems cult instances.
The xjt -LP is a nonpreemptive relaxation. That is,
including 1rj wj Cj .
the relaxation is obtained by slicing jobs into pieces in
n
T a horizontal fashion such that any one piece requires
minimize cjt xjt (1) only a fractional capacity of the machine but the full
j=1 t=1
processing requirement of the job that it belongs to.
T A feasible solution to the xjt -LP for the instance in
subject to xjt = 1 j = 1 n (2) Table 1 is given in Figure 1. The solution in Figure 1
t=1
can be viewed as a schedule of such pieces (slices) of
t+p −1
n j jobs. For example, the slice of job j corresponding to
xjs ≤ 1 t = 1 T (3) xjt = 025 utilizes one fourth of the machine’s capacity
j=1 s=t
and is scheduled nonpreemptively from t − pj to t, for
xjt ∈ 0 1 j = 1 n t = 1 T (4) some t.
Savelsbergh, Uma, and Wein: Experimental Study of LP-Based Approximation Algorithms for Scheduling Problems
126 INFORMS Journal on Computing 17(1), pp. 123–136, © 2005 INFORMS
Table 1 A Problem Instance date rj and T . The capacity constraints (7) state that
the machine can handle at most one job during any
j rj pj
time period.
1 0 10 In a relaxation to the integer program (6)–(9), the
2 4 5 integrality constraint (9) is relaxed to
3 6 2
4 7 4 0 ≤ yjt ≤ 1 j = 1 n t = rj + 1 T (10)
We will refer to this linear program as yjt -LP. This lin-
yjt -Formulation. Another formulation for the ear program is a valid relaxation to the optimal pre-
emptive schedule as well (Hall et al. 1997).
single-machine scheduling problem 1rj wj Cj is:
Although the yjt formulation is of exponential size,
T with n + T constraints and roughly nT variables
n pj 1 1
minimize wj + t− y (6) where T ≥ j pj , Dyer and Wolsey (1990) showed that
j=1
2 pj t=rj +1 2 jt it is a transportation problem with a very special
n structure and thus can be solved in On log n time
subject to yjt ≤ 1 t = 1 T (7) (see also Goemans 1997). The structure of the solution
j=1 is simple: at any point in time, schedule the available
T unfinished job with maximum wj /pj (this may involve
yjt = pj j = 1 n (8) preemption). Note that pj denotes the total processing
t=rj +1 requirement of job j and not the remaining processing
requirement. The yjt -LP is a preemptive relaxation.
yjt ∈ 0 1 j = 1 n
A feasible solution to the yjt -LP for the instance in
t = rj + 1 T (9) Table 1 is given in Figure 2.
Although both formulations are exponential in size,
where the binary variable yjt for each job j (j = the relaxation to the yjt -formulation can be solved
1 n) and time period t − 1 t (t = 1 T ) indi- in polynomial time whereas the relaxation to the xjt -
cates whether job j is processed in period t − 1 t formulation requires a pseudo-polynomial-time solu-
(yjt = 1) or not (yjt = 0). The term in the objective func- tion. As a result, any approximation algorithm that is
tion, namely pj /2 + 1/pj Tt=rj +1 t − 12 yjt , corresponds based on the yjt -relaxation is a polynomial-time algo-
to the actual completion time of job j if the job were rithm whereas any approximation algorithm based on
continuously processed from Cj − pj to Cj . The second the xjt -relaxation is only a pseudo-polynomial-time
algorithm. Note however, that the xjt -relaxation gives
term in this expression, namely 1/pj Tt=rj +1 t − 12 yjt ,
stronger lower bounds than does the yjt -relaxation.
computes the midpoint of the computation for each
job j (using the middle of each time unit when the 2.2. Completion-Time Formulations
job runs). Adding the remaining half of the process- A different approach to model the problem is by using
ing time pj /2 to this midpoint of computation of job variables Cj that represent the completion time of
j gives the effective completion time of job j. This job j in a schedule. Let N be the set of all n jobs
effective completion time of job j is earlier than the and define for any set S ⊆ N , rmin S = minj∈S rj and
completion time of the last piece of job j even if the yjt pS = j∈S pj . Further, define pS2 = j∈S pj 2 and
are integral (unless all pieces of the job are scheduled 2
p2 S = j∈S pj . We will refer to the following relax-
consecutively). Thus the solution to the yjt -LP relax- ation to 1rj wj Cj as the Cj -relaxation:
ation provides only a lower bound to the preemp-
tive schedule. The constraints (8) require that each job
n
minimize wj C j
should be processed in its entirety between its release j=1
subject to pj Cj ≥ S for each S ⊆ N (11)
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 j∈S
1 4 4 1 0 2 4 6 8 10 12 14 16 18 20
2 3 1
1 2 3 4 2 1
3 3 4 2 2
Figure 1 A Feasible Solution to the xjt -LP (1–3,5) for the Instance in Figure 2 A Feasible Solution to the yjt -LP (6–8,10) for the Instance in
Table 1 Table 1
Note. The circled numbers indicate the job id’s. The x-axis denotes time, and Note. The circled numbers indicate the job id’s. The x-axis denotes time and
the y -axis denotes the capacity of the machine (which is 1 unit). the y -axis denotes the capacity of the machine (which is 1 unit).
Savelsbergh, Uma, and Wein: Experimental Study of LP-Based Approximation Algorithms for Scheduling Problems
INFORMS Journal on Computing 17(1), pp. 123–136, © 2005 INFORMS 127
For all of these synthetic instances we solved the Table 2 Quality of the Lower Bounds with Respect to the Weighted
associated Cj -based relaxation (and also the BPP2 Flow Time Given by yjt , BPP2, and xjt Relaxations
relaxation). We solved the xjt -based relaxation, using Optimal Hard Synthetic 1 Synthetic 2
the column-generation code, for as many instances as
feasible. After a significant amount of computation yjt (BPP1) 4.506 19.169 1.729 1.848
BPP2 2.969 15.539 0.941 0.000
time, we solved this relaxation exactly for all of the xjt 1.322 0.000 0.013 N/A
n = 50 instances and the set of n = 100 instances with
arrival rates 2 and 5; for n = 100 we cut off the col- Note. We report on BEST − LB/LB × 100, where BEST is the best available
lower bound and LB is the corresponding lower bound. The values reported
umn generation somewhat early and therefore only are averaged over all the instances in each case.
produced approximate solutions. This demonstrates
an obvious and major advantage of the Cj -based LP-
relaxations over the xjt : They are solvable efficiently. approximate the wj Cj criterion within a few percent
The Hard set was designed to provoke poor per- of optimal, unless the instances are constructed in a
formance from the LP-based heuristics. All of these very careful fashion. Therefore we focus for the rest of
instances attempt to exploit the fact that the yjt relax- the paper on the wj Fj criterion. Note that the rela-
ation is also a valid relaxation of the optimal preemp- tive error with respect to average weighted flow time
tive schedule (Hall et al. 1997), and that therefore on is an upper bound on the relative error with respect
instances for which the optimal preemptive schedule to average weighted completion time.
is much better than the optimal nonpreemptive sched- We next examine the quality of the lower bound
ule, the yjt -based relaxation should perform poorly. delivered by each of the relaxations. Table 2 reports
Therefore these instances have one or several very on the relative performance of the different lower
large jobs, and a large number of tiny jobs that are bounds. We note that the BPP2 lower bound does
released regularly at small intervals. For the Hard set provide some improvement over the yjt bound at
we created instances with processing times 1 and 20 modest additional computational cost (On log n to
or 1 and 30. The size 1 jobs were generated on aver- On2 ), but because both are relaxations of the optimal
age 9 times more frequently than the size 20 or size preemptive schedule, on the Hard data set the BPP2
30 jobs and n ∈ 50 100. bound is still far from the xjt -lower bound. Further-
In this paper we focus on subsets of the data sets more, note that on the Optimal and Synthetic 1 data
that turn out to be among the most difficult in their sets sometimes BPP2 is better than the xjt -relaxation,
categories for the heuristics and that are at the same and this explains why none of the numbers in those
time representative of the overall behavior of that set. columns is 0—the comparison is always with respect
Specifically, we focus on the Optimal, Synthetic 1 and to the best lower bound for that instance. We note that
Synthetic 2, and Hard data sets. The Synthetic 1 set the maximum improvement observed by BPP2 over
corresponds to the subset of the Synthetic set where yjt on any instance was 9492%, on an instance in the
n ∈ 50 100 and a ∈ 2 5 and the Synthetic 2 data Hard data set.
set corresponds to a variation of the Synthetic set We next study the performance of the upper-
with release dates being generated uniformly and n ∈ bounding techniques (both approximation algorithms
50 100 200 500. and heuristics) on the different data sets.
On each solution to either linear program we Optimal and Synthetic Sets. For the instances in the
ran a suite of algorithms, including Schedule-by-C
j , Synthetic and Hard data sets we do not have the opti-
mal solutions (as n = 50 was beyond the range of
Schedule-by-Fixed- (for = 1/ 2), Schedule-by-
problems solvable optimally by the branch-and-cut
Best- , Schedule-by-Random- (based on the uni-
code of van den Akker et al. (1999). However, because
form distribution), and Schedule-by-Random- j .
the algorithms will solve the linear-programming
3.2. Experimental Results relaxations, yielding a lower bound, and then con-
struct a schedule based on this relaxation, yielding
3.2.1. Relative Strengths of Lower and Upper an upper bound, we report as an upper bound on
Bounds. We now look more carefully at the perfor- performance the ratio of the upper bound to the
mance of the algorithms, first those based on the Cj lower bound. We report these ratios with respect
relaxations. With respect to the wj Cj optimality cri- to the xjt -based relaxation if we know its solution
terion the two basic algorithms (Schedule-by-C
j and and otherwise with respect to the Cj -based relax-
Schedule-by-Fixed- ) give solutions that are almost ation. The Optimal and Synthetic data sets give similar
always within 95% of optimal, and Schedule-by- results qualitatively, both with respect to the gen-
Best- is almost always within 2% of optimal, with a eral size of factors of approximation achieved by the
maximum observed performance of a factor of 1.037 algorithms and their relative performance. Here we
times optimal. As this suggests, in general it is easy to present results for the Synthetic 1 data set.
Savelsbergh, Uma, and Wein: Experimental Study of LP-Based Approximation Algorithms for Scheduling Problems
130 INFORMS Journal on Computing 17(1), pp. 123–136, © 2005 INFORMS
Table 3 Performance of Algorithms Applied to Solution of Cj -Relaxation for wj Fj
n a Mean Std. dev. Max Mean Std. dev. Max Mean Std. dev. Max
50 2 1.271 0.144 1.729 1.184 0.092 1.495 1.066 0.047 1.282
50 5 1.107 0.045 1.245 1.073 0.032 1.197 1.018 0.013 1.071
100 2 1.268 0.125 1.786 1.190 0.096 1.686 1.071 0.043 1.271
100 5 1.079 0.030 1.152 1.056 0.018 1.104 1.014 0.010 1.055
Frequency
on different distributions. Therefore, in reporting 30
results we group into one set all wj pj distribution
combinations for a particular n a pair. 20
Despite the worst-case results on the difficulty of
10
the wj Fj objective, the performance of these algo-
rithms is quite reasonable, as illustrated in Table 3. 0
The basic algorithms are observed performing as
1.015–1.098
1.099–1.182
1.183–1.266
1.267–1.350
1.351–1.434
1.435–1.518
1.519–1.602
1.603–1.686
1.687–1.770
1.771–1.854
badly as 1.786 times optimal, but Schedule-by-Best-
is observed to perform only as badly as 1282 times
optimal. We also note that on average and in its ALG/LP ratios for N = 50, A = 2,(WFT - Cj ) (90 instances)
maximum observed values Schedule-by-Fixed- is
better than Schedule-by-C
j , and Schedule-by-Best- Figure 3 Distribution of Ratios of ALG/LP
gives significant improvement over both. However, Note. ALG is the cost of the solution obtained from running the corre-
this need not be true on every instance—on about sponding algorithm on the Cj -based relaxation and LP is the solution to the
Cj -relaxation with respect to wj Fj criterion for the case n a = 50 2
15% (over the entire Synthetic data set, which is the
(a total of 90 instances).
superset of the Synthetic 1 data set) Schedule-by-
jobs, and a large number of tiny jobs that are released The Schedule-by-Best- algorithm is observed to
regularly at small intervals. For such an instance, its perform as badly as 2.428 times optimal and inci-
average flow time with preemption allowed is much dentally, this performance was obtained when sched-
smaller than its nonpreemptive average flow time uled using the stronger xjt -relaxation than the weaker
because with preemption, each tiny job can be snuck yjt -relaxation. Later we will discuss more about the
in preemptively without overly disturbing the pro- impact of the quality of the underlying relaxation to
cessing of the large job. We would expect the xjt - the algorithms’ performance.
formulations to give stronger lower bounds, as it is Conclusions. To conclude, the observed quality of
only a relaxation of the nonpreemptive problem, but, the lower bounds directly correspond to the time
with respect to average weighted flow time they will spent on computing them, with yjt -relaxation giv-
still be weak on such instances because the small jobs ing the weakest bounds and xjt -relaxation giving the
can be snuck in in a fractional sense. strongest bounds. (Hence, if the quality of the lower
To generate difficult instances of this form we bounds is not highly critical to the end applica-
need processing times in a larger range than 1 10 tion, then using yjt -relaxation is better than using xjt -
and thus we experimented with instances with pj relaxation because of the additional time it takes to
generated in 1 100. We generated job sizes in a solve the xjt -relaxation.) Likewise, there is a direct
bimodal distribution, with the modes at 5 and 85; correspondence of the quality of the schedule given
the size 5 jobs were generated on average 9 times by an approximation algorithm with its performance
more frequently. On this set of instances, which we guarantee. In general, Schedule-by-Best- outper-
investigated for n ∈ 50 100 200 500, the worst per- forms all the other approximation algorithms.
formance (upper bound/lower bound ratio) recorded
over all algorithms exceeds 4 and the average perfor- 3.2.2. Randomness and Sampling. On most in-
mance measure of the best algorithm approaches 2. stances, we see that the Schedule-by-Best- algo-
We note that the algorithms are actually doing better rithm yields significant improvement over the other
than this because the Cj -based lower bound to which algorithms. However, it requires extra computation,
we compare is weak. as one must check n different values of , and thus
Based on this experiment, we experimented with is an On2 algorithm. Note that Schedule-by-Best-
instances with a very large spread—for example, jobs actually arose as a derandomization of the Schedule-
of size 1 and 200. On one such instance we recorded by-Random- algorithm (Goemans 1997, Chekuri
a ratio for Schedule-by-Best- of 16.502, which was et al. 2000). To what extent does randomization afford
in part due to the weak lower bound. However, the the performance guarantees of Schedule-by-Best-
actual approximation factor, when the instance was at a reduced computational cost? Qualitatively we
solved optimally, was still 7.110. In this instance we summarize that both Schedule-by-Random- and
have one large job released at time 0 and a sequence Schedule-by-Random- j , on average, give perfor-
of small jobs that arrive periodically over time. In the mance better than Schedule-by-C
j (although occa-
optimal schedule the large job should be processed sionally they can do worse) but they do not do nearly
last. However, in the Cj -based relaxation it is com- as well as Schedule-by-Best- .
pleted, preemptively, much earlier, and therefore for It is, however, a natural idea to run the randomized
all its -point is earlier than a large number of algorithm several times and choose the best answer it
small jobs. As a result, many small jobs are delayed gives. In a similar spirit, perhaps we need not check
unduly by all of our algorithms. We note that these all , as does Schedule-by-Best- ; maybe checking
particular instances were very similar in spirit to the just a few will be sufficient. Thus, we introduce three
instances given by Queyranne (1993) and Wang (1996) additional algorithms. Schedule-by-k-Best- divides
that demonstrate a lower bound of e/e − 1 on the 0 1 into k equal subintervals and tries each of
worst-case ratio of (Optimal Schedule value)/(Value the resulting endpoints as candidate , returning the
of Cj -based relaxation). best answer. Goemans (1997) has shown that this is
Unfortunately, instances with pj ∈ 1 100 are well a 2 + 1/k-approximation algorithm. Schedule-by-k-
beyond our ability to solve the xjt time-indexed relax- Random- and Schedule-by-k-Random- j run the
ation. Therefore, in an attempt to understand the rel- respective randomized algorithms k times.
ative performance of the lower bound given by the We conclude that it is necessary to test relatively
xjt -relaxation for such hard instances we created the few values of to achieve performance very close
Hard data set with pj ∈ 1 20 and 1 30 (with pro- to Schedule-by-Best- . For example, in the Synthetic
cessing times 1 and 20 or 30). For these we could solve data set Schedule-by-k-Best- , with k = 5, on aver-
the xjt -relaxation except for the 100 job instances with age is less than one percent from Schedule-by-Best- ,
pj ∈ 1 30. On this set, on average the xjt -relaxation with standard deviation 12%; on 48% of instances it
is about 19% stronger than the Cj -based relaxation. achieved exactly the same bound. Even on the Hard
Savelsbergh, Uma, and Wein: Experimental Study of LP-Based Approximation Algorithms for Scheduling Problems
132 INFORMS Journal on Computing 17(1), pp. 123–136, © 2005 INFORMS
wj Fj
3,560
k-Random- j , for k large enough, yields on average
moderate performance improvements. For example 3,540
on the Synthetic set with n a = 50 2, it on aver-
age improved on Schedule-by-Best- around k = 50 3,520
3.2.3. Algorithm Performance vs. Quality of Figure 4 Performance of the Schedule-by-Best-
Heuristic with Suc-
Linear-Program Solution. Overall, our results sug- cessive LP’s That were Generated Compared with the Opti-
mal Value
gest that the higher the quality of the LP relaxation
solution, the better the algorithms perform. This is root node, and run the algorithms on them. Some-
true across all the data sets, when we compare the what surprisingly, the algorithms’ performance does
performance of the algorithms based on the BPP2 not monotonically improve with improved LP solu-
relaxation and the yjt -relaxation, as is evident from tion. For instance, see Figure 4 where we have shown
Table 5. Recall that both BPP2 and yjt are preemp- the plots for one of the 10 instances. It is an interest-
tive relaxations and BPP2 is stronger than yjt . But ing question to understand this phenomenon better
this is not always the case when we consider the and its implications. If the goal is only approximately
xjt -relaxation also. For example, on the 50 5 Syn- optimal performance, these results suggest that there
thetic data Schedule-by-Fixed- does on average may be little benefit in doing the most intensive com-
slightly worse when it is based on the xjt -relaxation, putations.
and on the Hard instances for which we had solu- 3.2.4. SWPT vs. LP-based Heuristics. Of greater
tions to both relaxations the algorithms based on the interest is to compare the performance of all this
Cj relaxation were also better on average. We also LP “stuff” to the simple and naive heuristic used
explored this relationship from a different perspec- by Belouadah et al. (1992) in their combinatorial
tive. Specifically, recall that a branch-and-cut code branch-and-bound code. This heuristic, SWPT, sim-
for the solution of an integer program produces ply, when idle, selects for processing the unprocessed
a sequence of solutions to the linear-programming available job with the smallest pj /wj ratio (or equiv-
relaxation that are of increasing quality (i.e. increas- alently the largest wj /pj ratio) and schedules it non-
ingly close to the optimal integer solution). For 10 preemptively. It is trivial to see that the worst-case
instances we have generated the sequence of solu- performance of this heuristic is unbounded. Table 5
tions to the time-indexed linear-programming relax- provides a summary of the performance of the vari-
ation produced by the branch-and-cut code at the ous heuristics.
Table 5 Performance of the Algorithms Based on the Three (yjt BPP2 xjt ) Formulations
Mean Std. dev. Max Mean Std. dev. Max Mean Std. dev. Max
Schedule-by-fixed-
(yjt ) 2.207 0.623 3.906 1.126 0.093 1.686 1.334 0.126 1.892
Schedule-by-Best-
(yjt ) 1.683 0.263 2.164 1.042 0.042 1.282 1.155 0.067 1.371
Schedule-by-5-Best-
(yjt ) 1.775 0.301 2.244 1.048 0.046 1.358 1.171 0.079 1.419
Schedule-by-fixed-
(BPP2) 1.985 0.507 3.902 1.116 0.084 1.643 1.292 0.109 1.615
Schedule-by-Best-
(BPP2) 1.628 0.245 2.122 1.040 0.039 1.233 1.144 0.059 1.311
Schedule-by-5-Best-
(BPP2) 1.710 0.274 2.296 1.046 0.043 1.233 1.157 0.067 1.356
Schedule-by-Best-
(xjt ) 1.854 0.320 2.428 1.048 0.045 1.303 — — —
SWPT 1.874 0.372 3.015 1.033 0.032 1.148 1.121 0.048 1.307
Note. We report the mean, standard deviation, and maximum of the ratio of the performance of the algorithm to the best lower bound.
Savelsbergh, Uma, and Wein: Experimental Study of LP-Based Approximation Algorithms for Scheduling Problems
INFORMS Journal on Computing 17(1), pp. 123–136, © 2005 INFORMS 133
Table 6 Performance of the Algorithms Based on the Three (yjt BPP2 xjt ) Formulations After Some Local Improvement
Mean Std. dev. Max Mean Std. dev. Max Mean Std. dev. Max
Schedule-by-fixed-
(yjt ) 1.594 0.264 2.137 1.088 0.061 1.330 1.200 0.073 1.438
Schedule-by-20-Best-
(yjt ) 1.402 0.148 1.679 1.022 0.022 1.138 1.096 0.035 1.196
Schedule-by-5-Best-
(yjt ) 1.420 0.164 1.770 1.024 0.023 1.140 1.100 0.037 1.213
Schedule-by-fixed-
(BPP2) 1.591 0.270 2.141 1.081 0.058 1.339 1.175 0.066 1.371
Schedule-by-20-Best-
(BPP2) 1.398 0.133 1.624 1.022 0.022 1.143 1.096 0.035 1.191
Schedule-by-5-Best-
(BPP2) 1.413 0.142 1.652 1.024 0.023 1.143 1.100 0.036 1.213
Schedule-by-20-Best-
(xjt ) 1.429 0.197 1.871 1.021 0.022 1.125 — — —
SWPT 1.746 0.284 2.321 1.031 0.030 1.147 1.110 0.042 1.241
Note. We report the mean, standard deviation, and maximum of the ratio of the performance of the algorithm to the best lower bound.
Savelsbergh, Uma, and Wein: Experimental Study of LP-Based Approximation Algorithms for Scheduling Problems
134 INFORMS Journal on Computing 17(1), pp. 123–136, © 2005 INFORMS
local search is a better approach than trying to get a (2001). The algorithm employs the Schedule-by-C
j
better lower bound by solving the xjt -relaxation. and the Schedule-by-Fixed- (with = 05) heuris-
tics as well as some local improvement techniques to
obtain feasible schedules at each node of the search
4. Extending the Techniques: A tree. The solutions obtained for the four available
Real-Life Scheduling Problem data sets, each corresponding to a different number of
In this section, we demonstrate that the approxima- available workers (18, 20, 24, 27), were 469, 418, 366,
tion algorithms evaluated in this paper also have and 363, respectively.
practical value by illustrating their effectiveness on a We investigated whether the use of improved
real-life complex scheduling problem arising at BASF heuristics would result in improved overall per-
AG in Ludwigshafen, Germany. The problem is a formance. Therefore, we replaced the heuristics
resource-constrained project-scheduling problem and Schedule-by-C
j and Schedule-by-Fixed- by Best-
is described, together with methods that have been and Schedule-by-Random- j . The new solutions
proposed for its solution, in Kallrath and Wilson obtained for the four available data sets (with an
(1997). imposed time limit on cpu time of 4 hours), were
A number of orders have to be scheduled. Each 457, 401, 363, and 363, respectively, which are the best
order is broken down and produced in several identi- known solutions for all these instances. We note that
cal pieces. Each piece in turn consists of several tasks, the biggest advantage of these multiple- heuristics is
representing processing steps. Each task is character- that they supply a number of high-quality schedules
ized by a specific personnel requirement and a spe- for which to apply local improvement. We form all the
cific duration. The production of some of the orders schedules, one for each , apply the local improve-
are linked, i.e., the output of one order serves as input ment to each schedule, and then choose the best one.
for another. Labor is the limiting resource. A fixed Table 7 provides a little more insight into the value
number of workers are available. The objective is min- of the various heuristics. For each of the four data
imizing the schedule length. sets, we present the value of the makespan obtained
The problem is modeled by introducing a job for by the heuristics, as well as the value of the makespan
each piece of an order. As all pieces of an order after local search techniques have been applied to
are identical, the ordering of identical pieces is fixed improve the initial schedule, at the root node of the
using precedence constraints to avoid symmetries. search tree.
Due to the fact that some orders are connected, addi- We note that the quality of the initial schedule is
tional precedence relations between jobs exist. Each important. We have conducted a small experiment in
job has, in addition to a processing time, a labor pro- which we applied the local improvement techniques
file (duration and personnel requirement of the tasks). to 1,000 randomly generated feasible schedules, but
The jobs have to be scheduled on a single machine were unable to produce schedules of comparable
with limited capacity (number of workers). The objec- quality.
tive is to minimize the makespan.
Various time-indexed formulations for the above
problem have been proposed and investigated. We 5. Discussion and Future Directions
have used the so-called block formulation, which Our main goal in this paper is simply to understand
aggregates certain jobs into composite jobs, for all the empirical impact of ideas and techniques that
our computational experiments. The block formula- arose in the quest for improved worst-case perfor-
tion also formed the basis for an LP based branch- mance guarantees for algorithms for 1rj wj Cj . Our
and-bound algorithm developed by Cavalcante et al. results demonstrate that theoretical progress can lead
18 20 24 27
Note. The column headings indicate the number of available workers (18, 20, 24, 27). The first row gives the best known makespan
of the schedule prior to our work and the second row gives the best known makespan of the schedule through our work. The last four
rows give the makespan before local improvement and after doing some local improvement at the root node.
Savelsbergh, Uma, and Wein: Experimental Study of LP-Based Approximation Algorithms for Scheduling Problems
INFORMS Journal on Computing 17(1), pp. 123–136, © 2005 INFORMS 135
to progress in practice. Our goal is not to attempt to Dessouky, M. I., J. S. Deogun. 1981. Sequencing jobs with unequal
develop the best possible heuristics or to compare our ready times to minimize mean flow-time. SIAM J. Comput. 10
192–202.
work with all previous computational work on the
Dyer, M. E., L. A. Wolsey. 1990. Formulating the single machine
problem. Therefore, the experimental results in this
sequencing problem with release dates as a mixed integer pro-
paper, taken in conjunction with a number of previ- gram. Discrete Appl. Math. 26 255–270.
ous papers on both enumerative and approximation Goemans, M. X. 1996. A supermodular relaxation for scheduling
approaches, represent an example of good synergy with release dates. Proc. 5th MPS Conf. on Integer Programming
between theory, experimentation, and practice. and Combin. Optim., LNCS Vol. 1084, Springer-Verlag, Berlin,
In this study we touch on several approaches to Germany, 288–300.
scheduling problems: Linear-programming, approx- Goemans, M. X. 1997. Improved approximation algorithms for
scheduling with release dates. Proc. 8th ACM-SIAM Sympos.
imation, combinatorial relaxations and branch-and- Discrete Algorithms, LNCS Vol. 1099, Springer-Verlag, Berlin,
bound, and local improvement, and made a modest Germany, 591–598.
contribution to further our understanding of their Goemans, M. X., M. Queyranne, A. Schulz, M. Skutella, Y. Wang.
relationship. We feel that it is an important direction 2002. Single machine scheduling with release dates. SIAM J.
to continue to try to understand, in a unified fashion, Discrete Math. 15 165–192.
the different roles that these elements can play in the Hall, L. A., D. B. Shmoys, J. Wein. 1996. Scheduling to minimize
average completion time: Off-line and on-line algorithms. Proc.
theory and practice of scheduling.
7th ACM-SIAM Sympos. Discrete Algorithms, ACM, New York,
142–151.
Acknowledgments Hall. L. A., A. S. Schulz, D. B. Shmoys, J. Wein. 1997. Schedul-
The authors are grateful to Jan Karel Lenstra, Chris Potts, ing to minimize average completion time: Off-line and on-line
Maurice Queyranne, David Shmoys, and Cliff Stein for approximation algorithms. Math. Oper. Res. 3 513–544.
helpful discussions. Martin W. P. Savelsbergh’s research Hariri, A. M. A., C. N. Potts. 1983. An algorithm for single machine
was partially supported by NSF Grant DMI-9410102. R. N. sequencing with release dates to minimize total weighted com-
Uma’s research was partially supported by NSF Grant CCR- pletion time. Discrete Appl. Math. 5 99–109.
9626831 and NSF Grant DMI-9970063. Joel Wein’s research Kallrath, J., J. M. Wilson. 1997. Business Optimisation Using Mathe-
was partially supported by NSF Grant CCR-9626831, NSF matical Programming. Macmillan Publishers, Hampshire, UK.
Grant DMI-9970063 and a grant from the New York State Kellerer, H., T. Tautenhahn, G. J. Woeginger. 1999. Approximability
Science and Technology Foundation, through its Center for and nonapproximability results for minimizing total flow-time
Advanced Technology in Telecommunications. This work on a single machine. SIAM J. Comput. 28 1155–1166.
was done while the second author was a graduate stu- Phillips, C., C. Stein, J. Wein. 1998. Minimizing average completion
dent at Polytechnic University. Some of this work was done time in the presence of release dates. Math. Programming B 82
while the third author was visiting the IBM T. J. Watson 199–224.
Research Center. Posner, M. E. 1985. Minimizing weighted completion times with
deadlines. Oper. Res. 33 562–574.
Potts, C. N., L. N. van Wassenhove. 1983. An algorithm for single
References machine sequencing with deadlines to minimize total weighted
Anderson, E. J., C. A. Glass, C. N. Potts. 1997. Machine schedul- completion time. Eur. J. Oper. Res. 12 379–387.
ing. E. Aarts, J. K. Lenstra, eds. Local Search in Combinatorial Queyranne, M. 1993. Structure of a simple scheduling polyhedron.
Optimization. John Wiley and Sons, Chichester, UK, 361–414. Math. Programming 58 263–285.
Belouadah, H. 1985. Scheduling to minimize total cost. M.Sc. thesis, Queyranne, M., A. S. Schulz. 1994. Polyhedral approaches to
University of Keele, Staffordshire, UK. machine scheduling. Technical report 408/1994, Institut für
Belouadah, H., M. E. Posner, C. N. Potts. 1992. Scheduling with Mathematik, Technische Universität Berlin, Berlin, Germany.
release dates on a single machine to minimize total weighted Rinaldi, G., A. Sassano. 1977. On a job scheduling problem with
completion time. Discrete Appl. Math. 36 213–231. different ready times: Some properties and a new algorithm to
Bianco, L., S. Ricciardelli. 1982. Scheduling of a single machine determine the optimal solution. Tech. report R.77–24, Istituto
to minimize total weighted completion time subject to release di Automatica, Università di Roma, Rome, Italy.
dates. Naval Res. Logist. Quart. 29 151–167. Schulz, A. S. 1996. Scheduling to minimize total weighted com-
Cavalcante, C. C. B., C. Carvalho de Sousa, M. W. P. Savelsbergh, pletion time: Performance guarantees of LP based heuristics
Y. Wang, L. A. Wolsey. 2001. Scheduling projects with labour and lower bounds. Proc. 5th MPS Conf. Integer Programming
constraints. Discrete Appl. Math. 112 27–52. Combin. Optim., LNCS Vol. 1084. Springer-Verlag, Berlin, Ger-
Chakrabarti, S., C. Phillips, A. S. Schulz, D. B. Shmoys, C. Stein, many, 301–315.
J. Wein. 1998. Improved bounds on relaxations of a parallel Schulz, A. S., M. Skutella. 1997a. Scheduling LPs bear probabilities:
machine scheduling problem. J. Combin. Optim. 1 413–426. Randomized approximations for min-sum criteria. R. Burkard,
Chandra, R. 1979. On n/1/F
dynamic deterministic systems. Naval G. Woeginger, eds. Algorithms–ESA’97, LNCS Vol. 1284, Proc.
Res. Logist. Quart. 26 537–544. 5th Annual Eur. Sympos. Algorithms. Springer, Berlin, Germany,
Chekuri, C., R. Motwani, B. Natarajan, C. Stein. 2000. Approxima- 416–429.
tion techniques for average completion time scheduling. SIAM Schulz, A. S., M. Skutella. 1997b. Random-based scheduling: New
J. Comput. 31. approximations and LP lower bounds. J. Rolim, ed. Random-
De Sousa, J. P., L. A. Wolsey. 1992. A time-indexed formulation ization and Approximation Techniques in Computer Science, LNCS
of non-preemptive single-machine scheduling problems. Math. Vol. 1269, Proc. Internat. Workshop RANDOM’97. Springer,
Programming 54 353–367. Berlin, Germany, 119–133.
Savelsbergh, Uma, and Wein: Experimental Study of LP-Based Approximation Algorithms for Scheduling Problems
136 INFORMS Journal on Computing 17(1), pp. 123–136, © 2005 INFORMS
Uma, R. N., J. Wein, D. P. Williamson. 2003. On the relationship ing problems: Column generation. INFORMS J. Comput. 12
between combinatorial and LP-based approaches to NP-hard 111–124.
scheduling problems. Technical report, Department of Com- van den Akker, M., C. P. M. Van Hoesel, M. W. P. Savelsbergh. 1999.
puter Science, University of Texas at Dallas, Richardson, TX.
A polyhedral approach to single-machine scheduling prob-
(Preliminary version appeared in IPCO: 6th Integer Program-
lems. Math. Programming 85 541–572.
ming Combin. Optim. Conf. 1998.)
van den Akker, M. 1994. LP-based solution methods for single- Wang, Y. 1996. Bicriteria job scheduling with release dates.
machine scheduling problems. Ph.D. thesis, Department of Tech. Rep., Max-Planck-Institut für Informatik, Saarbrücken,
Mathematics and Computing Science, Eindhoven Univ. of Germany.
Technology, Eindhoven, The Netherlands. Wolsey, L. A. 1985. Mixed integer programming formulations for
van den Akker, M., C. A. J. Hurkens, M. W. P. Savelsbergh. production planning and scheduling problems. 12th Internat.
2000. A time-indexed formulation for single-machine schedul- Sympos. Math. Programming, MIT, Cambridge, MA.