An Owner-Centric Metric For The Evaluation of Online Job Schedules
An Owner-Centric Metric For The Evaluation of Online Job Schedules
An Owner-Centric Metric For The Evaluation of Online Job Schedules
p
j
size
j
C
j
. As reference problems we use corresponding problems with the
makespan objective C
max
= max
j
{C
j
} that have extensively been discussed in earlier
publications.
For a given scheduling algorithm, we determine upper bounds for the ratio
j
p
j
size
j
C
j
(S)
j
p
j
size
j
C
j
(OPT)
with S and OPT being the schedules produced by the algorithm
and the optimal schedule, respectively. This bound is called an approximation factor
for deterministic scheduling problems and a competitive factor for online problems.
Note that this is a worst case evaluation. An algorithm may perform much better in
case of a real workload. Nevertheless, a worst case analysis may provide interesting
information about possible bad situations.
For the total weighted completion time problem on a single processor 1||
w
j
C
j
without release dates, a nondelay scheduling in descending order of the Smith ratio
w
j
p
j
generates an optimal schedule. We say that a schedule is nondelay, if no processor is
kept idle while a job is waiting, see Pinedo [12]. Therefore, any nondelay schedule is
optimal for the 1|ncv|
p
j
C
j
problem. This also holds for the corresponding online
problem 1|r
j
, ncv|
p
j
C
j
while the problem with arbitrary weights (1|r
j
|
w
j
C
j
)
is NP-hard in the strong sense. Concluding, we can state that the objective
p
j
C
j
produces the same results as the makespan objective for single processors. Remember
that although the processing time is part of the objective, its knowledge is not required
by the algorithm. In the next section, we address machines with parallel identical
processors.
3 Parallel Machines and Sequential Jobs
The problem P
m
||C
max
is strongly NP-hard for m 3. List scheduling with an ar-
bitrary list order guarantees the approximation factor 2
1
m
, see Graham [8]. This
approximation factor is tight for nonclairvoyant scheduling. This result also holds for
oine and online problems with rigid parallel jobs, see Graham [13] and Naroska
Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2009)
10-12 August 2009, Dublin, Ireland
561
and Schwiegelshohn [14], respectively. The problem P
m
||
p
j
C
j
is NP-hard as parti-
tion reduces to P2||
p
j
C
j
. List scheduling with an arbitrary list order guarantees a
tight approximation factor of
1+
2
2
for this problem, see Kawaguchi and Kyan [15].
Schwiegelshohn and Yahyapour [11] showed a competitive factor of 2 for the problem
P
m
|r
j
, size
j
, ncv|
p
j
C
j
if size
j
m
2
holds for all jobs. While this demonstrates the
existence of a constant competitive factor for the problem P
m
|r
j
, ncv|
p
j
C
j
, the gap
between the upper bound 2 and the lower bound
1+
2
2
is still rather large. In this
section, we close this gap signicantly.
Theorem 1 List scheduling has a competitive factor of at most 1.25 for the problem
P
m
|r
j
, ncv|
p
j
C
j
.
Before presenting the proof for this problem, we use a lemma that has already been
introduced as Corollary 6 in a publication by Schwiegelshohn and Yahyapour [11]:
Lemma 1 Let S and OPT be a legal schedule and the optimal schedule of an arbitrary
problem instance P
m
|r
j
|
p
j
C
j
with n jobs, respectively. We split an arbitrary job J
k
into two jobs J
n+1
and J
n+2
such that 0 < p
n+1
< p
k
, p
n+2
= p
k
p
n+1
, r
n+1
= r
k
,
and r
n+2
= r
k
+ p
n+1
hold. Schedule S
j
p
j
C
j
(S
j
p
j
C
j
(OPT
j
p
j
C
j
(S)
j
p
j
C
j
(OPT)
.
Although Lemma 1 always produces legal schedules the resulting schedule need not
be nondelay, see the example of Fig. 3 that describes a system with all original jobs
being released at time 0. There, splitting of the long job does not yield a nondelay
schedule.
(a) (b)
Fig. 3 Nondelay schedule (a) and schedule after job splitting (b)
We can use Lemma 1 to transform an arbitrary schedule into a target schedule: We
split every job J
j
if p
j
> and the splitting does not lead to a nondelay schedule. Due to
Lemma 1, it is sucient to consider this type of schedule to determine the competitive
factor of nondelay schedules. Finally, we reduce the release dates of each job without
changing the schedule until any further reduction would violate the nondelay property
of this schedule. As this reduction cannot increase the total completion time of the
optimal schedule the ratio
j
p
j
C
j
(S)
j
p
j
C
j
(OPT)
cannot decrease. Now, a target schedule
contains short jobs (p
j
) and long jobs (p
j
> ), and has four main properties:
1. The schedule is nondelay.
Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2009)
10-12 August 2009, Dublin, Ireland
562
2. At least one machine is idle at the starting time of every long job.
3. For every long job J
j
and every positive value r
j
< t < r
j
+ p
j
, there is at least
one processor idle at some time instance of the interval [t, t + C
j
(S) r
j
p
j
).
4. All jobs that start in an interval (including the boundaries) without any idle pro-
cessor have the same release date.
If the second property does not hold then we can split a long job at time C
j
(S)p
j
+.
If the third property does not hold then we can split a long job J
j
at time t +C
j
(S)
r
j
p
j
, see Fig. 4. If the fourth property does not hold then we can further reduce the
release date of a job.
For any target schedule S, we consider a nondelay schedule OPT that starts all long
jobs at their release dates. Due to the properties of target schedules, such a schedule
always exists: The rst property of schedule S guarantees that for a long job J
i
, all
processors are busy in the interval [r
i
, C
i
(S) p
i
). Assume that job J
i
and a long job
J
j
with r
j
< r
i
are executed on the same processor in schedule S. Then the inequality
r
j
+ p
j
> r
i
violates the third property of target schedules, see Fig. 5. Therefore, no
long job J
j
that starts at r
j
can prevent the next long job on this processor to start at
its release date as well. For reasons of convexity, schedule OPT has the minimal total
completion time if all short jobs have the same processing time. This can be achieved
with an arbitrarily small deviation if is small enough.
J
j
r
j
t
Fig. 4 Splitting of a long job in a busy interval
J
i
J
j
r
i
r
j
Fig. 5 Rearranging long jobs in a schedule
Now we describe the proof of Theorem 1:
Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2009)
10-12 August 2009, Dublin, Ireland
563
Proof We use induction on the number of dierent release dates. Problem P
m
||
p
j
C
j
is the basis of the proof as
1+
2
2
< 1.25 holds. As hypothesis we assume that Theorem 1
is correct for all problem instances with at most i dierent release dates. Next we
consider an arbitrary problem instance with i +1 dierent release dates and a nondelay
target schedule S. Let r be the maximum release date and = {J
j
|r
j
= r} be the set
of jobs that are released at time r.
Due to the fourth property of target schedules, no job J
j
with r
j
< r starts at time
r or later. Let t
s
(S) be the last start time of any job in schedule S.
The processing time of any job J
j
with r
j
< r, p
j
> r r
j
, and is reduced to r r
j
.
This reduction does not decrease the completion time of any job J
i
in schedule S
while it cannot increase the contribution of jobs in to the total completion time of
schedule OPT. Although the ratio
j
p
j
C
j
(S)
j
p
j
C
j
(OPT)
may increase or decrease due to this
reduction, the total contribution of all jobs with a smaller release date than r remains
bounded by our hypothesis.
Now assume a job J
j
with r
j
< r, p
j
> r r
j
, and C
j
(S) t
s
(S) < p
j
+r
j
r. We
split this job into one long job with release date r
j
and processing time r r
j
and one
(long) job with release date r and processing time p
j
r + r
j
. Note that in schedule
S, this splitting occurs in the interval [r, t
s
(S)) and does not violate the nondelay
property. However, we may have to do some further splitting in order to preserve the
third property of target schedules among the jobs of . Therefore, no long job J
j
with
r
j
< r can complete after time r in schedule OPT.
If some short jobs complete after time r in schedule OPT although they have been
submitted before r then we simply split a long job J
i
with r
i
< r and r < C
i
(S)
at some time in the interval [r, t
s
(S)] such that no job J
j
completes after time r in
schedule OPT. Observe that this splitting reduces the total processor-time product
of all jobs submitted before r. To exactly achieve the desired result, we may have to
repeat this splitting or to (repeatedly) duplicate the number of machines and all jobs.
The latter procedure does not change the ratio
j
p
j
C
j
(S)
j
p
j
C
j
(OPT)
.
With W
r
being the total processor-time product of all jobs with a release date
smaller than r in the interval [r, ) of schedule S we assume that these jobs completely
occupy the interval [r, r +
W
r
m
), see Fig. 6. This can only increase the total completion
time contribution of jobs J
i
in schedule S. Now, we use the utilization bound
4
3
of Hussein and Schwiegelshohn [16]. This leads to W
r
1
4
m r.
r
(a)
r
(b)
Fig. 6 Schedule S (a) and assumed processor occupation (b)
Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2009)
10-12 August 2009, Dublin, Ireland
564
Finally, we can combine the total completion times to obtain the desired result:
j
p
j
C
j
(S)
j
p
j
C
j
(OPT)
=
J
j
p
j
C
j
(S) +
J
j
p
j
C
j
(S)
J
j
p
j
C
j
(OPT) +
J
j
p
j
C
j
(OPT)
<
J
j
p
j
(C
j
(S) 1.25r) + 1.25r
J
j
p
j
+ 1.25
J
j
p
j
C
j
(OPT)
J
j
p
j
(C
j
(OPT) r) + r
J
j
p
j
+
J
j
p
j
C
j
(OPT)
1+
2
2
J
j
p
j
(C
j
(OPT) r) + 1.25r
J
j
p
j
+ 1.25
J
j
p
j
C
j
(OPT)
J
j
p
j
(C
j
(OPT) r) + r
J
j
p
j
+
J
j
p
j
C
j
(OPT)
< 1.25
The determined competitive factor for the problem P
m
|r
j
, ncv|
p
j
C
j
is 1.25 while
1+
2
2
1.207 is a lower bound. Therefore, the gap is small (< 0.05).
4 Parallel Machines and Parallel Jobs
In this section, we address rigid parallel jobs that require a xed number of processors
during all of their processing time. We use the notation P
m
|r
j
, size
j
, ncv|
p
j
size
j
C
j
to describe the corresponding scheduling problem. Unfortunately, List scheduling with
an arbitrary order does not perform very well for this problem contrary to the problem
P
m
|size
j
, ncv|C
max
that guarantees the tight competitive factor 2
1
m
, see Garey [13].
A simple example using only two jobs J
1
and J
2
with p
1
=
m, p
2
= 1, size
1
= 1,
and size
2
= m shows that the competitive factor of List scheduling may be as bad as
m
m+2m
2m+
m
, see Fig. 7. As both jobs cannot be executed concurrently, both schedules
have the same optimal makespan. But in an online scenario it is better to schedule the
parallel job rst as this order oers options for the scheduling of additional jobs (left
schedule). In the right schedule, any idle processor in the interval [0,
m) can only be
utilized by the backlling method [4]. But backlling requires at least an estimate of
the processing times. Therefore, the new objective allows to distinguish between these
two schedules in an intuitive fashion.
1
(a) (b)
1 m + 1 m +
m
Fig. 7 Optimal schedule (a) and bad List schedule (b)
However, we can guarantee acceptable solutions for the problem
P
m
|size
j
, ncv|
p
j
size
j
C
j
by sorting the jobs in descending order of their degree of
parallelism (size
j
) before applying List scheduling.
Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2009)
10-12 August 2009, Dublin, Ireland
565
Theorem 2 List scheduling in descending order of the size
j
value has an approxima-
tion factor of 2 for the problem P
m
|size
j
, ncv|
p
j
size
j
C
j
.
Again we use a well known lemma, see Queyranne [17], before presenting the proof
for Theorem 2.
Lemma 2 Let S be a nondelay schedule for the problem 1|ncv|
p
j
C
j
. Then we have
j
p
j
C
j
(S) =
1
2
((
j
p
j
)
2
+
j
p
2
j
).
Lemma 2 conrms our previous statement that the objective value of nondelay
schedules for the problem 1|ncv|
p
j
C
j
is independent of the scheduling order. Now,
we return to the proof of Theorem 2.
Proof Let t
s
(S) be the rst time instance at which at most
m
2
processors are used.
Then no job can start later than t
s
(S). Let be the set of all jobs that start at time
t
s
(S), and let job J
i
be the job with the smallest size
j
value among all jobs in . Due
to List scheduling, there are less than size
i
processors idle at any time instance in the
interval [0, t
s
(S)). Next we use several steps to transform schedule S ((a) in Fig.8).
1. We split every job J
j
into size
j
sequential jobs with the same processing time
((b) in Fig.8). Note that the resulting schedule is not a nondelay schedule anymore.
This transformation does not change the objective value of schedule S while the
corresponding value of the optimal schedule cannot increase.
2. We apply Lemma 1 until no job that completes before t
s
(S) has a larger processing
time than ((c) in Fig.8). This may add further jobs to and transforms schedule
S into schedule S
.
3. We combine all jobs in into a single job J
k
with p
k
=
J
j
p
2
j
J
j
p
j
and size
k
=
(
J
j
p
j
)
2
J
j
p
2
j
((d) in Fig.8). Note that size
k
size
i
holds.
Further, we rearrange schedule S
and results in t
s
(S
) t
s
(S) ((e) in Fig.8). Therefore, the objective value of
schedule S
cannot decrease while the objective value of the optimal schedule after
the transformations ((f) in Fig.8) cannot increase due to reasons of convexity, see
Lemma 2.
Finally, we must determine the worst case values for p
k
and size
k
as functions of
m and t
s
(S
j
p
j
C
j
(S)
j
p
j
C
j
(OPT)
= 2
1
(a)
(b)
2k
1
2k
Fig. 9 Worst case example for List scheduling in descending order of the size
j
parameter
Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2009)
10-12 August 2009, Dublin, Ireland
567
Unfortunately, List scheduling in descending order of the size
j
value is not applica-
ble for the problem P
m
|r
j
, size
j
, ncv|
p
j
size
j
C
j
as the nondelay property may lead
to a preferred scheduling of sequential jobs and therefore highly parallel jobs may be
subject to starvation despite the order of the list. Moreover, there is no algorithm with
a constant competitive factor as demonstrated by Theorem 3.
Theorem 3 The lower bound of the competitive factor for the problem
P
m
|r
j
, size
j
, ncv|
p
j
size
j
C
j
approaches
m
m+2m
2m+
m
.
Proof First, the adversary submits a single job J
1
with size
1
= 1 at time 0. Eventually,
the scheduler must schedule this job to prevent an arbitrarily bad competitive factor.
Then the adversary immediately submits a parallel job J
2
with size
2
= m. Finally,
he selects the processing times of both jobs to match the example of Fig. 7 while
considering the submission time r
2
.
In practice however, this result is not really disturbing as there are only few highly
parallel jobs in real workloads. If each parallel job uses at most 50% of the processors
of a machine then simple List schedule guarantees already a competitive factor of 2,
see Schwiegelshohn and Yahyapour [11]. This competitive factor decreases further if
the bound on the parallelism of the individual jobs becomes tighter. The new metric
has the advantage that we can incorporate preferences of user groups or special jobs by
simply multiplying the weight of these jobs with an owner dened constant. Further, it
is easy to consider only recently submitted jobs in an online schedule. This enables an
online evaluation of the schedule and therefore adaptation of the scheduling strategy.
References
1. Graham, R., Lawler, E., Lenstra, J., Kan, A.R.: Optimization and approximation in de-
terministic sequencing and scheduling: A survey. Annals of Discrete Mathematics 15,
287326 (1979)
2. Feitelson, D., Rudolph, L., Schwiegelshohn, U., Sevcik, K., Wong, P.: Theory and practice
in parallel job scheduling. In: D. Feitelson, L. Rudolph (eds.) IPPS97 Workshop: Job
Scheduling Strategies for Parallel Processing, pp. 134. SpringerVerlag, Lecture Notes in
Computer Science LNCS 1291 (1997)
3. Bharadwaj, V., Robertazzi, T.G., Ghose, D.: Scheduling Divisible Loads in Parallel and
Distributed Systems. IEEE Computer Society Press, Los Alamitos, CA, USA (1996)
4. Lifka, D.: The ANL/IBM SP Scheduling System. In: D. Feitelson, L. Rudolph (eds.)
IPPS95 Workshop: Job Scheduling Strategies for Parallel Processing, pp. 295303.
SpringerVerlag, Lecture Notes in Computer Science LNCS 949 (1995)
5. Lee, C.B., Schwartzman, Y., Hardy, J., Snavely, A.: Are user runtime estimates inherently
inaccurate? In: Proceedings of the 10th Workshop on Job Scheduling Strategies for Parallel
Processing, pp. 153161 (2004)
6. Lee, C.B., Snavely, A.E.: Precise and realistic utility functions for user-centric performance
analysis of schedulers. In: HPDC 07: Proceedings of the 16th international symposium
on High performance distributed computing, pp. 107116 (2007)
7. Franke, C., Homann, F., Lepping, J., Schwiegelshohn, U.: Development of scheduling
strategies with genetic fuzzy systems. Applied Soft Computing Journal 8(1), 706721
(2008)
8. Graham, R.L.: Bounds for certain multiprocessor anomalies. Bell System Technical Journal
45, 15631581 (1966)
9. Bansal, N., Pruhs, K.: Server scheduling in the l
p
norm: a rising tide lifts all boats. In:
ACM Symposium on Theory of Computing (STOC), pp. 242250 (2003)
10. Avidor, A., Azar, Y., Sgall, J.: Ancient and new algorithms for load balancing in the
p
norm. Algorithmica 29(3), 422441 (2001)
Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2009)
10-12 August 2009, Dublin, Ireland
568
11. Schwiegelshohn, U., Yahyapour, R.: Fairness in parallel job scheduling. Journal of Schedul-
ing 3(5), 297320 (2000)
12. Pinedo, M.: Scheduling: Theory, Algorithms, and Systems, second edn. Prentice-Hall, New
Jersey (2002)
13. Graham, R.: Bounds on multiprocessor timing anomalies. SIAM Journal of Applied Math-
ematics 17, 416429 (1969)
14. Naroska, E., Schwiegelshohn, U.: On an online scheduling problem for parallel jobs. Infor-
mation Processing Letters 81(6), 297304 (2002)
15. Kawaguchi, T., Kyan, S.: Worst case bound of an LRF schedule for the mean weighted
ow-time problem. SIAM Journal on Computing 15(4), 11191129 (1986)
16. Hussein, M., Schwiegelshohn, U.: Utilization of nonclairvoyant online schedules. Theoret-
ical Computer Science 362, 238247 (2006)
17. Queyranne, M.: Structure of a simple scheduling polyhedron. Mathematical Programming
58(2), 263285 (1993)
Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2009)
10-12 August 2009, Dublin, Ireland
569