Scheduling 6
Scheduling 6
Scheduling 6
art ic l e i nf o a b s t r a c t
Article history: An acceptable response time of a server is an important aspect in many client–server applications; this is
Received 4 August 2015 evident in situations in which the server is overloaded by many computationally intensive requests. In
Received in revised form this work, we consider that the requests, or in this case tasks, generated by the clients are instances of
18 March 2016
optimization problems solved by anytime algorithms, i.e. the quality of the solution increases with the
Accepted 10 June 2016
processing time of a task. These tasks are submitted to the server which schedules them to the available
Available online 16 June 2016
computational resources where the tasks are processed. To tackle the overload problem, we propose a
Keywords: scheduling algorithm which combines traditional scheduling approaches with a quality control heuristic
Online scheduling which adjusts the requested quality of the solutions and thus changes the processing time of the tasks.
Anytime algorithms
Two efficient quality control heuristics are introduced: the first heuristic sets a global quality for all tasks,
Machine learning
whereas the second heuristic sets the quality for each task independently. Moreover, in practice, the
Adaptive systems
relationship between the processing time and the quality is not known a priori. Because it is crucial for
scheduling algorithms to know at least the estimation of these relationships, we propose a general
procedure for estimating these relationships using information obtained from the already executed tasks.
Finally, the performance of the proposed scheduling algorithm is demonstrated on a real-world problem
from the domain of personnel rostering with very good results.
& 2016 Elsevier Ltd. All rights reserved.
1. Introduction
An important aspect of client–server applications is the response time of the server. In a case of computationally intensive requests, e.g.
optimization problems, the issue of the response time is even more pressing because the server can be easily overwhelmed even by a
small number of requests.
Due to financial reasons, the computational capacity of a server is commonly scaled to handle a typical workload, i.e. the arrival rate
and the computational complexity of the requests, so that the response time during this typical workload is kept at an acceptable level. In
a case of sudden increase in the requests, the server may become easily overloaded and the response time increases significantly resulting
in user dissatisfaction. One possibility of how to mitigate the increased response time during the overload is to buy more computational
resources, but such solution is not financially suitable if the overload occurs a few times a day. However, if the requests or some of the
requests are instances of optimization problems, it is possible to maintain an acceptable response time by moderate degradation of the
solution quality, i.e. to trade-off a small decrease in a solution quality for a significantly shorter response time.
In this paper, we consider a scheduling problem illustrated in Fig. 1. Users work with client applications which generate tasks. The tasks
are sent to a scheduling system which schedules the received tasks to computational resources. The resources are heterogeneous, i.e. each
resource may have a different processing power and, therefore, the processing time of the tasks may vary on each resource. The task is
processed on the assigned resource, and once it is finished, its result is sent back to the scheduling system which distributes the result to
the respective client application. The scheduling system receives the tasks progressively through time, i.e. we are dealing with an online
scheduling problem [14,26].
n
Corresponding author.
E-mail addresses: [email protected] (I. Módos), [email protected] (P. Šůcha), [email protected] (R. Václavík), [email protected] (J. Smejkal),
[email protected] (Z. Hanzálek).
http://dx.doi.org/10.1016/j.cor.2016.06.008
0305-0548/& 2016 Elsevier Ltd. All rights reserved.
96 I. Módos et al. / Computers & Operations Research 76 (2016) 95–117
The tasks are instances of some optimization problems and are solved by anytime algorithms. The property of an anytime algorithm
is that the processing of a task can be interrupted at any time and the algorithm returns a feasible solution if such solution exists. The
quality of the solution depends on the processing time of a task, i.e. a longer processing time may result in a better solution (the quality
of a solution is defined using the objective function of the tasks' optimization problem, i.e. how close the solution is to the optimal/near
optimal solution). This behavior is typical for the majority of metaheuristics and hyperheuristics solving optimization problems. The
relationships between the processing time and the solution quality are defined by processing time functions. A typical example of a
processing time function of one task is illustrated in Fig. 2. In general, these functions have an increasing character: to get a better
solution, an anytime algorithm must perform more operations or explore a larger part of the solution space. From Fig. 2, it can also be
seen that a slight deterioration of the solution quality can significantly shorten the processing time of the task and thus reduce the
response time of the system. From the user point of view, a good solution is better then excessive waiting time for the near-optimal/
optimal solution.
In reality, the processing time functions are not known a priori as is usually considered in the related literature. The reason for this is
that the anytime algorithms search the solution space and the algorithms are not generally aware where the good solutions are. Without
any knowledge of the processing time functions, the scheduling system cannot guarantee the response time because the scheduling
system does not know how long the processing of the tasks will take. However, using either statistical or machine learning methods, the
processing time for the given quality can be estimated from the previous executions of similar tasks.
In this study, we focus on the situations in which the scheduling system is overloaded, i.e. the response time of the system
increases significantly due to increased workload. The idea of how to tackle the system overload is to control the requested quality of
solutions, i.e. when the overload of the system is detected, the system trades off the quality of solutions so that the response time is
kept close to the acceptable level. On the other hand, the system requests the highest quality of the solutions if the overload is not
detected.
We want to emphasize that our solution does not substitute clouds [9]. In fact, a cloud can be integrated into the scheduling system as a
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 97
cluster of computational resources. However, the proposed scheduling system replaces the default cloud load balancer because the
scheduling policy (see Section 4.1) makes more informed decisions than the default cloud load balancer and thus increases the respon-
siveness more efficiently. For example, Amazon Elastic Load Balancer uses simple Round-Robin algorithm for TCP listeners and Least-
Number-Of-Requests for HTTP listeners [1].
The validity of the proposed scheduling system is evaluated on an existing web-based personnel rostering application called Roslab.1
Personnel rostering [8] is a combinatorial optimization problem where a set of shifts is assigned to employees so that hard constraints
(typically given by the labor code) are satisfied and the penalty accumulated due to soft constraints violations (representing the quality of a
solution) is minimized. An assignment of shifts to employees is called a roster.
The main functionality provided by the Roslab client application is: (i) rostering, i.e. an automatic design of a roster, (ii) rerostering, i.e.
a partial roster correction due to external events such as illness of an employee, and (iii) validation of the roster changes which were
introduced by users. Different user interactions with the client application generate different computational intensive tasks (e.g. the
automatic roster design) and these tasks are solved by the anytime heuristic algorithms [4] on the server side.
At present, the available computational capacity of Roslab is scaled to handle the average workload. However, the response time
increases considerably when many users interact with the system at the same time, which causes a higher risk of not prolonging the
service contract. Because these critical situations occur only a few times a day, it is not financially suitable to buy new computational
resources. To increase the responsiveness, we can exploit the fact that the algorithms implemented in Roslab for solving the tasks have
typically non-linear progress of the solution quality, i.e. the quality difference between the optimal and a high-quality solution is small
while the processing time difference is significant (see Fig. 2). Therefore, it is possible to apply the proposed algorithms to control the
trade-off between the solution quality and responsiveness.
1
http://www.merica.cz/products/roslab/benefits.
98 I. Módos et al. / Computers & Operations Research 76 (2016) 95–117
From the summarized related work it can be seen that no work fully addresses the problem of online scheduling of computationally
intensive tasks which are solved by anytime algorithms and for which the processing time functions are not known beforehand. The
existing literature either assumes exact knowledge or specific shape of the processing time functions which is not realistic in practice.
Moreover, some works allow preemption [20,6,5], optimize a different objective function of the scheduling problem [28] or the presented
algorithms have a high complexity [11]. Therefore, we introduce a new modular scheduling system which can guarantee an acceptable
response time even in the case of overload. Moreover, since the processing time functions of the tasks are not known a priori, we propose
an estimator which can provide the estimation of the whole processing time function using knowledge acquired from the execution of
similar tasks. The proposed scheduling system is evaluated on a real client–server application from the domain of personnel rostering.
Our approach to tackling the overload problem combines a traditional scheduling policy with a quality control algorithm. We propose
two novel and efficient quality control algorithms: (i) bisection control, which sets one global quality for all tasks and (ii) independent
control, which controls the quality of the solutions for each task independently. The experiments in Section 6 show that the proposed
scheduling policy and quality control algorithms: (i) can decrease the response time significantly in situations in which the system would
be overloaded if the quality control was disabled, (ii) outperform a simple control approach which always stops the task that was running
for the longest time, and (iii) are robust to small errors in the estimation of the processing time functions.
The rest of the paper is organized as follows: In Section 2 we formulate our scheduling problem formally. The overview of the
scheduling system is presented in Section 3. Section 4 contains the description of the presented scheduling policy and quality control
algorithms. Section 5 explains how the estimation of the processing time functions works. In Section 6, the proposed system is experi-
mentally tested on real world instances. Finally, the last section concludes the paper.
2. Problem statement
In the whole text, it is assumed that the time is discrete and that one time unit equals one millisecond. T = denotes a set of time
instants t. The quality of solutions (or just quality) is denoted as ϕ ∈ Φ , where Φ = ⎡⎣ 0, 1⎤⎦. Interval Φcan be understood as normalized
values of the tasks’ objective functions, i.e. 1 represents the best possible solution while 0 represents the worst possible solution.
The scheduling problem is defined by a tuple Π = ( N , M , Ψ , f , p, WCT ), where N = {1, … , n} is a set of tasks, M = {1, … , m} is a set of
heterogeneous resources on which the tasks are processed, Ψ is a set of all possible instances, f : Ψ → D represents a feature function,
which maps each instance ψ ∈ Ψ to a feature vector with dimensionality of D ∈ > 0 , p: Ψ × Φ → > 0 represents a normalized processing
time function, and WCT: Ψ → > 0 is a normalized worst case processing time function (the meaning of the normalization will be explained
later in this section). The instances in this context are instances of the optimization problem solved by the computational resources, e.g.
Personnel rostering in the considered case study. For arbitrary instance ψ ∈ Ψ , it is assumed that p (ψ , ϕ) is continuous and increasing
function, i.e. ∀ ϕ1, ϕ2 ∈ Φ: ϕ1 < ϕ2 ⟹p (ψ , ϕ1) < p (ψ , ϕ2 ). The definition allows the normalized processing time function to be non-linear
in a general case. A maximum normalized processing time of instance ψ is a normalized processing time for quality 1. A normalized worst
case processing time WCT (ψ ) denotes an upper bound of the processing time after which the anytime algorithm, solving the given task,
is stopped.
In the whole text, notation i ∈ N and j ∈ M is used to denote the tasks and resources, respectively. To highlight that an object is indexed
by a task or a resource, we will use superscripts (i ) or (j ) , respectively.
( )
Each task i ∈ N is represented by a tuple a(i), rrt (i), ψ (i) , where a(i) ∈ T is an arrival time of the task, rrt (i) ∈ > 0 is a requested response
time of the task, and ψ (i) ∈ Ψ is a task instance. Roughly, a task is an instance which arrived in the scheduling system. It is assumed that the
tasks cannot be preempted. Time rdd(i) = a(i) + rrt (i) represents a requested due date of task i , i.e. a soft deadline. The response time is defined
as the duration between the arrival of the task to the scheduling system and its completion. The requested response time represents the
preferred maximum response time set by the client applications.
A parameter speed s(i) ∈ > 0 is defined for each resource j . The speed of resource j can be understood as a constant processing power,
i.e. it is the amount of work done by the algorithm running on resource j per time unit. For example, consider an algorithm consisting of
one loop which multiplies two numbers in each iteration. A resource with the speed of 1 can make only one iteration per time unit
whereas a resource with the speed of 10 can make ten iterations per time unit. By a normalized processing time we mean a processing time
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 99
abstracted from the heterogeneity of the resources, i.e. it represents the processing time on a resource with the speed of 1 and, therefore, it
also represents the amount of work to be done to get a solution of the given quality on such a resource.
A quality function q : T → Φ assigns a value of the requested quality of solutions to each time t ∈ T . The quality function is not defined by
the problem but is part of the decision made by the scheduling system.
If task i is started at time t on resource j for some quality function q, its completion time is defined as
{
ct (i, j, t , q) = min tmin: tmin ∈ T , tmin ≥ t , (tmin − t ) s (i) ≥ p (ψ (i), q (tmin )) . } (1)
The definition can be understood as the shortest time when the amount of work undertaken by an algorithm is greater than or equal to the
currently requested amount of work (the requested amount of work equals to the normalized processing time). Note that this definition
⎡ p (ψ (i), ϕ) ⎤
allows the quality function q to vary over time. If this were not the case, then the completion time could be easily defined as ⎢ (i) ⎥ + t ,
⎢ s ⎥
where ϕ is a constant quality value.
The lateness of task i if started at time t on resource j for some quality function q is defined as
The solution quality of task i if started at time t on resource j for some quality function q is defined as
{ ( )
φ (i, j, t , q) = max ϕ: ϕ ∈ Φ, ct (i, j, t , q) − t s (i) ≥ p (ψ (i), ϕ) . } (3)
Again, this definition allows the quality to change over time. The reason for the inequality is to handle the case when the requested
amount of work for the maximum quality is less than the actual amount of work done.
( )
A solution to problem Π is a tuple S = ( rS , stS , qS ) , where rS = rS(1), rS(2), … , rS(n) is a vector in which rS(i) ∈ M ∪ {∞} maps task i to some
( ) ( )
resource or ∞, stS = stS(1), stS(2), … , stS(n) is a vector in which stS(i) ∈ T ∪ {∞} maps task i to the start time or ∞, and qS = qS(1) , qS(2) , … , qS(n) is a
vector of quality functions for each task. Value ∞ represents uninitialized value, i.e. rS(i) = ∞ denotes that task i is not assigned to any
resource and stS(i) = ∞ denotes that task i does not have a starting time in solution S . Solution S = (rS , stS , qS ) is feasible if the following
conditions are satisfied for each task i ∈ N :
rS(i) ≠ ∞, (4)
stS(i) ≠ ∞, (5)
∀ i, i′ ∈ N: i ≠ i′ ∧ rS(i) = rS(i′) ⟹ct (i, rS(i), stS(i), qS(i) ) ≤ stS(i′) ∨ ct (i′, rS(i′), st (i′), qS(i′) ) ≤ stS(i′) (7)
The constraints require that: a task is assigned to some resources (4) and (5); a task cannot start before its arrival time (6); and processing
of two tasks on the same resource cannot overlap (7). The set of all feasible solutions is denoted as : .
For the feasible solutions the following functions are defined: average solution quality (8) and average normalized lateness (9)
1
φ (S ) = ∑ φ (i, rS(i), stS(i), qS(i) ),
n i∈N (8)
The goal in the scheduling problem is to find such a feasible solution which minimizes the average normalized lateness while the
average solution quality is maximized, i.e.
Because the described problem is a multi-objective scheduling problem [13], the outcome is a set of solutions called Pareto front. For each
solution in the Pareto front holds that it is not dominated by any other solution in the Pareto front. Since the scheduling system has to
react in an automatic manner, it needs to select some solution from the Pareto front so that the desired behavior of the system is achieved.
Obviously, an increase in the average solution quality leads to increase in the average normalized lateness and vice-versa. Solutions which
only optimize the average solution quality or the average normalized lateness are clearly unacceptable since the other objective is ignored.
Therefore, solutions which balance the solution quality and the normalized lateness are sought.
One possible way of achieving the balance is to minimize the weighted sum of the objectives. However, due to non-linearity of the
processing time functions, such solutions may lead to decreased average solution quality even if the system is not overloaded. Moreover, it
is not obvious how to set the weights since the scale of the objectives is different.
We argue that if the tasks are computed within the requested response time, the client applications are sufficiently responsive from the
100 I. Módos et al. / Computers & Operations Research 76 (2016) 95–117
users point of view. Therefore, we propose to maximize the average quality such that, on average, the tasks are processed very close to
their requested due date. This requirement can be expressed by pushing the average normalized lateness as close to 0 as possible. This
aggregation is correct even if the requested response time is significantly larger than the time needed to find the near optimal solutions
since we perform “tail-cutting” on the processing time functions (see Section 5 for more details).
The problem described above considers the complete information about tasks and resources. However, in our setting, some quantities
are unknown to the scheduler until time t or until some conditions are satisfied (this relates to the online scheduling paradigm where
tasks arrive over time [26]):
The values of function p for instance ψ (i) of task i can be observed only after task i has finished.
The values (a(i), rrt (i), ψ (i) ) of each task i are unknown until its arrival time a(i).
The completion time and the solution quality of task i are known only after task i has finished.
On the other hand, the scheduler has full knowledge of the following information: (i) normalized worst case processing time function
WCT , (ii) feature functions f , (iii) dimensionality D of the feature vectors, and (iv) set of resources M and speed of each resource (the speed
of the resources can be acquired through benchmarking [17]).
With respect to the problem statement described in Section 2, we propose a modular architecture of the scheduling system. Each module is
implemented as an independent thread. Therefore, the introduced architecture can better utilize the available computational capacity of the
server on which the scheduling system is deployed. The proposed architecture is depicted in Fig. 3. It consists of three blocks:
1. Client applications, which generate tasks and send them to the scheduling system.
2. Scheduling system, which is responsible for assigning the received tasks from the client applications to the available computational
resources. The scheduling system can be further divided into the following modules:
Service facade, which acts as a communication interface between the scheduling system and the client applications. The interface
allows the client applications to send tasks, abort tasks and to receive the solution of a previously sent task.
Central scheduler, which stores the received tasks and assigns these tasks to the available resources using a scheduling policy. The data
structure containing the information about the assignments of the tasks is called a schedule. The central scheduler also converts the
requested quality of the solutions (given by the quality control module) to the actual processing time of each task and sends them to
the computational resources through the resource management.
Resource management, which monitors the computational resources and abstracts the communication between the central scheduler
and the computational resources, e.g. communication over a socket with remote resources.
Estimator, which estimates the normalized processing time functions, i.e. it provides pl : Ψ × Φ → > 0 . These estimations are found by
the estimation algorithm which uses the regression model.
Quality control, which analyzes the current schedule created by the central scheduler and sets the requested quality of the solutions,
i.e. it provides qS(i) (t ). The qualities are found by a control algorithm.
Fig. 3. (Color online.) The proposed architecture. Red circles and green squares represent the stages through which the tasks pass before and after they are processed,
respectively.
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 101
3. Computational resources, which process the assigned tasks and send their solutions back to the scheduling system. The computational
resources can be local threads of the server, remote high-performance resources, etc. On each computational resource, a supervisor is
executed which communicates with the scheduling system and is responsible for monitoring the computation of a task. We remind the
reader that each resource can process only one task at a time.
To explain how the tasks flow through the system, Fig. 3 is used. The letters in red circles represent the order of the modules for an
unprocessed task, whereas the letters in green rectangles represent the order of modules for a processed task.
When the client application (A) generates task i , the client application sends that task to the central scheduler through the service facade
(B). The central scheduler (C) inserts that task to an array of tasks to be estimated, i.e. tasks for which the normalized processing time
function needs to be estimated. Then, the central scheduler sends that task to the estimator (D). If a regression model is already trained,
then the regression model is used to estimate the normalized processing time function for task i . If the regression model is not trained, the
normalized processing time function is computed from the normalized worst case processing time. The estimated normalized processing
time function is provided to the central scheduler (E) which then moves task i from the array of tasks to be estimated to an array of pending
tasks, i.e. tasks which were received by the scheduling system, are not completed yet and for which the normalized processing time
function is already estimated. The central scheduler creates a schedule of the pending tasks on the available resources using the scheduling
policy. When some resource j becomes free, i.e. it is not processing any tasks, the scheduling system takes the first task from the schedule
of the respective resource j and assigns that task to that resource. The task assignment is performed through the resource management
(F) module which communicates with the computational resources. When the task is received by the supervisor (G) of the resource, the
supervisor spawns a new thread which executes the received task. The processing time of task i is computed from the requested quality
and the speed of resource j. Since the processing time is computed from the requested quality which may change over time, resource j is
constantly notified about these changes. The anytime algorithms on the computational resources run approximately for the duration of
these processing times.
When the task has finished, the supervisor (H) sends the solution back to the central scheduler through the resource management (I). The
central scheduler (J) deletes task i from the array of pending tasks and sends the solution to the respective client application (L) through the
service facade (K1). If the progress of the solution quality over time was measured by the task algorithm, it is sent to the estimator (K2)
which uses it to refine the regression model (see Section 5.1).
If we consider the offline version of our problem without the quality control, then the closest scheduling problem can be represented in
Graham's notation [3] as Qm|ri | ∑ wi L i . This problem considers heterogeneous resources with quantifiable speed and that the tasks arrive in
the system at release time ri which is equal to the arrival time a(i) from our problem statement in Section 2. The aim is to find a solution
which minimizes the weighted lateness where the weight is defined as wi = 1/rrt (i). The optimal solution to problem Qm|ri | ∑ wi L i is equal to
the optimal solution to problem Qm|ri | ∑ wi Ci , the only difference is in the constant in the objective functions. Unfortunately, even the
simpler problem 1|ri | ∑ wi Ci is 57 -hard [3], therefore, the problem Qm|ri | ∑ wi L i is also 57 -hard.
For the online version (again, without the quality control), the situation is even more complicated [26]; this is due to the lack of
knowledge of the tasks that will arrive later from the client application. In such a case, the online algorithm might assign a longer task on
an available resource, even though it would be more beneficial to wait if a shorter task arrives very soon. Without prediction of the arrival
of future tasks, an offline scheduling rule can be employed to create a schedule of all pending tasks.
Our approach to handling the online scheduling with the quality control is to divide this problem into two steps performed by the
central scheduler and the quality control module:
1. Central scheduler: For the maximum quality of the solutions, find a schedule of all pending tasks which minimizes the average nor-
malized lateness. The schedule is found by the scheduling policy and is recreated whenever some event occurs (see Section 4.1 for a list
of events).
2. Quality control module: For a fixed schedule, find the requested quality of the solution for each task such that the average normalized
lateness is close to 0. The quality control procedure is started whenever a new schedule is created by the central scheduler.
In algorithms, an array data structure is used. An array is an ordered sequence of some objects. Bracket notation is used for accessing the
elements of an array, e.g. if a = (1, 3, 5) is an array, then a [3] = 5. The length of an array is computed using a len function, e.g. the length
of the array a is len (a) = 3. To append an element to the end of an array, bracket notation is used with end + 1 as the index, e.g. after
calling a [end + 1] ← 7, the content of a will be (1, 3, 5, 7).
An estimated completion time of task i started at timet on resource j with fixed quality ϕ is defined as
⎡ l (i ) ⎤
l (i, j, t , ϕ) = t + ⎢ p (ψ , ϕ) ⎥.
ct
⎢ s ( j ) ⎥ (11)
The number of pending tasks is denoted as n(*) and the number of pending tasks assigned to resource j is denoted as n [j].
102 I. Módos et al. / Computers & Operations Research 76 (2016) 95–117
The policy used for scheduling is described in Algorithm 1. It is a combination of an Earliest Due Date (EDD) selection policy and a
Minimal Completion Time (MCT) assignment policy. Although there are other policies, such as First In First Out and Shortest Task First, the
EDD policy is chosen because it considers the requested due date and it does not suffer from starvation of longer tasks.
The global variables of the central scheduler are pendingTasks and currentlyBeingProcessed. The pendingTasks is the array of pending tasks
and currentlyBeingProcessed is a boolean array indexed by the tasks where currentlyBeingProcessed [i] denotes whether task i is currently
being processed by some resource. Once a task is assigned to some resource and the resource starts to process it, the task cannot be moved to
another resource. The global variables are kept in the memory for the whole running time of the scheduling system.
The scheduling policy receives two input arguments (in addition to the global variables). The first one, denoted as t, is the time when
the event leading to rescheduling occurred. The second one, denoted as S , is the current solution to the scheduling problem.
The scheduling policy returns a schedule of the pending tasks which is an array sch indexed by the resources. Each element sch [j] of
this array is an another array which defines the order of the assigned tasks on this resource.
The policy starts by performing the initialization of the earliestStartTime and sch arrays. The earliestStartTime array represents the
earliest start time of the tasks on each resource with respect to the previous assignments. Then, in the loop at line 7, the currently
processed tasks on the resources are added to the start of the schedule of each resource. The loop at line 11 ensures that the rest of the
pending tasks cannot be assigned in the past. The next part sorts the tasks by their requested due dates, i.e. this is the selection policy. The
last loop at line 15 finds a resource for each pending task such that it can finish that particular task at the earliest with respect to the
previously assigned tasks, i.e. this is the assignment policy.
Algorithm 1. EDDMCT (Earliest Due Date þ Minimal Completion Time) scheduling policy.
Since the environment in which the scheduling system operates is not static, the scheduling system needs to adapt to the changes in
this environment. These changes are propagated to the scheduling system as events. When the environment changes, the scheduling
system receives a corresponding event and runs the scheduling policy to adapt to the new state of the environment. The following events
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 103
are recognized by the scheduling system (if an additional procedure needs to be performed before the scheduling policy is run, the
description of that procedure is also provided):
New task i was received by the scheduling system: task i is added to the array of pending tasks, i.e.
currentlyBeingProcessed [i] ← false
pendingTasks ← pendingTasks ∪ {i}
The solution for some task i was received by the scheduling system: task i is removed from the array of pending tasks, i.e.
currentlyBeingProcessed [i] ← false
pendingTasks ← pendingTasks⧹{i}
The complexity of the presented policy is 6 ( n(*) log n(*) + n(*) m). The term n(*) log n(*) is due to sorting of the pending tasks. The term
n(*) m is added because it is needed to iterate over all the resources to find a resource which completes the given task at the earliest.
We propose two quality control algorithms: (i) bisection control and (ii) individual control. Both algorithms are able to handle the
overload situations. However, they have different behavior and assumptions.
l (ψ (i), 1)
⎛ i l (ψ (k ), 1) ⎞
p
l(i, j, st (i), q (i)(t )) − rdd (i) stS(i) + qS(i)(t )
p
− rdd (i) stS(1) + ⎜ ∑k = 1 qS(k)(t ) (i) ⎟ − rdd (i)
1 ct S S 1 s (i ) 1 ⎝ s ⎠
∑ = ∑ = ∑
n i∈N rrt (i) n i∈N rrt (i ) n i∈N rrt (i)
⎛ ⎞ ⎛ ⎞
⎜ i
l (k ) ⎟ ⎜ st (1)
− rdd (i ) ⎟
p (ψ , 1 ) 1
= ⎜∑ ∑ qS(k)(t ) (i) ⎟ + ⎜∑ S ⎟.
⎜ i∈N s ·n rrt (i) ⎟⎟ ⎜⎜ i ∈ N n·rrt (i) ⎟⎟
⎜ k=1
⎝ ⎠ ⎝ ⎠ (12)
The second term of Eq. (12) on the right-hand side is a constant, i.e. its value does not depend on the values qS(i) (t ). This term is denoted as cons
in Algorithm 4. To further simplify the first term, the sums for each i are expanded and the multipliers of each qS(i) (t ) are collected. By doing so, a
weight of each task is obtained
n
l (ψ (i), 1)
p 1
w (i ) = ∑ ,
s (i ) · n k=i
rrt (k) (13)
with which the average normalized lateness from Eq. (12) can be rewritten as
⎛ ⎞ ⎛ st (1) − rdd(i) ⎞
⎜⎜ ∑ q (i) (t )·w (i) ⎟⎟ + ⎜⎜ ∑ S ⎟.
⎝ i∈N
S
⎠ ⎝ i∈N n·rrt (i) ⎟⎠ (14)
From this equation, it can be seen that the tasks that contribute the most to the average normalized lateness are those with the highest weights. The
idea of the individual control algorithm is to compress greedily the tasks with the highest weights until the average normalized lateness is close to zero.
The algorithm starts by computing the weight of each task and the constant part cons (see line 10). Then, the tasks are sorted non-
increasingly by their weights (see line 15) and the algorithm proceeds by greedy compression of the tasks with the highest weights (see
line 17). The compression is at line 31 where the qualities for all tasks except i are fixed and the algorithm is trying to find the root of Eq.
(14). The last part of the algorithm recomputes the start time of each task using the computed qualities (see line 34).
There are two complications which need to be considered when implementing this idea. The first complication is that if some task is currently
being processed by a resource, generally it cannot be compressed to ϕ because the solution found until time t can already be of better quality than ϕ ,
i.e. the quality of the current solution is a new lower bound for the quality compression. Therefore, the quality of the current solution must be reflected
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 105
in the algorithm when compressing the first task (see line 28). The second complication arises when the requested quality for the first task is set to one
and the solution is not received within the estimated completion time, e.g. the solution is being sent from a resource to the scheduling system. In such
a case, there is an idle time between the estimated completion time of the first task and time t called the phantom time, which needs to be considered
when computing the average normalized lateness (see lines 14 and 23).
In addition to the already introduced input arguments for the bisection control, the independent control algorithm has one new input
argument j which denotes the resource for which the qualities are currently computed.
( )
The complexity of the algorithm is 6 n(j ) log n(j ) because the complexity is dominated by the sorting of the tasks. From Theorem 1, the
total complexity of computing Algorithm 4 for all resources is 6 ( n(*) log n(*) ).
Theorem 1. The total worst case complexity of computing Algorithm 4 for all resources is 6 ( n(*) log n(*) ).
Since normalized processing time function p is not known a priori, the scheduling system needs to estimate it.2 This estimation can be
used by the scheduler instead of the worst-case estimates to find better assignments.
Estimation is based on the assumption that p, for the given instance, can be approximated by a piecewise linear function which is
parameterized by approximated parameters ∼ y . Each parameter ∼yk represents the normalized processing time of the endpoint of k-th
segment. An endpoint is a point where two neighboring segments of a piecewise linear function meet. The approximation is made due to
performance reasons - a function approximation with a few parameters can be more efficiently estimated than the whole p. However,
since ∼
y (i) are not known a priori (because p is also not known a priori), the question remains how to estimate them when task i arrives in
the scheduling system. To solve this problem, a regression analysis is used. Using a set of previously collected parameters ∼
y , a regression
model is trained. That regression model is then able to estimate ∼
y (i) for the given task i , where the estimated parameters for task i are
l (i )
denoted as y . Fig. 4 summarizes the flow of the normalized processing time estimation.
The training of the regression model can be performed:
In the combined approach, the estimator starts with an initial regression model which is iteratively refined using information acquired
online. The offline model can be built by sampling the observation space, i.e. the space of tuples representing the features and the
approximated parameters of the instances. However, this method assumes that we are provided with such samples. The online model
does not need such samples beforehand. However, its disadvantage is that until enough training observations are collected, the estimator
must use the default parameters such as the worst case normalized processing time. All approaches also fail to provide satisfactory
estimations if the observation space is insufficiently sampled, e.g. observations used for training the regression model are sampled from a
different region of the observation space than the observations to estimate.
This Subsection describes how the training observations are collected and processed. When the online learning is employed, the steps
are performed during the runtime of the scheduling system. If the offline learning is used, the steps are performed on the sampled
observations before the scheduling system is used in production. A new training observation from task i is created in the following steps:
1. The scheduling system assigns task i to some resource j with normalized processing time equal to the normalized worst case processing
time of that task. The normalized worst case processing time represents some fixed time after which the algorithm is stopped even
though the solution could still be improved. However, it is our assumption that the improvements after the normalized worst case
processing time are insignificant.
2. The algorithm solving the task must record how the value of the objective for the best-known solution evolves through time. This
progress is recorded in a objective curve Z (i) of task i which is a sequence of tuples ( (t(i )
1 , )( ) (
z1(i) , t2(i), z2(i) , … , th(i), zh(i)
i i ) ), where z
(i )
k is a
objective value of the best known solution until time tk(i) and hi is a number of collected tuples. Both zk(i) and tk(i) are assumed to be
increasing sequences, i.e. it is assumed that the objective function is maximized. The objective function can also be minimized, but then
the algorithm solving the task is responsible for transforming the decreasing objective curve to an increasing one.
2
For simplicity, we assumed in this work that the instances are only of one optimization problem, i.e. personnel rostering. However, it is possible that the scheduling
system would have to process more optimization problems. In such a case, the processing time functions would be indexed by the problems and the estimation would be
performed for each problem individually.
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 107
Fig. 5. (Color online.) An example of a measured objective curve with highlighted worst case processing time (WCT) and the threshold time denoting the processing time for
solution quality of 1. The trimmed points are highlighted by larger yellow circles.
3. When the task is finished, resource j sends the solution and objective curve Z (i) back to the central scheduler which sends the objective
curve to the estimator.
4. The estimator rescales times tk(i) in the objective curve by the estimated speed of resource j to the normalized processing time tk(i) s(j ).
5. The objective curve is trimmed to retain only the times tk(i) ≤ WCT (ψ (i) ).
6. The objective curve is trimmed to retain only the part where the approximated slope of the objective value is higher than some
threshold, i.e. this step performs the “tail cutting”. The slope for each time tk(i) is computed as
zk(i+) 1 − zk(i)
slopek =
tk(i+) 1 − tk(i) (15)
and the threshold is computed as 0.05·maxk slopek . This is the last step where the objective curve is trimmed; let us denote the index of
the last remaining tuple from the objective curve as kmax.
The reason for this step is that even if the normalized worst case processing time represents the stopping time of the algorithm, it does
not mean that the solution of the maximum quality cannot be found much earlier or that all solutions found after some threshold time
are significantly better than the solution found by the threshold time. Consider Fig. 5 which illustrates this idea. The threshold time
represents the time after which all measurement are trimmed because the improvements of the solution are not significant. The best-
known solution until this threshold time is then declared as the solution of the maximum quality.
7. Objective values zk(i) are linearly scaled to [0, 1] using min-max normalization. The normalized values are denoted as ϕk(i) .
8. The curve from the previous step represents the progress of the normalized processing time on the solution quality. The number of
tuples in this curve can be large and, therefore, a compact representation by the approximated curve is used. For our purposes, a
piecewise linear function with ten segments is used. The endpoints of the qualities in this piecewise linear function are fixed to
0.1, 0.2, … , 0.9, 1. To acquire the approximated parameters ∼ y (i), a curve from the previous step is sampled along those fixed endpoints,
∼ (i )
e.g. y3 represents the normalized processing time needed to find the solution of quality 0.3. A linear interpolation between the
neighboring qualities is used if the value of the given fixed quality does not exist in the curve.
9. A new training observation x(i), ∼ ( )
y (i) is added to the database of the training observations, where x(i) = f (ψ (i) ) is the feature vector of
task i .
When a sufficient number of training observations is collected, a regression model is trained. The regression model g maps the feature
vectors x(i) = f (ψ (i) ) to the estimated parameters, i.e. l
y (i) = g (x(i) ). The found parameters are then used to construct the estimated nor-
malized processing time function. Since a piecewise linear function is used for approximating the objective curve, the same function type
is also used for the estimation. Therefore, the coordinates of the endpoints in the estimated normalized processing time function are
ll(i) ), where k = 1, … , 10.
(0.1·k , y
Since most regression methods learn only one output, an independent regression model is trained for each quality endpoint 0.1·k and
outputs of those models are then combined into one function. However, this approach may result in a non-monotonic function; this can be
corrected by taking a running maximum of the processing time values.
Fig. 6. An example of the measured processing time (drawn in red, solid line), its approximation (drawn in cyan, solid line with triangles), its full estimation (drawn in blue,
solid line with points) and its linear estimation (drawn in green, dashed line). The triangles and points in approximation and full estimation, respectively, represent the
endpoints of the piecewise linear function. The estimation is found using the k-nn method.
1. Measured processing time: No estimation is employed, the scheduling system has full knowledge of the processing time function (the
function is acquired by running all instances before the scheduling system is run and measuring their processing time). Since in most
cases this knowledge is unavailable, such scenario is unrealistic in practice. However, in experiments such scenario allows us to de-
termine how the performance of the scheduling system differs from a more realistic scenario where the estimation is used.
2. Linear estimation: Instead of using a piecewise linear function to approximate the measured curve, a simple linear function can be used.
In such case, only one point needs to be estimated to obtain the estimated processing time function.
6. Experiments
In this section, it is verified that our proposed scheduling system can keep the average normalized lateness near 0. It is shown that if
the quality control is disabled, the lateness could be 60 times larger than the requested lateness which is unacceptable.
First, the experimental results for the offline estimation of the processing time of real-world instances from a domain of personnel
rostering are provided. Then, the instances from the previous experiment and their estimations are used for the experiments with the
scheduling system. The computer, on which the scheduling system run, has an Intel Core i7-3520M @ 2.90 GHz processor and 8 GB of
RAM.
6.1.2. Results
The results of the estimation experiment are reported in Table 1 and in Fig. 7. As an error metric, an absolute percentage error is used
and is defined as
l (ψ , ϕ) − p (ψ , ϕ)
p
l , ϕ, ψ ) =
ape (p, p ·100%.
p (ψ , ϕ) (16)
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 109
Table 1
Absolute percentage error of the learned models on the testing set for different qualities. The values in bold represent the best value among all regression methods. The most
interesting values are in the column for the 0.75 quantile which represents the majority of the observations.
Fig. 7. The relationship between the absolute percentage error and the maximum normalized processing time. Each point represents one observation estimated using the k-
nn method. The total number of observations is 200, i.e. the whole testing set.
The absolute percentage error for the given solution quality ϕ and instance ψ is an error ratio between the normalized processing time of
instance ψ for quality ϕ and the estimated normalized processing time of instance ψ for quality ϕ. The qualities in Table 1 correspond to the
endpoint coordinates of a piecewise linear function. Fig. 7 shows the absolute percentage error of each testing observation for maximum quality.
Notice that for higher qualities the absolute percentage error of the majority of observations (0.75%) is around 20% for both methods.
For this quantile, the k-nn method gives better estimations with the exception of the 0.2 quality. The higher estimation error is achieved
when the processing time is low, as can be seen in Fig. 7. This is not a considerable issue, because the absolute error, i.e. the absolute value
of the difference between y and y l , of these short instances is small. Therefore, the overall effect on the scheduling system is also small if
the system processes both short and long tasks. To overcome the problem of underestimation of short tasks, the minimum processing time
can be set to some fixed lower bound, e.g. 1 s. Because the k-nn method achieves lower estimation error on the majority of the ob-
servations, the k-nn method is used in the following experiments of the scheduling system.
Fig. 6 shows an example of one observation with its measured processing time, approximated processing time function and estimated
processing time functions (full and linear estimation). The estimated processing time overestimates the measured processing time function
which is better than if the processing time functions were underestimated. To explain this, assume that there are two tasks i, i′ and the
maximum normalized processing time of both tasks is 100 ms. The estimated maximum normalized processing time of task i is 120 ms (i.e.
overestimation) and the estimated maximum normalized processing time of task i′ is 80 ms (i.e. underestimation). Next, assume that for task i′
the normalized processing time of 80 ms relates to quality 0.8. When the scheduling system is not overloaded, i.e. the requested quality of the
solution is 1, the actual quality of the solution for task i would be 1 because the processing time is equal to 120 ms which means that the
algorithm is processing task i for more time than is required to get the solution of quality 1. On the other hand, the actual quality of the solution
110 I. Módos et al. / Computers & Operations Research 76 (2016) 95–117
for task i would be 0.8 because the processing time equals to 80 ms which means that the algorithm is processing task i′ for less time than is
required to get the solution of quality 1. Therefore, underestimation is undesirable if, most of the time, the scheduling system is not overloaded.
In the following experiments, the ability of the scheduling system to keep the average normalized lateness near 0 is analyzed. The
instances from the previous experiment are used.
In the experiments, the proposed bisection and individual control algorithms (see Section 4.2) are compared to alternative control
algorithms:
1. Max quality: The quality control is disabled, i.e. the requested quality is set to 1 for all tasks. However, due to imprecision in the
execution time estimation, the actual quality of the returned solutions does not have always to be 1.
2. Min quality: The requested quality is set to some minimum quality which is denoted as ϕ .
3. Random control: Before a task is assigned to some resource, the requested quality of that task is fixed to some randomly sampled value
from the uniform distribution < ( ϕ, 1).
4. Naive: When the overload is detected, the algorithm iteratively sets the lowest possible quality to tasks which are currently being
processed and which has been processed for the longest time until the average normalized lateness is less or equal to 0.
For each group, six client applications (i.e. 18 client applications in total) were created which generated tasks at some rate and sent them to
the scheduling system. The inter-arrival time of the tasks, i.e. the difference between the arrival time of consecutive tasks, was sampled
from the exponential distribution with a mean of 100 ms. In total, 600 tasks were generated from the first group, 510 from the second
group and 420 from the third group. Therefore, the total number of generated tasks was n = 1430.
Fig. 8. The dependence of the normalized lateness and the solution quality of tasks on the workload when the quality control is disabled (i.e. Max quality control algorithm)
and measured processing time is used.
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 111
Fig. 9. The dependence of the normalized lateness and the solution quality of the tasks on the workload when the bisection control is enabled and measured processing time
is used.
Fig. 10. The dependence of the normalized lateness and the solution quality of the tasks on the workload when the bisection control is enabled and full estimation is used.
(notice that the scale of the normalized lateness is different from Fig. 8). The control algorithm detects the overload and decreases the
requested quality of the solutions so that the normalized lateness is around 0. The requested quality of the solutions gradually increases as
the overload decreases. In Fig. 10, the results for the same control method with the full estimation are shown. Notice that the overall shape
of the requested quality of the solutions is similar to Fig. 9. Due to the inexact estimation, the spread of the quality is larger, however even
in such a situation the response time of the tasks is still maintained at an acceptable level.
The results for the individual control with the measured processing time are shown in Fig. 11 (the individual control with the full
estimation is not included because the effect of the estimation is not clearly visible as with the bisection control). The individual control
algorithm is also able to keep the normalized lateness around 0. From the quality of the solutions, it can be seen that the bisection and
individual control algorithms behave differently. Whereas the bisection control sets the same quality for all the tasks, the independent
control can specifically address those tasks whose contribution to the average normalized lateness is the highest. If the minimum quality
were set to 0, the independent control would behave similarly to the admission control, i.e. it would drop the tasks with the highest
contribution to the average normalized lateness.
The last Fig. 12 illustrates the situation when the naive control with the measured processing time is used. Similarly to the individual
control, the naive control sets the quality for each task and is also able to keep the response time around the acceptable level.
Fig. 11. The dependence of the normalized lateness and the solution quality of the tasks on the workload when the individual control is enabled and measured processing
time is used.
Fig. 12. The dependence of the normalized lateness and the solution quality of the tasks on the workload when the naive control is enabled and measured processing time is
used.
considering that this result was not obtained at the cost of significantly decreasing the quality of the solutions. The quality of the solutions
is also affected by the estimation but, as with the bisection control, the difference is small in comparison with the situation when the
measured processing time is used. In comparison with the naive control, the individual control has much higher median of the solution
quality. In the case of the full estimation, the difference in the median of the solution quality is almost 0.3. We note that even though the
normalized lateness of the naive control is smaller compared to the individual control, the difference is not relevant because the nor-
malized lateness of both control methods is negative - we aim for non-positive average normalized lateness.
The reason why the quality for the Min quality with the measured processing time has a large nonzero spread is because the smallest
time unit is set to 1 ms which influences the instances with a small processing time. For example, consider a task instance with a linear
processing time function and with the maximum normalized processing time of 100 ms. If the requested quality is set to 0.2, the nor-
malized processing time equals to 20 ms. If the scheduling system decides to assign that task to a resource j with a speed of s(i) = 100, it
computes that the estimated processing time of that task is
⎡ p (ψ , 0.2) ⎤ ⎡ 20 ⎤
⎢⎢ ⎥⎥ = ⎢⎢ ⎥ = ⌈0.2⌉ = 1,
s (j ) 100 ⎥ (17)
i.e. resource j should process the task for 1 ms. After converting that processing time to the normalized processing time, i.e. 100 ms, it is
obvious that the quality of the solution is 1 even though the requested quality is set to 0.2.
The overall comparison of various quality control algorithms and estimation methods is shown in Fig. 15. The methods are compared by
plotting the average quality of the solutions and the average normalized lateness. It can be seen that the full estimation does not sig-
nificantly influence the average values, i.e. the scheduling system is robust against small errors in the estimation. When the bisection
control is used, the average quality of the solutions for the full estimation achieves the highest quality, whereas the linear estimation is the
worst. The full estimation is only slightly worse than using the measured processing time. When the individual control is used, the
difference between the full and linear estimation is small because the individual control, according to Eq. (12), uses only the estimation of
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 113
Fig. 13. Comparison of various quality control and estimation methods by normalized lateness. The edges of the boxes represent the lower and upper quartile, the red
vertical line inside each box is median and the whiskers extend from the minimum to maximum value.
Fig. 14. Comparison of various quality control and estimation methods by solution quality. The edges of the boxes represent the lower and upper quartile, the red vertical
line inside each box is median and the whiskers extend from the minimum to maximum value.
the maximum normalized processing time. The difference arises from the fact that the scheduling policy uses the linear and full esti-
mation in their original form. The result of the random control is not surprising because the mean of the uniform distribution
< ( ϕ, 1) = < (0.2, 1) is μ = 0.6.
114 I. Módos et al. / Computers & Operations Research 76 (2016) 95–117
Fig. 15. Comparison of various quality control and estimation methods by the average quality of the solutions and the average normalized lateness. The markers for random
control are overlapping.
Fig. 15 also shows that the total solution quality is approximately the same for the individual and bisection control. The difference of
these methods is how the quality is distributed among the tasks. Whereas the bisection control distributes the quality uniformly among
the pending tasks, the individual control can compress the processing time of those tasks that have a large contribution to the average
normalized lateness. The effect of this is that more solutions have a quality of 1 at the cost of having a few solutions of minimum quality. If
the minimum quality were set to 0, the individual control would compress some tasks completely and, therefore, the individual control
would behave similarly as an admission control. Such behavior may not be desirable in some applications and, therefore we cannot declare
that the individual control is better than the bisection control even though the numeric results are in favor of the individual control.
To compare the naive and the individual control, we note that the difference in the average solution quality between the corresponding
estimation methods is at least 10% in favor of the individual control. Since the objective functions on the application layer (i.e. the objective
functions of the task instances) relate to the quality of the solutions, such difference can be significant, e.g. consider an objective function
denoting the saved money achieved by a solution. Moreover, in other scenarios the naive control may perform much worse, e.g. consider a
situation when short task 1 and two much longer {2, 3} arrive in the system at the same time. Assume that the scheduler creates schedule
(1, 2, 3) on one resource and the last task 3 is missing its requested due date. The naive control decreases the quality of the short task
1 even though it has a negligible effect on the completion time of the longer tasks and, therefore, the requested due date of task 3 is still
being missed. After the solution of task 1 is received, the quality is recomputed. The naive control decreases the quality of task 2, which
considerably affects the completion time of task 3 and the requested due date of task 3 is satisfied. On the other hand, the individual
control detects that the congestion is caused by task 2 leaving the quality of task 1 intact.
The question may arise about the low normalized lateness of the individual control: could be the quality of solution increased so that
the normalized lateness is closer to zero? Recall that the average normalized lateness in the individual control is computed per resource.
Consider an example with two resources {1, 2} of speed 1 and three tasks {1, 2, 3} with parameters from Table 2. If the average normalized
lateness is computed over all resources for these tasks, then it is equal to −0.1296. However, if the average normalized lateness is
computed for each resource independently, then it is equal to 0.056 for resource 1 and for resource 2 it is equal to −0.5. Since all tasks are
running to the maximum quality and the overall average normalized lateness is less than 0, the bisection control would do nothing
whereas the individual control would compress the tasks on resource 2 so that the average normalized lateness is closer to zero. By
compressing the tasks, the solutions are received earlier and, therefore, the average normalized lateness of all the tasks in the experiments
for the individual control is less than for the bisection control. The average quality of both control algorithm is, in the end, similar because
the individual control is able to compress a single bottleneck task without affecting the rest of the tasks.
Table 2
Parameters of the tasks for example explaining why the average normalized lateness of the individual control is less than 0.
1 10 10 0 1 0 10
2 90 90 0 1 10 100
3 10 20 0 2 0 10
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 115
Fig. 16. Comparison of the control algorithms by maximum running time. For each number of pending tasks there is, at most, one point (this is not obvious due to the
density of the points).
Fig. 17. Maximum running time of the scheduling policy (since all control algorithms are using the same scheduling policy, the points are not distinguished by the control
algorithms). For each number of pending tasks there is, at most, one point (this is not obvious due to the density of the points).
The reason for such small running times of the individual control algorithm is that the case when all pending tasks are assigned to one
resource is not common in typical problem instances. The number of pending tasks on each resource are rather balanced which leads to a
lower running time.
7. Conclusion
In this work, a problem of scheduling tasks in a heterogeneous environment is considered, i.e. the scheduling system has to determine
the assignments of the tasks to the available computational resources. The goal of the scheduling system is to handle overload situations in
which many computationally intensive tasks arrive in the system. In such a situation, the response time of the server increases con-
siderably and this leads to user dissatisfaction.
The first characteristic of the tasks is that they are solved by anytime algorithms, i.e. the quality of the solutions depends on the
processing time of the tasks. The second characteristic is that the relationship between the solution quality and the processing time of the
tasks is not known a priori. Therefore, to make intelligent decisions the scheduling system has to estimate it.
Since the problem in its entirety is not already addressed in the existing literature, we proposed a modular system architecture, two
efficient quality control algorithms and a procedure for estimating the processing time functions of the tasks. Both quality control al-
gorithms exploit the anytime property, i.e. when the overload is detected, the algorithms decrease the requested quality of the solutions,
thus, decreasing the response time of the server. The algorithms differ in how the requested quality is controlled; the bisection control sets
one global quality for all tasks, whereas the individual control controls the quality of each task independently. The algorithmic complexities
( )
of the bisection and individual control algorithms are 6 maxIters·( m + n(*) ) and 6 ( n(*) log n(*) ), respectively ( m is a number of resources
and n(*) is a number of tasks in the system). Thanks to the low algorithmic complexity, the system can be used in online environments and
116 I. Módos et al. / Computers & Operations Research 76 (2016) 95–117
Acknowledgment
This work was supported by ARTEMIS FP7 EU and by the Ministry of Education of the Czech Republic under the project DEMANES
295372 and by the Grant Agency of the Czech Republic under the Project GACR FOREST P103-16-23509S.
n ⎛ ∑n xj ⎞xi
∏ ⎜ j=1 ⎟ > 1 □
⎜ xi ⎟⎠
i=1 ⎝ (A.3)
Theorem 1. The total worst case complexity of computing Algorithm 4 for all resources is 6 ( n(*) log n(*) ).
Proof. In Section 4.2.2, the complexity of computing Algorithm 4 for one resource has been discussed already. To prove Theorem 1, it has
I. Módos et al. / Computers & Operations Research 76 (2016) 95–117 117
to be shown that the worst case is when all pending tasks are assigned to only one resource.
Assume that not all pending tasks are assigned to one resource, i.e.
n (* ) = ∑ n (j ) .
j ∈ M : n(j) ≥ 1 (A.7)
By substituting n(*) into inequality (A.6) we get
This inequality proves that the worst case is when all pending tasks are assigned to only one resource. □
References
[1] 〈www.Amazon.com〉, How elastic load balancing works: request routing. From Amazon Web Services documentation, 〈http://docs.aws.amazon.com/ElasticLoadBalan
cing/latest/DeveloperGuide/how-elb-works.html#request-routing〉 [accessed 03.08.15].
[2] Braun TD, Siegel HJ, Beck N, Bölöni LL, Maheswaran M, Reuther AI, et al. A comparison of eleven static heuristics for mapping a class of independent tasks onto
heterogeneous distributed computing systems. J Parallel Distrib Comput 2001;61(6):810–37.
[3] Brucker P. Scheduling algorithms. 5th ed. Berlin: Springer-Verlag; 2007.
[4] Bäumelt Z, Šůcha P, Hanzálek Z. A multistage approach for an employee timetabling problem with a high diversity of shifts as a solution for a strongly varying workforce
demand. Comput Oper Res 2014;49:117–29.
[5] Chin FYL, Fung SPY. Online scheduling with partial job values: does timesharing or randomization help?. Algorithmica 2003;37(3):149–64.
[6] Dey JK, Kurose J, Towsley D. On-line scheduling policies for a class of IRIS (increasing reward with increasing service) real-time tasks. IEEE Trans Comput 1996;45
(7):802–13.
[7] Dong F, Akl SG. Technical report no. 2006-504 scheduling algorithms for grid computing: state of the art and open problems; 2006.
[8] Ernst A, Jiang H, Krishnamoorthy M, Sier D. Staff scheduling and rostering: a review of applications, methods and models. Eur J Oper Res 2004;153(1):3–27.
[9] Foster I, Zhao Y, Raicu I, Lu S. Cloud computing and grid computing 360-degree compared. In: Grid computing environments workshop, 2008. GCE ’08; 2008. p. 1–10.
[10] Gilly K, Juiz C, Thomas N, Puigjaner R. Adaptive admission control algorithm in a QoS-aware web system. Inf Sci 2012;199(0):58–77.
[11] He Y, Elnikety S. Scheduling for data center interactive services. In: 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton); 2011.
p. 1170–81.
[12] Hellerstein JL, Diao Y, Parekh S, Tilbury DM. Feedback control of computing systems.New Jersey: Wiley-IEEE Press; 2004.
[13] Hoogeveen H. Multicriteria scheduling. Eur J Oper Res 2005;167(3):592–623.
[14] Hoogeveen J, Vestjens A. Optimal on-line algorithms for single-machine scheduling. In: Cunningham WH, McCormick ST, Queyranne M, editors. Integer programming
and combinatorial optimization, Lecture notes in computer science, vol. 1084. Springer, Berlin, Heidelberg; 1996. p. 404–14.
[15] Hutter F, Xu L, Hoos HH, Leyton-Brown K. Algorithm runtime prediction: methods & evaluation. Artif Intell 2014;206:79–111.
[16] Ibarra OH, Kim CE. Heuristic algorithms for scheduling independent tasks on nonidentical processors. J ACM 1977;24(2):280–9.
[17] Iverson M, Ozguner F, Potter L. Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment. In: 1999
Proceedings of eighth heterogeneous computing workshop, (HCW ’99); 1999. p. 99–111.
[18] Karlsson M, Covell M. Dynamic black-box performance model estimation for self-tuning regulators. In: Proceedings of the second international conference on automatic
computing. Washington, DC, USA: IEEE Computer Society; 2005. p. 172–82.
[19] Klusáček D, Rudová H. Multi-resource aware fairsharing for heterogeneous systems. In: Job scheduling strategies for parallel processing, 1st ed. Springer; 2014, Cham,
Switzerland.
[20] Liu JWS, Lin K-J, Shih W-K, Yu AC-s, Chung J-Y, Zhao W. Algorithms for scheduling imprecise computations. Computer 1991;24(5):58–68.
[21] Lu C, Stankovic JA, Tao G, Son SH. Design and evaluation of a feedback control EDF scheduling algorithm. In: Proceedings of the 20th IEEE real-time systems symposium.
Washington, DC, USA: IEEE Computer Society; 1999. p. 56–67.
[22] Maheswaran M, Ali S, Siegal H, Hensgen D, Freund R. Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In:
1999 Proceedings of eighth heterogeneous computing workshop, (HCW ’99); 1999. p. 30–44.
[23] Messelis T, De Causmaecker P, Vanden Berghe G. Algorithm performance prediction for nurse rostering. In: Proceedings of the 6th multidisciplinary international
scheduling conference: theory and applications; 2013. p. 21–38.
[24] Mittal A, Manimaran G, Murthy C. Integrated dynamic scheduling of hard and qos degradable real-time tasks in multiprocessor systems. In: Proceedings of fifth
international conference real-time computing systems and applications, Washington, DC, USA: IEEE Computer Society; 1998. p. 127–36.
[25] Page AJ, Keane TM, Naughton TJ. Scheduling in a dynamic heterogeneous distributed system using estimation error. J Parallel Distrib Comput 2008;68(11):1452–62.
[26] Sgall J. On-line scheduling. In: Fiat A, Woeginger GJ, editors. Online algorithms: the state of the art. Berlin, Heidelberg: Springer-Verlag; 1998. p. 196–231.
[27] Shabtay D, Steiner G. A survey of scheduling with controllable processing times. Discrete Appl Math 2007;155(13):1643–66.
[28] Tseng C-T, Liao C-J, Huang K-L. Minimizing total tardiness on a single machine with controllable processing times. Comput Oper Res 2009;36(6):1852–8.
[29] Welsh M, Culler D. Adaptive overload control for busy internet servers. In: Proceedings of the 4th conference on USENIX symposium on internet technologies and
systems. Berkeley, CA, USA: USENIX Association; 2003. p. 43–57.
[30] Xhafa F, Abraham A. Computational models and heuristic methods for grid scheduling problems. Future Gener Comput Syst 2010;26(4):608–21.