Distributed Scheduling&Load Balancing

Distributed Scheduling
1
Contents
• Distributed Scheduler
• Motivation
• Issues in load distributing
• Load distributing algorithms
• Load sharing policies
2
Introduction
• Need for good resource allocation scheme for
DS
• Distributed scheduler:
• A resource management component of a
distributed operating system that focuses on
judiciously and transparently redistributing the
load of the system among the computers such
that the overall performance of a
system is maximized.
• More suitable for LANs than WANs
3
4
Motivation
• Need for load distributing because of
– Random arrival of tasks
– Random CPU service requirements
• Needed both for heterogeneous and homogeneous systems
• E.g. system of N identical and independent servers
Let p be the utilization for each server
P0 = 1 – p; the probability that a server is idle
P be the probability that the system is in a state in which at least one
task is waiting for service and at least one server is idle
5
Motivation
contd…
6
Motivation
• contd…
Some Observations:
– For moderate system utilization the value of P is
high i.e. higher potential for load distribution
– At high system utilization the value of P is low i.e.
lower potential for load distribution
– At low system utilization the value of P is again low
– As the number of the server in the system
increases, P remains high even at high system
utilization
7
Issues in Load Distributing
• Some terminology
– Performance of a system
• one of the metric is average response time of task
which is the length of the time interval between its
origination and completion
– Defining proper load index
8
Issues in Load Distributing
contd…
• Load
– CPU queue length as a load indicator
– CPU utilization as a load indicator
9
Classification of load distributing
algorithms
• Goal of Load distributing algorithm
– To transfer load from heavily loaded computers to
idle or lightly loaded computers
• broadly characterized as
– Static
• Decision is hard wired in the algorithm using a priory
knowledge of the system
– Dynamic
• Make use of system state information to
make load distributing decisions
10
Classification of load distributing
algorithms contd…
– Adaptive
• Special class of dynamic algorithm
• they adapt their activities by dynamically
changing the parameters of the algorithm to suit the
changing system state
11
Load balancing vs. Load sharing
• unshared state :
– A state in which one computer lies idle while at the same
time tasks contend for service at another computer
• to reduce the likelihood of unshared state
• Load balancing algo:
– Attempt to equalize the loads at all computers
– Higher overhead than load sharing algo
• anticipatory task transfer
– To reduce the duration of unshared state
12
Preemptive vs. Nonpreemptive
transfers
• Preemptive:
– Transfer of a task that is partially executed
• Non preemptive:
– Transfer of a task that has
not started yet executed
13
Components of Load Distributing
algorithm
• Four components
– Transfer policy
• Determines whether a node is in a suitable state to participate in a
task transfer
– Selection policy
• determines which task should be transferred
– Location policy
• determines to which node a task selected for transfer
should be sent
– Information policy
• responsible for triggering the collection of
system state information
14
Transfer Policy
• Threshold policy
– Thresholds are expressed in terms of units of load
– Decided upon the origination of new task
– Concept of sender and receiver
• On detecting imbalance in load
amongst nodes in system
15
Selection Policy
• Selects a task for transfer
• Simplest approach: to select the newly originated
task
• Overhead incurred in task transfer should be
compensated by the reduction in the response time
realized by the task
• Factors to consider:
– Overhead incurred by transfer should be minimal
– Number of location dependent system calls made by the
selected task should be minimal
16
Location Policy
• To find suitable nodes to share load(sender or
receiver)
• Widely used method : polling
– Either serially or in parallel
– Either randomly or on a nearest-neighbor basis
• Alternative to polling
– Broadcast a query to find out if any
nodeis available for load sharing
17
Information Policy
• To decide when, where and what information
about the states of other nodes on the system
should be collected
• One of three types:
– Demand driven
• Node collects the state of the other nodes only when it
becomes either a sender or a receiver
• dynamic policy
• can be sender-initiated, receiver-
initiated or
symmetrically initiated
18
Information Policy
contd…
– Periodic
• Nodes exchange load information periodically
• Do not adapt their activity to the system state
• Benefits are minimal at high system loads ???
– State change driven
• Nodes disseminate state information whenever
their state changes by a certain degree
• Centralized and decentralized policy
19
Load distributing algorithms
• Sender initiated algorithms
• Receiver initiated algorithms
• Symmetrically initiated algorithms
20
Sender-Initiated algorithms
• Initiative by overloaded node (sender) to send a task
to an underloaded node(receiver)
• Transfer policy :
– Threshold policy based on CPU queue length
• Selection Policy:
– Consider only newly arrived tasks for transfer
• Location policy:
– Random :
• no remote state information
• task is transferred to a node selected at random
21
– Random :
• Useless task transfer can occur
• Treating a transferred task
• Thrashing problem :
– Solution : limit the number of times a task can be transferred
• Substantial performance improvement over no
load sharing at all
22
– Threshold
• Polling a node to determine whether it is receiver or
not
• PollLimit , limit on no. of nodes to poll
23
24
– Shortest
• Choose best receiver for a task
• Make use of CPU queue length
• Information policy
– Demand driven
• Stability
– Instability at high system load
25
Receiver-Initiated Algorithms
• Initiation by an underloaded node (receiver)
• Transfer policy
– Threshold policy based on CPU queue length.
– Triggered when the task departs
• Selection policy
– Any
• Location policy
– Threshold policy
26
(cont.)
27
(cont.)
– Demand-driven type.
• Stability
– Do not cause system instability at high load. Why????
– Do not cause system instability at low load. Why????
• Drawback
– Most transfers are preemptive.
– What about sender-initiated algorithms????
28
Comparison of Sender-Initiated and
• Stability
• Robustness
– Has an edge over the sender-initiated policies.
• Performs acceptably with a single value of threshold
over entire load spectrum while sender-initiated
policies requires adaptive location policy
29
Comparison of Sender-Initiated and
contd…
30
Symmetrically Initiated Algorithms
• Both senders and receivers search
for receivers and senders respectively
• Advantages and disadvantages of both sender
and receiver initiated algorithms.
• Above average algorithm.
31
The above average algorithm
• Proposed by Krueger and Finkel
• Tries to maintain the load at each node within
an acceptable range of system average
• Why not exact system average ????
32
Transfer Policy
• Use two adaptive thresholds:
– Equidistant from the node’s estimate of the average load
across all nodes
– E.g. average load is 2  the lower threshold = 1 and the
upper threshold = 3
• A node whose load is greater than upper threshold
 a sender
• A node whose load is less than lower threshold 
a receiver.
• Nodes that have loads between these thresholds lie
within the acceptable range, so they are neither
senders nor receivers.
33
Location Policy
• The location policy has the following
two components:
– Sender-initiated component
– Receiver-initiated component
34
Sender-initiated component
• Sender node:
– TooHigh message
– TooHigh timeout alarm
• Receiver node
– TooLow timeout alarm
– accept message
– AwaitingTask timeout alarm
– Increases load before accepting a task. Why????
• What if sender receives TooLow message
while waiting for Accept message??
35
Sender-initiated component
contd…
• On expiration of TooHigh timeout, if no Accept
message is received,
– Sender infers that its estimate of the average
system load is too low
– Hence, it broadcasts a ChangeAverage message to
increase the average load estimate at the other
nodes.
36
Receiver-initiated component
• A node, on becoming a receiver, broadcasts a TooLow
message, set a TooLow timeout, and starts listening
for a TooHigh message.
• If a TooHigh message is received, the receiver
performs the same actions that it does under
sender-initiated negotiation
• If the TooLow timeout expires before receiving any
TooHigh message, the receiver broadcasts a
ChangeAverage message to decrease the average
load estimate at the other nodes
37
Selection and Information Policy
• Selection policy
– This algorithm can use of any of the
make approaches discussed
earlier.
– Demand-driven.
38
Symmetrically initiated algorithm
• average system load is determined individually
at each node
• load balancing actions adapts to the state
of the communication network as well
39
Adaptive Algorithms
• A stable symmetrically initiated algorithm
• A stable sender initiated algorithm
40
A stable symmetrically initiated
algorithm
• Instability in previous algorithms is due
indiscriminate to polling by sender’s
component. negotiation
• Utilize the information gathered during polling to
classify the nodes in the system as either
Sender/overloaded, Receiver/underloaded, or OK.
• The knowledge concerning the state of node is
maintained by a data structure at each node: a
sender list, a receiver list, and an OK list.
• Initially, each node assumes that every other node is
a receiver.
41
Transfer policy
• A threshold policy where decisions are based on CPU
queue length.
• Trigger when a new task originates or when a task
departs.
• Two threshold values: a lower threshold (LT), an
upper threshold (UT).
• A node is said to be a sender if its queue length > UT,
a receiver if its queue length < LT, and OK if LT ≤
node’s queue length ≤ UT.
42
Location policy
• Sender initiated component
• Receiver initiated component
43
Sender initiated component
• Triggered when node becomes sender
• Sender polls a node at the head of the receiver lists
to
determine whether it is receiver or not
• Processing at the polled node:
• Processing when the response arrives from the polled node:
• Polling stops if,
– A suitable receiver is found
– The number of polls reaches a PollLimit
– The receiver list at the sender node becomes empty
– And the task is processed locally
44
Receiver initiated component
• Nodes polled are selected in following order,
– Head to tail in senders list
– Tail to head in OK list
– Tail to head in receivers list
• Receiver polls the selected node to determine whether it is
sender
• Processing if polled node is sender
• Processing if polled node is not a sender
• Polling process stops if,
– A sender is found
– If the receiver is no longer a receiver
– No. of polls reaches a PollLimit
45
Selection and Information Policy
• Selection policy:
– The sender initiated component considers
only newly arrived tasks for transfer.
– The receiver initiated component can make use of
any of the approaches discussed earlier.
• Information policy: demand-driven.
46
Discussion
• Future sender initiated polls at high
system loads are prevented. How???
• What about Receiver initiated component
at low system load ???
• Positive effect of updating the receiver list
47
A stable sender initiated algorithm
• Two desirable properties:
– It does not cause instability
– Load sharing is due to non-preemptive transfer only.
• Uses the sender initiated load sharing component of
the stable symmetrically initiated algorithm
• Has a modified receiver initiated component to
attract the future non-preemptive task transfers from
sender nodes.
48
• The data structure (at each node) of the stable
symmetrically initiated algorithm is augmented by a
array called statevector.
• The statevector is used by each node to keep track of
which list (senders, receivers, or OK) it belongs to at
all the other nodes in the system.
• When a sender polls a selected node, the sender’s
statevector is updated to reflect that the sender now
belongs the senders list at the selected node, the
polled node update its statevector based on the reply
it sent to the sender node to reflect which list it will
belong to at the sender
49
• The receiver initiated component is
replaced by the following protocol:
– When a node becomes a receiver, it informs all the nodes that are are
misinformed about its current state. The misinformed node are those
nodes whose receivers lists do not contain the receiver’s ID.
– The statevector at the receiver is then updated to reflect that it now
belongs to the receivers list at all those nodes that were informed of
its current state.
– By this technique, this algorithm avoids the receivers sending
broadcast messages to inform other nodes that they are receivers.
• No preemptive transfers of partly
executed tasks here.
50
Performance comparision
• Symmetrically initiated load sharing
• Stable load sharing algorithms
• Performance under heterogeneous workloads
51
Symmetrically initiated load sharing
52
Stable load sharing algorithms
53
Selecting a suitable load sharing
algorithm
1. System under consideration never attains high loads

2. Systems that can reach high loads
3. Systems that experiences a wide range of load
fluctuations
4. Systems that experiences a wide range of
fluctuations in load and has a high cost of the
migration of partly executed tasks
5. Systems that experiences heterogeneous work
arrival
54

Distributed Scheduling&Load Balancing

Uploaded by

Copyright:

Available Formats

Distributed Scheduling&Load Balancing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distributed Scheduling&Load Balancing

Uploaded by

Copyright:

Available Formats

Distributed Scheduling

1. System under consideration never attains high loads

You might also like