(Gelenbe Erol) Analysis and Synthesis of Computer

Analysis and
Synthesis of
Computer Systems
2ND EDITION
P643 tp.indd 1 3/19/10 4:38:53 PM

Advances in Computer Science and Engineering: Texts
Editor-in-Chief: Erol Gelenbe (Imperial College)

Advisory Editors: Manfred Broy (Technische Universitaet Muenchen)
Grard Huet (INRIA)
Published
Vol. 1 Computer System Performance Modeling in Perspective:

A Tribute to the Work of Professor Kenneth C. Sevcik
edited by E. Gelenbe (Imperial College London, UK)
Vol. 2 Residue Number Systems: Theory and Implementation

by A. Omondi (Yonsei University, South Korea) and
B. Premkumar (Nanyang Technological University, Singapore)
Vol. 3: Fundamental Concepts in Computer Science

edited by E. Gelenbe (Imperial College Londo, UK) and
J.-P. Kahane (Universit de Paris Sud - Orsay, France)
Vol. 4: Analysis and Synthesis of Computer Systems (2nd Edition)

by Erol Gelenbe (Imperial College, UK) and
Isi Mitrani (University of Newcastle upon Tyne, UK)
KwangWei - Analysis and Synthesis.pmd 2 5/6/2010, 3:55 PM

Advances in Computer Science and Engineering: Texts Vol. 4
Analysis and
Synthesis of
Computer Systems
2ND EDITION
Erol Gelenbe
Imperial College, UK
Isi Mitrani
University of Newcastle upon Tyne, UK
Imperial College Press

ICP
P643 tp.indd 2 3/19/10 4:38:53 PM

Published by
Imperial College Press
57 Shelton Street
Covent Garden
London WC2H 9HE
Distributed by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.
ANALYSIS AND SYNTHESIS OF COMPUTER SYSTEMS (2nd Edition)

Advances in Computer Science and Engineering: Texts Vol. 4
Copyright 2010 by Imperial College Press
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
Desk Editor: Tjan Kwang Wei
ISBN-13 978-1-84816-395-9
Typeset by Stallion Press

Email: [email protected]
Printed in Singapore.
KwangWei - Analysis and Synthesis.pmd 1 5/6/2010, 3:55 PM

February 11, 2010 13:12 spi-b749 9in x 6in b749-fm
Preface to the Second Edition
The book has been revised and extended, in order to reect important
developments in the eld of probabilistic modelling and performance
evaluation since the rst edition. Notable among these is the introduction
of queueing network models with positive and negative customers. A large
class of such models, together with their solutions and applications, is
described in Chapter 4. Another recent development concerns the solution
of models where the evolution of a queue is controlled by a Markovian
environment. These Markov-modulated queues occur in many dierent
contexts; their exact and approximate solution is the subject of Chapter 5.
Finally, the queue with a server of walking type described in Chapter 2 is
given a more general treatment in Chapter 10.
Erol Gelenbe
Isi Mitrani
February 2010
v
This page intentionally left blank

Contents
Preface to the Second Edition v
1. Basic Tools of Probabilistic Modelling 1

1.1. General background . . . . . . . . . . . . . . . . . . . . 1
1.2. Markov processes. The exponential distribution . . . . 3
1.3. Poisson arrival streams. Important properties . . . . . 9
1.4. Steady-state. Balance diagrams. The
Birth and Death process . . . . . . . . . . . . . . . . 13
1.5. The M/M/1, M/M/c and related queueing
systems . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6. Littles result. Applications. The M/G/1 system . . . . 28
1.7. Operational identities . . . . . . . . . . . . . . . . . . . 34
1.8. Priority queueing . . . . . . . . . . . . . . . . . . . . . 37
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2. The Queue with Server of Walking Type

and Its Applications to Computer System Modelling 43
2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2. The queue with server of walking type
with Poisson arrivals, and the M/G/1 queue . . . . . . 44
2.3. Evaluation of secondary memory device
performance . . . . . . . . . . . . . . . . . . . . . . . . 58
2.4. Analysis of multiplexed data communication
systems . . . . . . . . . . . . . . . . . . . . . . . . . . 68
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
vii
viii Analysis and Synthesis of Computer Systems
3. Queueing Network Models 73

3.1. General remarks . . . . . . . . . . . . . . . . . . . . . . 73
3.2. Feedforward networks and product-form solution . . . 76
3.3. Jackson networks . . . . . . . . . . . . . . . . . . . . . 80
3.4. Other scheduling strategies and service time
distributions . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5. The BCMP theorem . . . . . . . . . . . . . . . . . . . 98
3.6. The computation of performance measures . . . . . . . 106
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4. Queueing Networks with Multiple Classes of Positive

and Negative Customers and Product Form Solution 117
4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 117
4.2. The model . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.3. Main results . . . . . . . . . . . . . . . . . . . . . . . . 121
4.4. Existence of the solution to the trac equations . . . . 132
4.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 134
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5. Markov-Modulated Queues 137

5.1. A multiserver queue with breakdowns and repairs . . . 139
5.2. Manufacturing blocking . . . . . . . . . . . . . . . . . 141
5.3. Phase-type distributions . . . . . . . . . . . . . . . . . 142
5.4. Checkpointing and recovery in the presence
of faults . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.5. Spectral expansion solution . . . . . . . . . . . . . . . 144
5.6. Balance equations . . . . . . . . . . . . . . . . . . . . . 146
5.7. Batch arrivals and/or departures . . . . . . . . . . . . 151
5.8. A simple approximation . . . . . . . . . . . . . . . . . 153
5.9. The heavy trac limit . . . . . . . . . . . . . . . . . . 155
5.10. Applications and comparisons . . . . . . . . . . . . . . 158
5.11. Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 163
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6. Diusion Approximation Methods for General

Queueing Networks 165
6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 165
6.2. Diusion approximation for a single queue . . . . . . . 166
Contents ix
6.3. Diusion approximations for general networks

of queues with one customer class . . . . . . . . . . . . 185
6.4. Approximate behaviour of a single queue
in a network with multiple customer classes . . . . . . 201
6.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 206
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7. Approximate Decomposition and Iterative Techniques

for Closed Model Solution 211
7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 211
7.2. Subsystem isolation . . . . . . . . . . . . . . . . . . . . 211
7.3. Decomposition as an approximate solution
method . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
7.4. An electric circuit analogy for queueing
network solution . . . . . . . . . . . . . . . . . . . . . 224
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
8. Synthesis Problems in Single-Resource Systems:

Characterisation and Control of Achievable Performance 231
8.1. Problem formulation . . . . . . . . . . . . . . . . . . . 231
8.2. Conservation laws and inequalities . . . . . . . . . . . 233
8.3. Characterisation theorems . . . . . . . . . . . . . . . . 242
8.4. The realisation of pre-specied performance
vectors. Complete families of scheduling
strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 249
8.5. Optimal scheduling strategies . . . . . . . . . . . . . . 259
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
9. Control of Performance in Multiple-Resource Systems 269

9.1. Some problems arising in multiprogrammed
computer systems . . . . . . . . . . . . . . . . . . . . . 269
9.2. The modelling of system resources and program
behaviour . . . . . . . . . . . . . . . . . . . . . . . . . 271
9.3. Control of the degree of multiprogramming . . . . . . . 274
9.4. The page fault rate control policy (RCP) . . . . . . . . 281
9.5. Control of performance by selective memory
allocation . . . . . . . . . . . . . . . . . . . . . . . . . 287
x Analysis and Synthesis of Computer Systems
9.6. Towards a characterisation of achievable

performance in terminal systems . . . . . . . . . . . . 292
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
10. A Queue with Server of Walking Type 297

10.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 297
10.2. Properties of the waiting time process . . . . . . . . . 299
10.3. Application to a paging drum model . . . . . . . . . . 307
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
Index 309
January 11, 2010 12:17 spi-b749 9in x 6in b749-ch01
Chapter 1
Basic Tools of Probabilistic Modelling
1.1. General background

On a certain level of abstraction, computer systems belong to the same
family as, for example, job-shops, supermarkets, hairdressing salons and
airport terminals; all these are sometimes described as mass service
systems and more often as queueing systems. Customers (or tasks, or
jobs, or machine parts) arrive according to some random pattern; they
require a variety of services (execution of arithmetic and logical operations,
transfer of information, seat reservations) of random durations. Services are
provided by one or more servers, perhaps at dierent speeds. The order of
service is determined by a set of rules which constitutes the scheduling
strategy, or service discipline.
The mathematical analysis of such systems is the subject of queueing
theory. Since A. K. Erlangs studies of telephone switching systems,
in 19171918, that theory has progressed considerably; today it boasts
an impressive collection of results, methods and techniques. Interest in
queueing theory has always been stimulated by problems with practical
applications. In particular, most of the theoretical advances of the last
decade are directly attributable to developments in the area of computer
systems performance evaluation.
Because customer interarrival times and the demands placed on the
various servers are random, the state S(t) of a queueing system at time t
of its operation is a random variable. The set of these random variables
{S(t), t 0} is a stochastic process. A particular realisation of the random
variables that is, a particular realisation of all arrival events, service
demands, etc. is a sample path of the stochastic process. For example,
in a single-server queueing system where all customers are of the same type,
one might be interested in the stochastic process {N (t), t 0}, where N (t)
is the number of customers waiting and/or being served at time t. A portion
1
2 Analysis and Synthesis of Computer Systems
Fig. 1.1.
of a possible sample path for this process is shown in Fig. 1.1: customers
arrive at moments a1 , a2 , . . . and depart at moments d1 , d2 , . . .
An examination of the sample paths of a queueing process can disclose
some general relations between dierent quantities associated with a given
path. For instance, in the single-server system, if N (t1 ) = N (t2 ) for
some t1 < t2 , and there are k arrivals in the interval (t1 , t2 ), then
there are k departures in that interval. Since a sample path represents
a system in operation, relations of the above type are sometimes called
operational laws or operational identities (Buzen [1]). We shall derive
some operational identities in section 1.7. Because they apply to individual
sample paths, these identities are independent of any probabilistic assump-
tions governing the underlying stochastic process. Thus, the operational
approach to performance evaluation is free from the necessity to make such
assumptions. It is, however, tied to specic sample paths and hence to
specic runs of an existing system where measurements can be taken.
The probabilistic approach involves studying the stochastic process
which represents the system. The results of such a study necessarily depend
on the probabilistic assumptions governing the process. These results are
themselves probabilistic in nature and concern the population of all possible
sample paths. They are not associated with a particular run of an existing
system, or with any existing system at all. It is often desirable to evaluate
not only the expected performance of a system, but also the likely deviations
from that expected performance. Dealing with probability distributions
makes this possible, at least in principle.
We shall be concerned mainly with steady-state system behaviour
that is, with the characteristics of a process which has been running for
a long time and has settled down into a statistical equilibrium regime.
Long-run performance measures are important because they are stable;
Basic Tools of Probabilistic Modelling 3
being independent of the early history of the process, and independent

of time, they are also much easier to deal with. We shall, of course, be
interested in the conditions which ensure the existence of steady-state.
This chapter introduces the reader to the rudiments of stochastic
processes and queueing theory. Results used later in the book will be derived
here, with the emphasis on explaining important methods and ideas rather
than on rigorous proofs. In discussing queueing systems, we shall use the
classic descriptive notation devised by D. G. Kendall:
e.g. D/M/2 describes a queueing system with Deterministic (constant)

interarrival times, Markov (exponential) service times and 2 servers.
1.2. Markov processes. The exponential distribution

Let S(t) be a random variable depending on a continuous parameter
t (t 0) and taking values in the set of non-negative integers {0, 1, 2, . . .}.
We think of t as time and of S(t) as the system state at time t. The
requirement that the states should be represented as positive integers is
not important; it is essential that they should be denumberable. Later, we
shall have occasions to use vectors of integers as state descriptors.
The collection of random variables {S(t), t 0} is a stochastic
process. That collection is said to be a Markov process if the probability
distribution of the state at time t + y depends only on the state at time t
and not on the process history prior to t:
P (S(t + y) = j|S(u); u t)
= P (S(t + y) = j|S(t)), t, y 0, j = 0, 1, . . . . (1.1)
The right-hand side of (1.1) may depend on t, y, j and the value of S(t). If,
in addition, it is independent of t, i.e. if
P (S(t + y) = j|S(t) = i) = pi,j (y) for all t, (1.2)
then the Markov process is said to be time-homogeneous (for an excellent

treatment of stochastic processes see Cinlar [3]). From now on, whenever
we talk of a Markov process, we shall assume that it is time-homogeneous.
Thus, for a Markov process, the probability pi,j (y) of moving from state
i to state j in time y is independent of the time at which the process was in
state i and of anything that happened before that time. This very important
property will be referred to as the memoryless property.
The probability pi,j (y), regarded as a function of y, is called the
transition probability function. The memoryless property immediately
implies the following set of functional equations:

pi,j (x + y) = pi,j (x)pk,j (y), x, y 0, i, j = 0, 1, . . . . (1.3)
k=0
These equations express simply the fact that, in order to move from state
i to state j in time x + y, the process has to be in some state k after time
x and then move to state j in time y (and the second transition does not
depend on i and x). They are the ChapmanKolmogorov equations of the
Markov process. Introducing the innite matrix P(y) of transition functions
pi,j (y), we can rewrite (1.3) as
P(x + y) = P(x)P(y), x, y 0. (1.4)
We shall assume that the functions pi,j (y) are continuous at y = 0:

1 if i = j
lim pi,j (y) = (1.5)
y0 0 otherwise.
That assumption, together with (1.3), ensures that pi,j (y) is continuous,
and has a continuous derivative, for all y 0; i, j = 0, 1, . . . (we state this
without proof).
A special role is played by the derivatives ai,j of the transition functions
at t = 0. By denition,
pi,i (y) 1
ai,i = lim , i = 0, 1, . . .
y0 y
(1.6)
pi,j (y)
ai,j = lim , i = j = 0, 1, . . . .
y0 y
Hence, if h is small,
pi,j (h) = ai,j h + o(h), i = j = 0, 1, . . . , (1.7)
where o(x) is a function such that lim [o(x)/x] = 0.

x0
In other words, if the Markov process is in state i at some moment t,
then the probability that at time t + h it is in state j is nearly proportional
to h, with coecient of proportionality ai,j . That is why ai,j is called
the instantaneous transition rate from state i to state j, i = j. The

probability that the process leaves state i by t+ h is approximately equal to
1 pi,i (h) = ai,i h + o(h), i = 0, 1, . . . , (1.8)
so ai,i is the instantaneous rate of transition out of state i. Of course, we

must have

ai,i = ai,j . (1.9)
j=0
j=i
In fact, since P(y) is a stochastic matrix (its rows sum up to 1), the rows
of P (y) must sum up to 0 for all y 0.
Let A = [ai,j ], i, j = 0, 1, . . . be the matrix of instantaneous transition
rates. Dierentiating (1.4) with respect to x and then letting x 0
yields a system of equations known as the ChapmanKolmogorov backward
dierential equations:
P (y) = AP(y). (1.10)
Similarly, dierentiating (1.4) with respect to y and letting y 0 yields

the ChapmanKolmogorov forward dierential equations
P (x) = P(x)A. (1.11)
Either (1.10) or (1.11) can be solved for the transition probability functions,
subject to the initial conditions P(0) = I (the identity matrix) and P (0) =
A. In a purely formal way, treating P(y) as a numerically valued function
and A as a constant, (1.10) and (1.11) are satised by
P(y) = eAy . (1.12)
This turns out, indeed, to be the solution, provided that (1.12) is

interpreted as

yn n
P(y) = A , y 0. (1.13)
n=0
n!
Thus, the transition probability functions are completely determined

by their derivatives at y = 0. It should be clear, however, that to nd
them in practice is by no means a trivial operation. The matrix P(y), for
nite values of y, is referred to as the transient solution of the Markov
process. As far as closed-form expressions are concerned, transient solutions
are unobtainable for all but a few very simple Markov processes.
Let {S(t), t 0} be a Markov process with instantaneous transition

rate matrix A. Suppose that at time t the process is in state i. What is
the distribution of the interval i until the rst exit from state i (that
interval is called the holding time)? And what is the probability qi,j that
the next state to be entered will be state j? According to the memoryless
property, the answers to both these questions are independent of t and of
the process history prior to t. In particular, they are independent of how
long the process has already spent in state i. Consider rst the holding time;
denote by H i (x) the complementary distribution function of i : H i (x) =
P (ni > x). From the memoryless property, if the process stays in state i
for time x, the probability that it will remain there for at least another
interval y is independent of x. Therefore,
H i (x)H
i (x + y) = H i (y), x, y 0. (1.14)
Any distribution function which satises (1.14) must fall into one of
the following three categories:
(i) H i (x) = 1 for all x 0. If this is the case, once the process enters
state i it remains there forever (properly speaking, the holding time
does not have a distribution function then). States of this type are
called absorbing.
(ii) H i (x) = 0 for all x 0. In this case the process bounces out of state
i as soon as it enters it. Such states are called instantaneous.
(iii) H i (x) is monotone decreasing from 1 to 0 on the interval [0, ) and
is dierentiable. States in this category are called stable.
From now on, we shall assume that all states are stable. Dierentiating
(x) = i H
Eq. (1.4) with respect to y and letting y 0 we obtain H i (x),
i

where i = Hi (0). Hence

i (x) = ei x ,
H x 0,
and the distribution function Hi (x) = P (i x) is given by
Hi (x) = 1 ei x , x 0. (1.15)
To determine the parameter i in terms of the matrix A, note that

according to (1.15) the probability of leaving state i in a small interval
h is equal to Hi (h) = i h + o(h). Comparing this with (1.8) shows that i
is exactly the instantaneous transition rate out of state i:
i = ai,i , i = 0, 1, . . . . (1.16)
From (1.15), (1.7) and the memoryless property it follows that the
probability that the process remains in state i for time x and then moves
to state j in the innitesimal interval (x, x + dx) is equal to
ei x ai,j dx, x 0, j = i.
Integrating this expression over all x 0 gives us the probability that the
next state to be entered will be state j:

ai,j ai,j
qi,j = ei x ai,j dx = = , i = j = 0, 1, . . . . (1.17)
0 i ai,i
We derived (1.15) and (1.17) under the assumption that the Markov
process was observed at some arbitrary, but xed, moment t. These results
continue to hold if, for example, the process is observed just after it enters
state i. Moreover, a stronger assertion can be made (we state it without
proof): given that the process has just entered state i, the time it spends
there and the state it enters next are mutually independent.
The behaviour of a Markov process can thus be described as follows:
at time t = 0 the process starts in some state, say i; it remains there for an
interval of time distributed exponentially with parameter i (average length
1/i ); the process then enters state j with probability qi,j , remains there for
an exponentially distributed interval with mean 1/j , enters state k with
probability qj,k , etc. The successive states visited by the process form a
Markov chain that is, the next state depends on the one immediately
before it, but not on all the previous ones and not on the number of moves
made so far. This Markov chain is said to be embedded in the Markov
process.
We shall conclude this section by examining a little more closely
the exponential distribution dened in (1.15). That distribution plays a
central role in most probabilistic models that are analytically tractable. It
owes its preeminent position to the memoryless property. If the duration
of a certain activity is distributed exponentially with parameter ,
and if that activity is observed at time x after its beginning, then the
remaining duration of the activity is independent of x and is also distributed
exponentially with parameter :
P ( > x + y) e(x+y)
P ( > x + y | > x) = = = ey = P ( > y).
P ( > x) ex
(1.18)
On the other hand, we have seen in the derivation of (1.15) that (excluding
the degenerate cases) the memoryless property implies the exponen-
tial distribution. There are, therefore, no other distributions with that
property.
Let 1 and 2 be two independent random variables with distribution
functions
F1 (x) = 1 e1 x ; F2 (x) = 1 e2 x ,
and density functions
f1 (x) = 1 e1 x ; f2 (x) = 2 e2 x ,
respectively. Think of 1 and 2 as the durations of two activities which

are in progress simultaneously. The two activities are observed at a given
moment; neither of them has completed. It is then of interest to know
the distribution of the interval, , until the rst completion of an activity
and the probability, qi , that the i-th activity will complete rst (i = 1, 2).
Denote the distribution function and the density function of by F (x) and
f (x), respectively. Using the conventional notation P ( = x)/dx in place of
lim [P (x < x + x)/x], and the memoryless property, we can write
x0
f (x)dx = P ( = x) = P (min(1 , 2 ) = x)
= P (1 = x)P (2 x) + P (1 x)P (2 = x)
= f1 (x)dx[1 F2 (x)] + f2 (x)dx[1 F1 (x)]
= 1 e1 x e2 x dx + 2 e2 x e1 x dx
= (1 + 2 )e(1 +2 )x dx. (1.19)
The time until the rst completion is thus distributed exponentially with
parameter (1 + 2 ). The probability that activity 1 will complete rst is
given by

q1 = P (1 < 2 ) = f1 (x)[1 F2 (x)]dx = 1 /(1 + 2 ). (1.20)
0
Similarly, q2 = P (2 > 1 ) = 2 /(1 + 2 ). Moreover, it is easily seen that

the time until the nearest completion does not depend on which activity
completes rst. For instance,
P ( = x | 1 < 2 ) = (1 + 2 )e(1 +2 )x dx = P ( = x). (1.21)

These results, which can be generalised in an obvious way to any (even

innite) number of activities, give an intuitive meaning to expressions (1.15)
and (1.17) concerning the holding times and transition probabilities of a
Markov process.
When the process enters state i, we can imagine exponentially
distributed activities, representing the transitions from state i to state
j (j = 0, 1, . . .), being started all at once. The parameter of the j-th
distribution is ai,j . The holding time in state i is then the time until the rst
completion of an activity; the next state entered is the index of that rst
activity.
1.3. Poisson arrival streams. Important properties

The telephone calls received at a switchboard, the impacts by molecules
to which a small particle immersed in liquid is subjected, the breakdown
of machines in a large factory all these, and many other physical
phenomena, give rise to Poisson processes. In general, a Poisson process
is used to model a sequence of events we shall refer to them as
arrivals whose moments of occurrence satisfy certain probabilistic
conditions. In textbooks on stochastic processes, the denition and treat-
ment of the Poisson process usually precede those of general Markov
processes. Here, however, we wish to be as economical as possible; having
developed some Markov process theory, we shall apply it to this very
special case.
The Poisson process, {N (t), t 0}, is a Markov process which satises
the following restrictions:
(i) N (0) = 0 with probability 1,

(ii) from state i (i = 0, 1, . . .) the process moves to state i + 1 with
probability 1; the instantaneous transition rate ai,i+1 does not depend
on i (ai,i+1 = , i = 0, 1, . . .).
We have thus dened a counting process: the value of N (t) is equal to

the number of moves, or the number of arrivals, in the interval (0, t]. The
distribution of that number, pk (t) = P (N (t) = k | N (0) = 0), k = 0, 1, . . . ,
constitutes the rst row of the transition probability matrix P(t) dened in
the last section. We are now in the happy position of being able to use the
general result (1.12) to nd the desired distribution; the Poisson process is
just simple enough to permit such an approach.
Restriction (ii), together with (1.9) and (1.17), imply that the instan-
taneous transition matrix of the Poisson process has the form

0 0
0 0
A=
0
= (I + U). (1.22)
0

Here, I is the (innite) identity matrix and U is the matrix which has
ones on the rst upper diagonal and zeros everywhere else:

0 1 0 0
0 0 1 0
U=
0
.
0 0 1

Substituting (1.22) into (1.12) yields

(t)n n
P(t) = e(UI)t = et eUt = et U . (1.23)
n=0
n!
Now, the matrix Un has ones on the n-th upper diagonal and zeros
everywhere else. Therefore, the rst row of the matrix dened by the series
on the right-hand side of (1.23) is (1, t, (t)2 /2!, . . .). The probability of k
arrivals in the interval (0, t] is equal to
et (t)k
pk (t) = , k = 0, 1, . . . . (1.24)
k!
Because of the memoryless property, the probability of k arrivals in any
interval of length t is also given by (1.24). In a small interval of length h,
there is one arrival with probability p1 (h) = h+o(h). The probability that
there are two or more arrivals in an interval of length h is P>1 (h) = o(h).
These last properties (plus the memoryless one) are sometimes given as
dening axioms for the Poisson process.
Since the Poisson process is a Markov process, the holding times, i.e.
the intervals between consecutive arrivals, are independent and distributed
exponentially with parameter . This property too, can be taken as a
denition of the Poisson process; it implies the Markov property and
everything else. The expected length of the interarrival intervals is 1/.
Therefore, the average number of arrivals per unit time is . For that reason,
the parameter is called the rate of the Poisson process. The average
number of arrivals in an interval of length t is
E[N (t)] = t, (1.25)
as can also be seen directly from (1.24).

Often in practice, arrival streams from two or more dierent sources
merge before reaching a single destination. We shall see this happening,
for example, in queueing networks (Chapter 3). Now, if the component
processes are Poisson, then the result of this merging, or superposition
operation is also Poisson. Indeed, let {N1 (t), t 0} and {N2 (t), t 0}
be two independent Poisson processes with rates 1 and 2 , respectively,
and let {N (t) = N1 (t) + N2 (t), t 0} be their superposition. Consider the
interval between an arbitrary moment t0 and the next arrival instant of
{N (t)}. Clearly, = min(1 , 2 ), where i is the interval between t0 and
the next arrival instant of {Ni (t)}, i = 1, 2. Since the component processes
are Poisson, 1 and 2 are exponentially distributed with parameters 1
and 2 , respectively; also they are mutually independent. By (1.19), is
exponentially distributed with parameter = 1 +2 . This, in turn, implies
that {N (t), t 0} is Poisson with rate .
The above argument generalises easily. The superposition of an arbi-
trary number of independent Poisson processes is Poisson, with rate equal
to the sum of the component rates. Moreover, the superposition is approxi-
mately Poisson even if the individual components are not, as long as they are
independent and there is a large number of them. This explains why Poisson
arrival processes are frequently observed in practice. For example, if each
user of a computing facility submits jobs independently of the others, and
there are many users, the total stream of jobs will be approximately Poisson.
Consider now the splitting, or decomposition, of a Poisson process
{N (t), t 0} into two components {N1 (t), t 0} and {N2 (t), t 0}.
The decomposition is performed by a sequence of independent Bernoulli
trials: every arrival of the process {N } is assigned to the process {Ni } with
probability i (i = 1, 2; 1 + 2 = 1). The joint distribution of N1 (t) and
N2 (t) can be obtained as follows:
P (N1 (t) = n1 , N2 (t) = n2 ) = P (N1 (t) = n1 , N2 (t)

= n2 | N (t) = n1 + n2 )P (N (t) = n1 + n2 )
(n1 + n2 )! n1 n2 et (t)n1 +n2
= 1 2
n1 !n2 ! (n1 + n2 )!
e1 t (1 t)n1 e2 t (2 t)n2
= , (1.26)
n1 ! n2 !
Fig. 1.2.
where we have used (1.24). We see that the processes resulting from the
decomposition are both Poisson (with rates 1 and 2 , respectively). Not
only that, these processes are independent of each other. This result, too,
generalises to arbitrary number of components.
The superposition and decomposition of Poisson processes are
illustrated in Fig. 1.2.
In analysing system performance, we frequently employ the technique
of tagging an incoming customer and following his progress through the
system. It is therefore important to know something about the system state
distribution that customers see when they arrive. In this respect, Poisson
arrivals have a very useful, and apparently unique property: they behave
like random observers. More precisely, let {S(t), t 0} be a stochastic
process representing the state of a queueing system. That system is fed with
customers by one or more arrival streams. Consider an arbitrary moment
t0 ; let S(t
0 ) be the system state just prior to t0 . Then, if the arrival streams
are Poisson, the random variable S(t 0 ) is independent of whether there is
an arrival at t0 or not (Strauch [8]). This is because S(t 0 ) is inuenced
only by the past history of the arrival processes, and that is independent of
whether there is an arrival at t0 (looking backwards in time, the interarrival
intervals are still exponentially distributed and hence memoryless).
Thus, an arrival from a Poisson stream sees the same system state
distribution as someone who just happens to look at the system, having
otherwise nothing to do with it (a random observer).
To appreciate this remarkable property better, let us take a contrasting
example where the arrival stream is decidedly not Poisson. Imagine a
conveyor belt bringing machine parts to an operator at intervals ranging
between 20 and 30 minutes; the operation performed on each part lasts
between 10 and 18 minutes. Two hours after starting the belt, a random
observer (the shop oor supervisor?) may well see the operator diligently
at work. But if a machine part arrives at that time, it is guaranteed to nd

him idle!
Before leaving the topic of Poisson processes, let us derive the distribu-
tion of the time Tn until the n-th arrival instant. That random variable
the sum of n independent exponentially distributed intervals with the same
mean plays an important role in modelling. Denote its distribution
function by Gn (x). From the denition of Tn , and from (1.24), we have

Gn (x) = P (Tn x) = P (N (x) n) = ex (x)k /k!
k=n
n1

= 1 ex (x)k /k!.
k=0
That function is called the n-stage Erlang distribution function. Its

derivative
gn (x) = Gn (x) = ex (x)n1 /(n 1)!,
is the n-stage Erlang density function. The mean and variance of Tn are,
respectively, n/ and n/2 .
1.4. Steady-state. Balance diagrams. The Birth

and Death process
So far, we have been concerned with time-dependent properties of stochastic
processes. The chief objects of interest in a Markov process were the
transition probability functions pi,j (y) relating the state of the process at a
given moment to its state at time y later. Now, although the process state
at time t depends on the initial state (at time 0), we feel intuitively that
in a well-behaved system that dependence should weaken as t increases.
In the long run, the probability of nding the process in a given state
should be independent of where the process started and should cease to
vary with time.
Let us give these intuitive ideas a more precise meaning. Consider a
Markov process {S(t), t 0} with state space {0, 1, . . .} and instantaneous
transition rate matrix A = [ai,j ], i, j = 0, 1, . . . . The time-dependent
behaviour of the process is described by the matrix of transition prob-
ability functions P(t) = [pi,j (t)], i, j = 0, 1, . . . . We say that steady-
state (or equilibrium, or long-run) regime exists for that process if (i),
the limits
j = lim pi,j (t) = lim P (S(t) = j | S(0) = i), j = 0, 1, . . . , (1.27)

t t
exist and are independent of the initial state, and (ii), these limits constitute
a probability distribution:

j = 1. (1.28)
j=0
To justify the term steady-state, suppose that the distribution =

(0 , 1 , . . .) exists and let x in the ChapmanKolmogorov equations
(1.4). Since every row of P(x) tends to , and every row of P(x + y) tends
to , this yields
P(y) = , y 0. (1.29)
In other words, if at any moment the process state has the steady-state
distribution, then it has the steady-state distribution at time y later, no
matter how large or small y is. The state distribution becomes invariant
with respect to time.
There are two important questions which arise in this connection.
First, under what conditions does a steady-state regime exist for a Markov
process? Second, how does one determine the steady-state distribution of
the process? We shall leave the question of existence until the end of this
section and concentrate now on the determination of the vector , assuming
that it exists.
Dierentiating (1.29) at y = 0, and remembering that P (0) = A, we
obtain a system of linear equations for :
A = 0. (1.30)
This is known as the system of balance equations, for reasons which will
become apparent shortly. Being homogeneous, that system determines the
vector up to a multiplicative constant; the normalising equation (1.28)
then completes the determination.
The balance equations have a strong intuitive appeal. To see this, let
us write the i-th equation in the form

ai,i i = aj,i j . (1.31)
j=0
j=i
Now, we can think of j as the proportion of time (in the steady-state)

that the process spends in state j. While the process is in state j, it moves
to state i at rate aj,i (since aj,i is the instantaneous transition rate from
state j to state i; see (1.7)). Therefore, the product j aj,i is equal to the
average number of transitions from state j to state i per unit time. The
right-hand side of (1.31) thus represents the average number of times that
the process enters state i per unit time. Similarly, the left-hand side of (1.31)
represents the average number of times that the process leaves state i per
unit time (since ai,i is the instantaneous transition rate out of state i; see
(1.8)). If the process is in equilibrium, these two averages must be equal.
More generally, if I = (i1 , i2 , . . .) is any group of states, nite or innite,
then the average number of times that the process enters group I per unit
time is equal, in the steady-state, to the average number of times that the
process leaves group I per unit time. The balance equations obtained by
considering groups of states are not, of course, independent of the system
(1.30); however, they are sometimes simpler and easier to deal with.
It is very convenient to describe a Markov process in equilibrium by
means of a marked directed graph. This representation, called a balance
diagram, makes it easier to visualise the process structure and often helps
to select the set of balance equations best suited for determining the steady-
state distribution. The nodes of the balance diagram correspond to the
process states. With node i is associated the steady-state probability i (i =
0, 1, . . .). There is an arc from node i to node j (i = j) if the instantaneous
transition rate ai,i is non-zero; that arc is labelled ai,j . To obtain a balance
equation from the diagram, cut o a group of nodes from the rest of the
diagram by an imaginary closed curve. If an arc from node i to node j
crosses the curve we say that there is a ow i ai,j across the cut. The total
ow out of the cut (from nodes inside to nodes outside) is then equal to the
total ow into the cut (from nodes outside to nodes inside). For instance,
making a cut around node i alone, we obtain the balance equation (1.31).
Note that the term ow used here is simply an abbreviation for average
number of transitions per unit time.
Consider, as an example, the celebrated Birth and Death Markov
process. As well as illustrating the methods of analysis, this example is of
interest in its own right since a number of queueing system models turn
out to be special cases of it. We think of the Birth and Death process
{N (t), t 0} as representing the size of a certain population at time t.
The only possible transitions out of state i are to states i + 1 and i 1,
with instantaneous transition rates i , and i , respectively (these are the
Fig. 1.3.
rates of Birth and Death when the population size is i), i = 1, 2, . . . .

From state 0 the process moves to state 1, with instantaneous rate 0 . The
balance diagram for the Birth and Death process is shown in Fig. 1.3.
Making a cut around each node in succession we obtain the system of
balance equations (1.30);
0 0 = 1 1
(1 + 1 )1 = 0 0 + 2 2
(1.32)
(i + i )i = i1 i1 + i+1 i+1

(there are two arcs going out and two arcs coming into each cut, except
for node 0). Alternatively, cutting o the group of states (0, 1, . . . , i), for
i = 0, 1, . . . , we obtain an equivalent system of balance equations:
0 0 = 1 1
1 1 = 2 2
(1.33)
i i = i+1 i+1

(one arc going out and one arc coming into each cut). The general solution
of (1.33) is easily obtained by successive elimination:
0 1 . . . i1
i = 0 , i = 1, 2, . . . . (1.34)
1 2 . . . i
This leaves one unknown constant, 0 , which is determined from the
normalising condition (1.28):

1
0 0 1
0 = 1 + + + . (1.35)
1 1 2
Note that we have here a necessary condition for equilibrium of the

Birth and Death process in order for the solution given by (1.34) and (1.35)
to be a probability distribution, the innite series on the right-hand side of
(1.35) must converge. We shall see that the inverse implication also holds:
if the series converges, the Birth and Death process has a steady-state.
However, in order to state the general result, some preliminaries are needed.
The state j of a Markov process is said to be reachable from state i
if there is a non-zero probability of nding the process in state j at time
t, given that it started in state i : pi,j (t) > 0 for all t > 0. Since transition
probability functions are zero either everywhere or nowhere on the open
half-line, the state j is not reachable from i if pi,j (t) = 0 for all t > 0.
A subset of process states is said to be closed if no state outside is
reachable from a state in . Thus, if the process once enters a closed subset
of states, it remains in that subset for ever afterwards. A set of states is
said to be irreducible if no proper and non-empty subset of it is closed.
As far as the long-run behaviour of the process is concerned, an irreducible
set of states can be treated in isolation, so we can assume that the set of
all states, i.e. the Markov process, is irreducible.
Every state of an irreducible Markov process is reachable from every
other state. Indeed, suppose that this is not so, and let i and j be two states
such that j is not reachable from i. Consider the set of all states reachable
from i. That set is closed, since any state k reachable from a state in is
also reachable from i (this follows from (1.3)) and hence k . But does
not contain j, which contradicts the irreducibility of the process.
The states of a Markov process {S(t), t 0} can be classied according
to the time the process spends in them. Let Ri,j (t) be the average amount
of time spent in state j during the interval [0, t), given that S(0) = i.
Introducing the indicator function of a Boolean B

1 if B is true
IB =
0 if B is false
we can write
t
Ri,j (t) = E I(S(u)=j) du | S(0) = i
0
t
= E[I(S(u)=j) | S(0) = i]du
0
t t
= P (S(u) = j | S(0) = i)du = pi,j (u)du. (1.36)
0 0
Further, let Ri,j be the total average amount of time spent in state j, given
that S(0) = i:

Ri,j = lim Ri,j (t) = pi,j (u)du. (1.37)
t 0
A state j is said to be transient if Ri,j is nite; otherwise j is

recurrent. Since the average time the process remains in state j on every
visit is nite (it is equal to 1/aj,j ), the average number of visits to state
j is nite if j is transient and it is innite if j is recurrent. Denote by fi,j
the probability that, starting in state i, the process will ever be in state j.
From the remarks above it follows that if state j is recurrent, then fj,j = 1
and if state j is transient, then fj,j < 1. The inverse implications also hold.
If the Markov process is irreducible, and if Ri,j = for some pair of
states i and j, then Rr,k = for any pair of states r, k. Indeed, taking two
arbitrary positive constants v and w, we can write

Rr,k = pr,k (u)du pr,k (v + u + w)du
0 0

pr,i (v)pi,j (u)pj,k (w)du = pr,i (v)pj,k (w)Ri,j = .
0
(The rst inequality is obvious; the second follows from the Chapman
Kolmogorov equations (1.3); the irreducibility of the process ensures that
pr,i (v) > 0 and pj,k (w) > 0.) Hence, either all states are transient, or all
states are recurrent.
The case of all transient states can be disposed of quickly: if Ri,j is
nite for all i, j then, according to (1.37),
lim pi,j (t) = 0, i, j = 0, 1, . . . .

t
In that case, steady-state does not exist.

Suppose now that the Markov process is recurrent, as well as irre-
ducible. Every state is guaranteed to be visited, no matter what the initial
state is (if the probability of eventually moving from state i to state j were
not 1, there would be a non-zero probability of moving from j to i and not
returning to j; state j would not be recurrent). Having once visited a state,
the process keeps returning to it ad infinitum. Let mj be the average length
of the intervals between consecutive returns to state j, j = 0, 1, . . . . That
average length may be nite, in which case state j is said to be recurrent
non-null, or it may be innite, and then j is recurrent null.
The moments t1 , t2 , . . . of successive visits to state j are regeneration

points for the Markov process: the process behaviour in the interval
[tn , tn+1 ) is a probabilistic replica of that in the interval [tn1 , tn ). The
average time spent in state j during each of these intervals is 1/aj,j .
Therefore, in the long run, the fraction of time that the process spends in
state j is independent of the initial state and is given by
Ri,j (t) (1/aj,j )
lim = , i, j = 0, 1, . . . . (1.38)
t t mj
On the other hand, that fraction of time is equal to the long-run probability
of nding the process in state j:
(1/aj,j )
lim pi,j (t) = , i, j = 0, 1, . . . . (1.39)
t mj
Equations (1.38) and (1.39) seem intuitively clear, yet to prove them
rigorously is not easy. Some fundamental results from renewal theory are
involved (see, for example, Cinlar [3]).
It follows from (1.39) that the limiting probability of state j is zero if
mj = , i.e. if j is recurrent null, and vice versa. Moreover, if one state,
j, is recurrent null, then all other states are also recurrent null. Choose an
arbitrary state k (k = j) and two positive constants v and w. The following
inequality follows from the ChapmanKolmogorov equations (1.3):
pj,j (v + t + w) pj,k (v)pk,k (t)pk,j (w).

Since pj,j (t) tends to 0 as t , so must pk,k (t); hence, state k is recurrent
null.
Let us recapitulate the results obtained so far. In an irreducible Markov
process, either all states are transient, or all states are recurrent null, or all
states are recurrent non-null. In the rst two cases, all limiting probabilities
are equal to 0; steady-state does not exist. In the last case, all limiting
probabilities are non-zero; steady-state exists.
We have seen already that if a steady-state distribution vector exists,
it satises the system of balance equations (1.30) and the normalising
equation (1.28). Now we shall demonstrate that if equations (1.30) and
(1.28) have a solution, , then steady-state exists.
First, taking the known expression (1.13) for the transition probability
matrix

P(t) = An tn /n!
n=0
and multiplying both sides by on the left, we see that if A = 0, then
P(t) = for all t 0. (1.40)
Let t in this section. If the Markov process were transient or recurrent

null, then every column of P(t) would tend to 0 and we would have = 0.
That, however, is impossible since satises (1.28). Therefore, the process
must be recurrent non-null and hence steady-state exists. In the latter case,
all elements of the j-th column of P(t) tend to the same constant, j (given
by (1.39)). The j-th equation in (1.40) becomes, in the limit,

j = i j = j , j = 0, 1, . . . .
i=0
In other words, if a solution of (1.30) and (1.28) exists, then it is unique

and is precisely the steady-state distribution of the process.
So, an irreducible Markov process {S(t), t 0} has a steady-state
regime if, and only if, the balance equations (1.30) have a solution =
(0 , 1 , . . .) whose elements sum up to 1; that solution is then unique and
represents the steady-state distribution of the process:
j = lim P (S(t) = j), j = 0, 1, . . . .

t
This important result is the point of departure for most analytic and
numerical studies of systems modelled by Markov processes.
Returning to the Birth and Death process considered earlier, we can
assert now that the necessary and sucient condition for existence of
steady-state is the convergence of the series appearing on the right-hand
side of (1.35); when it exists, the steady-state distribution is given by (1.34)
and (1.35). That assertion follows from the result above and from the fact
that the Birth and Death process is irreducible; the probability pi,j (t) of
moving from state i to state j in time t is obviously non-zero, for all i, j
and all t > 0.
1.5. The M/M/1, M/M/c and related queueing systems

We shall examine here several models which t easily into the framework of
the theory developed in the last section. Although these models are rather
simple, they manage to capture and display some essential features of mass-
service systems. In particular, they illustrate very clearly the way in which
Fig. 1.4.
system performance is inuenced by the level of user demand and by the

capacity and availability of servers.
Consider a single-server queueing system where all customers are of
the same type and are served in order of arrival (that service discipline
is usually referred to as FIFO, rst-in-rst-out, or FCFS rst-come-rst-
served). There is no restriction on the size of the queue that may develop
and no customer leaves the queue before completing service (Fig. 1.4).
Such a system can be used to model a counter at a bank, a car-
washing station, a uniprogrammed computer, etc. Under a suitable set
of assumptions the model becomes a Markov process which lends itself
to analysis. The simplest way to ensure that the memoryless property
holds is to assume that consecutive interarrival times are independent and
distributed exponentially with mean 1/ (i.e. the arrival stream is Poisson
with rate ), and consecutive service times are independent and distributed
exponentially with mean 1/; also, the arrival and service processes are
mutually independent. We thus obtain the M/M/1 queueing model. Let
N (t) be the number of customers in the system (waiting and in service)
at time t. From the memoryless property of the exponential distribution it
follows that
P (N (t + y) = j | N (t) = i) = Pi,j (y), i, j = 0, 1, . . .
independently of t and of the past history {N (u), u < t}. Therefore,

{N (t), t 0} is a Markov process. The only possible transitions out of
state i (i = 1, 2, . . .) are to states i + 1 (if an arrival occurs before a service
completion) and i 1 (if a service completion occurs before an arrival). The
instantaneous transition rates are ai,i+1 = and ai,i1 = . From state 0
the process always moves to state 1, with instantaneous rate a0,1 = .
We recognise here a special case of the Birth and Death process intro-
duced in the last section, with i = (i = 0, 1, . . .) and i = (i = 1, 2, . . .).
Denoting (/) = , the general solution (1.34) of the balance equations

becomes
i = i 0 , i = 0, 1, . . . . (1.41)
The necessary and sucient condition for the existence of a solution whose
elements sum up to 1, and hence for the existence of steady-state, is
< 1. When the system is in equilibrium, the number of customers in it is
distributed geometrically:
P (N = i) = i = i (1 ), i = 0, 1, . . . . (1.42)
The expectation, E[N ], and the variance, Var[N ], of that number are
given by

E[N ] = ii = /(1 ), (1.43)
i=1
and

Var[N ] = E[N 2 ] E 2 [N ] = i2 i E 2 [N ] = /(1 )2 . (1.44)
i=1
In order to give physical meaning to these results, it is helpful to dis-

tinguish the amount of service required by a customer, or the job length,
from the speed of the server. Job lengths are measured in units of work
(in computer systems the unit of work is usually a machine instruction),
while the speed of the server is measured in units of work per unit time.
The time unit can always be chosen so that the server speed is 1; then the
service time of a customer is simply the amount of work that he requires.
The average number of customers arriving into the system per unit time
is . The average amount of work required by a customer is 1/. Hence, the
quantity represents the average amount of work brought into the system
per unit time; for that reason, it is referred to as trac intensity. The
condition for existence of steady-state now reads: the average amount of
work brought into the system per unit time must be less than the speed
of the server (the amount of work that it can do per unit time). This is a
very natural requirement; we shall come across it many times, under much
more general assumptions.
When the trac intensity is less than 1, the process {N (t), t 0} is
recurrent non-null. Every state, and in particular the state N = 0, occurs
innitely many times, at intervals whose expectations are nite. The system
goes through alternating busy and idle periods. We shall see at the end
of this section that the steady-state distribution can, in fact, be determined

directly from these regeneration cycles.
As 1, the steady-state average number of customers in the system
tends to innity. The state N = 0 occurs less and less often. The variance
of N also tends to innity, which means that a randomly observed queue
size is likely to be very far from the expected one.
When = 1, the process is recurrent null (we state this without proof).
Every state is still visited innitely many times but the intervals between
visits are innitely long on the average. The long-run mean and variance of
N are innite, and the probability of observing any given N is zero.
When > 1, the process is transient (again we give no proof). The
number of jobs in the system grows eventually above any nite number,
never to drop below it again. Not only is the fraction of time that the
system spends in any given state zero in the long run, but the total time it
spends in any state is nite.
We shall sometimes use the terms non-saturated system and satu-
rated system to describe the cases < 1 and 1, respectively.
A random variable of central importance in a queueing system is
the steady-state response time, w (the time a customer spends in the
system). The average response time is often taken as a measure of system
performance.
We now proceed to nd the probability density function fw (x) of the
response time in an M/M/1 system in equilibrium. First, from the random
observer property of the Poisson stream (see section 1.3), it follows that an
arriving customer sees the steady-state distribution (1.42) of the number
of customers in the system. Next, from the memoryless property of the
exponential distribution, if the new arrival nds a customer in service, the
remaining service time of that customer is distributed exponentially with
mean 1/. The response time of a customer who nds n customers in the
system is therefore the sum of n + 1 independent exponentially distributed
random variables. Such a sum has the n + 1 stage Erlang density function
gn+1 (x) dened in section 1.3:
gn+1 (x) = (x)n ex /n!. (1.45)
Combining (1.42) and (1.45), and remembering that = /, we obtain

fw (x) = n gn+1 (x) = (1 )ex (x)n /n! = ( )e()x .
n=0 n=0
(1.46)
The response time is thus distributed exponentially. No matter how

long a customer has already spent in the system, his remaining time there
still has the same distribution. The average response time W = E[w] is
equal to
1 1
W = = . (1.47)
(1 )
Note that this performance measure diers from those in (1.43) and (1.44)
in that it depends on and not just through their ratio . It is possible
for a system to be nearly saturated, with large queue sizes, and yet to have
a very short expected response time.
Let us now generalise the model by allowing c parallel servers (each of
unit speed), keeping the other assumptions as before. This is the M/M/c
queueing system. If at a given moment there are i customers in the system,
the number of customers in service is min(i, c). Since each service time is
distributed exponentially with parameter , the interval until the nearest
service completion is distributed exponentially with parameter min(i, c).
The process representing the number of customers in the system, {N (t), t
0}, is therefore a Birth and Death process with constant birth rate, i =
(i = 0, 1, . . .), and state-dependent death rate, i = min(i, c). The general
solution (1.34) of the balance equations is
i
( /i!)0 , i = 0, 1, . . . , c
i = (1.48)
[i /(c!cic )]0 = (/c)ic c , i > c.
Steady-state exists if, and only if, < c. As before, this is a requirement
that the average amount of work brought into the system per unit time
should be less than the amount of work that can be done per unit time.
When c the system is saturated (recurrent null if = c and transient
if > c).
To determine the steady-state distribution we need the probability of
the idle state:
c1 1

i c
0 = ( /i!) + ( /c!)c/(c ) . (1.49)
i=0
Various performance measures can now be obtained, although the expres-

sions tend to be complicated. In general, an M/M/c system is less ecient
than an M/M/1 system with an equivalent service capacity. Let us carry
out the comparison between an M/M/2 system with parameters and
, and M/M/1 system with parameters and 2. The non-saturation
condition is, in both cases, < 2. We shall use the expected number
of customers in the system, E[N ], as a measure of performance. In the
M/M/1 system we have, from (1.43),

E[N ]M/M/1 = .
2
For the M/M/2 system, we nd rst
0 = [1 + + 2 /(2 )]1 = (2 )/(2 + ).
The expression for E[N ] now becomes

4
E[N ]M/M/2 = ii = .
i=1
(2 )(2 + )
The non-saturation condition implies that

4
> 1.
2 +
Therefore
E[N ]M/M/2 > E[N ]M/M/1 .
A similar inequality holds for any number of servers. The reason for the
worse performance of the M/M/c system is that its full service capacity
is not always utilised: when there are less than c customers in the system,
some servers are idle. The M/M/c system is, in its turn, more ecient than
c independent servers with separate queues (i.e. c M/M/1 systems), where
each new arrival joins any of the queues with equal probability. We leave
that comparison as an exercise to the reader. The lesson that emerges from
all this is that, other things being equal, a pooling of resources leads to
improved performance.
A limiting case of the M/M/c system is the system with innitely many
servers, M/M/. Clearly, there can be no queue of waiting customers here.
The solution of the balance equations is as in (1.48), top case, for all i:
i = (i /i!)0 , i = 0, 1, . . . . (1.50)
That solution can always be normalised:

1

i
0 = ( /i!) = e .
i=0
Hence, steady-state always exists. This, of course, is hardly surprising since

the service capacity is innite. The expected number of customers in the
system is E[N ] = .
Other members of the Birth and Death family are models with limited
waiting room: there is a maximum number K of customers that can be
allowed into the system at any one time. All new arrivals who nd K
customers in the system are turned away and are lost. Steady-state always
exists in these systems because the number of states is nite. When a limit
on the waiting room is imposed, it is included in the Kendall notation as
another descriptor after the number of servers, e.g. M/G/1/K. We shall
mention here two systems of this type. The rst is the M/M/1/K system,
where there can be one customer in service and at most K 1 waiting.
For us the interest of this model lies in the fact that it is equivalent to
the following closed cyclic system: K customers circulate endlessly between
two servers, 1 and 2, whose service times are distributed exponentially
with means 1/ and 1/, respectively. The order of service is FIFO at
both servers (Fig. 1.5). The cyclic model can be applied, for example, to
a computer system consisting of one CPU and one Input/Output device,
with K jobs sharing the main memory.
To see the equivalence between the M/M/1/K and the cyclic system
note that as long as the number of customers at server 1 is less than K,
customers arrive there at intervals distributed exponentially with mean 1/;
when all K customers are at server 1, the arrivals stop. This is the same as
having a Poisson arrival stream which is turned o in state K.
The steady-state distribution of the M/M/1/K system state is given by
i = i 0 , i = 0, 1, . . . , K, (1.51)
where = / and 0 = (1 )/(1 K+1 ); when = 1, i = 1/(K + 1),

i = 0, 1, . . . , K.
Fig. 1.5.
Our other example is the M/M/c/c system, where only customers who
nd idle servers are admitted. The classic application for such a model
is a telephone exchange with c lines. The steady-state distribution of the
number of busy servers is
i = (i /i!)0 , i = 0, 1, . . . , c, (1.52)
where
c

0 = (i /i!)1 .
i=0
A measure of performance for this system is the fraction of customers that

is lost. Since Poisson arrivals behave like random observers, that fraction is
equal to
c

c = (c /c!) (/i!) . (1.53)
i=0
Expression (1.53) is known as Erlangs loss formula.

Let us now return to the M/M/1 queueing system and analyse it
in the steady-state by applying a renewal theory argument. We have
mentioned that when < 1, every state is entered innitely many times
at intervals whose expectation is nite. Let t1 , t2 , . . . be the consecutive
moments when the queueing process {N (t), t 0} enters state 0. These
moments are regeneration points for the process: the behaviour of N (t) on
the interval [tj , tj+1 ) is an independent probabilistic replica of its behaviour
on the interval [tj1 , tj ), j = 2, 3, . . . . In particular, the interval lengths
(tj+1 tj ), j = 1, 2, . . . , are independent and identically distributed. Denote
their expectation by T :
T = E[tj+1 tj ].
Let Ti be the total expected amount of time that the process spends in
state i during a regeneration period (i = 0, 1, . . .). By the same argument
that led to equations (1.38) and (1.39) it can be shown that the long-run
fraction of time that the process spends in state i, and hence the steady-
state probability of state i, is given by
i = Ti /T, i = 0, 1, . . . . (1.54)
Now we proceed to nd the expectations Ti . Denote by Mi the average

number of visits to state i during a regeneration period. Then, since the
average time the process remains in state 0 on each visit is 1/, and the
average time it remains in state i (i > 0) on each visit is 1/( + ), we have
T0 = M0 /
(1.55)
Ti = Mi /( + ), i = 1, 2, . . . .
But a visit to state i is either a result of a transition from state i 1, with

probability /(+), or a result of a transition from state i+1, with proba-
bility /( + ), i = 1, 2, . . . . State 0 can be entered only from state 1, with
probability /( + ) (for the transition probabilities, see (1.17)). Hence,

M0 = M1
+
(1.56)

Mi = Mi1 + Mi+1 , i = 1, 2, . . . .
+ +
Substituting (1.55) into (1.56) we obtain
T0 = T1
(1.57)
( + )Ti = Ti1 + Ti+1 , i = 1, 2, . . . .
These last equations, together with T0 = 1/ (there is only one visit to state
0 during a regeneration interval) can be solved by successive elimination:
Ti = i /, i = 0, 1, . . . . (1.58)
The average length of a regeneration interval is, of course, equal to

T = Ti = 1/[(1 )]. (1.59)
i=0
Substituting (1.58) and (1.59) into (1.54) we nally obtain the desired
distribution
i = i (1 ), i = 0, 1, . . . .
Note that the above approach can be applied to the general Birth and
Death process as well, with obvious minor modications.
1.6. Littles result. Applications. The M/G/1 system

We shall derive here a simple relation between the average response time
and the average number of customers in a queueing system in equilibrium.
The rst rigorous proof of that relation was given by Little [6]; hence, it
is known as Littles result, or Littles theorem. However, the validity
of the result had been realised earlier and there were also proofs for some
special cases.
Consider an arbitrary queueing system in equilibrium, and let N, W
and be the average number of customers in the system, the average time
customers spend in the system and the average number of arrivals per unit
time, respectively. Littles theorem states that
N = W, (1.60)
regardless of the interarrival and service time distributions, the service

discipline and any dependencies within the system. Note that we have
not even specied what constitutes the system, nor what customers do
there. It is just a place where customers arrive, remain for some time and
then depart. The only requirement is that the processes involved should be
stationary (independent of time).
Let us rst give an intuitive justication for (1.60). Suppose that the
system receives a reward (or penalty) of 1 for every unit of time that a
customer spends in it. Then the total expected reward per unit time is
equal to the average number of customers in the system, N . On the other
hand, the average number of customers coming into the system per unit
time is ; the expected reward contributed by each customer is equal to his
average residence time, W . Since it does not matter whether the reward is
collected on arrival or continuously, we must have N = W . (This, and the
following argument and proof, are due to Foster [5].)
A dierent interpretation of relation (1.60) is obtained by rewriting it
in the form = (N/W ). Since a customer in the system remains there
for an average time of W , his average rate of departure is 1/W . The total
average departure rate is, therefore, N/W . Thus, the relation holds if the
average arrival rate is equal to the average departure rate. But the latter
is clearly the case since the system is in equilibrium.
The above arguments should suce to convince us that Littles result
holds in its full generality. To prove it formally (admittedly in a slightly
less general case: arrivals in batches will be excluded), denote by Fw (x) the
probability distribution function of the response time. The average response
time is given by

W = (1 Fw (x))dx.
0
Fix an arbitrary moment t in the steady-state. The customers who are in the
system at that moment are those who arrived before t and will depart after
t. Since the arrival process is stationary with rate and customers arrive
one at a time, the probability that there was an arrival at time t u is du.
Such an arrival is still in the system at time t with probability 1 Fw (u).
Therefore, point t u contributes an average of (1 Fw (u))du customers
to the ones present at time t. Integrating over all values of u yields

N= (1 Fw (u))du = W,
0
thus establishing the result.

Littles original proof of the relation contained another, more basic
assertion: if the space averages N , and W are replaced by time averages
that is, averages over an individual realisation of the queueing process
then in every such realisation (1.60) holds with probability 1. This is an
instance of an operational identity.
Let us now turn to some applications. Consider rst a queueing system
where customers are served by a number (nite or innite) of identical
servers of unit speed. Denote, as before, the arrival rate by and the
average service time by 1/. The relevant distributions can be general, as
can be the scheduling discipline. Assume further that customers do not
leave before receiving service. Dene the set of servers, , as the system,
for the purpose of Littles theorem. Since every incoming customer enters
a server eventually, the rate of arrivals into is also . The average time a
customer spends in is equal to 1/. According to the theorem, the average
number of customers in is /.
Thus, in any G/G/c or G/G/ system in equilibrium, the average
number of busy servers is equal to the trac intensity, . One consequence
of this is that the condition < c is necessary for the existence of
equilibrium in the general case (we have already seen that it is necessary
and sucient in the case M/M/c). When c = 1, the average number of
busy servers is equal to the probability that the server is busy. Therefore,
in any single-server system in the steady-state we have
P (there are customers in the system) = ,

(1.61)
P (idle system) = 1 .
Suppose that the customer population is split into classes, numbered

1, 2, . . . , with dierent characteristics. Let the arrival rate and the average
service time of class i customers be i and 1/i , respectively (i = 1, 2, . . .).

Then, applying Littles theorem to the class i customers only, we nd (in
exactly the same way as above) that the average number of class i customers

in service is i = (i /i ), and it is necessary that i i < c.
In single-server systems we have
P (a customer of class i is in service) = i ,

(1.62)

P (idle system) = 1 i .
i
As our next example, we shall nd the steady-state average number of

customers, N , and the average response time, W , in a single-server system
with Poisson arrivals and generally distributed service times (M/G/1). The
scheduling discipline is FIFO; interarrival and service times are assumed to
be mutually independent; there is a single-customer class. The techniques of
the previous sections cannot be applied to this system (at least not directly),
because the process {N (t), t 0} representing the number of customers in
the system is not Markov in general. When the distribution of service times
is not exponential, the process behaviour after a given moment depends on
its history prior to that moment. However, here we are only interested in
the averages N and W , which can be obtained by a rather simple argument.
A more detailed study of the M/G/1 system will be presented in Chapter 2.
By the random observer property of the Poisson stream, a new arrival
into the system nds an average of N customers there. Of these, we saw
that an average of are being served and N are waiting in the queue.
Each of the waiting customers will take an average of 1/ to serve, as will
the new arrival himself. Denote by W0 the expected remaining service time
of a customer found in service by a random observer. We can then write,
for the expected residence time of the new arrival,
W = W0 + (N )(1/) + (1/).
Substituting Littles result, W = N/, in this equation, and solving for N ,

we obtain

N = + W0 . (1.63)
1
It remains to determine the quantity W0 . To do this, imagine the con-
secutive service intervals laid end-to-end on the time axis, thus eliminating
any idle periods. The resulting sequence of independent and identically
Fig. 1.6.
distributed intervals forms a renewal process. The end-points of the

renewal intervals are called renewal epochs. We are interested in the
random variable representing the time between a random observation point
and the next renewal epoch (Fig. 1.6); this is the residual lifetime of the
renewal interval (also sometimes called random modication or forward
recurrence time).
Let f (x), m and M2 be the probability density function, the mean
and the second moment of the renewal interval, respectively (in our case,
m = 1/). Consider the renewal process over a very long period of time, T .
Since, on average, there are T /m renewal intervals during T , and since a
renewal interval is of length x with probability f (x)dx, the average number
of renewal intervals of length x during the period T is equal to [T f (x)dx]/m.
Hence, the average portion of T covered by renewal intervals of length x
is equal to [T xf (x)dx]/m. The random observation point is, by denition,
equally likely to fall anywhere in T ; therefore, the probability f(x)dx that
the observed renewal interval is of length x is given by
f(x)dx = (xf (x)dx)/m. (1.64)
From (1.64) we obtain the average length, m,

of the observed renewal
interval:

m
= xf(x)dx = M2 /m. (1.65)
0
Note that m is always greater.than or equal to m, with equality only when

2
M2 = m , i.e. when the variance of the renewal interval is zero. This is
because a renewal interval which receives the observation point is more
likely to be long than one which does not. Since the observation point is
equally likely to fall anywhere in the observed interval, the expected residual
lifetime is equal to
W0 = m/2
= M2 /(2m). (1.66)
It is interesting that, in some cases, the expected residual lifetime can be

greater than the expected lifetime!
We can now substitute (1.66) with m = 1/, into (1.63). This yields
2 M2
N =+ . (1.67)
2(1 )
The above expression is known as Pollaczek-Khintchines formula. It is
usually written in the form
2 (1 + C 2 )
N =+ . (1.68)
2(1 )
where C 2 = Var[s]/(E[s])2 = (2 M2 ) 1 is the squared coecient of
variation of the service time s. Here, as in the M/M/1 system, we note
the appearance of (1 ) in the denominator; the expected number of
customers in the system approaches innity as 1. For xed and ,
the value of N is determined by the coecient of variation of the service
times. When C 2 > 1, the M/G/1 system performance is worse than that
of the M/M/1 system (C 2 = 1 for the exponential distribution); when
C 2 < 1 it is better. The average response time W in the M/G/1 system
can, of course, be determined easily from Littles theorem: W = N/.
For the last example we return to a topic covered twice already:
the steady-state distribution {0 , 1 , . . .} of the number of customers in
the M/M/1 system. An ingenious derivation, using Littles theorem, was,
proposed by Foster [5]. Its interest lies in the conjuring trick whereby a
distribution is pulled out of a hat containing only averages.
Identify individual queue positions by numbering them 1, 2, . . .: 1 is
the service position, 2 is the rst waiting position, etc. The steady-state
probability qj that the j-th position is occupied is equal to the probability
that there are j or more customers in the system:
qj = j + j+1 + . . . ; j = 1, 2, . . . .
This is also the average number of customers in the j-th position.

After a service completion, every customer in the system moves by
one queue position to the next lower index. Every customer who nds,
on arrival, j 1 or more customers in the system, passes eventually
through position j. Therefore, the rate of arrivals into position j is qj1
(j = 1, 2, . . .; q0 = 1 by denition). The average time that customers
remain in position j is equal to 1/, regardless of whether they arrive there
directly or from position j + 1 (this is because of the memoryless property
of the exponential distribution). Applying Littles theorem to the system

consisting of the j-th queue position gives
qj = qj1 / = qj1 ; j = 1, 2, . . . .
This, together with q0 = 1, yields
qj = j j = 1, 2, . . . , or j = qj qj+1 = j (1 ), j = 0, 1, . . . ;
the same expression as (1.42).
1.7. Operational identities

It will be instructive, at this point, to examine more closely the sample
path behaviour of a general queueing process, {N (t), t 0}, representing
the number of customers in a system. Any such sample path is a step
function of the type illustrated in Fig. 1.1: the function jumps up by one
at arrival instants and it jumps down by one at service completion instants
(bulk arrivals and departures are excluded). In this section only, N (t) will
denote a sample path function; it should be remembered that this is not
now a random variable, but an ordinary function of t describing a particular
realisation of the queueing process.
Consider a sample path N (t) over a time interval [a, b] such that N (a) =
N (b) (Buzen [2]). Let m and M be, respectively, the minimum and the
maximum values reached by N (t) on [a, b]. Since all jumps are of unit
magnitude, every value n in the range m n M is attained at least once
during that interval. For each such n, denote:
T (n), the total amount of time the sample path remains at level n during
[a, b];
A(n), the number of jumps from n to n + 1 during [a, b] (i.e. the number
of arrivals who nd n customers in the system);
D(n), the number of jumps from n to n 1 during [a, b] (i.e. the number
of departures who leave n 1 customers behind). Clearly, A(n) > 0 for
n = m, m + 1, . . . , M 1 and D(n) > 0 for n = m + 1, m + 2, . . . M .
From the operational equilibrium condition N (a) = N (b) it follows

that
A(n) = D(n + 1), n = m, m + 1, . . . , M 1. (1.69)

This equation yields, in a straightforward manner,
A(n) T (n) D(n + 1) T (n + 1)

= , n = m, m + 1, . . . , M 1, (1.70)
T (n) T T (n + 1) T
where T = b a is the length of the observation interval. Now, A(n)/T (n)

is the observed average number of arrivals per unit time in state n;
denote it by (n). Similarly, (n) = D(n)/T (n) is the observed average
number of service completions per unit time in state n. The ratio
T (n)/T , which we shall denote by p(n), represents the observed proportion
of time that the system remains in state n. In this notation, (1.70)
becomes
(n)p(n) = (n + 1)p(n + 1), n = m, m + 1, . . . , M 1. (1.71)
Note the similarity of form between (1.71) and the balance equations
(1.33) of the Birth and Death process. It should be realised, however,
that the content is very dierent. The relations (1.33) were between the
parameters i , i of a certain stochastic process, and the probabilities i ,
taken over the set of all sample paths of that process. Those relations could
be used to determine the probabilities. Here, on the other hand, we have
identities valid for any sample path of any queueing process. The equations
(1.71) can also be solved for p(n):
n1
(k)
p(n) = p(m) , n = m + 1, . . . , M, (1.72)
(k + 1)
k=m
where p(m) is obtained from the normalising equation
M

p(n) = 1.
n=m
The fractions p(n) can thus be determined in terms of the fractions (n)
and (n). The latter are not, however, parameters of the process; they are
characteristics of the same sample path for which the former are sought.
Knowing the values of (n) and (n) for one sample path does not help to
nd the values of p(n) for another sample path.
Suppose now that the sample path N (t) is observed over longer and
longer periods of time, and that during those periods it attains wider and
wider ranges of values. In other words, let T , m 0, M .

Suppose further, that the limits
A(n) D(n) T (n)

n = lim ; n = lim ; pn = lim (1.73)
T T (n) T T (n) T T
exist and are non-zero for all n = 0, 1, . . . (except for 0 ). Continuing the
analogy with the Birth and Death process, one would naturally expect the
fractions pn to be the unique solution of the innite system of equations

pn = 1
n=0
(1.74)
n pn = n+1 pn+1 , n = 0, 1, . . . .
This is not necessarily the case, as can be seen from the following example.
Consider the sample path illustrated in Fig. 1.7. N (t) goes through
alternating busy and idle periods of unit length. During the i-th busy period
(i = 2, 3, . . .), it spends time /2n1 at level n, n = 1, 2, . . . , i 1, and the
rest of the time at level i (0 < < 12 ). It is easily seen that the limits (1.73)
for this sample path are
0 = 1, n = n = 2n1 /, n = 1, 2, . . .
n
p0 = 1/2, pn = /2 , n = 1, 2, . . . .
Equations (1.74), on the other hand, yield
pn = (/2n1 )p0 , p0 = 1/(1 + 2).
Fig. 1.7.
If we were dealing with a Birth and Death process with the above
parameters, then a sample path should spend, in the long run, a fraction
1/(1 + 2) of its time in state 0, with probability 1. A pathological sample
path like the one in Fig. 1.7 may occur, but the probability of such an event
is zero.
1.8. Priority queueing

Let us now move away from the First-In-First-Out scheduling discipline and
study some queueing models where the order of service is determined by
externally assigned priorities. The customer population is split into a set R
of distinct classes, numbered 1, 2, . . . . That set may be nite or innite. The
class indices are used as priority levels: customers of class i have priority
over those of class j if i < j.
The models that we shall consider have several common features. In
all cases, customers of dierent classes are assumed to arrive into the
system according to independent Poisson streams, with rate i for class
i (i = 1, 2, . . .). Service is given by a single server of unit speed and within
each class customers are served in FIFO order. The server cannot be idle
when there are customers in the system. If customers of dierent classes
are waiting for service, the ones with higher priority (lower class index) will
be served rst.
There are several possibilities concerning the action to be taken when
a higher-priority customer arrives to nd a lower-priority one in service. In
our rst model, the new arrival waits until the current service is completed
before beginning his own. This is the non-pre-emptive or head-of-the-
line priority discipline (Cobham [4]): after each service completion, the cus-
tomer with the highest priority among those waiting is selected and served
to completion. The service times for class i customers may be generally
distributed, with mean 1/i and second moment M2i (i = 1, 2, . . .). We shall
denote, as usual, the trac intensity for class i by i = i /i ; this is the
expected amount of work of class i brought into the system per unit time.
The condition for non-saturation is that the server should be able to
cope with the work brought in:

i < 1.
iR
Under that condition, we shall be interested in the steady-state average

number of class i customers in the system, Ni , and the average response
time for class i, Wi .
It was shown in section 1.6 that the expected number of class i

customers in service is i (this is also the probability that a new arrival nds
a class i customer being served). If a class i customer is found in service,
his expected remaining service time W0i is given by equation (1.66):
1
W0i = i M2i ; i R.
2
Therefore, the overall expected delay W0 caused by any customer that might
be found in service is equal to
1 1
W0 = (i i M2i ) = i M2i . (1.75)
2 2
iR iR
Consider the expected total delay, W1 , to which a top-priority customer

is subjected. Apart from W0 , that delay comprises the service times of
all class 1 customers that our customer nds in the queue (their average
number is N1 1 ), plus his own service time. Hence,
W1 = W0 + (N1 1 )/1 + 1/1 .
Substituting, from Littles theorem, N1 = 1 W1 , and solving for W1 ,

we obtain
W1 = 1/1 + W0 /(1 1 ). (1.76)
Let us examine now the total average delay, W2 , suered by a

class 2 customer. First we make the following remark: suppose that a class 2
customer has to wait for time T (no matter for what reason). All class 1
customers who arrive during T will be served before him. Since class 1 work
is brought into the system at rate 1 per unit time, this causes an additional
delay of 1 T . But all class 1 customers who arrive during that additional
delay will be served before our customer, causing a further delay 21 T , etc.
Thus any delay T inicted on a class 2 customer is stretched to
T (1 + 1 + 21 + ) = T /(1 1 )
due to the continuing arrival of class 1 customers.

On arrival, a class 2 customer is subjected to delays by the customer
in service (average of W0 ), the class 1 customers in the queue (average of
(N1 1 )/1 ) and the class 2 customers in the queue (average of (N2
2 )/2 ). Each of these delays is stretched by a factor of 1/(1 1 ) because
of subsequent class 1 arrivals. On top of all that, there is the customers

own service time. The expression for W2 takes the form
W2 = [W0 + (N1 1 )/1 + (N2 2 )/2 ]/(1 1 ) + 1/2 .
Substituting N1 = 1 W1 (where W1 is given by (1.76)) and N2 = 2 W2 ,

and solving for W2 yields
1 W0
W2 = + . (1.77)
2 (1 1 )(1 1 2 )
We can now write a similar formula for an arbitrary customer class,
j. Note that if customer classes 1, 2, . . . , j 1 are lumped together into a
single class, H, and are served in FIFO order, this will not aect in any
way the customers of class j. Class H would then be the top-priority class
and class j the second-priority class. The value of W0 will remain the same.
The trac intensity for class H, H is equal to
H = 1 + 2 + + j1 .
Applying formula (1.77) to class j gives

1 W0
Wj = +
j (1 H )(1 H j )
j1
j

1
= + i Mi2 2 1 i 1 i , (1.78)
j i=1 i=1
iR
where we have used (1.75). The average number of class j customers in the
system is obtained, of course, from Littles theorem: Nj = j Wj .
It is intuitively clear that, with priority scheduling, higher-priority
customers receive better treatment at the expense of lower-priority ones.
The above expressions make that intuition quantitative. They also allow
one to address various optimisation problems. For instance, given the arrival
and service characteristics, and a cost function of the form

C= ci Wi ,
iR
how should one assign priorities to classes in order to minimise C? We shall

solve this problem in Chapter 6.
As an application of formulae (1.78), consider the M/G/1 system under
the Shortest-Processing-Time-rst (SPT) scheduling discipline. Service
times are assumed to be known in advance and, after each service
completion, the customer with the shortest service time of those waiting
is selected and served to completion. Customers arrive in a Poisson stream
with rate ; the probability distribution function of their service times is
F (x).
This model can be reduced to the one with head-of-the-line priorities
by introducing an innity of articial customer classes, using the service
time x as a class index (for a rigorous derivation, service times should be
rst assumed discrete and then a limit taken). Customers of class x arrive
at rate x = dF (x); the rst and second moments of their service times
are, of course, x and x2 , respectively. The trac intensity for class x is
x = xdF (x). Substituting these parameters into (1.78) and replacing the
sums by integrals we obtain the conditional expected response time Wx of
a customer whose service time is x (Phipps [7]):
Wx = x

x
x+
2
+ u dF (u) 2 1 udF (u) 1 udF (u)
0 0 0

x
x+
= x + (M2 /2) 1 udF (u) 1 udF (u) ,
0 0
(1.79)
where M2 is the second moment of F (x) and x and x+ denote limits from
the left and from the right (if F (u) is continuous at point x, the two are
identical). The unconditional expected response time W is given by

W = Wx dF (x). (1.80)
0
We shall see in Chapter 6 that, of all non-pre-emptive scheduling disciplines,

SPT yields the least average response time W .
Let us now return to the priority model with classes 1, 2, . . . . Suppose
that when a higher-priority customer nds a lower-priority one in service,
he interrupts the service in progress and starts his own immediately. This
is a pre-emptive priority discipline: customers of class j are served only
when there are no customers of classes 1, 2, . . . , j 1 in the system. To
dene the discipline completely, one should specify what happens to a
pre-empted customer. Does he later continue his service from the point
of interruption (pre-emptive-resume discipline), or does he restart the
same service from the beginning (pre-emptive-repeat without resampling),
or does he request a new independent service (pre-emptive-repeat with
resampling)? To avoid these complications, and to make the analysis easier,

we shall assume that class i service times are distributed exponentially with
mean 1/i (i = 1, 2, . . .). Now, it does not matter which of the above policies
is chosen, because of the memoryless property.
Again, we are interested in the expected response time Wj for customers
of class j (j = 1, 2, . . .). Because priorities are pre-emptive, class j customers
are not aected in any way by the existence of classes j + 1, j + 2, . . . . In
particular, class 1 customers behave as they would in a single-class M/M/1
system with parameters 1 and 1 . Their expected response time is given
by expression (1.47):
1
W1 = .
1 (1 1 )
Following a similar argument as before, we note that every delay to
which a class 2 customer is subjected is stretched by a factor of 1/(1 1 )
because of subsequent class 1 arrivals. The delays that should be included
in this calculation are due to the class 1 customers he nds in the system
(average number N1 each taking an average of 1/1 to serve), the class 2
customers he nds in the system (average N2 /2 ) and his own service time
(average 1/2 ). Hence,
(N1 /1 ) + (N2 /2 ) + (1/2 )

W2 = .
(1 1 )
Substituting N1 = 1 W1 , N2 = 2 W2 , using the known expression for W1
and then solving for W2 yields
1 (1 /1 ) + (2 /2 )
W2 = + .
2 (1 1 ) (1 1 )(1 1 2 )
This expression generalises easily to an arbitrary class j:
j1

Wj = 1 j 1 i
i=1
j j1
j

+ (i /i ) 1 i 1 i . (1.81)
i=1 i=1 i=1
Note the similarity between (1.81) and (1.78). The numerator in the
second term of (1.81) also represents expected residual service, this time
averaged over classes 1, 2, . . . , j only.
References
1. Buzen, J. P. (1976). Fundamental operational laws of computer system
performance. Acta Informatica, 7, 167182.
2. Buzen, J. P. (1977). Operational Analysis: An Alternative to Stochastic
Modelling. Research Report, Harvard University.
3. Cinlar, E. (1954). Introduction to Stochastic Processes. Prentice-Hall,
Englewood Clis, New Jersey.
4. Cobham, A. (1954). Priority assignment in waiting-line problems. Operations
Research, 9, 383387.
5. Foster, F. G. (1972). Stochastic Processes Proc. IFORS Conference, Dublin.
6. Little, J. D. C. (1961). A proof for the queueing formula L = W . Operations
Research, 9, 383387.
7. Phipps, T. E. (1961). Machine repair as a priority waiting-line problem.
Operations Research, 9, 732742.
8. Strauch, R. E. (1970). When a queue looks the same to an arriving customer
as to an observer. Man. Sci., 17, 140141.
Chapter 2
The Queue with Server of Walking Type
and Its Applications to Computer System
Modelling
2.1. Introduction
Several important classes of computer subsystems can be modelled in
a unied manner using a single server queue, whose service becomes
unavailable in a manner which depends on the queue length after each
service epoch. Such models are particularly useful in the study of the
performance of certain secondary memory devices (paging disks or drums,
for instance) and in evaluating the behaviour of multiplexed data commu-
nication channels.
In this chapter we shall rst examine the properties of the basic
theoretical model, and then develop various applications. This will provide
us with a more economical presentation of the results. The performance
measures of each application will thus be obtained as special instances of
the more general results which will be derived rst.
Section 2.2 will be devoted to the presentation and analysis of the
queue with server of walking type which serves as the metamodel for
the computer system models. We rst derive the stationary queue length
distribution related to a special Markov chain embedded in the general
queue length process. Then, using general results from Markov renewal
theory, we obtain the stationary probability distribution for the model at
arbitrary instants. We prove that the latter is identical to the stationary
distribution at instants of departure (and hence at the instants of arrival of
customers); this generalises a similar result (due to Khintchine) which has
been proved for the M/G/1 queue. We also show in this section how the
M/G/1 queues analysis can be immediately obtained from the preceding
43
results. The basic theorem we derive concerning the stationary queue

length distribution of the queue with server of walking type also reveals
an interesting interpretation of the relationship between the stationary
waiting time of a customer in this system and the corresponding quantity for
the ordinary M/G/1 queue: these two quantities dier only (in probability
distribution) by a term which has the form of a forward recurrence time
which can be easily computed. This general relationship holds, in fact, even
when arrivals are not Poisson (see Gelenbe and Iasnogorodski [11]).
In section 2.3 we develop, in detail, the application of these results
to the paging drum (or xed-head disk) which was rst analysed by
Coman [6]. However, the work done in section 2.2 eliminates the need for a
separate analysis. The results obtained allow us to compare numerically the
performance of a sectored paging drum with that of a rst-come-rst-served
drum. The performance measures considered are the average queue length
at each sector and the average response time. The same approach is then
developed in order to evaluate a charge-coupled device memory and a bub-
ble memory system. Some attention is paid to the problem of optimising the
angular velocity as a function of queue length for a charge-coupled device,
this problem being of importance in view of the introduction of such devices
as circulating shift registers or as replacements for paging drums. Some new
results related to this question are presented at the end of section 2.3.2.
Section 2.4 contains an application of the queue with server of walking
type to the analysis of a multiplexed computer communication channel used
for transmitting packets from several sources to one receiver. This type of
behaviour is typical of certain computer systems in which several terminals
send data to a central computer via a single channel.
2.2. The queue with server of walking type with Poisson

arrivals, and the M/G/1 queue
Consider the service algorithm shown in Fig. 2.1. Each time the queue is
non-empty (Q > 0), the server serves one customer for a service time s, then
takes a rest period T after which it returns to examine the queue again.
If it discovers that the queue is empty (Q = 0) it takes o for an absence
period S, after which it will return once again to examine the queue. This
model has numerous applications to computer systems, some of which will
be examined in this chapter. It was introduced in this form by Skinner
[15], although it had been examined earlier (Miller [14]). Application of
this model to computer systems can be found in [6, 9, 12].
The Queue with Server of Walking Type and Its Applications 45
Fig. 2.1. Service algorithm of walking type server.
We shall assume that the queue is served in rst-in-rst-out order and

in this section we suppose that arrivals to the queue occur in a Poisson
stream of rate . Other assumptions are that S the absence period, s the
service time and T the rest period are positive and nite (with probability

one) random variables whose distribution functions will be noted S(x), s(x)

and T (x), respectively: S(0) = s(0) = T (0) = 0. Furthermore, we suppose
that S is independent of s + T , and write S = s + T .
2.2.1. The embedded Markov chain

Consider the queue length process {Qt }t0 , at instants t = t0 , t1 , . . . when
the server arrives at the Start position in Fig. 2.1, i.e. just before testing
whether Q > 0. Let Q0 , Q1 , . . . denote the values taken by queue length at
those instants.
Denote by pk the stationary probability
pk = lim P [Qn = k], k = 0, 1, 2, . . .

n
associated with the queue length at those instants. Under the present
assumptions it is easy to see that {Qn }n0 is a Markov chain. Therefore
the pk , if they exist, must satisfy
p0 = p0
0 + p1 0
(2.1)
k

pk = p0
k + pkj+1 j , k1
j=0
where and k is the

k is the probability of k external arrivals in time S,
probability of k external arrivals in time S. Let G(x) be the generating
function

G(x) = pk xk , |x| 1
k=0
and dene for |x| 1:

U (x) = k xk

k=0

V (x) = k xk .
k=0
Then from (2.1)

1 1
G(x) = p0 U (x) + i xi pj xj = p0 U (x) + V (x)[G(x) p0 ]
x i=0 j=1
x
or
G(x)[x V (x)] = p0 [xU (x) V (x)]. (2.2)
The quantities U (x), V (x) are readily obtained. Notice that due to the
Poisson arrivals of rate we have

(y)k y
k =
e dS(y)
0 k!
so that

U (x) = E e(x1)S (2.3)
where E denotes the expectation. Similarly, for S = s + T , we have

V (x) = E e(x1)S . (2.4)
Since p0 in (2.2) is as yet unknown we take x 1 in (2.2) and use 1H

opitals
rule since G(1) = U (1) = V (1) = 1:

xU (x) + U (x) V (x)
lim G(x) = 1 = lim p0 .
x1 x1 1 V (x)
But
lim U (x) = E[S],

lim V (x) = E[S]
x1 x1
so that
E[S])
1 + (E[S]
lim G(x) = p0 .
x1 1 E[S]
Therefore
(1 E[S])
p0 = E[S])
1 + (E[S]
yielding

1 E[S] xE[e(x1)S ] E[e(x1)S ]
G(x) = E[S])
1 + (E[S] x E[e(x1)S ]
.
Notice that p0 > 0 implies E[S] < 1; this is the stability condition for a
queue with server of walking type.
Consider the stationary queue length distribution gk , k = 0, 1, . . . ,
measured at instants just after a departure occurs, i.e. when the server
has just left the service time block of Fig. 2.1. The following relation is
obtained because a departure takes place given that Q > 0 in Fig. 2.1:
k

gk = pkj+1 j /(1 p0 ), k = 0, 1, . . .
j=0
where j is the probability of j arrivals to the queue during the service

time s. Let H(x) denote the generating function, for |x| 1,

H(x) = gk xk .
k=0
We will then have
H(x) = (1/x)[G(x) p0 ]W (x)/(1 p0 ) (2.6)
where

W (x) = k xk = E[e(x1)s ].
k=0
Let us now consider the single-server queue with Poisson arrivals of rate ,
and independent service times S of probability distribution function S(x).
Fig. 2.2. Service mechanism for M/G/1 queue.
Consider the queue length process at instants 0 , 1 , . . . just after a

departure from the queue. The behaviour of the server may be represented
as in Fig. 2.2. We see that after a departure, if the queue is empty, the server
will enter an enforced idle period which corresponds to an interarrival time;
this will then be followed by a service time before the server enters the cycle
once again.
Let pk , k = 0, 1, 2, . . . denote the stationary probability that the queue
length is k in the stationary state just after a departure. If j , j = 0, 1, 2, . . .
denotes again the probability of j arrivals in time S, then
p0 = p0 0 + p1 0

k
(2.7)
pk = p0 k + pkj+1 j , k 1.
j=0
Notice that if the queue length is zero after a departure, the following
departure will correspond to the rst customer which will arrive. Let F (x)
be the generating function, for |x| 1,

F (x) = pk xk .
k=0
We then obtain after some algebra (V (x) being given by (2.4))
(1 E[S])E[e(x1)S ](x 1) V (x)(x 1)

F (x) = = p0 . (2.8)
x E[e (x1)S ] x V (x)

Notice that G(x) of (2.5) is identical to F (x) when S = S.
The expressions (2.5) and (2.8) give the generating functions for
the stationary queue length probability distribution at instants of time
embedded in the queue length process for the queue with server of walking
type, and for the M/G/1 queue, respectively. What we really would like to
have for both systems is the stationary distribution dened as
lim P (Qt = k) (2.9)

t
which we shall derive using some general results from Markov renewal
theory.
2.2.2. The stationary queue length process

Consider rst the notion of a Markov renewal process. It is dened
as a sequence of pairs of random variables {Qn , Tn }n1 satisfying the
relationship
P [Qn+1 = j, Tn+1 t | Q0 = i0 , Q1 = i1 , . . . , Qn = in ]
= P [Qn+1 = j, Tn+1 t | Qn = in ].
For our purposes the Qn will take integer values and the Tn will be real-
valued; furthermore, both will be non-negative and Q0 corresponds to the
initial state at time zero. In our queueing models Qn will be the queue
n
length at the instant 1 Ti . Notice that
lim P [Qn+1 = j, Tn+1 t | Qn = i]

t
is simply the transition probability from state i to j of the Markov chain

{Qn }n1 ; call it p(i, j). Further, P [Qn+1 = j, Tn t | Qn = i] is merely the
probability that the time between the n-th and (n + 1)-th instant of the
Markov renewal process is less than or equal to t, and that the state it will
enter into at the end of this interval is j, given that at the beginning of the
interval the state is i. That probability is assumed to be independent of n.
A more useful quantity in this context is
A(i, j, t) = P [Qt = j, T1 > t | Q0 = i] (2.10)
which is the probability that at some instant t between two instants of the
Markov renewal process the state is j, given that it was i just after the
n
most recent instant. By the term instant we refer to a time 1 Ti , n 1.
The result we seek will be obtained by applying the key renewal theorem
[5]. It states that
v(j)
lim P [Qt = k] = A(j, k, t)dt (2.11)
t m 0
j
where v(j) is the stationary probability of state j associated with the

Markov chain {Qn }n1 :

v(j) = v(i)p(i, j), for all j
i

v(j) = 1.
j
n n+1
Further, m is the average time between instants 1 Ti and 1 Ti of
the Markov renewal process in stationary state:

m= v(j)E[Tn+1 | Qn = j]. (2.12)
j
Of course, (2.11) has a meaning only when the various quantities of

which it is composed exist. In fact it has a very intuitive form since

A(j, k, t)dt
0
is the average time the process spends in state k between two successive
instants, given that it was in state j at the most recent instant. Also, m is
the average time between instants. Thus the right-hand side of (2.11) is
merely the average time spent in state k between each successive pair of
instants, divided by m.
Let us now apply this result to the two queueing systems which we are
examining here.
Stationary queue length process for the M/G/1 system: For this
system, we see that the queue length process just after each departure (see
Fig. 2.2) is Markov renewal. This is because the time between two successive
departures is totally determined by the state just after a departure, as is the
queue length just after the following departure. The stationary probability
v(j) in (2.11) is therefore replaced by pj of (2.7). Also we see from Fig. 2.2
that m of (2.12) is given by
m = p0 (1/ + E[S]) + (1 p0 )E[S] = 1/. (2.13)
We can obtain the A(j, k, t) as follows

et , if j = k = 0

(1 S(t))et (t)ki

if k j > 0

,
(k j)!
A(j, k, t) = t

(y)k1 y

(1 S(y))e(ty) e dy, if k > j = 0

(k 1)!

0

0, if k < j.
(2.14)
The case j = k = 0 is simply when no arrivals have occurred up to time t.

If k j > 0, then at time t the service is not yet ended with probability
(1 S(t)), and (k j) arrivals have occurred in this time. Finally, if j = 0
and k > 0, then there is an initial interval of length (t y), which is
exponentially distributed, when no arrivals occur; the rst arrival occurs at
(t y) and the (k 1) following arrivals take place during the remaining
interval of length y during which the service of the rst customer to arrive
does not nish with probability (1 S(y)); y varies, of course, between 0
and t.
For the M/G/1 queue call fk the left-hand side of (2.11), which is the
stationary probability we are seeking and let

L(x) = fk xk , |x| 1
k=0
be the corresponding generating function. From (2.11) and our denition

of F (x) (see (2.8)), we obtain

1 k
L(x) = x A(j, k, t)pj dt. (2.15)
m j=0 0
k=0
We will now show that L(x) = F (x); i.e. the stationary queue length
distribution (2.11) is identical to the stationary distribution just after
departure instants, for the M/G/1 queue.
Using (2.14) in (2.15) we have

t k1
2 t (y)
L(x) = p0 1+ dt x k
(1 S(y)) e dy
0 0 (k 1)!
k=0

(t)kj t
+ pj xk (1 S(t)) e dt
0 (k j)!
k=1 j=1
t
= p0 1+ e t
dt (1 S(y))e xy
x dy
0 0

+ (F (x) p0 ) et(x1) (1 S(t))dt. (2.16)
0
Notice that
t
t xy
e dt (1 S(y))e x dy = xet(x1) (1 S(t))dt.
0 0 0
Therefore

L(x) = p0
1 + (x 1) e t(x1)
(1 S(t))dt
0

+ F (x) et(x1) (1 S(t))dt.
0
But

1 1
et(x1) (1 S(t))dt = + E[e(x1)S ]
0 (x 1) (x 1)
1
= [V (x) 1].
(x 1)
Hence
F (x)
L(x) = p0 V (x) + [V (x) 1] (2.17)
(x 1)
and substituting (2.8) we obtain

V (x)(V (x) 1) x1
L(x) = p0 V (x) +
= p0 V (x) (2.18)
x V (x) x V (x)
which is identical to F (x). Thus, since the equality of the generating

functions implies that the corresponding probability distribution functions
are identical we have proved the result:
lim P [Qt = k] = pk for all k 0. (2.19)

t
This is usually known as Khintchines theorem [13].

A useful quantity to have in many situations is the average queue length
in stationary state for the M/G/1 queue. It can be obtained directly from
(2.18) and (2.19) using the simple property of generating functions:

d
lim L(x) = kpk .
x1 dx
k=0
After some algebra we obtain

E[S](1 + CS2 )
lim E[Qt ] = kpk = E[S] 1 + (2.20)
t 2(1 E[S])
k=0
where CS2 is the squared coecient of variation of S:
CS2 = [E[S 2 ] (E[S])2 ]/(E[S])2 .
Expression (2.20) is the PollaczekKhintchine formula derived by a dierent

method in Chapter 1, section 1.6.
Stationary queue length process for the queue with Poisson arrivals and
server of walking type: The queue length process just before the server tests
whether the queue is non-empty (see Fig. 2.1) is Markov renewal for the
system with Poisson arrivals and server of walking type. Notice that the
queue length at these instants determines the distribution of the time until
the next such instant, and therefore also the queue length.
The quantity m, or average time between two such instants, is
+ (1 p0 )(E[s] + E[T ]) = p0 E[S]

m = p0 E[S] + (1 p0 )E[S]
In fact, Khintchines result is that the stationary queue length distribution and the
stationary distribution at instants of arrival are identical; but the latter is identical to
the stationary queue length distribution at departure instants.
where p0 is given by (2.5). Therefore

E[S]
m= E[S]) . (2.21)
1 + (E[S]
The quantities A(j, k, t) for this system are obtained as follows:
8 k
>
> t (t)
>
> (1 S(t))e
, kj=0
>
> k!
>
>
>
> (1 s(t))et (t)kj
>
<
A(j, k, t) = (k j)!
>
>
>
> (t)kj+1
>
> +[(1 S(t)) (1 s(t))]et , if k j > 0
>
> (k j + 1)!
>
>
>
:
[(1 S(t)) (1 s(t))]et , k = j 1 and j > 0.
(2.22)
The case k j = 0 in (2.22) corresponds to an instance of the server
entering an idle period after nding the queue empty; therefore some time
t < S later, before it tests once again whether the queue is empty, there
will be k customers in queue only if all of them have arrived in that time.
The case k j > 0 contains two terms, where the rst corresponds to
an interval of length t shorter than a service period s (with probability
(1 s(t)), and the second corresponds to an interval of length t so that
s t < s + T (with probability [(1 S(t)) (1 s(t))]). In the former term
we have the probability of (k j) arrivals, while in the latter there is an
additional arrival to compensate for the departure at the end of the service
period. Finally, the case k = j 1 and j > 0 corresponds to an instant after
the service period has ended with no arrivals in the interval.
Denoting Qt the queue length at time t for this system, we apply (2.11)
to obtain

pj
lim P [Qt = k] = A(j, k, t)dt. (2.23)
t m 0
j=0
Call qk the left-hand side of this expression and dene the generating
function

M (x) = qk xk , |x| 1. (2.24)
k=0
Therefore M (x) can be obtained from (2.22) and (2.23) as follows:

xj pj
M (x) = A(j, k, t)xkj dt.
j=0
m 0
k=0
Consider this expression separately for each case of (2.22). Take rst
k j = 0; this contributes the following term to M (x):

k
p0 t (tx) p0 t(x1)
(1 S(t))e
dt = (1 S(t))e
dt
m 0 k! m 0
k=0
p0
= (U (x) 1).
m(x 1)
The two cases of (2.22) covered by k j1, j > 0, contribute the expression

1 (tx)kj
pj xj et (1 s(t)) dt
m j=1 0 (k j)!
k=j

1 j t
(tx)kj+1
+ pj x e [(1 S(t)) (1 s(t))] dt
mx j=1 0 (k j + 1)!
k=j1

G(x) p0 V (x) W (x)
= W (x) 1 + .
m(x 1) x
Therefore

p0 G(x) V (x) W (x)
M (x) = U (x) 1 + 1 W (x) 1 + .
m(x 1) p0 x
We can now use (2.2) to write
G(x) xU (x) V (x) x(U (x) 1)

1= 1=
p0 x V (x) x V (x)
so that, using p0 /m = (1 E[S])/E[S],

1 E[S] U (x) 1 x1
M (x) = W (x) (2.25)

E[S] x1 x V (x)
which is identical everywhere, except perhaps at x = 1, to (see (2.6) and

notice that p0 /(1 p0 ) = p0 /m):

1 E[S] U (x) 1
H(x) = W (x). (2.26)

E[S] x V (x)
To verify that (2.25) and (2.26) are identical at x = 1 it suces to take

limits and to compare the expressions obtained. Therefore we have, once
again,
M (x) = H(x) (2.27)

or the stationary queue length distribution is identical to the stationary

queue length distribution at instants of departure for the queue with server
of walking type and Poisson arrivals. This is similar to the result obtained
for the M/G/1 queue.
In fact, H(x) given in (2.26) has a very intuitive interpretation which
is worth examining. First recall that for two discrete random variables A
and B which are independent, the probability generating function G(x) of
their sum A + B is the product of the generating functions GA (x), GB (x)
of A and B, respectively. That is:
G(x) = GA (x)GB (x).
Therefore the quotient of two generating functions corresponds to the

subtraction of independent random variables. Now consider the stationary
queue length probability generating function L(x) of the M/G/1 queue
given by (2.18); the service time S of the M/G/1 queue is taken to be
identical to the quantity S = s + T of the queue with server of walking
type. H(x) can then be rewritten as

L(x) U (x) 1
H(x) = W (x) (2.28)
V (x) (x 1)E[S]
and is the product of three probability generating functions W (x),

[(U (x)1)/(x1)E[S]], and L(x)/V (x). Each of these terms has a special
signicance. W (x) is obviously the probability generating function for the
number of arrivals to the queue during a service time s (see (2.6)). The
second term is the generating function of the number of arrivals in an
interval which is distributed as the forward recurrence time S related to S,
as we shall see presently. The forward recurrence time is dened as follows.
Consider the sequence of instants 0, S1 , S1 + S2 , S1 + S2 + S3 , . . . where
the Si , i 1, are independent and distributed identically to S. Consider an
instant of time and dene

k k1 k

P [S < t] = lim P Si < t and Si < Si .

k=1 1 1 1
It is well known that the density function of S is given by
dP [S < t]
= P [S t]/E[S]
= [1 S(t)]/E[

S].
dt
For a proof the reader may see Cox [7]. Intuitively, S is the amount of
time that a person arriving at a bus-stop will have to wait if buses pass by
at epochs 0, S1 , S1 + S2 , . . . and all the Si are independent and identically
distributed with common probability distribution function S(t). This is also
the residual lifetime of Chapter 1, section 1.6.
Now notice that

k (t)k = E[e
(x1)S
]1 U (x) 1
x et [1 S(t)]/E[
S] = .
k=0 0 k! (x 1)E[S]
(x 1)E[S]
Therefore let Q denote the number in queue in stationary state for the
queue with server of walking type and let Q be the corresponding quantity
for the M/G/1 queue (2.28) leads immediately to the following important
identity:
Q = Q + a(S ) + a(s) a(S) (2.29)
where a(z) is the random variable representing the number of arrivals (from
the Poisson arrival stream of rate ) during an interval distributed as the
random variable z. Since (2.28) is a relationship between probability distri-
butions, (2.29) is an identity in the sense of the probability distributions.
We can now compute directly E[Q ] using the PollaczekKhintchine formula
(2.20) for E[Q]:

(E[S])2 (1 + KS2 )
E[Q ] = + E[s] + 0 t(1 S(t))dt.
(2.30)
2(1 E[S]) E[S]
Another deeper and more general result is concealed in (2.29). We shall
state this fact without proof; the result is due to GelenbeIasnogorodski
[11]. Let W be the limit as n of the waiting time Wn of the n-th
customer arriving at a queue with server of walking type and with general
independent interarrival times, and let W be the limit as n of Wn the
waiting time of the n-th customer arriving at the corresponding GI/G/1
queue. The service time of this GI/G/1 queue is S = s + T (as in the case
of the M/G/1 queue corresponding to the queue with Poisson arrivals and
server of walking type). The result obtained in [11] is that
W = W + S (2.31)
and that W and S are independent, the equality being in the sense of the
probability distributions of the random variables on the left- and right-hand
sides of (2.31).
Let us see how (2.31) implies the result given in (2.29) when arrivals are
Poisson. In this case, the number of arrivals to the system during disjoint
intervals of time are independent. As in (2.29), let a(z) denote the number
of arrivals in time z for the Poisson arrival process. Then (2.31) implies that
a(W ) = a(W ) + a(S )
and we can also write
a(W ) + a(s) = a(W ) + a(S ) + a(s).
But a(W ) + a(s) and a(W ) + a(S) are the numbers of customers remaining
in queue just after the departure of a customer in stationary state for the
queue with server of walking type and for the M/G/1 queue, respectively.
We have shown that, for both systems, the stationary queue length
distribution is identical to the stationary queue length distribution just
after departure instants; therefore
Q = a(W ) + a(s)
Q = a(W ) + a(S),
and
Q = Q a(S) + a(S ) + a(s)
follows, which is (2.29).

These theoretical results will be very useful in the system models which
will be examined in the following sections.
2.3. Evaluation of secondary memory device performance

The results derived in the previous sections can be directly applied to the
performance evaluation of secondary memory devices such as the paging
drum or bubble memory systems. A considerable amount of work has been
done in this area, and we shall show how the analysis of the queue with
server of walking type and of the M/G/1 system can be used directly in
this context.
2.3.1. Application to a paging drum model

The paging drum (P D) is a secondary memory device which is used to store
information in blocks of xed size called pages. This device plays an impor-
tant role in paged virtual memory computer systems since it is used to store
those portions of active programs for which space is unavailable in central

memory. Measurements taken directly on existing systems show that the
saturation of the PD is often the cause or indication of poor system perfor-
mance. This is why there has been much interest in analysing its behaviour.
Coman [6] has given a mathematical model of the PD, assuming that
requests for transfer follow a Poisson process. With the same hypothesis,
Gelenbe, Lenfant and Potier [12] studied the case in which transfers are
sets of grouped and contiguous pages of the PD. A complete performance
study of this device may be found in [9].
The PD is a xed-head disk composed of concentric tracks which are
divided into N equal-sized sectors, each able to contain a page of the main
memory as shown on Fig. 2.3. The switching of reading to writing can be
done in the interval between the passage of the end of a sector and the
passage of the beginning of the next sector, while the PD rotates at a
constant velocity.
In order to increase the PDs throughput, a queue is associated to
each sector, rather than a single queue for the whole PD. An equivalent
representation is obtained by assuming that the PD is xed and that the
read/write heads turn around at constant speed (see Fig. 2.3): when each
read/write head comes in front of a sector whose queue is not empty, the
corresponding transfer request is initiated and the transfer is completed
when the head reaches the end of the sector.
Consider the instants just before the read/write head passes in front of
the beginning of the k-th sector of the drum, where k is some sector we
have xed arbitrarily. If the k-th sector queue is non-empty, then a page
will be transferred while the read/write heads scan the sector. After this,
no more transfers will occur from or to this sector until the heads visit it
once again.
If the time for one complete PD rotation is Y , and if there are N sectors,
the service viewed from the k-th sector appears as a service time Y /N ,
followed by an idle period whose duration is Y (N 1)/N . On the other
hand, if the k-th sector queue is empty when the read/write head reaches
the beginning of the sector no transfer can occur until a time Y has elapsed
(one full PD rotation) when the sector is once again visited, even if arrivals
occur in the interval. Notice that we do not make a distinction between
page reads or writes since both are equivalent from the point of view of the
sector queues, and with respect to the utilisation of the PD.
The following relationship exists between the service mechanism for the
k-th sector queue being examined, and the queue with server of walking
Fig. 2.3. (a) The physical model. (b) The mathematical model.
type of Fig. 2.1. The instant at which the test is Q > 0? is performed
corresponds to the time when the beginning of the k-th sector passes under
the read/write heads. The service time s is Y /N , the time necessary for
transferring a page. The rest period T after a service is Y (N 1)/N , the
time necessary for the beginning of the k-th sector to return under the
read/write heads. Finally, the idle period S if the queue is empty after
the rest in Fig. 2.1 is simply the time Y for one complete rotation of the
PD. Therefore, the model of a PD sector queue will be a special case of the
queue with server of walking type with
s = Y /N, T = Y (N 1)/N, S = S = Y (2.32)
all of which are deterministic quantities in this system.

Now let the global arrival process of transfer requests to the PD be
Poisson of parameter , and suppose that it is composed of N independent
Poisson streams of rates 1 , . . . , N corresponding to arrivals to the N

sector queues so that
N

= k .
k=1
We may now immediately apply (2.26) and (2.27) to the analysis of

the k-th sector queue. Let Mk (x) denote the generating function for its
stationary queue length probability distribution. Then, using (2.32), we
have
k Y (x1)
1 k Y e 1 k Y (x1)/N
Mk (x) = e (2.33)
k Y x ek Y (x1)
so that the average queue length nk in stationary state is obtained as

d N +2 2k
nk = lim Mk (x) = k + (2.34)
x1 dx 2N 2(1 k )
where k = k Y . The stationary probability that the k-th sector queue is
empty is given by

1 k
Mk (0) = (ek 1)ek /N . (2.35)
k
The stationary average queue length for transfer requests arriving at
the PD will be
N
k
n= nk .

k=1
When transfer requests are uniformly distributed over the N sectors we

have k = /N , so that n in this case becomes

Y N + 2 2 Y 2 /N 2
n= + . (2.36)
N 2N 2(1 Y /N )
The average response time R in stationary state can now be obtained using
Littles formula as

n Y N +2 Y 2 /N 2
R= = + .
N 2N 2(1 Y /N )
Suppose now that instead of organising the PD so that page transfers
are queued separately for each sector, we constitute a single queue for
all the transfers and serve them in their order of arrival (FIFO order).
Assuming, again, that the requests have a uniform probability, 1/N , to

refer to any one of the sectors and that the sector addresses of successive
transfers are independent of each other, we can model the PD as an M/G/1
queue if transfer requests arrive in a Poisson stream. The service time of the
queue will consist of the time necessary for the read/write heads to reach the
beginning of the sector concerned by the page transfer, followed by the time
necessary to transfer the page. This is in fact only an approximation since
the service time would be slightly dierent for a page transfer arriving at
an empty queue. Using the PollaczekKhintchine formula (2.20) to evaluate
the average queue length n in this case, we obtain

(1 + C 2 )
n = 1 + (2.37)
2(1 )
where = Y (N + 1)/2N , since the total time for serving a page transfer
will be uniformly distributed over the set of values Y k/N, k = 1, 2, . . . , N ;
C 2 is the squared coecient of variation of this service time, so that
2 N
2 2
Y (N + 1) 2 Yk 1 2 Y
(1 + C ) = = (N + 1)(2N + 1)
2N N N 3 2N
k=1
and

2 2 2N + 1
1+C = .
3 N +1
Clearly, if no sector queueing is used the PD will saturate if = 1, while

the saturation point will be obtained with sector queueing when Y /N = 1.
In the former case this gives = 2N/Y (N + 1) while, in the latter case,
we have = N/Y . It is interesting to compare and :
/ = (N + 1)/2
so that the PD with sector queueing can support (N + 1)/2 times as much
page trac as the PD without sector queueing, provided the page trac is
uniformly distributed among all of the sectors.
These results are illustrated on Fig. 2.4, where we show the average
queue lengths n and n for a PD with eight sectors.
In fact, the sector queueing policy may be viewed as a shortest access
(or service) time rst scheduling policy; such policies tend to optimise the
performance of service systems, as we shall see in Chapter 6.
Fig. 2.4. Average queue length for paging drum with and without sector queueing
(N = 8).
Further discussion of the PD system can be found in Chapter 4 where

diusion approximations are used to predict queue lengths when arrival
streams are no longer Poisson.
2.3.2. Solid-state secondary memory devices

Solid-state secondary devices with characteristics resembling those of pag-
ing drums or disks have emerged [3, 8] recently as means of supplementing
or even replacing rotating secondary memory devices. These solid-state
devices are an order of magnitude faster than paging disks [3] and have
the additional advantage of having no mechanical parts. Their analogy
to paging disks comes from the fact that, from a logical point of view,
they behave as circulating shift registers. This means that the information
has to pass in front of a read/write area in order to be accessed. Two
types of technology have been used for these devices: magnetic bubble
memory technology and charge-coupled semiconductor device technology.
In the case of a magnetic bubble memory, the information circulates inside

the device under the eect of a magnetic eld; it is possible to stop this
magnetic eld so that the corresponding server, representing the transfer
of information, can be stopped on a page (or sector) boundary when
the sector queues are empty, if the device is organised in page sectors.
For charge-coupled semiconductor devices the information content must be
periodically refreshed. That is, a minimum clock rate is imposed so that
every bit of the stored data can be refreshed before it is lost, by making it
circulate under the read/write area. Thus the device behaves as a circulating
shift register loop. It also has a maximum clock rate determined by the
speed of the semiconductor device. Let rm and rM , be the minimum and
maximum clock times, respectively. Therefore, if the charge-coupled device
has L bits stored on its circumference, a complete rotation time for the
information will lie between rm L and rM L seconds.
The lack of mechanical inertia in these devices makes it possible to vary
the clocking time between the two limits rm and rM for the charge-coupled
device and to stop or start the rotation of the magnetic bubbles at will in
the case of the magnetic bubble memory.
In this section we shall analyse the behaviour of the queue of transfer
requests at a charge-coupled device secondary memory. Again, we shall
assume that the device is organised in N sectors just as a paging disk. We
shall examine the queue of transfer requests at the k-th sector. The unit of
data being transferred will be relatively small compared to the page which
is transferred at a paging drum or disk; it will be a block of L/N bits, where
L is usually less than 1024 bits in present-day devices [3, 8].
We shall assume that r(), the clock time of the charge-coupled device,
can be a function of the angular position of the circulating shift register,
where
1 L, rm r() rM .
The clock time, or time necessary for moving one bit in the shift register,
may be varied at will by appropriate electronic circuitry between the two
limits, as indicated earlier. Therefore, if the need arises, a variable clock
rate can be implemented in this system.
To simplify the discussion, and with no loss of generality, we shall
assume that the k-th sector which we are examining begins at 1 = 1 and
ends at 2 = L/N . Notice that these are cell positions (each cell containing
one bit) rather than units of rotation time as was the case with the paging
drum. If we use the queue with server of walking type to model the k-th
queue we must take
L/N L L
s= r()d, T =
r()d, S = r()d
1 (L/N )+1 1
for the corresponding service time and idle times. The integrals in the above
expressions should be, in the strict sense, summations. The loss in accuracy
in treating as a continuous random variable will be insignicant, however,
since L/N can be expected to be of the order of magnitude of 100 bits.
In view of (2.31), and the corresponding results (2.29) and (2.30) in
the case of Poisson arrivals of transfer requests to each sector, we see that
system performance will be optimised by setting r() to its minimum value
rm leading to a minimisation of queue length and transfer times. If the
arrivals of transfer requests to the k-th sector form a Poisson stream of rate
k , we can use (2.30) to obtain its average queue length nk :
(k rm L)
nk = + k rm L/N + k rm L/2.
2(1 k rm L)
In the previous analysis we assumed that the charge-coupled device
speed could be varied only as a function of angular position but not of
queue length. The obvious conclusion was that its rotation speed should
be maintained at its highest possible level at all times. The performance
of this device can be improved, however, if its speed can be varied as a
function of angular position, and also of queue length. This possibility has
been analysed in [8].
Let us assume for the time being that transfers to and from the device
occur in blocks of L bits so that there is only one queue of transfer requests
(N = 1). Consider now an idle period for the device, that is one in which the
queue is empty. It is clear that during such periods the initial address = 1
should dwell as long as possible in the vicinity of the read/write head so
that as soon as a transfer request occurs it may advance at maximum speed
to the head in order to minimise the latency delay preceding the beginning
of the transfer. Furthermore, as soon as the initial address passes under
the read/write head, the information cells of the device should be moved
as quickly as possible to the vicinity of the read/write head once again if
no arrivals have occurred.
In order to examine this behaviour more closely, assume that an idle
period begins at an instant which we arbitrarily x at t = 0 just after the
cell = 1 passes under the read/write head. Let D(t) be the number of
cells separating the address at time t from the initial address, this number
being counted in the direction of the motion of the cells. Thus D(0+ ) = L.
We shall assume that, as long as there are no transfer requests, the address
= 1 visits the read/write head every T seconds (T being xed).
Suppose that f (t)dt is the probability that an arrival occurs, ending
the idle period, in the interval (t, t + dt) for some t 0. Then the average
distance, in number of cells to be traversed, to the starting address for the
arriving transfer request is

=
D D(t)f (t)dt.
0
Our problem is to choose the function D(t) which will minimise D, since as
soon as an arrival occurs the optimum strategy will be to rotate the charge-
coupled device using the minimum clock time rm . The average latency, or
delay before the arriving request can begin its transfer, is then rm D.

Since both f (t) and D(t) are non-negative quantities, D is minimised
simply by letting D(t) be as small as possible for each value of t. We know
that D(kT ) = 0, and D(kT + ) = L for k = 0, 1, . . . ; furthermore, D(t) is a
decreasing function for every other value of t. It cannot decrease any faster
than 1/rm because of the limitation on rotation speed, and any slower than
1/rM because of the need to refresh the contents of each cell at least each
rM time units. This leads immediately of the optimum form for D(t) shown
in Fig. 2.5. This form guarantees that D(t) is as small as possible within
the given constraints so that D is minimised.
The time at which the speed of rotation must be changed during each
rotation is easily obtained from
(T )/rM = L /rm
yielding
= rm (T LrM )/(rm rM )
which corresponds to the cell number, or angular position
( ) = (T LrM )/(rm rM ).
Fig. 2.5. Optimum form for D(t), the number of cells separating the initial address
from the read/write heads during an idle period of the charge-coupled device memory.
The preceding analysis concerns the optimum choice of D(t) once T

is xed. We still have to provide guidelines for choosing T . Since D(t) is
periodic, we may write
(k+1)T
D = D(x)f (kT + x)dx.
k=0 kT
Therefore the value of T which will minimise the average distance D

to the starting address will depend on f (t). Since f (t) is the density of the
instant of the rst arrival after the queue of transfer requests is empty, it
will depend on the arrival process and will in general be dicult to compute.
If the arrival process of transfer requests is Poisson, however, f (t) is the
same as the interarrival time density. Let us examine the optimum choice
of T for this case. We then have
f (t) = et
when there are arrivals per second to the system. This yields

T
=
D ekT D(x)ex dx .
k=0 0
Therefore

= (L + 1) e 1 eT e
D +
(1 eT ) rm rM
which may be written as

L 1 rm rM 1 e
=
D 1 .
1 eT rM rm 1 e
The rst term in this expression is minimised by taking T as large as
possible, which is obtained when = 0. On the other hand, = 0 will
also make the second term (which is negative) as small as possible. Thus
is obviously minimised when = 0 or for T equal to
D
T = LrM
which yields
= L/(1 eLrM ) 1/rM .
D
This result is counterintuitive, since it states that in order to minimise the

latency the CCD memory must be rotated as slowly as possible during idle
periods until an arrival occurs. After an arrival occurs it will be rotated
at maximum speed to the initial address. This conclusion is dependent, of
course, on the Poisson assumption concerning the arrival process and will
not be valid in general. It is interesting to notice here that T does not
depend on , the arrival rate of transfer requests.
The analysis so far has assumed that N = 1, i.e. that there is only one
sector queue at the CCD device. In the case of multiple sectors, matters
become more complicated but the same general principles can be applied
to the analysis of the optimisation problem.
2.4. Analysis of multiplexed data communication systems

Many data communication systems multiplex a simple communication
channel among a set of transmitting or receiving stations. Consider, for
instance, the system shown in Fig. 2.6. N transmitting stations are
connected to a receiver via a simple channel. The channel is multiplexed
among the N stations in the following manner. A station, say the rst, is
polled at some instant; if it has data to transmit, it is allowed to send a
packet of xed length Y along the channel. Otherwise the second station
is polled, and so on, until the N -th station is examined. The whole process
starts once again with the rst station after the N -th has been treated.
The situation we have just described is quite common in data communi-
cation systems, although it is not the most general scheme one may imagine.
In particular, it is often the case that messages being transmitted are not
Fig. 2.6. A multiplied data communication channel with polling time y and xed
message (packet) transmission time Y .
of xed length. For several papers on multiplexed data communication

schemes the reader is referred to [4].
We shall assume that there is a xed polling time y for each station
which is independent of the fact that it may or may not have data to
transmit; during that time the channel is allocated to the station being
polled. We shall let Y also be the (xed) time it takes to transmit a packet;
if the station has no data to transmit the channel will be switched over
immediately to the next station for polling. Each station has a buer of
packets waiting to be transmitted; their arrival to the k-th station will be
modelled by a Poisson stream of rate k .
The queue with server of walking type can be used to model the k-th
station queue. However, the idle times S and T of Fig. 2.1 for the service
at any one station depend in fact on whether the other station queues are
empty or not. For instance, if they are all empty then T = (N 1)y while
if they are full we obtain T = (N 1)(y + Y ). The analysis which we shall
carry out here will assume that the k-th queue has no inuence on all the
other queues. This is merely an approximation which simplies the analysis
since the remark concerning the values T may take show clearly that the
service mechanisms at each station depend on what is happening at the
remaining stations.
Let us assume initially that the probabilities rj , 1 j N 1, that
j stations are busy besides the k-th station being analysed are known and
given. The idle times S and T for the model of the k-th buer queue will
be chosen as follows:
T = (N 1)y + jY with probability rj ,

S = N y + (j + 1)Y with probability rj ,
S = N y + jY with probability rj , 1 j N 1.
The service time of Fig. 2.1 will take the value Y . As far as the k-th buer
queue is concerned, we can also identify two cases which yield the best and
worst case performance:
(i) best case: T = (N 1)y, S = N y,

(ii) worst case: T = (N 1)(y + Y ), S = N y + (N 1)Y .
These two limiting cases can be analysed exactly and will provide perfor-
mance bounds for the k-th buer queue.
2.4.1. Best and worst case analysis for the buer queues
Let us rst consider the best case analysis for the k-th buer queue. Here
we will have s = Y + y, T = (N 1)y and S = N y for the queue with
server of walking type model of the buer service mechanism. The average
queue length will be the measure of performance which we will examine;
let bk be this quantity for the best case. Using (2.30) we can write
[k (Y + N y)]2
bk = + k (Y + y) + k N y/2
2(1 k (Y + N y))
where k is the rate of arrival of packets to the k-th buer.
For the worst case we have again s = Y + y, while T = (N 1)(y + Y )
and S = N y + (N 1)Y . If we denote by Bk the worst case average queue
length we have, again using (2.30),
[k N (y + Y )]2
Bk = + k (Y + y) + k [N y + (N 1)Y ]/2.
2(1 k N (y + Y ))
We see that the dierence between the best and worst cases is due both
to the values of the arrival rate of packets which will saturate the system,
which are [Y + N y]1 and [N (y + Y )]1 , respectively, and to the additional
terms which appear in the formulae. On Fig. 2.7 we show the form of these
results.
2.4.2. Approximate analysis of buer queue length

In this section we are concerned with the behaviour of the k-th buer
queue in the presence of interference from activity at other buers. We
shall assume that j of the N 1 other buers are non-empty, so that the
time during which the channel is transferring data for the other stations is
inuenced by this quantity. It will be assumed that the k-th buer queue
does not inuence the value j. This is, of course, inexact since when the
k-th buer queue is empty the remaining buers receive better service and
Fig. 2.7. Range of values taken by the average queue length of the k-th buer.
therefore have a greater tendency to be idle themselves. It is a reasonable

approximation, however, if N is relatively large and when the trac arriving
at the k-th station is not much larger than that which arrives at the
remaining stations.
Thus, in addition to the quantities we have already computed for the
k-th buer queue which are
E[S] = N y + (j + 1)Y
= N y + jY
E[S]
s = y+Y
we shall also need

1
E[S ] = [N y + jY ].
2
We may now use (2.30) to determine the average queue length of the k-th
buer bk (j) when there are j non-empty buers:
2k [N y + (j + 1)Y ]2 1
bk (j) = + k [(N + 2)y + (j + 2)Y ].
2(1 k [N y + (j + 1)Y ]) 2
References
1. Adams, C., Gelenbe, D. and Vicard, J. (1977). An Experimentally Validated
Model of the Paging Drum. IRIA Research Report, No. 229.
2. Borovkov, A. A. (1976). Stochastic Processes in Queueing Theory.
Springer, New York.
3. Chang, H. (1976). Magnetic Bubble Technology Present and Future.

Symposium on Advanced Memory Concepts, Stanford Institute, Menlo Park,
California, v496, v517.
4. Chu, W. W. (ed.) (1974). Advances in Computer Communications. Arted
House, Dedham, Massachusetts.
5. Cinlar, E. (1975). Introduction to Stochastic Processes. Prentice-Hall,
Englewood Clis, New Jersey.
6. Coman, E. G. (1969). Analysis of a drum input-output queue under
scheduled operation. J.A.C.M., 16(1), 7390.
7. Cox, D. R. (1962). Renewal Theory. Methuen, London.
8. Fuller, S. H. (1977). Direct Access Device Modelling. Performance Mod-
elling and Prediction, State-of-the-Art Conference, Infotech, London.
9. Fuller, S. H. and Baskett, F. (1972). An analysis of Drum-storage Units.
Technical Report, No. 29, Digital Syst. Lab., Stanford University, California.
10. Gelenbe, E. (1979). On the optimum checkpoint interval. J.A.C.M., 26(2),
259270.
11. Gelenbe, E. and Iasnogorodski, R. (2009) A queue with server of walking
type. Annales de lInstitut Henri Poincare, Probabilites et Statistiques.
12. Gelenbe, E., Lenfant, J. and Potier, D. (1975). Response time of a xed-head
disk to transfers of variable length. SIAM J. on Computing, 4(4), 461473.
13. Khintchine, A. Y. (1960). Mathematical Methods in the Theory of Queue-
ing. Grin, London.
14. Miller, L. W. (1964). Alternating Priorities in Multiclass Queues. Ph.D.
Thesis, Cornell University, Ithaca, New York.
15. Skinner, C. E. (1967). A priority queueing model with server of walking type.
Chapter 3
Queueing Network Models
3.1. General remarks

The queueing models that we have examined so far single-server models,
or many servers with a common queue have all had a common feature:
every customer demands one service and leaves the system after obtaining
it. Often, however, in complex systems like airport terminals, job-shops and
large computers, a customer may need several dierent services provided by
dierent servers, and he may have to wait in several dierent queues before
leaving the system. A computing job, for example, may consist of some
arithmetic operations (a CPU service), followed by reading of records from a
disk le (disk I/O service), followed by more arithmetic operations (a second
CPU service), followed by a fetch of a new virtual memory page (a drum
I/O service), etc. Moreover, if the computing system is multiprogrammed,
then at each of the servers it requires the job may be delayed by other jobs
waiting and/or being served.
To model systems of the above type one is naturally led to dene a
network of service stations with a separate queue at each node. Customers
(or jobs) move from node to node in the network, waiting and receiving
service; they may or may not eventually leave the system. These queueing
networks were rst introduced and studied by J. R. Jackson and R. R. P.
Jackson [12, 13, 14] in connection with job-shop type systems. The advent
of multiprogrammed computers sparked o new interest in them, with
the result that studies of queueing network models have multiplied in
recent years. In this chapter, we shall concentrate on closed-form analytical
solutions. When the model is too complicated and/or general to allow exact
analysis, one has to apply approximate methods, some of which will be
presented in Chapters 4 and 5.
73
The simplest way to describe a queueing network (QN) is by means of

a directed graph. The nodes (all except one) of the graph represent service
stations and the arcs indicate possible paths which jobs may take when
moving around the network. There is one special node node 0 which
represents the outside world. An arc from node 0 to node i indicates that
jobs arrive into node i from the outside world; an arc from node i to node
0 indicates that jobs may depart from node i, never to be seen again. If
there are no arcs coming into or going out of node 0, the network is called
closed. Otherwise it is open.
The graph denes the QN topology. In order to describe the behaviour
of the system in time, one needs to specify the following:
(i) the nature of each node: how many servers there are, how fast they are,
what scheduling strategy is employed there;
(ii) the nature of the jobs: their arrival patterns, their routing patterns, the
amounts of service they demand from nodes on their route. Typically,
there will be dierent classes of jobs in the system, with dierent
characteristics.
The specications in group (i) are deterministic in character, while

those in group (ii) involve (in the type of models that concern us)
probabilistic assumptions. Note that we make a distinction here between the
speed of a server (say C instructions per unit time) and the service required
from that server (say x instructions), rather than talking directly about
service times. This distinction will be useful in dening certain scheduling
strategies.
It should be obvious from this general formulation that queueing
networks are very well suited to the modelling of multiprogrammed
computer systems. Let us take as a simple, yet non-trivial, example a time-
sharing system with M terminals, a CPU, a paging drum and a ling disk.
Suppose that we are interested in the system behaviour under heavy load,
i.e. when all the terminals are occupied all the time. We can then use the
QN shown in Fig. 3.1 as a model.
Since none of the nodes communicates with the outside world, this is
a closed network: there are always exactly M jobs circulating inside. Node
1 contains M servers representing the terminals. Each of the jobs in the
network is associated with one of these servers and goes to it whenever it
visits node 1 (i.e. there can be no queueing there). The service rendered
Queueing Network Models 75
Fig. 3.1.
at this node represents the time users spend thinking between receiving a
response to one job and submitting the next.
Node 2 contains a single server representing the CPU. A queue may
form here and, this being a time-sharing system, let us say that the
scheduling discipline is processor-sharing (we shall discuss processor-sharing
strategies in detail later). Nodes 3 and 4 are also single-server nodes,
representing the drum and the disk, respectively. The scheduling strategy
at both nodes is FIFO (rst-in-rst-out).
To model the fact that job execution consists of alternating CPU and
Input/Output intervals (the latter correspond to either page or le record
transfers) we impose the following routing rules: after leaving nodes 1, 3 or
4 jobs always go to node 2; after leaving node 2 they may go to nodes 1,
3 or 4 with certain probabilities. These probabilities will be assumed xed
normally but may, in some applications, be allowed to depend on past job
history. Think times, CPU times and I/O intervals are also governed by
probabilistic assumptions.
There may be jobs of dierent types in the system. For example, k of
the M terminals may be reserved for users with short, I/O-bound jobs
while the others are occupied by users with long, CPU-bound jobs. This
can be modelled by introducing two jobs classes with dierent routing
probabilities, think and service time distributions.
What do we expect to learn from the model? Some system performance
measures one may be interested in are: the average response time (the
time between leaving node 1 and returning to it) for jobs of class i; the
proportions of time that the CPU, the drum and the disk spend servicing
jobs of class i, the marginal and joint distributions of queue sizes at the
various nodes, etc.
We shall return to this model at the end of the chapter, after the
tools of analysis have been developed. We shall then be able to write
down expressions for these performance measures in terms of the system
parameters.
3.2. Feedforward networks and product-form solution

If the removal of node 0 from the graph of a queueing network leaves an
acyclic graph, the network is called feedforward. Stated in terms of job
routing, this denition means that between their arrival from the outside
and their departure to the outside, jobs never visit any node twice. An
example of a feedforward QN (with node 0 removed) is shown in Fig. 3.2.
From now on, whenever we talk about a feedforward QN we shall
assume that its nodes 1, 2, . . . , N are numbered in such a way that if
there is a path from node i to node j then i < j: i, j = 1, 2, . . . , N (this
is always possible with acyclic graphs). Thus, only jobs from outside arrive
into node 1; only jobs from outside and/or node 1 arrive into node 2; etc.
Consider now a feedforward QN with the following specications.
Case 1. There is a single job class. Jobs arrive into node i from outside in a
homogeneous Poisson stream with rate 0i ; i = 1, 2, . . . , N . The amount of
service they require at node i is distributed exponentially with mean 1/i .
After service at node i (i = 1, 2, . . . , N ) jobs take an exit arc to node j (j = 0

or j > i) with xed probability pij (pij 0, j pij = 1). Node i contains
ci identical servers of unit speed (the last is not an important restriction,
it is made only to avoid extra notation), with a common queue served in
FIFO order. All external arrival and service processes are independent.
The state of the QN at any given time is dened as the integer vector
n = (n1 , n2 , . . . , nN ), ni 0, i = 1, 2, . . . , N
Fig. 3.2.
where ni is the number of jobs waiting and/or being served at node i,

i = 1, 2, . . . , N . Denote
p(n, t) = P [the QN is in state n at time t].
We are interested in the stationary distribution of the QN state, i.e. in the

limit
p(n) = lim p(n, t)

t
when it exists.
We begin by observing that node 1 is entirely unaected by nodes
2, 3, . . . , N . It behaves like the classic M/M/c queue with parameters =
01 , = 1 and c = c1 ; the stationary distribution of the latter exists when
< c and is given by (see Chapter 1)

p(n) = (n) (k); n = 0, 1, . . . (3.1)
k=0
where

k
k
(0) = 1, (k) = (j), (j) = min(j, c), j, k = 1, 2, . . . .
j=1
Thus (3.1) can be used to obtain the marginal stationary distribution p1 (n1 )
of the number of jobs at node 1.
If the stream of arrivals into node 2 is also Poisson, say with total rate
2 , then node 2 would also behave like an M/M/c queue with parameters
2 , 2 and c2 ; the stationary distribution of the number of jobs at node 2,
p2 (n2 ), would exist if 2 < c2 2 and we could again use (3.1) to write an
expression for it. Furthermore, if n1 and n2 were mutually independent in
the steady-state, we could obtain their joint distribution by multiplying
p1 (n1 ) and p2 (n2 ).
What constitutes the input into node 2? It is formed in general by
splitting o part of the output from node 1 (a fraction p12 ) and merging
it with the external arrivals into node 2. Since Poisson streams remain
Poisson after splitting and merging (if the streams are independent), it will
be sucient, and necessary, to show that the total departure stream from
node 1 must be Poisson in order for the total arrival stream into node 2 to
be Poisson.
Perhaps the best way to approach this problem is via the notion of
reversibility, introduced by Reich [20]. He observed that, in equilibrium, the
state process of the M/M/c queue with time reversed is indistinguishable

from the original state process, i.e. that transitions from one state to
another in reverse time occur with the same rates as the same transitions
in forward time. This fact will be referred to as the reversibility theorem.
Before we prove this, let us see what conclusions can be drawn from it.
A departure, or a step down transition, in the original state process,
corresponds to a step up transition, or an arrival, in the reverse time
process. Thus, the original departure stream is equivalent in all respects
to the arrival stream in reverse time. But the latter is, according to the
reversibility theorem, equivalent to the arrival stream in forward time,
which is Poisson with parameter . Hence the original departure stream
is Poisson with parameter .
Further, the state of the M/M/c queue at time t is obviously
independent of the arrival stream after t (although it depends on the
arrivals before t). By the above duality, the state of the queue at time t is
independent of the departure stream before t (although it depends on the
departures after t). This independence, together with the Poisson character
of the departure process, is referred to as the output theorem which was rst
proved (through a dierent argument) by Burke [4].
To prove the reversibility theorem we have to show that, for the M/M/c
queue with arrival and service parameters and , in equilibrium, the
transition rates in reverse time from state n to state n + 1 (n = 0, 1, . . .)
and from state n to state n 1 (n = 1, 2, . . .) are and (n), respectively,
where (n) = min(c, n). These are the only non-zero transition rates in
forward time. The transition rate in reverse time from n to n+1 is dened as
P (state at t t is n + 1 | state at t is n)
lim
t0 t
which is equal to
P (state at t is n | state at t t is n + 1) p(n + 1)
lim
t0 tp(n)
(n + 1)p(n + 1)
= =
p(n)
after substitution of (3.1). Similarly, the transition rate in reverse time from
state n to state n 1 is equal to
p(n 1)
= (n),
p(n)
completing the proof.
Going back to our feedforward network, the output theorem shows that
the total arrival stream into node 2 is Poisson with rate 2 = 02 + p12 01 .
Thus the marginal stationary distribution of the number of jobs at node 2,
p2 (n2 ) is given by (3.1) with = 2 , = 2 and c = c2 . The theorem
also shows that, at any time t in the steady-state, the number of jobs at
node 2 is independent of the number of jobs at node 1. This is because only
departures from node 1 prior to t inuence the state of node 2 at t and
those departures are independent of the state of node 1 at t.
These arguments carry through to all other nodes in the network.
The total arrival stream into node j is Poisson with rate
j1

j = 0j + i pij , j = 2, 3, . . . , N (3.2)
i=1
where i is the total arrival (and hence departure) rate for node i, i =
1, 2, . . . , j 1. The derivation of (3.2) is obvious; it takes into account the
external arrivals into node j and those parts of the departure streams from
other nodes which are directed to node j. Furthermore, at any moment in
the steady-state, the states of the various nodes are independent of each
other because we have shown that the past departure stream is independent
of the present state at a node. Therefore, the stationary distribution of the
network state is equal to
p(n) = p1 (n1 )p2 (n2 ) . . . pN (nN ) (3.3)
where pi (ni ) is given by (3.1) with = i , = i and c = ci , provided that

i < ci i , i = 1, 2, . . . , N .
Thus, under the feedforward topology and the assumptions of case 1
the QN has the so-called product form solution: the distribution of the
network state decomposes completely into a product of individual node
distributions. We shall see in later sections that both the topology and
the other assumptions can be generalised considerably without losing that
form of the solution. One should be careful, however, in interpreting the
meaning of the product form. In particular, the fact that the node states
are independent does not imply that the times a job spends at various
nodes (the sojourn times) are also independent. Consider, for example, the
feedforward QN in Fig. 3.3. Suppose that node 3 is much faster than node 2
(3 2 ). Let J be a job with a long sojourn time at node 1; it is quite
likely that when J leaves node 1 there will be a queue behind; with a
nite probability J will go to node 4 via node 2, while some jobs from the
Fig. 3.3.
queue behind it will go to node 4 via node 3, in which case they will arrive
there before J and cause it to wait longer. Thus the conditional probability
of a long sojourn time at node 4 given a long sojourn time at node 1 is
higher than the corresponding unconditional probability, i.e. the two are
not independent.
There is only one known case of a QN where the sojourn times of a
given job at dierent nodes are all independent: N nodes strictly in tandem;
all except the rst and the last contain a single exponential server; the rst
node is an M/M/c queue and the last can be an M/G/c queue (Burke [5];
Reich [20]). Another curious aspect of this problem is that if waiting times
are dened to exclude service times, then even in the above case the waiting
times at dierent nodes are not independent (Burke [5]).
3.3. Jackson networks

Of the restrictions imposed on the networks of the last section, the most
unpalatable was clearly the one forbidding jobs to visit the same node twice.
We shall remove that restriction now, and study the following model.
Case 2. The topology of the network can be represented by an arbitrary

graph. All other assumptions are as in case 1, except that on leaving
node i (i = 1, 2, . . . , N ) a job may go to any node j (j = 0, 1, . . . , N )
with probability

N

pij pij 0, pij = 1 .
j=0
Queueing networks of the type covered by case 2 are known as Jackson

networks. The main results concerning them were obtained by J. R.
Jackson in two pioneering papers [12, 13]. Queues strictly in tandem had
previously been studied by R. R. P. Jackson [14].
Again, the state of the QN at a given moment in time is dened by the

vector n = (n1 , n2 , . . . , nN ) where ni is the number of jobs (waiting and
being served) at node i (i = 1, 2, . . . , N ). We are interested in the steady-
state distribution of n. The existence and uniqueness of that distribution are
determined completely by the following system of linear equations, known
as trac equations or ow balance equations:
N

j = 0j + i pij , j = 1, 2, . . . , N. (3.4)
i=1
We saw a special case of the trac equations in (3.2); there, the associated
matrix was triangular and the equations always had a unique solution;
the steady-state distribution existed if, and only if, that solution satised
i > ci i (i = 1, 2, . . . , N ).
It is readily seen that if a general Jackson QN has a steady-state regime
then the corresponding trac equations have a solution. Indeed, they are
satised by the total rates of input (number of arrivals per unit time),
1 , 2 , . . . , N into nodes 1, 2, . . . , N . To justify that statement it is enough
to observe that, in the steady-state, i is also the total rate of output from
node i (i = 1, 2, . . . , N ). The right-hand side of (3.4) then contains the rate
of external input into node j (0j ), plus all the output rate fractions which
are directed to node j (i pij , i = 1, 2, . . . , N ) i.e. the total rate of input into
node j. Thus the existence of a solution to (3.4) is a necessary condition for
the existence of a steady-state distribution of the Jackson QN. A rigorous
proof of this can be found in [12].
Before examining the suciency of that condition we shall introduce a
classication of the individual nodes of the network. This follows loosely the
one adopted by Melamed [18]. A node is called open if any job which visits
it is certain (will do so with probability 1) to leave the network eventually.
A node is called closed if any job which visits it is certain to remain in
the network forever. A node is called recurrent if any job which visits it
is certain to return to it eventually (clearly all recurrent nodes are closed
but not vice versa). For example, in the network of Fig. 3.4, nodes 5 and
6 are open, nodes 3 and 4 are closed and recurrent, node 2 is closed and
non-recurrent (transient), and node 1 is neither open nor closed.
Let A be the set of open nodes in the network, B be the set of the non-
open nodes and R be the set of the recurrent nodes. It can be demonstrated
that the trac equations (3.4) have a solution if, and only if, 0j = 0 for
all j B [18].
Fig. 3.4.
We shall give an outline of the proof. Suppose that 0j > 0 for some
j B and that j is either recurrent or has a path leading from it to some
node r R (otherwise j would be open). In either case, a fraction (perhaps
all) of the external arrivals into j nd their way to some recurrent node
r and hence keep on visiting it ad infinitum. Therefore r saturates in the
long run; the trac through it does not balance and (3.4) does not have a
solution. If, on the other hand, 0j = 0 for all j B, one solution of (3.4)
can be obtained by setting j = 0 for all j B and solving only those
equations in (3.4) corresponding to j A. That will be possible because all
nodes j A are transient (in Markov chain terminology) and therefore the
submatrix of (3.4) associated with them has an inverse.
If the trac equations (3.4) have a solution, 1 , 2 , . . . , N , then,
necessarily j = 0 for all j B R [18]. This can be explained intuitively
by remarking that jobs may leave the set of nodes B R but never arrive
into it (from the denitions of A and R and from the fact that 0j = 0,
j B). Hence that set of nodes eventually drains of jobs completely and
the trac through it, when balanced, is zero.
Bearing in mind that pij = pji = 0 for all i A and j R, we can
summarise the above results in the following manner.
Theorem 3.1. The trac equations (3.4) have a solution if, and only if,
they are equivalent to the three independent sets of equations

j = 0j + i pij , j A, (3.5)
iA
j = 0, jBR (3.6)

j = i pij , j R. (3.7)
iR
Note that (3.5) always has a unique solution because its matrix has an

inverse (due to jA pij < 1 for all i A). (3.7), if present, has innitely
many solutions because it is homogeneous and its matrix is singular

( jR pij = 1, i R). Therefore, if we want the trac equations to have a
unique solution, R must be the empty set. Since R = i B = , we have:
Corollary 3.1. The trac equations have a unique solution if, and only
if, all nodes of the network are open.
If the set R of the recurrent nodes is not empty, there is (after all the
jobs have drained from the nodes in B R) a constant number of jobs
circulating in it. Furthermore, R is split into non-intersecting equivalence
classes by the relation communicate (nodes i and j communicate if
there is a path from i to j and a path from j to i). There is a constant
number of jobs circulating in each of these communicating classes and the
system of equations (3.7) splits into independent subsystems, one for each
communicating class.
Thus, in order to understand the steady-state behaviour of general
Jackson networks it is important to study two special cases:
(i) open networks all of whose nodes are open (we shall call these networks
completely open);
(ii) closed networks consisting of a single communicating class, with a
xed number of jobs circulating inside (we shall call such networks
completely closed).
Let us write the balance equations for the equilibrium probability

distribution of the general Jackson network state:

N
N

p(n1 , n2 , . . . , nN ) 0j + j (nj )I(nj >0) (1 pjj )
j=1 j=1
N

= p(n1 , . . . , nj 1, . . . , nN )I(nj >0) 0j
j=1
N

+ p(n1 , . . . , nj+1 , . . . , nN )j (nj + 1)pj0
j=1
N
N
+ p(n1 , . . . , ni + 1, . . . , nj 1, . . . , nN )i (ni + 1)
j=1 i=1
i=j
I(nj >0) pij for all (n1 , n2 , . . . , nN ) 0. (3.8)

where j (n) = j min(n, cj ) and IB is the indicator function of the

Boolean B:

1 if B is true
IB =
0 if B is false.
In the left-hand side of (3.8) is the instantaneous transition rate out of
state n = (n1 , n2 , . . . , nN ); the right-hand side contains the transition rates
into state n due to: external arrivals (rst line), departures to the outside
(second line) and transfers from one node to another (third line).
Suppose that the network is completely open. Jacksons classic result
can be stated as follows:
Theorem 3.2. (Jackson) If the unique solution to (3.5) satises the

inequalities
j < cj j , j = 1, 2, . . . , N (3.9)
then the steady-state distribution of the network state exists and has the
form
p(n1 , n2 , . . . , nN ) = p1 (n1 )p2 (n2 ) . . . pN (nN ), nj 0, j = 1, 2, . . . , N

(3.10)
where pj (nj ) is the steady-state probability of having nj customers in an

M/M/c queueing system with parameters = j , = j , c = cj (j =
1, 2, . . . , N ); it is given by (3.1).
Proof. First we verify that (3.10) satises the balance equations (3.8). We
substitute (3.10) into (3.8) and use the identities (see (3.1)):
j (nj )
j (nj 1) = j (nj ), nj > 0;
j
j
j (nj + 1) = j (nj ), nj 0.
j (nj + 1)

The factors j (nj )/ k=0 j (k), j = 1, 2, . . . , N , cancel out and (3.8) is
reduced to
N
N

0j + j (nj )I(nj >0) (1 pjj )
j=1 j=1
N
N N N
j (nj ) j (nj )
= 0j I(nj >0) + j pj0 + I(nj >0) i pij .
j=1
j j=1 j=1
j i=1
i=j
This last equation always holds. Indeed, individual terms on the left- and
right-hand sides can be equated: from (3.4) we have

N
1
1 pjj = 0j + i pij , j = 1, 2, . . . , N,
j i=1
i=j
and hence, for j = 1, 2, . . . , N ,
N
j (nj ) j (nj )
j (nj )I(nj >0) (1 pjj ) = 0j I(nj >0) + I(nj >0) i pij .
j j i=1
i=j
Also from (3.4), by summing all equations, we obtain

N
N

0j = j pj0
j=1 j=1
thus completing the verication.
As an important aside, we should point out that the last two identities
mean, in eect, that
(i) the rate of transition out of state n, due to a job leaving node j, is
equal to the rate of transition into state n, due to a job arriving into
node j; and
(ii) the total rate of arrivals into the network is equal to the total rate of
departures from the network.
Property (i) is usually called local balance (to distinguish it from the
global balance equations (3.8)) and it appears to be intimately connected
with the existence of product-form solutions.
Having established that (3.10) satises (3.8), we next verify, by direct
summation, that it also satises the normalising equation

p(n) = 1
n0
when (3.9) holds. Jacksons theorem now follows from the theorem in
section 1.4 which states that if the balance equations of an irreducible
Markov process have a positive solution which satises the normalising
equation then the steady-state distribution of the Markov process exists and
is given by that solution. The state process of a completely open Jackson
QN is, indeed, irreducible; this follows from the fact that the state n = 0
is accessible from every state [18].
Jacksons theorem implies that the states nj of individual nodes
(j = 1, 2, . . . , N ) at a given moment in the steady-state are independent
random variables. This is an even more remarkable result than in the case
of feedforward networks because, a priori, it would seem that the nodes of
a general Jackson network have more opportunities to inuence each other.
Furthermore, as we shall see at the end of this section, the total input
process into a given node is no longer Poisson, in general. Yet the node
behaves as if it were!
Remark: The theorem states that conditions (3.9) are sucient for the
existence of a steady-state distribution. Clearly, they are also necessary
because if steady-state exists the total rate of output from node j is j (j =
1, 2, . . . , N ) and, since the servers there are occasionally idle, the rate of
output is less than cj j (which is what it would be if all the servers were
busy all the time).
Suppose now that we are dealing with a completely closed network
with K jobs circulating inside. The state process of the network is a nite
Markov chain, it is irreducible (since all nodes communicate) and therefore
always has a steady-state distribution. The form of that distribution was
discovered by Gordon and Newell [11] (although it can be derived as a
special case from one of Jacksons theorems).
Theorem 3.3 (GordonNewell). Let j , j = 1, 2, . . . , N, be any non-

zero solution of the trac equations (3.7). The steady-state distribution of
the network state is given by
1
p (n1 , n2 , . . . , nN ) = 1 (n1 )2 (n2 ) N (nN ),
G(K)
nj 0, n1 + + nN = K (3.11)
where j (nj ) are obtained as in (3.1) with = j , = j , c = cj (j =

1, 2, . . . , N ) and the normalising factor [G(K)]1 is chosen so that all
probabilities sum up to one:

G(K) = 1 (n1 )2 (n2 ) N (nN ). (3.12)
n1 ++nN =K
The proof of this theorem is also by direct substitution of (3.11) into (3.8)
and verifying that the latter are satised.
We encounter again a product-form distribution. This time, however,

it is not a product of individual node distributions and we cannot conclude
that the node states are independent of each other. In fact, they are obvi-
ously not independent since the total number of jobs in the network is xed.
What can we say about the steady-state distribution of general (not
completely open and not completely closed) Jackson networks? The state
process of such a network is not necessarily an irreducible Markov chain.
Even if the balance equations have a solution summing up to 1, steady-state
distribution may not exist because the long-run behaviour of the network
will depend, in general, on the initial conditions. For example, dierent
initial assignments of jobs to the nodes in the set B R will lead to dierent
numbers of jobs draining into the communicating classes of recurrent nodes
and hence dierent long-run distributions.
Many measures of network performance can be derived directly from
the steady-state distributions (3.10) and (3.11). Let us obtain some for
completely open networks; closed networks present special computational
problems which will be tackled in a separate section.
The total throughput of the network is, of course,
N

= 0j .
j=1
Since jobs are being served at node j for an average of 1/j and they arrive
there at rate j , the average number of jobs being served at node j is,
according to Littles theorem, j = j /j (j = 1, 2, . . . , N ). If there is only
one server at node j then j is its utilisation factor (the fraction of time the
server is busy). The total average number of jobs being served (not waiting
in queues) in the network is 1 + 2 + + N and hence, again according to
Littles theorem, the total average amount of service a job obtains during
its residence in the network is

N
E[S] = j .
j=1
If ej is the average number of visits a job makes to node j (j = 1, 2, . . . , N )

then, since an average of jobs arrive into the network from outside per
unit time and each of them visits node jej times on the average, the rate
of input into node j should be ej . Thus we have
ej = j /.
So far we have not used the distribution of the network state; the same
arguments would apply, for example, if interarrival and service times had
general distributions. One needs the state distribution if one is interested
in the numbers of jobs at various nodes or the time jobs spend there. In
particular, the average number of jobs at node j is equal to

E[nj ] = kpj (k).
k=1
The total average number of jobs in the network is

N

E[n] = E[nj ]
j=1
which gives, once more according to Littles theorem, the average response
time W (the time between the arrival of a job into, and its departure from
the network):
W = E[n]/.
Let us now look at some trac processes of jobs between nodes. Very
little is known about these and the results that are available are mostly of
a negative nature. For example, the total input process into a node is not,
in general, Poisson. To demonstrate this, consider the single-node network
of Fig. 3.5 (Burke [6]). There is a single server at node 1; upon completion
of service a job leaves with probability p10 and is fed back with probability
p11 = 1p10 . The trac equation is 1 = 01 +p11 1 , yielding 1 = 01 /p10 .
Steady-state exists when 1 < 1 and the system, as far as the queue size
distribution is concerned, is equivalent to an M/M/1 queue with trac
intensity = 1 /1 :
p(n) = n (1 ), n = 0, 1, . . . .
Now, the queue size distribution left behind by departing (not fed-back)
jobs is the same as that seen by jobs arriving from the outside; the latter
Fig. 3.5.
is the same as the steady-state distribution because the exogenous arrivals

form a Poisson stream. On the other hand, fed-back jobs see the same queue
size distribution (conventionally, the queue seen by a fed-back job does not
include the job itself) as departing jobs, since the feedback decision is made
independently of the system state. Hence, the state distribution at input
instants (exogenous or feedback) is also given by p(n).
Let F (x) be the distribution function of the interval T between
consecutive input instants: 1 F (x) is the probability that T > x. Let
1 G(x) be the probability that the time until the rst feedback following
an input instant is greater than x. Denote by f (x) and g(x) the density
functions associated with F (x) and G(x), respectively, and let gn (x) be the
latter density conditioned upon the number n of jobs in the system just
before an input instant. We have
n+1

gn (x) = pj1
10 p11 1 e
1 x
(1 x)j1 /(j 1)!.
j=1
The j-th term in the sum is equal to the probability that the j-th customer
in the queue will be the rst to be fed back, multiplied by the density
function of j service times (Erlang with parameters j, 1 ). Next,

g(x) = p(n)gn (x) = p11 1 e(1 01 )x
n=0
after substitution of p(n) and gn (x) and inverting the order of summation.
This gives
x
p11 1
G(x) = g(t)dt = [1 e(1 01 )x ].
0 1 01
The distribution function F (x) is determined by observing that, for the

interval between inputs to be greater than x, there must be no exogenous
arrivals and no feedbacks before x:
1 F (x) = e01 x [1 G(x)]

= [(p10 1 01 )e01 x + p11 1 e1 x ]/(1 01 ).
We see that the mean of F (x) is 1/1 (as expected), but F (x) is not
exponential and hence the input stream is not Poisson.
This situation raises the question of what is the network state
distribution at the moments when jobs move from one node to another
(or arrive from the outside). If the total input into a node is Poisson, and
independent of the network state, then the network state distribution at
input instants is the same as the steady-state distribution. We saw in an
example that the input process is not necessarily Poisson but we also saw
that input jobs may still see the steady-state distribution. This is, in fact,
the case for any completely open Jackson network: jobs arriving into a node
(externally or internally) do not, in general, form a Poisson process but they
see the steady-state distribution of the network state (Sevcik and Mitrani
[23]). In a closed network with K jobs circulating in it, a job coming into a
node sees the steady-state distribution of a network with K 1 jobs. These
results are generalised in [23] to a large class of networks with many job
classes; the networks may be open with respect to some job classes and
closed with respect to others.
A generalisation of the output theorem holds for the processes of
departure from Jackson networks in equilibrium: the stream of jobs
leaving the network from node j is Poisson with rate j pj0 and its past
is independent of the network state. Moreover, these streams are mutually
independent [18].
3.4. Other scheduling strategies and

service time distributions
Executing jobs in order of arrival has the obvious advantages of fairness,
simplicity and ease of implementation. It is also ecient in the sense of
yielding small average queue sizes and waiting times, when the variation
in the required service times is small. However, the FIFO scheduling
strategy has disadvantages, too. Its performance is far from optimal when
the variation in the required service times is large (we shall return to
these questions in the chapters on design). It is inherently unsuitable for
certain applications, like time-sharing (where jobs are served in parallel)
or stack processing (where the last arrival is served rst). It cannot be
used in situations where it is desirable to give some jobs priority over
others.
Clearly, the utility of queueing network models would be enhanced
signicantly if dierent scheduling disciplines were allowed at dierent
nodes. The enhancement would be even greater if one could drop the rather
restrictive assumption that all service times are distributed exponentially.
There are certain jobs scheduling strategies and a certain type of probability
distribution which make such generalisations possible.
3.4.1. The egalitarian processor-sharing strategy

This servicing discipline was introduced [16] in connection with computer
time-sharing models. It was formulated originally as a limiting case of the
Round-Robin discipline which allocates service in quanta of xed size Q;
if a job does not complete within a quantum, it returns to the end of the
queue and waits until its turn comes again. The smaller the quantum size,
the faster the jobs circulate and, in the limit Q 0, one obtains a mode of
operation without queueing where all jobs requiring service are being served
in parallel at a rate inversely proportional to their number. This is the clas-
sic processor-sharing strategy. It is usually dened directly by saying that if
the capacity (or speed) of the processor is C instructions per unit time and if
at time t there are n jobs requiring service, then in a small interval (t, t + t)
(during which nobody arrives or leaves) each of the n jobs increases its
attained service by (C/n)t instructions. We call this processor-sharing
strategy egalitarian because it divides the processing capacity equally
among the jobs present, without regard to class or other distinctions.
Consider an M/M/1 processor-sharing system with R job classes and
unit processor speed (C = 1). Jobs of class r arrive in a Poisson stream
at rate r and have required service times distributed exponentially with
mean 1/r (r = 1, 2, . . . , R). The system state is dened by the vector
(k1 , k2 , . . . , kR ) where kr is the number of class r jobs requiring service. The
steady-state distribution of (k1 , k2 , . . . , kR ) is determined from the balance
equations
R R

p(k1 , k2 , . . . , kR ) r + (kr /k)r
r=1 r=1
R

= p(k1 , . . . , kr 1, . . . , kR )I(kr >0) r
r=1
R

+ p(k1 , . . . , kr + 1, . . . , kR )r (kr + 1)/(k + 1),
r=1
(k1 , k2 , . . . , kR ) 0, (3.13)
where k = k1 + k2 + + kR .
It is not dicult to verify, by direct substitution, that the solution of
(3.13) which satises the normalising equation

p(k) = 1
k0
is given by
R

p(k1 , k2 , . . . , kR ) = (1 )k! (kr r /kr !), (k1 , k2 , . . . , kR ) 0 (3.14)
r=1
where r = r /r and = 1 + 2 + + R (steady-state exists when

< 1). The easiest way of performing the verication is by showing that
(3.14) satises the local balance equations: each term in the rst sum on
the left-hand side of (3.13) is equal to the corresponding term in the second
sum on the right-hand side and vice versa.
Note that, as far as the total number (k) of jobs is concerned, the
processor-sharing queue is equivalent to a FIFO queue with trac intensity
: by summing (3.14) over all (k1 , k2 , . . . , kR ) such that k1 + k2 + +
kR = k, we obtain p(k) = (1 )k , k = 0, 1, . . . .
3.4.2. The pre-emptive-resume LCFS strategy

If the resource being modelled is (or behaves like) a stack, a scheduling
discipline under which the last arrival is served rst is appropriate. In many
cases this would involve pre-emptions, i.e. if a job is in service when a new
job arrives, the service is interrupted until the new job departs (which, in
turn, may be interrupted) and then resumed from the point of interruption.
We use the name pre-emptive resume LCFS (last-come-rst-served) when
referring to this scheduling strategy.
Let us take again an M/M/1 system with R job classes (same
assumptions and notations as before) and study it under the pre-emptive-
resume LCFS scheduling strategy. The system state is dened by the
variable-length vector (r1 , r2 , . . . , rk ), where the number of elements is equal
to the total number of jobs requiring service and the i-th element is the class
index of the i-th job in the LCFS order: the rst of these jobs is being served
and all others are waiting (having been interrupted). We use the notation
(0) for the empty state.
The steady-state balance equations are

R
p(r1 , r2 , . . . , rk ) j + r1 I(r1 >0)
j=1
R

= p(r2 , . . . , rk )r1 I(r1 >0) + p(rj , r1 , . . . , rk )rj (3.15)
j=1
and their solution, subject to the normalising equation, is
k

p(r1 , r2 , . . . , rk ) = (1 ) r i (3.16)
i=1
provided that < 1, where the product is dened as 1 if the vector

(r1 , . . . , rk ) consists of the single element 0 (i.e. represents the empty state).
Again (3.16) satises the local balance subequations of (3.15).
If we wish to nd the steady-state distribution of the aggregate system
state (k1 , k2 , . . . , kR ), where kr is the number of class r jobs in the
system (kr 0, r = 1, 2, . . . , R), we have to sum (3.16) over all vectors
(r1 , r2 , . . . , rk ) which have k1 elements equal to 1, k2 elements equal to
2, . . . kR elements equal to R. This gives
R

p(k1 , k2 , . . . , kR ) = (1 )k! (kr r /kr !) (3.17)
r=1
where k = k1 +k2 + +kR . We observe that the pre-emptive-resume LCFS

discipline and the egalitarian processor-sharing discipline yield identical
steady-state distributions of the numbers of jobs of various classes in the
system.
3.4.3. The server-per-job strategy

In order to operate this scheduling strategy one needs as many servers as
there may be jobs requiring service. As soon as a job arrives, a separate
server is assigned to it for the duration of the service. All servers are assumed
identical and of unit speed. For example, the collection of user terminals
in a computer system can be modelled by a node with the server-per-job
scheduling discipline.
In the case which we have been considering (R jobs classes arriving in
Poisson streams and with exponentially distributed service requirements),
the server-per-job scheduling strategy would involve innitely many servers
(since there is no bound to the number of jobs in the system). The system
state is dened by the vector (k1 , k2 , . . . , kR ), kr being the number of jobs
of class r in the system (kr 0, r = 1, 2, . . . , R). The steady-state balance
equations take account of the fact that the departure rate for class r is
proportional to the number of class r jobs present:

R R

p(k1 , k2 , . . . , kR ) r + kr r
r=1 r=1
R

= p(k1 , . . . , kr 1, . . . , kR )r I(kr >0)
r=1
R

+ p(k1 , . . . , kr + 1, . . . , kR )(kr + 1)r . (3.18)
r=1
Their solution, subject to the normalising equation, is given by

R

p(k1 , k2 , . . . , kR ) = [(kr /kr !)er ]. (3.19)
r=1
Steady-state exists for all values of the parameters.

Note that although (3.14), (3.17) and (3.19) are all product-form solu-
tions, only the last one factorises completely into a product of distributions
for the individual job classes (the right-hand side of (3.19) is the product of
R single class M/M/ distributions). The random variables k1 , k2 , . . . , kR
are mutually independent in a server-per-job system (that is intuitively
obvious, too) but they are not independent in a single-server processor-
sharing or pre-emptive-resume LCFS system.
We shall dene now a family of probability distributions which are,
to all intents and purposes, general and which will allow us to relax
the assumption that required service times are distributed exponentially.
The idea, due initially to Erlang and generalised later by Cox [10], is to use
the response time of a simple exponential network to represent the service
time required from a single server.
Consider the network with L nodes in Fig. 3.6. There can never be
more than one job in the network. Jobs enter via node 1. After receiving
service from node l (distributed exponentially with mean 1/l ) a job leaves
Fig. 3.6.
the network with probability bl , or proceeds to node l + 1 with probability

al (al + bl = 1, l = 1, 2, . . . , L 1). After node L jobs leave the network.
The probability that a job reaches node l is Al = a0 a1 al1 (l =
1, 2, . . . , L; a0 = 1) and the probability that a job visits nodes 1, 2, . . . , l
and leaves the network is equal to Al bl . Hence, the time a job spends
in the network is, with probability Al bl , the sum of l independent,
exponentially distributed random variables. The expected value of is
equal to
L
l
L

E[ ] = Al bl (1/i ) = (Ai /i ). (3.20)
l=1 i=1 i=1
Let f (x) be the probability density function of and let f (s) be its
Laplace transform. Since the Laplace transform of the node l service time
is l /(l + s), we can write
L
l

f (s) = Al bl [i /(i + s)]. (3.21)
l=1 i=1
The right-hand side of (3.21) is a rational function of s: it can be rewritten

as P (s)/Q(s), where P (s) and Q(s) are polynomials. Furthermore, all
roots of Q(s) are real, the degree of Q(s) is higher than the degree of
P (s) and P (0)/Q(0) = f (0) = 1. Conversely, any rational function of s
which satises the above conditions can be expressed in the form (3.21)
and, therefore, any distribution whose Laplace transform is such a rational
function can be represented by a network of exponential stages as in
Fig. 3.6. Distributions which are representable in this way are called Coxian
distributions.
The exponential, hyperexponential and Erlang distributions are
Coxian, and any distribution function which is a linear combination of
Coxian distributions is also Coxian. Moreover, any probability distribution
function F (x) can be approximated arbitrarily closely by Coxian distri-
bution functions. This can be done, for example, by rst constructing a
staircase approximation to F (x) of the form
m

F (x) = di I(xih)
i=1
where the increment h, the staircase steps di and their number m are chosen
so that F (x) approximates F (x) with the desired accuracy. Each of the
unit step functions I(tih) is the distribution function of a constant (ih)
and can, therefore, be approximated by an Erlang distribution arbitrarily

closely (if we take an Erlang distribution with parameters n, and let
n , so that n/ const. = k, the Erlang distribution function
approaches I(xk) ). Thus F (x) is approximated by a linear combination of
Erlang distributions, which is the Coxian.
Let us now revisit the three models considered in this section and
assume, in each case, that the required service time for class r jobs
(r = 1, 2, . . . , R) has a Coxian distribution with parameters Lr (number
of stages), 1/rl (mean of l-th stage, l = 1, 2, . . . , Lr ), arl (probability
of proceeding to the j + 1 stage, l = 0, 1, . . . , Lr 1, ar0 = 1) and brl
(probability of exiting after the j-th stage, l = 1, 2, . . . , Lr , brLr = 1). The
average required service time for class r jobs is, according to (3.20),
Lr

1/r = (Arl /rl ), r = 1, 2, . . . , R
l=1
where Arl = ar0 ar1 arl1 is the probability that a class r job reaches the
l-th stage of its service.
When the required service times are not distributed exponentially,
the stochastic process dened by the number (or vector of numbers) of
jobs in the system is not Markov and one cannot nd its steady-state
distribution by means of balance equations. However, if those distributions
are Coxian, the Markov property can be reinstated by a suitable rede-
nition of the system state. The new process can then be studied in the
usual way.
In the case of the processor-shared server, dene the system state
as a vector of vectors (v1 , v2 , . . . , vR ), where vr = (kr1 , kr2 , . . . , krLr )
is a vector whose l-th element is the number of class r jobs which are
in the l-th stage of their service. As dened, the system state forms a
Markov process because all stages are distributed exponentially. We can
therefore write a set of balance equations for the steady-state distribution
of (v1 , v2 , . . . , vR ). These equations take into account transitions out of
and into a state due to arrivals of class r jobs and due to completions
of stage l of a class r service (r = 1, 2, . . . , R; l = 1, 2, . . . , Lr ). The
solution of the balance equations, subject to the normalising equation, is
given by
R
Lr

kr krl
p(v1 , v2 , . . . , vR ) = (1 )k! r [(Arl /rl ) /krl !] (3.22)
r=1 l=1
where kr = k1 + + krLr is the number of class r jobs, k = k1 + + kR

is the total number of jobs in the system and
R R
L
r
= (r /r ) = r (Arl /rl ) .
r=1 r=1 l=1
Steady-state exists when < 1.

The verication that the balance equations are satised is carried out
by showing that (3.22) satises the following set of local balance equations:
the rate of ow out of state (v1 , . . . , vR ) due to a class r job completing
stage l of its service is equal to the rate of ow into that state due to a class
r job entering stage l of its service (r = 1, 2, . . . , R; l = 1, 2, . . . , Lr ); and
vice versa. This allows individual terms to be cancelled out on both sides
of the balance equations.
If we sum (3.22) over all states (v1 , . . . , vR ) such that kr1 + + krLr =
kr (r = 1, 2, . . . , R) we obtain the distribution of the aggregate system
state (k1 , k2 , . . . , kR ), where only the numbers of jobs of various classes are
considered and not the stages of service. This yields
L kr
R r
p(k1 , k2 , . . . , kR ) = (1 )k! (kr r /kr !) (Arl /rl )

r=1 l=1
R
R

= (1 )k! [(r /r )kr /kr !] = (1 )k! (kr r /kr !).
r=1 r=1
We have obtained the same expression as (3.14)! In other words, the

distribution of the vector (k1 , k2 , . . . , kR ) depends only on the average
required service times, not on the shape of the required service time
distributions.
When the scheduling strategy is pre-emptive-resume LCFS, we dene
the system state as the vector of pairs ((r1 , l1 ), (r2 , l2 ), . . . , (rk , lk )), where
k is the number of jobs present and (rj , lj ) describes the j-th job in
the LCFS order: rj is the class and lj the service stage of that job
(rj {1, 2, . . . , R}, lj {1, 2, . . . , Lrj }, j = 1, 2, . . . , k). The empty state
is conventionally denoted ((0,0)).
Thus dened, the state forms a Markov process. Following a path which
is becoming familiar, we nd that its steady-state distribution is given by
k

p((r1 , l1 ), (r2 , l2 ), . . . , (rk , lk )) = (1 ) (rj Arj lj /rj lj ), (3.23)
j=1
using the same notation as in the previous case (and assuming that < 1).
The product on the right-hand side is dened as 1 if (r1 , l1 ) = (0, 0).
Aggregating over all states such that the class index of the rst job in
the LCFS order is r1 , that of the second job is r2 , etc., gives
k
k

p(r1 , r2 , . . . , rk ) = (1 ) (rj /rj ) = (1 ) r j
j=1 j=1
which is the same expression as (3.16). A further aggregation would yield

(3.17). Again, the distribution of (k1 , k2 , . . . , kR ) turns out to be insensitive
to the shape of the required service time distributions and to depend only
on their means.
We have a similar result for the server-per-job discipline. Dening
the system state by a vector of vectors (v1 , v2 , . . . , vR ), exactly as in the
processor-sharing case, we obtain
Lr
R

p(v1 , v2 , . . . , vR ) = e [(r Arl /rl )krl (1/krl !)]. (3.24)
r=1 l=1
Steady-state exists for all values of the parameters. In this case, since
exp() factorises into a product of exp((r Arl /rl )) over all r and l,
the random variables krl (the number of class r jobs in the l-th stage of
their service) are mutually independent. Aggregation of (3.24) yields (3.19).
3.5. The BCMP theorem

We are now ready to formulate one of the most general queueing network
models which have been analysed to date. The development and analysis
of the model was due to the combined eort, over several years, of Baskett,
Chandy, Muntz and Palacios [2, 3, 8]; the result bears their initials.
Case 3 (BCMP). The network topology is represented by an arbitrary

graph with N nodes (excluding the outside world node). There are R job
classes and jobs may change class as they move from one node to another.
More precisely, a job of class r, when completing service at node i, goes to
node j as a job of class s with probability pir,js ; that job leaves the network
with probability

pir,0 = 1 pir,js (i, j = 1, 2, . . . , N ; r, s = 1, 2, . . . , R).
j,s
The pair (i, r) associated with a job at a node is called job state. The set of
job states is split into one or more non-intersecting subsets (or subchains)
in the following way: two job states belong to the same subchain if there is
a non-zero probability that a job will be in both job states during its life
in the network. Denote these subchains by E1 , E2 , . . . , Em (m 1). (For
example, if jobs never change class when they go from node to node, there
will be at least R subchains.)
It may be that some subchains are closed, having a constant number
of jobs in them at all times, while others are open with external arrivals
and departures. Moreover, the external arrival processes may be state-
dependent in a restricted way. Let S be the state of the network (to be
dened later), let M (S) be the total number of jobs in the network in state
S and let M (S, Ek ) be the number of jobs in subchain Ek when the network
is in state S. The external arrivals may be generated in either, but not both,
of the following two ways:
(i) by a single non-homogeneous Poisson process whose instantaneous rate,

(M (S)), depends on the system state via the total number of jobs in
the network. A new arrival joins node i as a class r job with probability

p0,ir ( i,r p0,ir = 1);
(ii) by m independent non-homogeneous Poisson processes, one for each
subchain. The instantaneous rate of the k-th process, k (M (S, Ek )),
depends on the system state via the number of jobs in the subchain Ek .
A new arrival in the k-th stream joins node i as a class r job with

probability p0,ir ( (i,r)Ek p0,ir = 1).
It remains to describe node i of the queueing network and to dene

its state, Si (i = 1, 2, . . . , N ). There are four possibilities, as will now be
described.
Type 1 node: The service requirements for all job classes are distributed
exponentially with mean 1/i . Jobs are served in order of arrival. The state
Si of the node is dened as the vector (r1 , r2 , . . . , rni ), where ni is the
number of jobs present and rj is the class index of the j-th job in the FCFS
order. There is a single server whose speed Ci (ni ) depends on the number
of jobs and satises Ci (1) = 1 (multiple servers can be modelled by setting
Ci (ni ) = min(ni , ci )). Thus the instantaneous completion rate at node i in

state Si is i Ci (ni ).
Type 2 node: This consists of a single processor-shared server as
described in the last section. The required service times for class r jobs (r =
1, 2, . . . , R) have Coxian distribution with parameters airl , birl , irl , Lir (l =
Lir
1, 2, . . . , Lir ) and mean l=1 (Airl /irl ), where Airl is the probability that
a class r job at node i will reach stage l of its service. The node state Si is
dened as the vector (v1 , v2 , . . . , vR ) where vr = (nir1 , nir2 , . . . , nirLir ) is
a vector whose l-th element nirl denotes the number of class r jobs at node
i which are in the l-th stage of their service (l = 1, 2, . . . , Lir ). The numbers
of class r jobs at node i is nir = nir1 + + nirLir and the total number of
jobs is ni = ni1 + + niR . The speed of the server may depend on ni as
for type 1 nodes. Thus the rate of completion for class r jobs in stage l of
their service, when the node is in state Si , is (nirl /ni )irl Ci (ni ); after such
completion, the job leaves node i with probability birl and proceeds to the
next stage with probability airl .
Type 3 node: The scheduling strategy is server-per-job (it was dened
and analysed in the last section). The assumptions regarding the required
service time distributions, and the denition of the node state Si are the
same as for type 2 nodes. Since in the server-per-job discipline the speed of
service already depends on the numbers nirl (the completion rate for class
r jobs in stage l is nirl irl ; r = 1, 2, . . . , R; l = 1, 2, . . . , Lir ) there seems
little point in introducing further dependencies, although it is possible.
Type 4 node: A single server is scheduled according to the preemptive-
resume LCFS discipline (see last section). The required service times have
Coxian distributions, as for type 2 and 3 nodes. The state of the node, Si ,
is dened as the vector of pairs ((r1 , l1 ), (r2 , l2 ), . . . , (rni , lni )) whose j-th
element describes the j-th job in the LCFS order (j = 1, 2, . . . , ni ): rj is its
class index and lj is its service stage. The speed of the server may depend on
ni as for type 1 nodes. The stage completion rate in state Si , is ir1 l1 Ci (ni ).
The total network state S is dened as the vector (S1 , S2 , . . . , SN ). The

above denitions and assumptions ensure that S (regarded as a function of
time) is a Markov process. We are interested in the steady-state distribution
p(S) of that Markov process. To nd it, it suces (since the process is
irreducible) to nd a solution to the balance equations:
p(S)[instantaneous transition rate out of S]

= p(S ) [instantaneous transition rate from S to S] (3.25)
S
which satises the normalising equation

p(S) = 1. (3.26)
S
The existence of the steady-state distribution depends on the solution

of the following set of equations:

ejs = p0,js + eir pir,js ; i, j = 1, 2, . . . , N ; r, s = 1, 2, . . . , R.
i,r
In this model, the equations play the role which the trac equations played
in Jackson networks. The quantity eir is proportional to the total arrival
rate of class r jobs into node i (i = 1, 2, . . . , N ; r = 1, 2, . . . , R). Since
pir,js = 0 when the job states (i, r) and (j, s) belong to dierent subchains,
there are in fact m independent subsystems.

ejs = p0,js + eir pir,js ; (j, s) Ek , k = 1, 2, . . . , m. (3.27)
(i,r)Ek
If p0,js = 0 and pjs,0 = 0 for all (j, s) Ek then the subchain Ek is

closed and there is, at all times, a xed number of jobs in it. If p0,js > 0
for some (j, s) Ek and pir,0 > 0 for some (i, r) Ek then Ek is open. The
corresponding subsystem in (3.27) has a unique solution if all nodes in Ek
are open (see section 3.3), i.e. if Ek is completely open. We shall assume
that all subchains are either completely open or completely closed.
The following result is known as the BCMP theorem.
Theorem 3.4 (BCMP). Let eir (i = 1, 2, . . . , N ; r = 1, 2, . . . , R) be any

solution of (3.27). The general solution of the balance equations (3.25) has
the form
p(S) = (1/G)d(S)f1 (S1 )f2 (S2 ) fN (SN ) (3.28)
where:
(a) G is an arbitrary constant;

(b) if there are external arrivals and they are of type (i) (see the model
specication), then
M(S)1

d(S) = (n),
n=0
otherwise, if there are external arrivals and they are of type (ii) then

m M(S,Ek )1

d(S) = k (n) ,
k=1 n=0
and, if there are no external arrivals, d(S) = 1;

(c) the factor fi (Si ) depends on the type of node i (i = 1, 2, . . . , N ): if node
i is of type 1 then
nj

fi (Si ) = [eirj /(i Ci (j))],
j=1
if node i is of type 2 then

R L nj
ir
nirl
fi (Si ) = ni ! [(eir Airl /irl ) /nirl !] Ci (j) ,
r=1 l=1 j=1

Lir
R

fi (Si ) = [(eir Airl /irl )nirl /nirl !],
r=1 l=1

ni

fi (Si ) = [eirj Airj lj /(irj lj Ci (j))].
j=1
Moreover, if the constant G can be chosen so that the normalising

equation (3.26) is satised, i.e. if the sum S [d(S)f1 (S1 )f2 (S2 ) fN (SN )]
converges, then the steady-state distribution exists and is given by (3.28)
with what choice of G.
The proof of the theorem is by substituting (3.28) into (3.25) and
verifying that the latter are satised. The verication is performed by
showing that (3.28) satises a rather detailed set of local balance equations:
the rate of transition out of state S due to a class r job completing stage
l of its service at node i is equal to the rate of transition into state S due
to a class r job entering stage l of its service at node i; also, the rate of
transition out of S due to a class r job coming into node i is equal to the
rate of transition into S due to a class r job leaving node i. These local
balance equations (which, in turn, are established with the aid of (3.27) in
a similar way as we demonstrated in the case of Jackson networks) allow

individual terms to be cancelled out on both sides of the global balance
equations.
In practical applications one is usually interested not so much in the
node states Si as we have dened them, but rather in the aggregate node
states ni = (ni1 , ni2 , . . . , niR ) specifying the number of class r jobs at
node i (i = 1, 2, . . . , N ; r = 1, 2, . . . , R). Let n = (n1 , n2 , . . . , nN ) be
the aggregate network state. Its steady-state distribution can be obtained
(assuming that the steady-state distribution of S exists) by summing p(S)
over all states S which yield n. Because of the product-form of p(S), this is
equivalent to summing the factors fi (Si ) over all node states Si which yield
ni , and then multiplying the resulting factors together (note that d(S), in
(3.28), depends only on the total number of jobs in the network, or in the
subchains, and hence is the same for all S which yield n; we can denote it
d(n)). Performing these calculations we obtain
p(n) = (1/G)d(n)g1 (n1 )g2 (n2 ) gN (nN ), (3.29)
where G is the same constant as in (3.28); d(n) is dened in the same way
as d(S) in (3.28); the factor gi (ni ) depends on the type of node i (i =
1, 2, . . . , N ):
R
n
i
nir
gi (ni ) = ni ! (eir /nir !) [i Ci (j)] ,
r=1 j=1
if node i is of type 2 or 4 then

R n
i
nir
gi (ni ) = ni ! [(eir /ir ) /nir !] Ci (j) ,
r=1 j=1

R

gi (ni ) = [(eir /ir )nir /nir !];
r=1
1/ir is the average required service time for class r jobs at node i (node
types 2, 3 and 4):
Lit

1/ir = (Airl /irl ).
l=1
Once again we observe (see previous section) that the distribution of

the aggregate system state does not depend on the shape of the required
service time distributions (for node types 2, 3 and 4), only on their means.
Only the latter need to be estimated, therefore, when applying the model
in practice.
An even higher level of aggregation would involve dening the network
state simply as the vector (n1 , n2 , . . . , nN ), where ni is the total number of
jobs (of all classes) at node i (i = 1, 2, . . . , N ). Rather simpler expressions
can be obtained for the distribution of this aggregate state in the case
when the network does not contain any closed subchains, the speed of the
servers is independent of the node states, and the external arrival rate
is independent of the network state. Then equations (3.27) have a unique
solution eir (i = 1, 2, . . . , N ; r = 1, 2, . . . , R) which can be interpreted as
the average number of times a job visits node i with a class index r, during
its life in the network. The total average number of class r jobs coming into
node i per unit time is ir = eir and the overall trac intensity for node i is

R

(ir /i ) if node i is of type 1

r=1
i =

R

(ir /ir ) if node i is of type 2, 3 or 4.
r=1
The steady-state distribution of (n1 , n2 , . . . , nN ) now factorises com-

pletely into a product of individual node distributions:
p(n1 , n2 , . . . , nN ) = p1 (n1 )p2 (n2 ) . . . pN (nN ), (3.30)
where

(1 i )ni i if node i is of type 1, 2 or 4
pi (ni ) =
ei ni i /ni ! if node i is of type 3,
provided that i < 1 for nodes of type 1, 2 or 4. We see that in this case
the nodes behave like N independent M/M/1 (for types 1, 2 and 4) or
M/M/ (for type 3) queues.
Some remarks are in order concerning the assumptions, generality and
usefulness of the BCMP model. Clearly, the introduction of dierent job
classes and node types widens considerably the eld of application of the
model. We shall give two examples of systems which can be modelled as
BCMP but not as Jackson networks.
Consider a Jackson network where a job transition from node i to node

j is not instantaneous but takes a random time with Coxian distribution
(in computer systems transitions are rarely instantaneous, due to supervisor
overheads). This model can be included in the BCMP framework by adding
N 2 articial nodes (i, j) of type 3 whose service times will represent
the transit times between nodes in the original network. The new routing
probabilities should be dened in terms of the old as pi,(i,j) = pij ,
p(i,j),j = 1, p0i = p0i , pi0 = pi0 and zero otherwise. The state description
of the new network includes jobs at the (i, j) nodes (i.e. in transit), as well
as jobs at the original nodes.
Our second example is of a network where the destination of a job
after leaving a node depends not only on the node just left but also on
nodes visited previously (i.e. the job states represent a higher-order rather
than a rst-order Markov chain). This generalisation can be reduced to
a standard BCMP model by introducing new job classes, where the class
index would include the nodes previously visited. Thus, the higher-order
transition probabilities with the old job classes, pi1 i2 ...ih r,js , become rst-
order transition probabilities with the new job classes, pih r,js , where r =
(i1 i2 . . . ih1 r) and s = (i2 i3 . . . ih s).
Regarding the assumptions of the BCMP model, one can legitimately
ask the questions Why these particular four node types?, What is so
special about the processor-sharing, the server-per-job and the preemptive-
resume LCFS disciplines?, Is there no hope of generalising the model even
further by allowing, for instance, nodes with priority disciplines or FIFO
nodes with dierent required service time distributions for the dierent job
classes?. Some answers to these questions are gradually emerging. Muntz
[19] has shown that the four node types in the BCMP model all have a
certain property which ensures that when they are taken in isolation, with
Poisson inputs for each class, the departure process for each class is also
Poisson. He calls this the MM property (Markov implies Markov). The
MM property (which, briey, states that the class r arrival process in
reverse time is equivalent to the class r arrival process in forward time and
hence is Poisson) is sucient for the existence of a product-form solution
and these four node types are, at present, the only ones known to possess it.
We have already seen that local balance is closely connected with
product form. A recent work (Chandy et al., [9]) denes a property called
station balance which equates transition rates in and out of a particular
position in a queue, rather than in and out of the whole queue. It turns out
that station balance is necessary, as well as sucient, for the existence of a
product-form solution. Many interesting scheduling disciplines (e.g. priority

ones) do not satisfy station balance; it seems, therefore, that the chances
of generalising the BCMP model with respect to the scheduling strategies
allowed at each node are very slim.
Other generalisations exist, however. Kelly [15] allowed jobs to take
arbitrary paths through the network (rather than paths governed by
the transfer probabilities pir,js ). He also conjectured that, where Coxian
distributions are admitted, one can allow arbitrary distributions which
has been proved to be true by Barbour [1]. Lam [17] considered a model
where arrivals to the network can be lost and departures from the network
can trigger an arrival.
3.6. The computation of performance measures

If the solution for the stationary distribution of a queueing network state is
to be of any practical use, one should be able to extract from it numerical
values for specic measures of system performance like node utilisations,
throughputs, average response times, etc. Furthermore, one should be able
to do this at a computational cost which compares favourably with that of
a simulation.
The easiest cases to deal with are those of completely open Jackson
networks or BCMP networks all of whose subchains are open with state-
independent arrival rates. In those cases each node can be considered as a
separate, independent M/M/1 or M/M/ queue, perhaps with dierent
job classes and server speed depending on the number of jobs requiring
service. The relevant quantities of interest can be obtained either explicitly
(see sections 3.3 and 3.4) or with a minimum of computational eort.
Consider now a closed network with a single job class, a single (state-
independent) exponential server at each node and a total of K jobs
circulating inside (a special case of the GordonNewell model). The steady-
state distribution of the network state is given by
N
1 ni
p(n) = p(n1 , n2 , . . . , nN ) = ; n 1 + n2 + + nN = K (3.31)
G i=1 i
where i = ei /i , (e1 , e2 , . . . , eN ) is any solution of the equations (3.7) and

the normalising constant G is equal to
N

ni
G= i . (3.32)
n1 ++nN =K i=1
Even in this rather simple case the computational problem is non-

trivial. There are ( N N
+K1
1 ) terms in the summation on the right-hand
side of (3.32), which means that the brute force approach for evaluating
G is impractical for any but the smallest values of N and K. We shall
use the generating function method of Williams and Bhandiwad [24] to
develop an ecient algorithm (due originally to Buzen [7]) for computing G.
Consider the product of innite power series
N N

g(z) = gi (z) = ni i z ni (3.33)
i=1 i=1 ni =0
dened whenever the component series converge. g(z) will be called the
generating function of the network, and the factors gi (z) the generating
functions of the individual nodes (i = 1, 2, . . . , N ). Clearly, the coecient
of z K in g(z) is precisely our normalising constant G: that coecient, like G,
is the sum of terms of the type n1 1 n2 2 nNN, one term for each composition
n1 + n2 + + nN = K.
Denote by i (z) the partial products in (3.33):
1 (z) = g1 (z), i (z) = i1 (z)gi (z), i = 2, 3, . . . , N (3.34)
and let Gi (j) be the coecient of z j in i (z). Our task is to compute

G = GN (K). Using the fact that, in this case, gi (z) is a simple geometric
series, gi (z) = 1/(1 i z), we rewrite (3.34) as
i (z) = i1 (z) + i zi (z)
which implies the following recurrence relation for the coecients Gi (j):
Gi (j) = Gi1 (j) + i Gi (j 1), i = 2, 3, . . . , j = 1, 2, . . . . (3.35)
The algorithm suggested by (3.35) (together with G1 (j) = j1 , j = 0, 1, . . . ,

and Gi (0) = 1, i = 1, 2, . . .) computes GN (K) in O(N K) steps.
Similar ideas allow us to compute various performance measures. If,
in the product (3.33) dening g(z), we replace g1 (z) by g1 (z) 1, and
then take the coecient of z K , we would have a sum of terms of the type
n1 1 n2 2 nNN , where n1 + + nN = K and n1 1. According to (3.31),
that sum divided by G is equal to the probability P (n1 1) of having at
least one job at node 1. A similar statement is true, of course, for any other
node. Since we are dealing with geometric series,
gi (z) 1
g(z) = i zg(z)
gi (z)
and the coecient of z K on the right-hand side is i GN (K 1). Thus we

have, for the utilisation factor Ui of node i,
Ui = i GN (K 1)/GN (K). (3.36)
Note that GN (K 1) will have been computed in the process of computing

GN (K). Note also that (Ui /Uj ) = (i /j ) regardless of the value of K: the
utilisation factor of any one node determines the utilisation factors of all
other nodes. This last result is important, it is sometimes referred to as
the work-rate theorem.
From the utilisation factor Ui we can nd the throughput i at node i:
i = Ui i = ei GN (K 1)/GN (K). (3.37)
To obtain the average number E[ni ] of jobs at node i we write

k
K

E[ni ] = jP (ni = j) = P (ni j)
j=1 j=1
and, by an argument similar to the one which led to (3.36),
P (ni j) = ji GN (K j)/GN (K).
Hence
j K
1
E[ni ] = GN (K j). (3.38)
GN (K) j=1 i
The average sojourn time at node i, E[Ti ], is given by (according to

Littles theorem)
K
j
1
E[Ti ] = E[ni ]/i = GN (K j).
ei GN (K 1) j=1 i
Let us now generalise the model a little, by allowing the speed

of the server at node i to depend on the number of jobs there, with
the usual notation Ci (j) expressing the dependency. This generalisation
includes multiple-server nodes (Ci (j) = min(j, ci )) and server-per-job nodes

(Ci (j) = j). The steady-state distribution of the network state is given by
N
1
p(n1 , n2 , . . . , nN ) = i (ni ), n1 + n2 + + nN = K
G i=1
where i (0) = 1, i (j) = ji /[Ci (1)Ci (2) . . . Ci (j)], j 1 with the previous
notation for i . The network generating function is

N N

g(z) = gi (z) = i (j)z j
i=1 i=1 j=0
and again G is the coecient of z K in g(z). This time, however, the

convolution (3.34) does not simplify; the coecient of z j in i (z) is given by
j

Gi (j) = Gi1 (s)i (j s).
s=0
This recurrence relation, together with the initial conditions G1 (j) = 1 (j),
j = 0, 1, . . . , allow GN (K) to be computed in O(N K 2 ) steps.
To nd the utilisation of node i, Ui , we proceed as before: G Ui is the
coecient of z K in the series
gi (z) 1
hi (z) = g(z) = g(z)[1 di (z)] (3.39)
gi (z)
where di (z) is the inverse of gi (z). Denoting the coecients of hi (z) and
di (z) by Hi (j) and Di (j), respectively, the convolution (3.39) yields a
recurrence relation
j

Hi (j) = GN (j) GN (s)Di (j s). (3.40)
s=0
The coecients Di (j) are determined from the condition
di (z)gi (z) = 1
which yields
j

Di (0) = 1, Di (s)i (j s) = 0, j 1.
s=0
or
j1

Di (j) = Di (s)i (j s), j 1. (3.41)
s=0
Thus (3.41) can be used to compute Di (j) and then (3.40) to compute
Hi (j). The utilisation of the i-th node is given by
Ui = Hi (K)/GN (K). (3.42)
Two remarks should be made concerning (3.42): rstly, if node i

happens to contain a single-state independent server, then (3.42) coincides
with (3.36) even though other nodes may be more complicated; secondly,
the denition of Ui as P (ni 1) is correct for single-server nodes but may be
inappropriate if a state-dependent server is used to model a multiple-server
node (for example, the utilisation of a server-per-job node is sometimes
dened as the average number of jobs there).
The average number of jobs at node i (E[ni ]) is, perhaps, best obtained
by rst nding the marginal distribution at node i: pi (j) = P (ni = j), j =
0, 1, . . . , K. The probability pi (j) is equal to a sum of terms of the type
1 (n1 ) . . . i (j) . . . N (nN ), with n1 + + ni1 + ni+1 + + nN = K j,
divided by G. Apart from the factor i (j), the sum in the numerator is
the normalising constant of a network from which node i is removed, with
K j jobs circulating in it; we shall denote it GN \i (K j). Thus
pi (j) = i (j)[GN \i (K j)]/GN (K) (3.43)
and
K

E[ni ] = jpi (j), i = 1, 2, . . . , N.
j=1
The throughput of node i is given by

K

i = pi (j)Ci (j)i .
j=1
Substituting (3.43) in this last expression and remembering that

Ci (j)i i (j) = ei i (j 1) we obtain, surprisingly,
i = ei GN (K 1)/GN (K)
i.e. the same expression as (3.37)!

The methods described so far generalise to networks with more than one
job class. The generating functions of such networks are multi-variate (there
is one variable for each job class if jobs do not change classes; one variable
for each subchain if they do). The normalisation constant and various
quantities of interest are obtained by multi-variate convolutions (Reiser [21];
Reiser and Kobayashi [22]; Wong [25]). Some results remain unchanged: for
example, the throughput ir of class r jobs through node i, in a closed
network with Kr jobs of class r circulating inside (r = 1, 2, . . . , R), is given
by
ir = eir GN (K1 , . . . , Kr 1, . . . , KR )/GN (K1 , . . . , KR ) (3.44)
where {eir } is any solution of equations (3.27). Expression (3.44) is a

generalisation of (3.37). If node i has single-server of constant speed, then
its utilisation due to class r jobs (the fraction of time it spends serving class
r jobs) is
Uir = ir /ir , r = 1, 2, . . . , R. (3.45)
If the service rate is state-dependent, Uir can be computed from

nir
Uir = pi (Si )
ni
Si
where Si is the state of node i, pi (Si ) is the probability of that state and
nir /ni is the fraction of server capacity allocated to class r jobs (for type 2
nodes), or the probability of a class r job being in service (type 1 or 4
nodes). Such a procedure would involve the computation of the normalising
constant and then of the marginal probabilities pi (Si ).
When we talk about response times in the context of a closed network,
we usually mean the time between leaving a certain node and returning
to it. For example, in a terminal-driven system the collection of terminals
is modelled by one node (of type server-per-job). Let that be node i and
suppose that there are Kr terminals of class r, r = 1, 2, . . . , R (in a heavily
loaded system, when the terminals are busy all the time, jobs can be
identied with terminals). The response time for a class r job is dened as
the interval between the job leaving its terminal (the user presses carriage
return) and returning to it (the keyboard unlocks). Denote the average
response time for class r jobs by Wir . Let ir be the throughput of class r
jobs at node i and let E[nir ] be the average number of class r jobs at node
i (users in think state). The average number of class r jobs in the rest of
the system is Kr E[nir ] and, by Littles theorem,
Wir = (Kr E[nir ])/ir .
On the other hand, node i being of type 3, jobs do not wait there; the
average sojourn time for class r jobs is equal to their average service time
(or think time) 1/ir ; again by Littles theorem, E[nir ] = ir /ir . Hence
Wir = Kr /ir 1/ir (3.46)
where ir is given by (3.44). Note that, while (3.44) relies on the

assumptions of the model and on the product-form solution, (3.46) does
not; it is a completely general relation between response time, think time
and throughput. Because of its importance, we shall rewrite it in another
form, relating response time, utilisation and required service.
Let j be any node which class r jobs visit (in computer system models
j is usually taken to be the CPU but it does not have to be). Suppose
that the server speed at node j is state-independent so that its utilisation
is given by (3.45). Since the absolute and relative s at nodes i and j are
proportional to each other, (ir /eir ) = (jr /ejr ), we can write
eir jr jr eir Ujr
ir = jr = = .
ejr jr ejr (ejr /eir )(1/jr )
Now, the ratio ejr /eir represents the average number of visits class r
jobs make to node j in between successive visits to node i; 1/jr is the
average amount of service they require from node j on each visit; therefore,
(ejr /eir ) (1/jr ) is the average amount of service class r jobs require from
node j in between successive visits to node i. Denote that quantity by
E[sjr,i ]. Thus we have the relation
Ujr
ir = . (3.47)
E[sjr,i ]
Substituting (3.47) into (3.46) gives
E[sjr,i ] 1
Wir = Kr . (3.48)
Ujr ir
Equation (3.48) is fundamental to terminal systems under heavy load.
It implies that, given the average think times and the total average required
service times from any node (these are job class characteristics), the
utilisation of that node and the average response time (with respect to
the particular job class) uniquely determine each other. Moreover, (3.48)
and (3.47), like (3.46), are valid under much more general assumptions than
those of the BCMP model.
Let us now take, as an example, the terminal system introduced at
the beginning of this chapter (see Fig. 3.1) and obtain for it expressions
for some performance measures of interest. The system consisted of M
terminals (modelled by a node of type 3), one CPU (a type 2 node), one
paging drum and one ling disk (type 1 nodes). Suppose that there is only
one job class and that on leaving the CPU jobs go to the terminals, the drum
and the disk with probabilities p1 , p3 and p4 , respectively, (p1 +p3 +p4 = 1).
On leaving the terminals, the drum and the disk, jobs go to the CPU with
probability 1. Let 1/i , i = 1, 2, 3, 4 be, respectively, the average think
times, the average CPU intervals, the average drum transfer times and
the average disk transfer times (the latter two include rotational and/or
seek delays). The corresponding distributions may be arbitrary Coxian for
i = 1, 2, but have to be assumed exponential for i = 3, 4 (see section 3.5).
The ow equations, (3.7) or (3.27), are
e1 = p 1 e 2 e 3 = p3 e 2
e2 = e1 + e3 + e4 e 4 = p4 e 2
and one solution can be obtained by setting e2 = 1, which gives e1 = p1 ,

e3 = p3 , e4 = p4 . The distribution of the aggregate system state n =
(n1 , n2 , n3 , n4 ), where ni is the number of jobs at node i (i = 1, 2, 3, 4)
given by (3.29):
1 n1
p(n) = ( /n1 !)n2 2 n3 3 n4 4 ,
G 1
where i = ei /i (i = 1, 2, 3, 4). The normalising constant, G = G4 (M ), can
be computed by using the recurrence relations
G1 (j) = j1 /j!, j = 0, 1, . . . , M
Gi (0) = 1, i = 1, 2, 3, 4
Gi (j) = Gi1 (j) + i Gi (j 1), i = 2, 3, 4; j = 1, 2, . . . , M.
The CPU utilisation factor is given by (3.36),
U2 = 2 G4 (M 1)/G4 (M ),
and the of jobs at the CPU is
2 = 2 U2 = G4 (M 1)/G4 (M ).
The s at the other nodes are, respectively, 1 = p1 2 , 3 = p3 2 and

4 = p4 2 . The average number of jobs in think state is 1 /1 and that
in compute state is M (1 /1 ). The average response time W can be
obtained either from (3.46) or from (3.48):
W = [M G4 (M )/(p1 G4 (M 1))] (1/1 ).
If the set of terminals is split into several subsets (classes) with

dierent characteristics, the only signicant change in the analysis will be
in the computation of the normalising constant which will require multi-
variate convolution. Formulae (3.44), (3.45) and (3.46) can still be used to
determine s, utilisation factors and average response times.
References
1. Barbour, A. D. (1976). Networks of queues and the method of stages. Adv.
Appl. Prob., 8(3), 584591.
2. Baskett, F., Chandy, K. M., Muntz, R. R. and Palacios, F. G. (1975). Open,
closed and mixed networks of queues with dierent classes of customers.
J.A.C.M., 22(2), 248260.
3. Baskett, F. and Palacios, F. G. (1972). Processor Sharing in a Central Server
Queueing Model of Multiprogramming with Applications. Proc. 6th Ann.
Princeton Conf. on Information Science and Systems, pp. 598603. Princeton,
New Jersey.
4. Burke, P. J. (1958). The output process of a stationary M/M/s queueing
system. Ann. of Math. Stat., 39, 1141152.
5. Burke, P. J. (1972). Output Processes and Tandem Queues. Proc. Symp.
Computer Communications Networks and Telecommunications, Brooklyn.
6. Burke, P. J. (1976). Proof of a conjecture on the interarrival-time distribution
in an M/M/1 queue with feedback. IEEE Trans. on Comm., 24(5), 175176.
7. Buzen, J. P. (1972). Queueing Network Models of Multiprogramming.
Ph.D. Thesis, Harvard University, Cambridge, Massachusetts.
8. Chandy, K. M. (1972). The Analysis and Solutions for General Queueing
Networks. Proc. 6th Ann. Princeton Conf. on Information Science and
Systems, pp. 224228. Princeton, New Jersey.
9. Chandy, K. M., Howard, J. H. and Towsley, D. F. (1977). Product form and
local balance in queueing networks. J.A.C.M., 24(2), 250263.
10. Cox, D. R. (1955). A use of complex probabilities in the theory of stochastic
processes. Proc., Cambridge Phil. Soc., 51, 313319.
11. Gordon, W. J. and Newell, G. F. (1967). Closed queueing systems with
exponential servers. Operations Research, 15, 254265.
12. Jackson, J. R. (1957). Networks of waiting lines. Operations Research,
15, 254265.
13. Jackson, J. R. (1963). Jobshop-like queueing systems. Man. Sci., 10(1),

131142.
14. Jackson, R. R. P. (1954). Queueing systems with phase type service.
Operations Research Quart., 5, 109120.
15. Kelly, F. P. (1976). Networks of queues. Adv. Appl. Prob., 8(2), 416423.
16. Kleinrock, L. (1967). Time-shared systems: A theoretical treatment,
J.A.C.M., 14(2), 242261.
17. Lam, S. S. (1977). Queueing networks with population size constraints. IBM
J. Res. Dev., 21(4), 370378.
18. Melamed, B. (1976). Analysis and Simplications of Discrete Event Systems
and Jackson Queueing Networks. Ph.D. Thesis, University of Michigan.
19. Muntz, R. R. (1972). Poisson Departure Processes and Queueing Networks.
IBM Research Report, RC 4145, IBM Thomas J. Watson Research Center,
Yorktown Heights, New York.
20. Reich, E. (1957). Waiting times when queues are in tandem. Ann. Math.
Stat., 28, 768773.
21. Reiser, M. (1976). Numerical Methods in Separable Queueing Networks.
IBM Research Report, RC 5842, IBM Thomas J. Watson Research Center,
22. Reiser, M. and Kobayashi, H. (1975). Queueing networks with multiple
closed chains: Theory and computational algorithms. IBM J. Res. Dev., 19,
283294.
23. Sevcik, K. C. and Mitrani I. (1979). The Distribution of Queueing Network
States at Input and Output Instants. Proc. 4th Int. Symp. on Modelling
and Perfecting Evaluations of Computer Systems, Vienna. North-Holland,
Amsterdam.
24. Williams, A. C. and Bhandiwad, K. A. (1974). Queueing Network Models
of Computer Systems. Proc. 3rd Texas Conf. on Computer Systems.
25. Wong, J. W.-N. (1975). Queueing Network Models for Computer Systems.
Ph.D. Thesis, University of California at Los Angeles.

Chapter 4
Queueing Networks with Multiple
Classes of Positive and Negative Customers
and Product Form Solution
4.1. Introduction
In papers dating from the end of the 1980s and early 1990s [3, 6], new
models of queueing networks were introduced, in which customers can be
either negative or positive. Positive customers are the ones that we
are used to when we model service systems: they enter a queue, wait and
then receive service, and then they move on to another queue and the
same thing may happen until they nally leave the network (or continually
cycling inside the network indenitely). However in this new model called
a Gelenbe Network or G-Network, a positive customer may mutate into
a negative customer when it enters another queue. A negative customer
vanishes if it arrives to an empty queue, and otherwise it reduces by one
the number of positive customers in the queue it enters. Furthermore,
negative customers do not receive service so that their only eect is to
reduce the amount of work at the queue which they enter or to destroy
other customers, hence the term negative.
It has been shown [6] that networks of queues with a single class
of positive and negative customers have a product form solution if the
external positive or negative customer arrivals are Poisson, the service
times of positive customers are exponential and independent, and if the
movement of customers between queues is Markovian. This chapter will
discuss the theory of G-networks as it applies to networks of queues with
multiple classes of positive and negative customers, with direct relations
of destruction among negative customers of certain classes and positive
customers of certain other classes. We will also allow changes among
customer classes, as is usual in such models. Of course, as indicated in
117
previous chapters of this book, the classical reference for multiple class
queueing network models is [2] and the related theory is discussed there, and
in other sources. Multiple class queueing networks which include negative
customers were rst developed in [19] and generalised in [20]. The extension
of the original model [6] to multiple classes has also been discussed in [12].
Some applications of G-networks are summarised in [18]. G-Networks
can be used to represent a variety of systems. The initial model [6] was
motivated by the analogy with neural networks [4,11]: each queue represents
a neuron, and customers represent excitation (positive) or inhibition (nega-
tive) signals. Indeed, signals in biophysical neurons, for instance in the brain
of mammals, also take the form of random trains of impulses of constant
size, just like customers travelling through a queueing network. Results sim-
ilar to the ones presented in this paper have been used in [9 and 25], where
signal classes correspond to dierent colours in images. Other applica-
tions, including to networking problems [17] have also been developed.
Another application is to multiple resource systems: positive customers
can be considered to be resource requests, while negative customers can
correspond to decisions to cancel such requests. G-Networks have been
applied to model systems where redundancy is used to protect the systems
operation against failures: work is scheduled on two dierent processors
and then cancelled at one of the two processors as soon as the work is
successfully completed at the other, as detailed in [8].
The single server queue with negative and positive customers has
been discussed in [7], while stability conditions for G-Networks were rst
obtained under general conditions in [10]. G-Networks with triggers which
are specic customers which can re-route other customers [14], and batch
removal of customers by negative customers, have been introduced in [15].
Additional primitives for these networks have also been introduced in [13].
The computation of numerical solutions to the non-linear trac equations,
which will be examined in detail below, has been discussed in [5].
In this chapter we focus on G-Networks with multiple classes of
positive customers and one or more classes of negative customers,
together three types of service centers and service disciplines:
Type 1: rst-in-rst-out (FIFO),
Type 2: processor sharing (PS),
Type 4: last-in-rst-out with preemptive resume priority (LIFO/PR).
With reference to the usual terminology related to the BCMP
theorem [2], we exclude from the present model the Type 3 service centers
Queueing Networks with Multiple Classes 119
with an innite number of servers since they will not be covered by

our results. Furthermore, in this paper we deal only with exponentially
distributed service times.
In section 2 we will prove that these multiple class G-Networks, with
Types 1, 2 and 4 service centers, have product form. Due to the non-linearity
of the trac equations for these models [6] the existence and uniqueness
of their solutions have to be addressed with some care. This issue will be
examined in section 4 with techniques similar to those developed in [10].
4.2. The model

We consider networks with an arbitrary number N of queues, an arbitrary
number of positive customer classes K, and an arbitrary number of negative
customer classes S. As in [6] we are only interested in open G-Networks.
Indeed, if the system is closed, then the total number of customers will
decrease as long as there are negative customers in the network.
External arrival streams to the network are independent Poisson pro-
cesses concerning positive customers of some class k or negative customers
of some class c. We denote by i,k the external arrival rate of positive
customers of class k to queue i and by i,m be the external arrival rate of
negative customers of class m to queue i.
Only positive customers are served, and after service they may change
class, service center and nature (positive to negative), or depart from the
system. The movement of customers between queues, classes and nature
(positive to negative) is represented by a Markov chain.
At its arrival in a non-empty queue, a negative customer selects a
positive customer in the queue in accordance with the service discipline
at this station. If the queue is empty, then the negative customer simply
disappears. Once the target is selected, the negative customer tries to
destroy the selected customer. A negative customer, of some class m,
succeeds in destroying the selected positive customer of some class k, at
service center i with probability Ki,m,k . With probability (1Ki,m,k ) it does
not succeed. A negative customer disappears as soon as it tries to destroy its
targeted customer. Recall that a negative customer is either exogenous, or is
obtained by the transformation of a positive customer as it leaves a queue.
A positive customer of class k which leaves queue i (after nishing
service) goes to queue j as a positive customer of class l with probability
P + [i, j][k, l], or as a negative customer of class m with probability
P [i, j][k, m]. It may also depart from the network with probability d[i, k].
Obviously we have for all i, k

R
N N
S
P + [i, j][k, l] + P [i, j][k, m] + d[i, k] = 1. (1)
j=1 l=1 j=1 m=1
We assume that all service centers have exponential service time
distributions. In the three types of service centers, each class of positive
customers may have a distinct service rate i,k .
When the service center is of Type 1 (FIFO) we place the following
constraint on the service rate and the destruction rate due to incoming
negative customers:
S

i,k + Ki,m,k i,m = ci . (2)
m=1
Note that this constraint, together with the constraint (3) given below,
have the eect of producing a single positive customer class equivalent for
service centers with FIFO discipline.
The following constraints on the deletion probability are assumed to
exist. Note that because services are exponentially distributed, positive
customers of a given class are indistinguishable for deletion because of the
obvious property of the remaining service time.
The following constraint must hold for all stations i of Type 1 and classes
N R
of negative customers m such that j=1 l=1 P [j, i][l, m] > 0
for all classes of positive customers k and p, Ki,m,k = Ki,m,p . (3)
This constraint implies that a negative customer of some class m arriving
from the network does not distinguish between the positive customer
classes it will try to delete, and that it will treat them all in the same
manner.
For a Type 2 server, the probability that any one positive customer of
the queue is selected by the arriving negative customer is 1/c if c is the
total number of customers in the queue.
For Type 1 service centers, one may consider the following conditions
which are simpler than (2) and (3):
ik = ip
(4)
Ki,m,k = Ki,m,p
for all classes of positive customers k and p, and all classes of negative
customers m. Note however that these new conditions are more restrictive,
though they do imply that (2), (3) hold.
4.2.1. State representation

We denote the state at time t of the queueing network by a vector x(t) =
(x1 (t), . . . , xN (t)). Here xi (t) represents the state of service center i. The
vector x = (x1 , . . . , xN ) will denote a particular value of the state and |xi |
will be the total number of customers in queue i for state x.
For Types 1 and 4 servers, the instantaneous value of the state xi of
queue i is represented by the vector (ri,j ) whose length is the number of
customers in the queue and whose jth element is the class index of the jth
customer in the queue. Furthermore, the customers are ordered according
to the service order (FIFO or LIFO); it is always the customer at the head
of the list which is in service. We denote by ri,1 the class number of the
customer in service and by ri, the class number of the last customer in
the queue.
For a PS (Type 2) service station, the instantaneous value of the state
xi is represented by the vector (xi,k ) which is the number of customers of
class k in queue i.
4.3. Main results

Let (x) denote the stationary probability distribution of the state of
the network, if it exists. The following result establishes the product form
solution of the network being considered.
Theorem 1. Consider a G-network with the restrictions indicated above.

If the system of non-linear equations:
probability that queue i has at least 1 customer of class k
i,k + +
i,k
qi,k = S (5)
i,k + m=1 Ki,m,k [i,m +
i,m ]
rate of incoming positive customers coming from inside the network
N
R
+
i,k = P + [j, i][l, k]j,l qj,l (6)
j=1 l=1
rate of incoming negative customers coming from inside the network
N
R

i,m = P [j, i][l, m]j,l qj,l (7)
j=1 l=1
has a solution such that

R

for each pair i, k : qi,k > 0 and for each station i: qi,k < 1
k=1
then the stationary distribution of the network state is

N

(x) = G gi (xi ) (8)
i=1
where each gi (xi ) depends on the type of service center i. The gi (xi ) in (8)
have the following forms:
FIFO. If the service center is of Type 1, then

|xi |

gi (xi ) = qi,ri,n (9)
n=1
PS. If the service center is of Type 2, then

R
(qi,k )xi,k
gi (xi ) = |xi |! (10)
xi,k !
k=1
LIFO/PR. If the service center is of Type 4, then

|xi |

gi (xi ) = qi,ri,n (11)
n=1
and G is the normalisation constant.

Note that the conditions requiring that qi,k > 0 and on that their sum
over all classes at each center be less than 1, simply ensure the existence of
the normalising constant G in Eq. (8).
The proof is based on simple algebraic manipulations of global balance
equations, since it is not possible to use the local balance equations
for customer classes at stations because of the eect of negative customer
arrivals. We begin with some technical lemmas.
Lemma 1. The following flow equation is satisfied:
R
N R
N S
N
qi,k i,k (1 d[i, k]) = +
i,k +
i,m . (12)
i=1 k=1 i=1 k=1 i=1 m=1
Proof. Consider (6), then sum it for all the stations and all the classes and
exchange the order of summations in the right-hand side of the equation:
N R N R
N R

+
i,k = j,l qj,l P + [j, i][l, k] .
i=1 k=1 j=1 l=1 i=1 k=1
Similarly, using equation (7)

S
N R
N
S
N

i,m = j,l qj,l P [j, i][l, m]
i=1 m=1 j=1 l=1 i=1 m=1
and,
R
N N
S
+
i,k +
i,m
i=1 k=1 i=1 m=1
R
N
N R S
N

+
= j,l qj,l P [j, i][l, k] + P [j, i][l, m] .
j=1 l=1 i=1 k=1 i=1 m=1
According to the denition of the routing matrix P (equation (1)), we have
R
N S
N R
N
+
i,k +
i,m = j,l qj,l (1 d[j, l]).
i=1 k=1 i=1 m=1 j=1 l=1
Thus the proof of the lemma is complete.

In order to carry out algebraic manipulations of the stationary
Chapman-Kolmogorov (global balance) equations, we introduce some nota-
tion and develop intermediate results:
The state dependent service rates for customers at service center j will
be denoted by Mj,l (xj ) where xj refers to the state of the service center
and l is the class of the customer concerned. From the denition of the
service rate j,l , we obtain for the three types of stations:
FIFO and LIFO/PR. Mj,l (xj ) = j,l 1{rj,1 =l} ,
x
PS. Mj,l (xj ) = j,l |xj,l
j|
.
Nj,l (xj ) is the deletion rate of class l positive customers due to external
arrivals of all the classes of negative customers

FIFO and LIFO/PR. Nj,l (xj ) = 1{rj,1 =l} Sm=1 Kj,m,l j,m
x S
PS. Nj,l (xj ) = |xj,l j| m=1 Kj,m,l j,m .
Aj,l (xj ) is the condition which establishes that it is possible to reach

state xj by an arrival of a positive customer of class l
FIFO. Aj,l (xj ) = 1{rj, =l} ,
LIFO/PR. Aj,l (xj ) = 1{rj,1 =l} ,
PS. Aj,l (xj ) = 1{|xj,l |>0} .
Zj,l,m (xj ) is the probability that a negative customer of class m, arriving
from the network, will delete a positive customer of class l.
FIFO and LIFO/PR. Zj,l,m (xj ) = 1{rj,1 =l} Kj,m,l
x
PS. Zj,l,m (xj ) = |xj,l
j|
Kj,m,l .
Yj,m (xj ) is the probability that a negative customer of class m which
enters a non empty queue, will not delete a positive customer.
R
FIFO and LIFO/PR. Yj,m (xj ) = l=1 1{rj,1 =l} (1 Kj,m,l )
R x
PS. Yj,m (xj ) = l=1 (1 Kj,m,l ) |xj,l
j|
.
Denote by (xj + ej,l ) the state of station j obtained by adding to

the server a positive customer of class l. Let (xi ei,k ) be the state
obtained by removing from the end of the list a class k customer (if it exists,
since otherwise (xi ei,k ) will not be dened).
Lemma 2. For any Type 1, 2, or 4 service center, the following relations

hold:
gj (xj + ej,l )
Mj,l (xj + ej,l ) = j,l qj,l (13)
gj (xj )
S
gj (xj + ej,l )
Nj,l (xj + ej,l ) = (Kj,m,l j,m )qj,l (14)
gj (xj ) m=1
gj (xj + ej,l )
Zj,l,m (xj + ej,l ) = Kj,m,l qj,l . (15)
gj (xj )
The proof is purely algebraic.
Remark 1. As a consequence, we have from equations (6), (7) and (13):

N
R
gj (xj + ej,l ) +
+
i,k = Mj,l (xj + ej,l ) P [j, i][l, k] (16)
j=1 l=1
gj (xj )
and
N
R
gj (xj + ej,l )

i,m = Mj,l (xj + ej,l ) P [j, i][l, m]. (17)
j=1 l=1
gj (xj )
Lemma 3. Let i be any Type 1, 2, or 4 station, and let i (xi ) be:

S

i (xi ) =
i,m Yi,m (xi )
m=1
R

(Mi,k (xi ) + Ni,k (xi ))
k=1
R
gi (xi ei,k )
+ Ai,k (xi )(i,k + +
i,k ) .
gi (xi )
k=1
Then for the three types of service centers, 1{|xi |>0} i (xi ) =
S
m=1 i,m 1{|xi |>0} .
Proof of the Lemma. The proof consists in algebraic manipulations for

the three types of stations.
LIFO/PR. First consider an arbitrary LIFO station and recall the
denition of i :
R
gi (xi ei,k )
1{|xi |>0} i (xi ) = 1{|xi |>0} Ai,k (xi )(i,k + +
i,k )
gi (xi )
k=1
R
R

1{|xi|>0} Mi,k (xi ) 1{|xi |>0} Ni,k (xi )
k=1 k=1
S

+ 1{|xi|>0}
i,m Yi,m (xi ).
m=1
Then, we substitute the values of Yi,m , Mi,k , Ni,k and Ai,k for a LIFO
station:
R

1{|xi |>0} i (xi ) = 1{|xi |>0} 1{ri,1 =k} (i,k + +
i,k )/qi,k
k=1
R

1{|xi |>0} 1{ri,1 =k} i,k
k=1
R
S

1{|xi |>0} 1{ri,1 =k} Ki,m,k i,m
k=1 m=1
S
R

+ 1{|xi |>0}
i,m 1{ri,1 =k} (1 Ki,m,k ).
m=1 k=1
We use the value of qi,k from equation (5) to obtain after some can-
cellations of terms:
R

1{|xi |>0} i (xi ) = 1{|xi |>0} 1{ri,1 =k}
k=1
S S

Ki,m,k
i,m +
i,m (1 Ki,m,k )
m=1 m=1
S
R

= 1{|xi |>0}
i,m 1{ri,1 =k}
m=1 k=1
R
and as 1{|xi |>0} k=1 1{ri,1 =k} = 1{|xi |>0} , we nally get the result:
S

1{|xi |>0} i (xi ) = 1{|xi |>0}
i,m . (18)
m=1
FIFO. Consider now an arbitrary FIFO station:

R
gi (xi ei,k )
1{|xi |>0} i (xi ) = 1{|xi |>0} Ai,k (xi )(i,k + +
i,k )
gi (xi )
k=1
R
R

1{|xi|>0} Mi,k (xi ) 1{|xi |>0} Ni,k (xi )
k=1 k=1
S

+ 1{|xi|>0}
i,m Yi,m (xi ).
m=1
Similarly, we substitute the values of Yi,m , Mi,k , Ni,k , Ai,k and qi,k :
R

1{|xi |>0} i (xi ) = 1{|xi |>0} 1{ri, =k}
k=1
S S

i,k + Ki,m,k i,m + Ki,m,k
i,m
m=1 m=1
R

1{|xi|>0} 1{ri,1 =k} i,k 1{|xi |>0}
k=1
R
S

1{ri,1 =k} Ki,m,k i,m
k=1 m=1
S
R

+ 1{|xi|>0}
i,m 1{ri,1 =k} (1 Ki,m,k ).
m=1 k=1
We separate the last term into two parts, and regroup terms:
R

1{|xi |>0} i (xi ) = 1{|xi |>0} 1{ri, =k}
k=1
S S

i,m
m=1 m=1
R

1{|xi |>0} 1{ri,1 =k}
k=1
S S

i,m
m=1 m=1
S
R

+ 1{|xi |>0}
i,m 1{ri,1 =k} .
m=1 k=1
Conditions (2) and (3) imply that the following relation must hold:
R
S S

1{ri, =k} i,k + Ki,m,k i,m + Ki,m,k
i,m
k=1 m=1 m=1
R
S S

= 1{ri,1 =k} i,k + Ki,m,k i,m + Ki,m,k
i,m .
k=1 m=1 m=1
R
Thus, as 1{|xi |>0} k=1 1{ri,1 =k} = 1{|xi |>0} , we nally get the
expected result:
S

1{|xi |>0} i (xi ) = 1{|xi |>0}
i,m . (19)
m=1
PS. Consider now an arbitrary PS station:
R
gi (xi ei,k )
1{|xi |>0} i (xi ) = 1{|xi |>0} Ai,k (xi )(i,k + +
i,k )
gi (xi )
k=1
R
R

1{|xi |>0} Mi,k (xi ) 1{|xi |>0} Ni,k (xi )
k=1 k=1
S

+ 1{|xi |>0}
i,m Yi,m (xi ).
m=1
As usual, we substitute the values of Yi,m , Mi,k , Ni,k , Ai,k :

R
(i,k + +
i,k ) xi,k
1{|xi |>0} i (xi ) = 1{|xi |>0} 1{|xi,k |>0}
qi,k |xi |
k=1
R
xi,k
1{|xi|>0} i,k
|xi |
k=1
R
S
xi,k
1{|xi|>0} Ki,m,k i,m
|xi | m=1
k=1
S
R
xi,k
+ 1{|xi|>0}
i,m (1 Ki,m,k ).
m=1 k=1
|xi |
Then, we apply equation (5) to substitute qi,k . After some cancellations

of terms we obtain:
R
S
xi,k
1{|xi |>0} i (xi ) = 1{|xi |>0} Ki,m,k
i,m
|xi | m=1
k=1
S
R
xi,k
+ 1{|xi|>0}
i,m (1 Ki,m,k ).
m=1 k=1
|xi |
Finally we have:
R
S
xi,k
1{|xi |>0} i (xi ) = 1{|xi |>0}
i,m . (20)
|xi | m=1
k=1
R x
As 1{|xi |>0} k=1 |xi,k
i|
= 1{|xi |>0} , once again, we establish the relation we
need. This concludes the proof of Lemma 3.
Let us now turn to the proof of the Theorem 1. Consider the global
balance equation the networks considered is:

N R

(x) j,l + Mj,l (xj )1{|xj |>0} + Nj,l (xj )1{|xj |>0}
j=1 l=1
N
R
= (x ej,l )j,l Aj,l (xj )1{|xj |>0}
j=1 l=1
N
R
+ (x + ej,l )Nj,l (xj + ej,l )
j=1 l=1
N
R
+ (x + ej,l )Mj,l (xj + ej,l )d[j, l]
j=1 l=1
N
N R
R
+ Mj,l (xj + ej,l )(x ei,k + ej,l )
i=1 j=1 k=1 l=1
P + [j, i][l, k]Ai,k (xi )1{|xi |>0}

N
N R
R
S
+ Mj,l (xj + ej,l )(x + ei,k + ej,l )
i=1 j=1 k=1 l=1 m=1
P [j, i][l, m]Zi,k,m (xi + ei,k )

N
N R
S
+ Mj,l (xj + ej,l )(x + ej,l )
i=1 j=1 l=1 m=1
P [j, i][l, m]Yi,m (xi )1{|xi |>0}

N
N R
S
+ Mj,l (xj + ej,l )(x + ej,l )P [j, i][l, m]1{|xi|=0} .
i=1 j=1 l=1 m=1
We divide both sides by (x) and we assume that there is a product

form solution. Then, we apply Lemma 2.
N
R
(j,l + Mj,l (xj )1{|xj |>0} + Nj,l (xj )1{|xj |>0} )
j=1 l=1
N R
gj (xj ej,l )
= j,l Aj,l (xj )1{|xj |>0}
j=1
gj (xj )
l=1
R
N S R
N
+ j,m Kj,m,l qj,l + j,l qj,l d[j, l]
j=1 l=1 m=1 j=1 l=1
N
N R
R
gi (xi ei,k )
+ j,l qj,l P + [j, i][l, k]Ai,k (xi ) 1{|xi |>0}
i=1 j=1 k=1 l=1
gi (xi )
N
N R
R
S
+ j,l qj,l P [j, i][l, m]Ki,m,k qi,k
i=1 j=1 k=1 l=1 m=1
N
N R
S
+ j,l qj,l P [j, i][l, m]Yi,m (xi )1{|xi |>0}
i=1 j=1 l=1 m=1
N
N R
S
+ j,l qj,l P [j, i][l, m]1{|xi |=0} .
i=1 j=1 l=1 m=1
After some substitution, we group the rst and the fourth terms of the right
side of the equation.
N
R
(j,l + Mj,l (xj )1{|xj |>0} + Nj,l (xj )1{|xj |>0} )
j=1 l=1
N
R
gj (xj ej,l )
= 1{|xj |>0} Aj,l (xj )(j,l + +
j,l )
j=1 l=1
gj (xj )
R
N S R
N
+ j,m Kj,m,l qj,l + j,l qj,l d[j, l]
j=1 l=1 m=1 j=1 l=1
R
N S S
N
+
i,m Ki,m,k qi,k +
i,m Yi,m (xi )1{|xi |>0}
i=1 k=1 m=1 i=1 m=1
N
S
+
i,m 1{|xi |=0} .
i=1 m=1
N R
We add to both sides the quantity j=1 l=1 j,l qj,l (1 d[j, l]) and
factorise three terms in the right side
N
R
(j,l + Mj,l (xj )1{|xj |>0} + Nj,l (xj )1{|xj |>0} ) + j,l qj,l (1 d[j, l])
j=1 l=1
N
R
gj (xj ej,l )
= 1{|xj |>0} Aj,l (xj )(j,l + +
j,l )
j=1 l=1
gj (xj )
R
N
S S

+ qj,l j,l + j,m Kj,m,l +
j,m Kj,m,l
j=1 l=1 m=1 m=1
S
N N
S
+
i,m Yi,m (xi )1{|xi |>0} +
i,m 1{|xi |=0} .
i=1 m=1 i=1 m=1
We substitute on the r.h.s, the value of qi,k in the second term. Then, we
cancel the term j,l which appears on both sides and we group terms to
obtain:
N
R
j,l qj,l (1 d[j, l])
j=1 l=1
R
N N
N
S
= +
j,l + 1{|xi |>0} i (xi ) +
i,m 1{|xi |=0} (21)
j=1 l=1 i=1 i=1 m=1
where
S
R
R

i (xi ) =
i,m Yi,m (xi ) Mi,k (xi ) Ni,k (xi )
m=1 k=1 k=1
R
gi (xi ei,k )
+ Ai,k (xi )(i,k + +
i,k ) .
gi (xi )
k=1
In Lemma 3, we have shown that 1{|xi |>0} i (xi ) is equal to

S
m=1 i,m 1{|xi |>0} for the three types of service centers. Thus,
N
R
j,l qj,l (1 d[j, l])
j=1 l=1
R
N N
S
= +
j,l +
i,m (1{|xi |=0} + 1{|xi |>0} ).
j=1 l=1 i=1 m=1
Finally, Lemma 1 shows that this ow equation is satised. This

concludes the proof.
As in the BCMP [2] theorem, we can also compute the steady state
distribution of the number of customers of each class in each queue. Let yi
be the vector whose elements are (yi,k ) the number of customers of class k
in station i. Let y be the vector of vectors (yi ). We omit the proof of the
following result.
Theorem 2. If the system of equations (5), (6) and (7) has a solution
then, the steady state distribution (y) is given by
N

(y) = hi (yi ) (22)
i=1
where the marginal probabilities hi (yi ) have the following form:

R
R

hi (yi ) = 1 qi,k |yi |! [(qi,k )yi,k /yi,k !]. (23)
k=1 k=1
4.4. Existence of the solution to the trac equations

Unlike BCMP or Jackson networks [2], the customer ow equations (5),
(6) and (7) of the model we consider are non-linear. Therefore issues of
existence and uniqueness of their solutions have to be examined.
In particular, our key result depends on the existence of solutions to
(5), (6), (7). Thus the existence and uniqueness of solutions to these trac
equations is central to our work.
Note that if existence is established, then uniqueness follows easily for
a simple reason. We are dealing with the stationary solution of a system
of Chapman-Kolmogorov equations, which is known to be unique if it
exists [10].
Dene the following vectors:
+ with elements [+ i,k + i,k ]

with elements [i,k + i,k ]
with elements i,k , and
with elements i,k
Furthermore, denote by P + the matrix of elements {P + [i, j][k, l]}, and
by P the matrix whose elements are {P [i, j][k, m]}.
Let F be a diagonal matrix with elements 0 Fi,k 1. Equations (6)
and (7) inspire us to write the following equation:
+ = + FP + + , = + FP + (24)
or, denoting the identity matrix I, as
+ (I FP + ) = , (25)
+
= FP + . (26)
Proposition 1. If P + is a substochastic matrix which does not contain

ergodic classes, then equations (25) and (26) have a solution (+ , ).

Proof. The series n=0 (FP + )n is geometrically convergent, since F I,
and because by assumption P + is substochastic and does not contain
any ergodic classes [1]. Therefore we can write (25) as

+ = (FP + )n , (27)
n=0
so that (26) becomes

= (FP + )n FP . (28)
n=0
Now denote z = , and call the vector function

G(z) = (F (z)P + )n F (z)P .
n=0
Note that the dependency of G on z comes from F , which depends on .

It can be seen that G : [0, G(0)] [0, G(0)] and that it is continuous.
Therefore, by Brouwers xed point theorem
z = G(z) (29)
has a xed point z . This xed point will yield the solution of (25) and
(26) as:

(z ) = + z , + (z ) = (F (z )P + )n , (30)
n=0
completing the proof of Proposition 1.
Proposition 2. Equations (6), (7) have a solution.

Proof. This result is a direct consequence of Proposition 1, since we can
see that (5), (6) and (7) are a special instance of (21). Indeed, it suces
to set
i,k
Fi,k = S (31)
i,k + m=1 Ki,m,k [i,m + i,m ]
and to notice that 0 Fi,k 1, and that (6), (7) now have taken the
form of the generalised trac equations (21). This completes the proof of
Proposition 2.
The above two propositions state that the trac equations always have
a solution. Of course, the product form (8) will only exist if the resulting
network is stable. The stability condition is summarised below and the proof
is identical to that of a similar result in [10].
Theorem 3. Let z be a solution of z = G(z) obtained by setting F as

in (27). Let (z ), + (z ) be the corresponding trac values, and let the
qi,k (z ) be obtained from (5) as a consequence. Then the G-network is
stable if all of the 0 qi,k (z ) < 1 for all i, k. Otherwise it is unstable.
4.5. Conclusion
In this chapter we have studied G-Networks. However, rather than develop
all of the theory, starting from networks with a single customer class,
we have dealt directly with G-Networks with multiple classes of positive
and negative customers. We have developed in detail both the existence
and uniqueness results for the steady-state solution of the model, and the
explicit product form solution. In the model considered, the service centers
are identical to the service centers considered in the BCMP theorem [2],
with the exception of the innite server case which is not considered.
However, all service times considered are exponentially distributed with
dierent service rates for dierent classes of positive customers.
Beyond this model, and the results discussed in [20] where multiple
classes of signals are allowed, where a signal is a generalisation of a negative
customer which has the ability to either destroy another customer or move
it to another queue, further extensions of these results can be expected to
emerge from future research.
We have mentioned applications of some of these results to algorithms
for colour texture generation [9, 25], using a neural network analogy where
colours are represented by customers of dierent types. The model we
have described, in a simpler single class version has also been applied to
texture recognition in medical images [21], and to optimisation problems in
computer-communication networks [22]. Other important characteristics of
these networks include their ability to approximate continuous and bounded
functions [23] which we think will lead to new developments in the eld of
stochastic networks and their applications.
References
1. Kemmeny, J. G. and Snell, J. L. (1965). Finite Markov Chains. Von Nostrand,
Princeton.
2. Baskett, F., Chandy, K., Muntz, R. R. and Palacios, F. G. (1975). Open,
closed and mixed networks of queues with dierent classes of customers.
Journal ACM, 22(2), 248260.
3. Gelenbe, E. (1989). Rseaux stochastiques ouverts avec clients ngatifs et

positifs, et rseaux neuronaux. Comptes-Rendus Acad. Sciences de Paris, t.
309, Srie II, pp. 979982.
4. Gelenbe, E. (1989). Random neural networks with negative and positive
signals and product form solution. Neural Computation, 1(4), 502510.
5. Fourneau, J. M. (1991). Computing the steady-state distribution of networks
with positive and negative customers. Proc. 13-th IMACS World Congress
on Computation and Applied Mathematics, Dublin.
6. Gelenbe, E. (1991). Product form queueing networks with negative and
positive customers. Journal of Applied Probability, 28, 656663.
7. Gelenbe, E., Glynn P. and Sigman, K. (1991). Queues with negative
customers. Journal of Applied Probability, 28, 245250.
8. Gelenbe, E. and Tucci, S. (1991). Performances dun système informatique
duplique. Comptes-Rendus Acad. Sci., t 312, Serie II, pp. 2730.
9. Atalay, V. and Gelenbe, E. (1992). Parallel algorithm for colour texture
generation using the random neural network. International Journal of
Pattern Recognition and Articial Intelligence, 6(2&3), 437446.
10. Gelenbe, E. and Schassberger, R. (1992). Stability of G-Networks. Probability
in the Engineering and Informational Sciences, 6, 271276.
11. Gelenbe, E. (1993). Learning in the recurrent random neural network. Neural
Computation, 5, 154164.
12. Miyazawa, M. (1993). Insensitivity and product form decomposability of
reallocatable GSMP. Advances in Applied Probability, 25(2), 415437.
13. Henderson, W. (1993). Queueing networks with negative customers and
negative queue lengths. Journal of Applied Probability, 30(3).
14. Gelenbe, E. (1993). G-Networks with triggered customer movement. Journal
of Applied Probability, 30(3), 742748.
15. Gelenbe, E. (1993). G-Networks with signals and batch removal. Probability
in the Engineering and Informational Sciences, 7, 335342.
16. Henderson, W., Northcote, B. S. and Taylor, P. G. (1994). Geometric equi-
librium distributions for queues with interactive batch departures. Annals of
Operations Research, 48(14).
17. Henderson, W., Northcote, B. S. and Taylor, P. G. (1994). Networks
of customer queues and resource queues. Proc. International Teletrac
Congress 14, Labetoulle, J. and Roberts, J. (Eds.), pp. 853864, Elsevier.
18. Gelenbe, E. (1994). G-networks: a unifying model for neural and queueing
networks. Annals of Operations Research, 48(14), 433461.
19. Fourneau, J.-M., Gelenbe, E. and Suros, R. (1996). G-networks with multiple
classes of positive and negative customers. Theoretical Computer Science,
155, 141156.
20. Gelenbe, E. and Labed, A. (1998). G-networks with multiple classes of signals
and positive customers. European Journal of Operations Research, 108(2),
293305.
21. Gelenbe, T. Feng and Krishnan, K. R. R. (1996). Neural network methods
for volumetric magnetic resonance imaging of the human brain. Proceedings
of the IEEE, 84(10) 14881496.
22. Gelenbe, E., Ghanwani, A. and Srinivasan, V. (1997). Improved neural

heuristics for multicast routing. IEEE Journal of Selected Areas of Com-
munications, 15(2), 147155.
23. Gelenbe, E., Mao, Z.-H. and Li, Y.-D. (1999). Function approximation with
spiked random networks. IEEE Trans. on Neural Networks, 10(1), 39.
24. Gelenbe, E. and Fourneau, J.-M. (2002). G-Networks with resets.
Performance Evaluation, 49, 179192, also in Proc. IFIP WG 7.3/ACM-
SIGMETRICS Performance 02 Conf., Rome, Italy, October 2002.
25. Gelenbe, E. and Hussain, K. (2002). Learning in the multiple class random
neural network. IEEE Trans. on Neural Networks, 13(6), 12571267.
26. Fourneau, J.-M. and Gelenbe, E. (2004). Flow equivalence and stochas-
tic equivalence in G-networks. Computational Management Science, 1(2),
179192.
Chapter 5
Markov-Modulated Queues
There are many computer, communication and manufacturing systems

which give rise to queueing models where the arrival and/or service
mechanisms are inuenced by some external processes. In such models,
a single unbounded queue evolves in an environment which changes state
from time to time. The instantaneous arrival and service rates may depend
on the state of the environment and also, to a limited extent, on the number
of jobs present.
The system state at time t is described by a pair of integer random
variables, (It , Jt ), where It represents the state of the environment and Jt
is the number of jobs present. The variable It takes a nite number of values,
numbered 0, 1, . . . , N ; these are also called the environmental phases. The
possible values of Jt are 0, 1, . . . . Thus, the system is in state (i, j) when
the environment is in phase i and there are j jobs waiting and/or being
served.
The two-dimensional process X = {(It , Jt ); t 0} is assumed to have
the Markov property, i.e. given the current phase and number of jobs, the
future behaviour of X is independent of its past history. Such a model is
referred to as a Markov-modulated queue. The corresponding state space,
{0, 1, . . . , N } {0, 1, . . .} is known as a lattice strip.
A fully general Markov-modulated queue, with arbitrary state-
dependent transitions, is not tractable. However, one can consider a sub-
class of models which are suciently general to be useful, and yet can be
solved eciently. Those models satisfy the following restrictions:
(i) There is a threshold M , such that the instantaneous transition rates
out of state (i, j) do not depend on j when j M .
(ii) the jumps of the random variable J are bounded.
137
Fig. 5.1. State diagram of a QBD process.
When the jumps of the random variable J are of size 1, i.e. when
jobs arrive and depart one at a time, the process is said to be of the
Quasi-Birth-and-Death type, or QBD (the term skip-free is also used
(Latouche et al., [7]). The state diagram for this common model, showing
some transitions out of state (i, j), is illustrated in Fig. 5.1.
The requirement that all transition rates cease to depend on the size of
the job queue beyond a certain threshold is not too restrictive. Note that
there is no limit on the magnitude of the threshold M , although it must be
pointed out that the larger M is, the greater the complexity of the solution.
Similarly, although jobs may arrive and/or depart in xed or variable (but
bounded) batches, the larger the batch size, the more complex the solution.
The object of the analysis of a Markov-modulated queue is to determine
the joint steady-state distribution of the environmental phase and the
number of jobs in the system:
pi,j = lim P (It = i, Jt = j); i = 0, 1, . . . , N ; j = 0, 1, . . . . (5.1)

t
That distribution exists for an irreducible Markov process if, and only if,
the corresponding set of balance equations has a positive solution that can
be normalised.
Markov-Modulated Queues 139
The marginal distributions of the number of jobs in the system, and of

the phase, can be obtained from the joint distribution:
N

p,j = pi,j (5.2)
i=0

pi, = pi,j . (5.3)
j=0
Various performance measures can then be computed in terms of these joint

and marginal distributions.
The following are some examples of systems that are modelled as
Markov-modulated queues.
5.1. A multiserver queue with breakdowns and repairs

A single, unbounded queue is served by N identical parallel servers (Mitrani
and Avi-Itzhak, [9], Neuts and Lucantoni, [13]). Each server goes through
alternating periods of being operative and inoperative, independently of
the others and of the number of jobs in the system. The operative and
inoperative periods are distributed exponentially with parameters and ,
respectively. Thus, the number of operative servers at time t, It , is a Markov
process on the state space {0, 1, . . . , N }. This is the environment in which
the queue evolves: it is in phase i when there are i operative servers.
Jobs arrive according to a Poisson process, with a rate which may
depend on the state of the environment, It . That is, when there are
i operative servers, the instantaneous arrival rate is i . Jobs are taken for
service from the front of the queue, one at a time, by available operative
servers. The required service times are distributed exponentially with
parameter . An operative server cannot be idle if there are jobs waiting
to be served. A job whose service is interrupted by a server breakdown
is returned to the front of the queue. When an operative server becomes
available, the service is resumed from the point of interruption, without any
switching overheads. The ow of jobs is shown in Fig. 5.2.
The process X = {(It , Jt ); t 0} is of the Quasi-Birth-and-Death type.
The transitions out of state (i, j) are:
(a) to state (i 1, j) (i > 0), with rate i;

(b) to state (i + 1, j) (i < N ), with rate (N i);
(c) to state (i, j + 1) with rate i ;
(d) to state (i, j 1) with rate min(i, j).
Fig. 5.2. A multiserver queue with breakdowns and repairs.
Note that only transition (d) has a rate which depends on j, and that
dependency vanishes when j N .
Remark. The breakdown and repair processes could be generalised without
destroying the QBD nature of the process. For example, the servers could
break down and be repaired in batches, or a server breakdown could trigger
a job departure. The environmental state transitions can be arbitrary, as
long as the queue changes in steps of size 1.
In this example, as in all models where the environment state transi-
tions do not depend on the number of jobs present, the marginal distribution
of the number of operative servers can be determined without nding the
joint distribution rst. Moreover, since the servers break down and are
repaired independently of each other, that distribution is binomial:
i N i
N
pi, = ; i = 0, 1, . . . , N. (5.4)
i + +
Hence, the steady-state average number of operative servers is equal to
N
E(Xt ) = . (5.5)
+
The overall average arrival rate is equal to
N

= pi, i . (5.6)
i=0
This gives us an explicit condition for stability. The oered load must be
less than the processing capacity:
N
< . (5.7)
+
Fig. 5.3. Two nodes with a nite intermediate buer.
5.2. Manufacturing blocking

Consider a network of two nodes in tandem, such as the one in Fig. 5.3
(Buzacott and Shanthikumar, [1], Konheim and Reiser, [6]). Jobs arrive
into the rst node in a Poisson stream with rate , and join an unbounded
queue. After completing service at node 1 (exponentially distributed with
parameter ), they attempt to go to node 2, where there is a nite buer
with room for a maximum of N 1 jobs (including the one in service). If that
transfer is impossible because the buer is full, the job remains at node 1,
preventing its server from starting a new service, until the completion of
the current service at node 2 (exponentially distributed with parameter ).
In this last case, server 1 is said to be blocked. Transfers from node 1 to
node 2 are instantaneous.
The above type of blocking is referred to as manufacturing blocking.
(An alternative model, which also gives rise to a Markov-modulated queue,
is the communication blocking. There node 1 does not start a service if
the node 2 buer is full.)
In this system, the unbounded queue at node 1 is modulated by a
nite-state environment dened by node 2. We say that the environment,
It , is in state i if there are i jobs at node 2 and server 1 is not blocked
(i = 0, 1, . . . , N 1). An extra state, It = N , is needed to describe
the situation where there are N 1 jobs at node 2 and server 1 is
blocked.
The above assumptions imply that the pair X = {(It , Jt ); t 0},
where Jt is the number of jobs at node 1, is a QBD process. Note that
the state (N, 0) does not exist: node 1 may be blocked only if there are jobs
present.
The transitions out of state (i, j) are:
(a) to state (i 1, j) (0 < i < N ), with rate ;

(b) to state (N 1, j 1) (i = N, j > 0), with rate ;
(c) to state (i + 1, j 1) (0 i < N 1, j > 0), with rate ;
(d) to state (N, j) (i = N 1, j > 0), with rate ;
(e) to state (i, j + 1) with rate .
The only dependency on j comes from the fact that transitions (b), (c)
and (d) are not available when j = 0. In this example, the j-independency
threshold is M = 1. Note that the state (N, 0) is not reachable: node 1 may
be blocked only if there are jobs present.
5.3. Phase-type distributions

There is a large and useful family of distributions that can be incorporated
into queueing models by means of Markovian environments (Neuts, [12]).
Those distributions are almost general, in the sense that any distribution
function either belongs to this family or can be approximated as closely as
desired by functions from it.
Let It be a Markov process with state space {0, 1, . . . , N } and generator
matrix A. States 0, 1, . . . , N 1 are transient, while state N , reachable
from any of the other states, is absorbing (the last row of A is 0). At
time 0, the process starts in state i with probability i (i = 0, 1, . . . , N 1;
1 + 2 + + N 1 = 1). Eventually, after an interval of length T , it is
absorbed in state N . The random variable T is said to have a phase-type
(PH) distribution with parameters A and i .
The exponential distribution is obviously phase-type (N = 1). So
is the Erlang distribution the convolution of N exponentials. The
corresponding generator matrix is

.. ..
A = . . ,

0
and the initial probabilities are 0 = 1, 1 = . . . = N 1 = 0.

Another common PH distribution is the hyperexponential, where
I0 = i with probability i , and absorbtion occurs at the rst transition.
The generator matrix of the hyperexponential distribution is

0 0
1 1

.. ..
A = . . .

N 1 N 1
0
The corresponding probability distribution function, F (x), is a mixture of

exponentials:
N
1
F (x) = 1 i ei x .
i=0
The PH family is very versatile. It contains distributions with both

low and high coecients of variation. It is closed with respect to mixing
and convolution: if X1 and X2 are two independent PH random variables
with N1 and N2 (non-absorbing) phases respectively, and c1 and c2 are
constants, then c1 X1 + c2 X2 has a PH distribution with N1 + N2 phases.
A model with a single unbounded queue, where either the interarrival
intervals, or the service times, or both, have PH distributions, is easily
cast in the framework of a queue in Markovian environment. Consider, for
instance, the M/PH/1 queue. Its state at time t can be represented as a
pair (It , Jt ), where Jt is the number of jobs present and It is the phase of
the current service (if Jt > 0). When It has a transition into the absorbing
state, the current service completes and (if the queue is not empty) a new
service starts immediately, entering phase i with probability i .
The PH/PH/n queue can also be represented as a QBD process.
However, the state of the environmental variable, It , now has to indicate
the phase of the current interarrival interval and the phases of the current
services at all busy servers. If the interarrival interval has N1 phases and
the service has N2 phases, the state space of It would be of size N1 N2n .
5.4. Checkpointing and recovery in the

presence of faults
The last example is not a QBD process. Consider a system where
transactions, arriving according to a Poisson process with rate , are
served in FIFO order by a single server. The service times are i.i.d.
random variables distributed exponentially with parameter . After N
consecutive transactions have been completed, the system performs a
checkpoint operation whose duration is an i.i.d. random variable distributed
exponentially with parameter . Once a checkpoint is established, the
N completed transactions are deemed to have departed. However, both
transaction processing and checkpointing may be interrupted by the
occurrence of a fault. The latter arrive according to an independent Poisson
process with rate . When a fault occurs, the system instantaneously rolls
back to the last established checkpoint; all transactions which arrived since
that moment either remain in the queue, if they have not been processed,
or return to it, in order to be processed again (it is assumed that repeated
service times are resampled independently).
This system can be modelled as an unbounded queue of (uncompleted)
transactions, which is modulated by an environment consisting of completed
transactions and checkpoints. More precisely, the two state variables, I(t)
and J(t), are the number of transactions that have completed service since
the last checkpoint, and the number of transactions present that have not
completed service (including those requiring re-processing), respectively.
The Markov-modulated queueing process X = {[I(t), J(t)]; t 0}, has
the following transitions out of state (i, j):
(a) to state (0, j + i), with rate ;
(b) to state (0, j)(i = N ), with rate ;
(c) to state (i, j + 1), with rate ;
(d) to state (i + 1, j 1)(0 i < N, j > 0), with rate ;
Because transitions (a), resulting from arrivals of faults, cause the queue
size to jump by more than 1, this is not a QBD process.
5.5. Spectral expansion solution

Let us now turn to the problem of determining the steady-state joint
distribution of the environmental phase and the number of jobs present,
for a Markov-modulated queue. The solution method that we shall present
is called Spectral Expansion, for reasons that will become apparent.
We shall start with the most commonly encountered case, namely the
QBD process, where jobs arrive and depart singly. The starting point is of
course the set of balance equations which the probabilities pi,j , dened in
(5.1), must satisfy. In order to write them in general terms, the following
notation for the instantaneous transition rates will be used.
(a) Phase transitions leaving the queue unchanged: from state (i, j) to state
(k, j)(0 i, k N ; i = k), with rate aj (i, k);
(b) Transitions incrementing the queue: from state (i, j) to state (k, j + 1)
(0 i, k N ), with rate bj (i, k);
(c) Transitions decrementing the queue: from state (i, j) to state (k, j 1)
(0 i, k N ; j > 0), with rate cj (i, k).
It is convenient to introduce the (N + 1) (N + 1) matrices containing
the rates of type (a), (b) and (c): Aj = [aj (i, k)], Bj = [bj (i, k)] and
Cj = [cj (i, k)], respectively (the main diagonal of Aj is zero by denition;

also, C0 = 0 by denition). According to the assumptions of the Markov-
modulated queue, there is a threshold, M (M 1), such that those matrices
do not depend on j when j M . In other words,
Aj = A; Bj = B; Cj = C, j M. (5.8)
Note that transitions (b) may represent a job arrival coinciding with
a change of phase. If arrivals are not accompanied by such changes, then
the matrices Bj and B are diagonal. Similarly, a transition of type (c) may
represent a job departure coinciding with a change of phase. Again, if such
coincidences do not occur, then the matrices Cj and C are diagonal.
By way of illustration, here are the transition rate matrices for the
model of the multiserver queue with breakdowns and repairs. In this case
the phase transitions are independent of the queue size, so the matrices Aj
are all equal:

0 N
0 (N 1)

..
Aj = A = 2 0 . .

.. ..
. .
N 0
Similarly, the matrices Bj do not depend on j:

0
1

B= .. .
.
N
Denoting
i,j = min(i, j); i = 0, 1, . . . , N ; j = 1, 2, . . . ,
the departure rate matrices, Cj , can thus be written as

0
1,j

Cj = .. ; j = 1, 2, . . . ,
.
N,j
These matrices cease to depend on j when j N . Thus, the threshold M

is now equal to N , and

0

C = . .
..
N
5.6. Balance equations

Using the instantaneous transition rates introduced above, the balance
equations of a general QBD process can be written as
N

pi,j [aj (i, k) + bj (i, k) + cj (i, k)]
k=0
N

= [pk,j aj (k, i) + pk,j1 bj1 (k, i) + pk,j+1 cj+1 (k, i)], (5.9)
k=0
where pi,1 = b1 (k, i) = c0 (i, k) = 0 by denition. The left-hand side

of (5.9) gives the total average number of transitions out of state (i, j) per
unit time (due to changes of phase, arrivals and departures), while the right-
hand side expresses the total average number of transitions into state (i, j)
(again due to changes of phase, arrivals and departures). These balance
equations can be written more compactly by using vectors and matrices.
Dene the row vectors of probabilities corresponding to states with j jobs
in the system:
vj = (p0,j , p1,j , . . . , pN,j ); j = 0, 1, . . . . (5.10)
Also, let DjA , DjB and DjC be the diagonal matrices whose ith diagonal
element is equal to the ith row sum of Aj , Bj and Cj , respectively. Then
equations (5.9), for j = 0, 1, . . . , can be written as:
vj [DjA + DjB + DjC ] = vj1 Bj1 + vj Aj + vj+1 Cj+1 , (5.11)
where v1 = 0 and D0C = B1 = 0 by denition.

When j is greater than the threshold M , the coecients in (5.11) cease
to depend on j:
vj [DA + DB + DC ] = vj1 B + vj A + vj+1 C, (5.12)
for j = M + 1, M + 2, . . . .
In addition, all probabilities must sum up to 1:

vj e = 1, (5.13)
j=0
where e is a column vector with N + 1 elements, all of which are equal to 1.

The rst step is to nd the general solution of the innite set of balance
equations with constant coecients, (5.12). The latter are normally written
in the form of a homogeneous vector dierence equation of order 2:
vj Q0 + vj+1 Q1 + vj+2 Q2 = 0; j = M, M + 1, . . . , (5.14)
where Q0 = B, Q1 = A DA DB DC and Q2 = C.
Associated with equation (5.14) is the so-called characteristic matrix
polynomial, Q(x), dened as
Q(x) = Q0 + Q1 x + Q2 x2 . (5.15)
Denote by xk and uk the generalised eigenvalues, and corresponding

generalised left eigenvectors, of Q(x). In other words, these are quantities
which satisfy
det[Q(xk )] = 0,
(5.16)
uk Q(xk ) = 0; k = 1, 2, . . . , d,
where det[Q(x)] is the determinant of Q(x) and d is its degree. In what

follows, the qualication generalised will be omitted.
The above eigenvalues do not have to be simple, but it is assumed that
if one of them has multiplicity m, then it also has m linearly independent
left eigenvectors. This tends to be the case in practice. So, the numbering in
(5.16) is such that each eigenvalue is counted according to its multiplicity.
It is readily seen that if xk and uk are any eigenvalue and corresponding
left eigenvector, then the sequence
vk,j = uk xjk ; j = M, M + 1, . . . , (5.17)
is a solution of equation (5.14). Indeed, substituting (5.17) into (5.14) we get
vk,j Q0 + vk,j+1 Q1 + vk,j+2 Q2 = xjk uk [Q0 + Q1 xk + Q2 x2k ] = 0.
By combining any multiple eigenvalues with each of their independent

eigenvectors, we thus obtain d linearly independent solutions of (5.14).
On the other hand, it is known that there cannot be more than d linearly
independent solutions (Gohberg et al., [4]). Therefore, any solution of (5.14)
can be expressed as a linear combination of the d solutions (5.17):
d

vj = k uk xjk ; j = M, M + 1, . . . , (5.18)
k=1
where k (k = 1, 2, . . . , d), are arbitrary (complex) constants.

However, the only solutions that are of interest in the present context
are those which can be normalised to become probability distributions.
Hence, it is necessary to select from the set (5.18), those sequences for

which the series vj e converges. This requirement implies that if |xk | 1

for some k, then the corresponding coecient k must be 0.
So, suppose that c of the eigenvalues of Q(x) are strictly inside the unit
disk (each counted according to its multiplicity), while the others are on the
circumference or outside. Order them so that |xk | < 1 for k = 1, 2, . . . , c.
The corresponding independent eigenvectors are u1 , u2 , . . . , uc . Then any
normalisable solution of equation (5.14) can be expressed as
c

vj = k uk xjk ; j = M, M + 1, . . . , (5.19)
k=1
where k (k = 1, 2, . . . , c), are some constants.

The set of eigenvalues of the matrix polynomial Q(x) is called its
spectrum. Hence, expression (5.19) is referred to as the spectral
expansion of the vectors vj . The coecients of that expansion, k , are
yet to be determined.
Note that if there are non-real eigenvalues in the unit disk, then they
appear in complex-conjugate pairs. The corresponding eigenvectors are also
complex-conjugate. The same must be true for the appropriate pairs of
constants k , in order that the right-hand side of (5.19) be real. To ensure
that it is also positive, the real parts of xk , uk and k should be positive.
So far, expressions have been obtained for the vectors vM , vM+1 , . . .;
these contain c unknown constants. Now it is time to consider the balance
equations (5.11), for j = 0, 1, . . . , M . This is a set of (M + 1)(N + 1) linear
equations with M (N + 1) unknown probabilities (the vectors vj for j =
0, 1, . . . , M 1), plus the c constants k . However, only (M +1)(N +1)1 of
these equations are linearly independent, since the generator matrix of the
Markov process is singular. On the other hand, an additional independent
equation is provided by (5.13).
In order that this set of linearly independent equations has a unique

solution, the number of unknowns must be equal to the number of
equations, i.e. (M + 1)(N + 1) = M (N + 1) + c, or c = N + 1. This
observation implies the following rather general result.
Proposition 5.1. The QBD process has a steady-state distribution if, and
only if, the number of eigenvalues of Q(x) strictly inside the unit disk, each
counted according to its multiplicity, is equal to the number of states of
the Markovian environment, N + 1. Then, assuming that the eigenvectors
of multiple eigenvalues are linearly independent, the spectral expansion
solution of (5.12) has the form
N
+1
vj = k uk xjk ; j = M, M + 1, . . . . (5.20)
k=1
In summary, the spectral expansion solution procedure consists of the

following steps:
1. Compute the eigenvalues of Q(x), xk , inside the unit disk, and the
corresponding left eigenvectors uk . If their number is other than N + 1,
stop; a steady-state distribution does not exist.
2. Solve the nite set of linear equations (5.11), for j = 0, 1, . . . , M , and
(5.13), with vM and vM+1 given by (5.20), to determine the constants
k and the vectors vj for j < M .
3. Use the obtained solution in order to determine various moments,
marginal probabilities, percentiles and other system performance mea-
sures that may be of interest.
Careful attention should be paid to step 1. The brute force approach

which relies on rst evaluating the scalar polynomial det[Q(x)], then nding
its roots, may be very inecient for large N . An alternative which is prefer-
able in most cases is to reduce the quadratic eigenvalue-eigenvector problem
u[Q0 + Q1 x + Q2 x2 ] = 0, (5.21)
to a linear one of the form uQ = xu, where Q is a matrix whose dimensions

are twice as large as those of Q0 , Q1 and Q2 . The latter problem is normally
solved by applying various transformation techniques. Ecient routines for
that purpose are available in most numerical packages.
This linearisation can be achieved quite easily if the matrix C = Q2 is
non-singular (Jennings, [5]). Indeed, after multiplying (5.21) on the right
by Q1
2 , it becomes
u[H0 + H1 x + Ix2 ] = 0, (5.22)
where H0 = Q0 C 1 , H1 = Q1 C 1 , and I is the identity matrix. By

introducing the vector y = xu, equation (5.22) can be rewritten in the
equivalent linear form

0 H0
[u, y] = x[u, y]. (5.23)
I H1
If C is singular but B is not, a similar linearisation is achieved by

multiplying (5.21) on the right by B 1 and making a change of variable
x 1/x. Then the relevant eigenvalues are those outside the unit disk.
If both B and C are singular, then the desired result is achieved by
rst making a change of variable, x ( + x)/( x), where the value
of is chosen so that the matrix S = 2 Q2 + Q1 + Q0 is non-singular.
In other words, can have any value which is not an eigenvalue of Q(x).
Having made that change of variable, multiplying the resulting equation by
S 1 on the right reduces it to the form (5.22).
The computational demands of step 2 may be high if the threshold M
is large. However, if the matrices Bj (j = 0, 1, . . . , M 1) are non-singular
(which is often the case in practice), then the vectors vM1 , vM2 , . . . , v0
can be expressed in terms of vM and vM+1 , with the aid of equations (5.11)
for j = M, M 1, . . . , 1. One is then left with equations (5.11) for j = 0,
plus (5.13) (a total of N + 1 independent linear equations), for the N + 1
unknowns xk .
Having determined the coecients in the expansion (5.20) and the
probabilities pi,j for j < N , it is easy to compute performance measures.
The steady-state probability that the environment is in state i is given by
M1
N
+1
xM
k
pi, = pi,j + k uk,i , (5.24)
j=0
1 xk
k=1
where uk,i is the i th element of uk .

The conditional average number of jobs in the system, Li , given that
the environment is in state i, is obtained from

M1
N
+1 M
1 x (M M xk + xk )
Li = jpi,j + k uk,i k . (5.25)
pi, j=1 (1 xk )2
k=1
The overall average number of jobs in the system, L, is equal to

N

L= pi, Li . (5.26)
i=0
5.7. Batch arrivals and/or departures

Consider now a Markov-modulated queue which is not a QBD process, i.e.
one where the queue size jumps may be bigger than 1. As before, the state
of the process at time t is described by the pair (It , Jt ), where It is the state
of the environment (the operational mode) and Jt is the number of jobs in
the system. The state space is the lattice strip {0, 1, . . . , N } {0, 1, . . .}.
The variable Jt may jump by arbitrary, but bounded amounts in either
direction. In other words, the allowable transitions are:
(a) Phase transitions leaving the queue unchanged: from state (i, j) to state
(k, j) (0 i, k N ; i = k), with rate aj (i, k);
(b) Transitions incrementing the queue by s: from state (i, j) to state
(k, j + s) (0 i, k N ; 1 s r1 ; r1 1), with rate bj,s (i, k);
(c) Transitions decrementing the queue by s: from state (i, j) to state
(k, j s) (0 i, k N ; 1 s r2 ; r2 1), with rate cj,s (i, k),
provided of course that the source and destination states are valid.
Obviously, if r1 = r2 = 1 then this is a Quasi-Birth-and-Death process.
Denote by Aj = [aj (i, k)], Bj,s = [bj,s (i, k)] and Cj,s = [cj,s (i, k)], the
transition rate matrices associated with (a), (b) and (c), respectively. There
is a threshold M , such that
Aj = A; Bj,s = Bs ; Cj,s = Cs ; j M. (5.27)
Dening again the diagonal matrices DA , DBs and DCs , whose ith
diagonal element is equal to the ith row sum of A, Bs and Cs , respectively,
the balance equations for j > M + r1 can be written in a form analogous
to (5.12):
r1 r2
r1 r2

A Bs Cs
vj D + D + D = vjs Bs +vj A+ vj+s Cs . (5.28)
s=1 s=1 s=1 s=1
Similar equations, involving Aj , Bj,s and Cj,s , together with the corre-
sponding diagonal matrices, can be written for j M + r1 .
As before, (5.28) can be rewritten as a vector dierence equation, this

time of order r = r1 + r2 , with constant coecients:
r

vj+ Q = 0; j M. (5.29)
=0
Here, Q = Br1 for = 0, 1, . . . r1 1,
r1
r2

Q r1 = A D A DBs D Cs ,
s=1 s=1
and Q = Cr1 for = r1 + 1, r1 + 2, . . . r1 + r2 .

The spectral expansion solution of this equation is obtained from the
characteristic matrix polynomial
r

Q(x) = Q x . (5.30)
=0
The solution is of the form

c

vj = k uk xjk ; j = M, M + 1, . . . , (5.31)
k=1
where xk are the eigenvalues of Q(x) in the interior of the unit disk, uk are
the corresponding left eigenvectors, and k are constants (k = 1, 2, . . . , c).
These constants, together with the probability vectors vj for j < M , are
determined with the aid of the state-dependent balance equations and the
normalising equation.
There are now (M + r1 )(N + 1) so-far-unused balance equations (the
ones where j < M + r1 ), of which (M + r1 )(N + 1) 1 are linearly
independent, plus one normalising equation. The number of unknowns is
M (N + 1) + c (the vectors vj for j = 0, 1, . . . , M 1), plus the c constants
k . Hence, there is a unique solution when c = r1 (N + 1).
Proposition 5.2. The Markov-modulated queue has a steady-state distri-

bution if, and only if, the number of eigenvalues of Q(x) strictly inside the
unit disk, each counted according to its multiplicity, is equal to the number of
states of the Markovian environment, N +1, multiplied by the largest arrival
batch, r1 . Then, assuming that the eigenvectors of multiple eigenvalues
are linearly independent, the spectral expansion solution of (5.28) has

the form
r1 (N +1)

vj = k uk xjk ; j = M, M + 1, . . . . (5.32)
k=1
For computational purposes, the polynomial eigenvalue-eigenvector

problem of degree r can be transformed into a linear one. For example,
suppose that Qr is non-singular and multiply (5.29) on the right by Q1
r .
This leads to the problem
r1

r
u H x + Ix = 0, (5.33)
=0
where H = Q Q1
r . Introducing the vectors y = x u, = 1, 2, . . . , r 1,
one obtains the equivalent linear form

0 H0
I 0 H1

[u, y1 , . . . , yr1 ] .. .. = x[u, y1 , . . . , yr1 ].
. .
I Hr1
As in the quadratic case, if Qr is singular then the linear form can be
achieved by an appropriate change of variable.
5.8. A simple approximation

The spectral expansion solution can be computationally expensive. Its
numerical complexity depends crucially on the number of environmental
phases: that number determines the number of eigenvalues and eigenvectors
that have to be evaluated, and inuences the size of the set of simultaneous
linear equations that have to be solved. Moreover, when N is large, there
may be numerical problems concerned with ill-conditioned matrices. In
some cases, both the complexity and the numerical stability of the solution
are adversely aected when the system is heavily loaded.
For these reasons, it may be worth abandoning the exact solution,
if one can develop a reasonable approximation which is simple, easy to
implement, robust and computationally cheap. Such an approximation can
be extracted from the spectral expansion solution. The idea is to use a
restricted expansion, based on a single eigenvalue and its associated
eigenvector. The eigenvalue provides a geometric approximation for the
queue size distribution, while the eigenvector approximates the distribution

of the environmental phase.
An attractive feature of the geometric approximation is that its
accuracy improves when the oered load increases. In the heavy-trac limit,
i.e. when the system approaches saturation, the approximation becomes
asymptotically exact.
In order to keep the presentation simple, the discussion will be
restricted to QBD Markov-modulated queues whose solution is given by
Proposition 5.1, with simple eigenvalues. However, the applicability of the
proposed approximation is much more general.
A central role in the approximation is played by the largest eigenvalue
that appears in (5.20), and its left eigenvector. Assume, without loss
of generality, that the eigenvalues are numbered in increasing order of
modulus, so that the largest is xN +1 . When the queue is stable, xN +1 is real
and positive. Moreover, it has a positive eigenvector. From now on, xN +1
will be referred to as the dominant eigenvalue, and will be denoted by .
The expression (5.20) implies that the tail of the joint distribution of
the queue size and the environmental phase is approximately geometrically
distributed, with parameter equal to the dominant eigenvalue, . To see
that, divide both sides of (5.20) by j and let j . Since is strictly
greater in modulus than all other eigenvalues, all terms in the summation
vanish, except one:
vj
lim = N +1 uN +1 . (5.34)
j j
In other words, when j is large,
vj N +1 uN +1 j . (5.35)
This product form implies that when the queue is large, its size is
approximately independent of the environmental phase. The tail of the
marginal distribution of the queue size is approximately geometric:
p,j N +1 (uN +1 1) j , (5.36)
where 1 is the column vector dened in (5.13).

These results suggest seeking an approximation of the form
vj = uN +1 j , (5.37)
where is some constant.

Note that and uN +1 can be computed without having to nd
all eigenvalues and eigenvectors. There are techniques for determining the
eigenvalues that are near a given number. Here we are dealing with the
eigenvalue that is nearest to but strictly less than 1.
If (5.37) is applied to all vj , for j = 0, 1, . . . , then the approximation
depends on just one unknown constant, . Its value is determined by (5.13)
alone, and the expressions for vj become
uN +1
vj = (1 ) j ; j = 0, 1, . . . . (5.38)
(uN +1 1)
This last approximation avoids completely the need to solve a set of
linear equations. Hence, it also avoids all problems associated with ill-
conditioned matrices. Moreover, it scales well. The complexity of computing
and uN +1 grows roughly linearly with N when the matrices A, B and C
are sparse. The price paid for that convenience is that the balance equations
for j M are no longer satised.
Despite its apparent over-simplicity, the geometric approximation
(5.38) can be shown to be asymptotically exact when the oered load
increases.
5.9. The heavy trac limit

Consider the case where a parameter associated with arrivals or services
changes so that system becomes heavily loaded and approaches saturation.
The parameters governing the evolution of the environment are assumed to
remain xed. Then the dominant eigenvalue, , is known to approach 1 (Gail
et al., [3]). When = 1 (i.e. there is a double eigenvalue at 1), the process
X = {(I, J)} is recurrent-null; when leaves the unit disc, the process is
transient. Hence, instead of taking a limit involving a particular parameter,
e.g. max (where max is the arrival rate that would saturate the
system), we can equivalently treat the heavy-trac regime in terms of the
limit 1.
Since there is no equilibrium distribution when X is recurrent-null, we
must have
lim vj = 0; j = 0, 1, . . . . (5.39)
1
Hence, in order to talk sensibly about the limiting distribution, some

kind of normalisation must be applied. Multiply the queue size by 1
and consider the process Y = {[I, J(1 )]}. The limiting joint distribution
of Y will be determined by means of the vector Laplace transform
h(s) = [h0 (s), h1 (s), . . . , hN (s)], (5.40)

where
hi (s) = lim E[(I = i)es(1)J ]; i = 0, 1, . . . N, (5.41)

1
and (B) is the indicator of the boolean B: it is equal to 1 if B is true,

0 otherwise. In terms of the vectors vj , (5.40) is expressed as

h(s) = lim vj es(1)j . (5.42)
1
j=0
The objective will be to show that both the exact distribution, where the
vectors vj are given by (5.20), and the geometric approximation, where
they are given by (5.38), have the same limiting distribution.
Consider rst the exact distribution. When all eigenvalues are simple,
the equations (5.20) and (5.39) imply that
lim k uk = 0; k = 1, 2, . . . N + 1. (5.43)
1
This can be seen by taking N + 1 consecutive equations (5.20) and

setting their left-hand sides to 0; the Vandermonde matrix involving
powers of dierent eigenvalues is non-singular, and so the only solution
is k uk = 0.
On the other hand, since the environmental process has a nite
number of states, and since the corresponding transition rates are xed, the
stationary marginal distribution of the environmental phase always exists
and has a non-zero limit when 1. Denote that limit by the vector q.
This is the limiting eigenvector corresponding to the eigenvalue 1; it satises
the equations
qG = 0; (q 1) = 1, (5.44)
where G is the generator matrix of the environmental process. In terms of

the matrix polynomial (5.15), G is the limiting matrix Q(1) = Q0 + Q1 +
Q2 , obtained by replacing the changing trac parameter with its limit. In
particular, if the matrices B and C are diagonal, then G = A DA .
Hence, we can write

lim vj = q. (5.45)
1
j=0
Moreover, in view of (5.39), equation (5.45) holds if the lower index of the
summation is j = M (or any other non-negative integer), instead of j = 0.
Substituting (5.20) into (5.45) and changing the lower summation index
to j = M yields
N
+1
xM
k
lim k uk = q. (5.46)
1 1 xk
k=1
However, the rst N eigenvalues do not approach 1, while the last one,
xN +1 = , does. Hence, according to (5.43), the rst N terms in (5.46)
vanish and leave
N +1 uN +1
lim = q. (5.47)
1 1
Now, substituting (5.20) into (5.42), and arguing as for (5.47), we see
that only the term involving the dominant eigenvalue survives:

N
+1
h(s) = lim es(1)j k uk xjk
1
j=M k=1
N
+1

= lim k uk xjk es(1)j
1
k=1 j=M
N
+1
xMk e
s(1)M
= lim k uk
1
k=1
1 xk es(1)
N +1 uN +1
= lim . (5.48)
1 1 es(1)
Combining this with (5.47) leads to
1 1
h(s) = q lim =q . (5.49)
1 1 es(1) 1+s
The last limit follows from LHospitals rule. The Laplace transform
appearing in the right-hand side of (5.49) is that of the exponential
distribution with mean 1. Thus we have established the following rather
general result:
Proposition 5.3. In any Markov-modulated queue, in the heavy-trac

limit 1, the environmental state I and the normalised queue size
(1 )J are independent of each other. The rst has distribution q, while

the second is distributed exponentially with mean 1.
It now remains to compare the limit (5.49) with the corresponding one
for the geometric approximation, (5.38). Denote the approximate limiting

vector Laplace transform by h(s); it is given by (5.42), with vj replaced by
the approximations (5.38):

uN +1

h(s) = lim (1 ) j es(1)j
1 (uN +1 1)
j=0
uN +1 1
= lim lim
1 (uN +1 1) 1 1 es(1)
1 uN +1
= lim , (5.50)
1 + s 1 (uN +1 1)
again using LHospitals rule.
The last limit in the right-hand side of (5.50) is simply the vector
q. This can be seen by arguing that the normalised left eigenvector of
the eigenvalue must approach the normalised left eigenvector of the
eigenvalue 1. Alternatively, multiply both sides of (5.47) by the column
vector 1:
N +1 (uN +1 1)
lim = 1. (5.51)
1 1
Hence rewrite (5.47) as
uN +1
lim = q. (5.52)
1 (uN +1 1)
Thus we have
1
h(s) =q = h(s). (5.53)
1+s
So, in heavy trac, the geometric approximation is asymptotically
exact, in the sense that it yields the same limiting normalised distribution
of environmental phase and queue size as the exact solution.
5.10. Applications and comparisons

It is instructive to present some numerical experiments aimed at evaluating
the accuracy of the geometric approximation in the context of two dierent
models of Markov-modulated queues. In all cases, the exact values of the
performance measures are computed by applying the full spectral expansion

solution (5.20).
The rst system examined is the network of two nodes in tandem, with
manufacturing blocking at node 1. The model is illustrated in Fig. 5.3.
The parameters are (external arrival rate), (service rate at node 1),
(service rate at node 2) and N (the storage capacity at node 2 is N 1).
In this system, the unbounded queue at node 1 is modulated by a nite-
state environment dened by node 2. The environment, I, is in state i if
there are i jobs at node 2 and server 1 is not blocked (i = 0, 1, . . . , N 1).
An extra state, I = N , is needed to describe the situation where there are
N 1 jobs at node 2 and server 1 is blocked.
The pair X = {(I, J)}, where J is the number of jobs at node 1, is a
QBD process. The transitions out of state (i, j) were given earlier.
Because the environmental process is coupled with the queueing
process, the marginal distribution of the former (i.e. the number of jobs
at node 2), cannot be determined without nding the joint distribution of
I and J. There is no simple expression for the stability condition.
Figure 5.4 illustrates the close agreement between the exact solution
of this model and the geometric approximation (5.38), when the system
is heavily loaded. The performance measure is the average size of the
unbounded queue; it is plotted against the arrival rate, . The service rates
Fig. 5.4. Manufacturing blocking: Average node 1 queue size against arrival rate,
N = 10, = 1, = 1.
at nodes 1 and 2 are the same. Hence, the busier node 1, the higher the
likelihood that the buer will ll up and cause blocking. Because of that,
the saturation point is not at = 1 (as it would be if node 1 was isolated),
but at approximately = 0.909.
The geometric approximation for the marginal distribution of the
environmental variable, I, indicating the number of jobs at node 2 and
whether or not node 1 is blocked, is given by (5.38) as q uN +1 /(uN +1 1).
Since there are two environmental states, I = N 1 and I = N , representing
N 1 jobs at node 2, the average length of the node 2 queue, L2 , is given by
N
1
L2 = iqi + (N 1)qN ,
i=1
where qi is the i+1st element of the vector q. Figure 5.5 compares the exact
value of L2 with that provided by the geometric approximation, for the same
parameters as in Fig. 5.4. It can be seen that this time the approximation
is relatively less accurate, and converges to the exact solution more slowly.
Intuitively, this is due to the fact that, in order to obtain an accurate value
for L2 , all elements of q need to be accurate. Whereas, in a heavily loaded
unbounded queue, only the tail of the distribution is important.
In Fig. 5.6, the average unbounded queue size is plotted against N .
Increasing the size of the nite buer enlarges the environmental state
Fig. 5.5. Manufacturing blocking: Average node 2 queue size against arrival rate,
N = 10, = 1, = 1.
Fig. 5.6. Manufacturing blocking: Average node 1 queue size against N , = 0.8, = 1,
= 1.
space. Consequently, the exact solution needs to compute more eigenvalues

and eigenvectors, and solve larger sets of linear equations.
The accuracy of the geometric approximation is seen to increase with
N . This is not really surprising, because enlarging the intermediate buer
reduces the coupling between the two nodes, making them behave more like
independent queues. Nevertheless, the exact solution begins to experience
numerical diculties when N > 35. The software (Matlab) starts issuing
warnings to the eect that the matrix is ill-conditioned, and the results may
not be reliable (as it happens, the results returned seem ne). Of course
the approximation displays no such symptoms, since it has no equations
to solve.
The second model to be evaluated is that of the multiserver queue with
breakdowns and repairs, described at the beginning of the chapter (Fig. 5.2).
The parameters are (arrival rate; it will be assumed independent of
the operative state of the servers), (service rate), (breakdown rate),
(repair rate) and N (number of servers. The queue evolves in a Markovian
environment which is in phase i (i = 0, 1, . . . , N ) when there are i operative
servers.
In applying the geometric approximation to this model, there is a choice
of approaches. One could use (5.37) for j N , together with the balance
equations for j < N . This will be referred to as the partial geometric
Fig. 5.7. Breakdowns and repairs: Average queue size against arrival rate, N = 10,
= 1, = 0.05, = 0.1.
approximation. Alternatively, the geometric approximation (5.38) can be

used for all j 0.
Intuitively, the partial geometric approximation can be expected to be
more accurate, since it satises more of the balance equations. In fact, the
results in Fig. 5.7 suggest that the opposite is true. The average queue
size is plotted against the arrival rate, with parameters chosen so that the
system is heavily loaded (the saturation point is = 6.666 . . .). It turns
out that the simple geometric approximation is more accurate than the
more complex partial geometric one. There seem to be two opposing eects
here. On the one hand, relying only on the dominant eigenvalue tends to
overestimate the average queue size; on the other hand, the additional
approximation introduced by ignoring the boundary balance equations
reduces that overestimation.
Since the marginal distribution of the environmental variable I is known
to be given by (5.4), there is not much point in trying to approximate it.
However, if the geometric approximation is nevertheless applied, e.g. to
compute the average number of operative servers, then a similar picture
to Fig. 5.5 emerges. The approximation improves when increases, even
though the exact value of the average does not depend on .
In Fig. 5.8, the average queue size is evaluated for increasing number
of servers, and hence decreasing load. This experiment disproves the
conjecture that the geometric approximation always overestimates the exact
Fig. 5.8. Breakdowns and repairs: Average queue size against number of servers, = 6,
= 1, = 0.05, = 0.1.
values. Here the approximation starts o as an overestimate, but as N

increases, it becomes an underestimate.
As in the previous model, when N becomes large (greater than
about 30), the exact solution begins to warn of possible numerical problems
due to ill-conditioned matrices; the geometric approximation does not
display such symptoms.
5.11. Remarks
The presentation in this chapter is based on material from [8, 10, 11]. It
is perhaps worth mentioning that there are two other solution techniques
that can be used in the context of Markov-modulated queues. These are
the matrix-geometric method (Neuts, [12]) and the generating functions
method (as applied, for example, in [9]). However, we have chosen to
concentrate on the spectral expansion solution method because it is
versatile, readily implementable and ecient. A strong case can be made
for using it, whenever possible, in preference to the other methods [10].
An additional point in its favour is that it provides the basis for a simple
approximate solution.
The geometric approximation is valid for a large class of heavily loaded
systems. The arguments presented here do not rely on any particular model
structure. One could relax the QBD assumption and allow batch arrivals
and departures. As long as there is a spectral expansion solution with

nitely many eigenvalues, there would be a single dominant eigenvalue and
therefore the geometric approximation would be asymptotically exact in
heavy trac. Moreover, it may also be reasonable for moderate and light
loads, as the examples in Figs. 5.5 and 5.8 illustrate.
References
1. Buzacott, J. A. and Shanthikumar, J. G. (1993). Stochastic Models of
Manufacturing Systems, Prentice-Hall.
2. Daigle, J. N. and Lucantoni, D. M. (1991). Queueing systems having phase-
dependent arrival and service rates, in Numerical Solutions of Markov
Chains, (ed. W. J. Stewart), Marcel Dekker.
3. Gail, H. R., Hantler, S. L. and Taylor, B. A. (1996). Spectral analysis of
M/G/1 and G/M/1 type Markov chains, Adv. in Appl. Prob., 28, 114165.
4. Gohberg, I., Lancaster, P. and Rodman, L. (1982). Matrix Polynomials,
Academic Press.
5. Jennings, A. (1977). Matrix Computations for Engineers and Scientists,
Wiley.
6. Konheim, A. G. and Reiser, M. (1976). A queueing model with finite waiting
room and blocking, JACM, 23(2), 328341.
7. Latouche, G., Jacobs, P. A. and Gaver, D. P. (1984). Finite Markov chain
models skip-free in one direction, Naval Res. Log. Quart., 31, 571588.
8. Mitrani, I. (2005). Approximate Solutions for Heavily Loaded Markov
Modulated Queues, Performance Evaluation, 62, 117131.
9. Mitrani, I. and Avi-Itzhak, B. (1968). A many-server queue with service
interruptions, Operations Research, 16(3), 628638.
10. Mitrani, I. and Chakka, R. (1995). Spectral expansion solution for a class
of Markov models: Application and comparison with the matrix-geometric
method, Performance Evaluation.
11. Mitrani, I. and Mitra, D. (1991). A spectral expansion method for random
walks on semi-infinite strips, IMACS Symposium on Iterative Methods in
Linear Algebra, Brussels.
12. Neuts, M. F. (1981). Matrix Geometric Solutions in Stochastic Models, John
Hopkins Press.
13. Neuts, M. F. and Lucantoni, D. M. (1979). A Markovian queue with N servers
subject to breakdowns and repairs, Management Science, 25, 849861.
Chapter 6
Diusion Approximation Methods
for General Queueing Networks
6.1. Introduction
Although considerable progress has been made in obtaining exact solutions
for large classes of queueing network models, one particularly simple type of
network, an arbitrary network with rst-come-rst-served (FCFS) service
discipline and general distribution function of service time at the servers,
has proved to be resilient to all approaches except for approximate solution
techniques. In this chapter our attention is limited to this type of queueing
network.
Several approximation methods have been suggested for its treatment.
On the one hand there are diusion approximations [9, 10, 20] applicable
to two-station networks or to general queueing networks [11, 12, 13, 16, 22]
and on the other hand we have iterative techniques [5, 23]. The convergence
of the latter to the exact solution is not an established fact and we know
that the former tend, in certain simple cases, to the exact solution.
Most of the work published in the literature has concentrated on
evaluating the joint probability distribution of queue lengths for all the
queues in a network, but it is seldom possible to make use of this complete
information. In measurements on computer systems it is dicult enough to
collect data on the performance of a single resource, and the measurement
of joint data for several resources could become very time- and space-
consuming. The same can be said of simulation experiments where the task
of computing condence intervals for estimated joint statistics becomes
impractical. Furthermore, when it comes to computing average response
times or queue lengths it suces to know the average response time
encountered in each individual queue. Therefore it would suce in many
cases to be able to compute with satisfactory accuracy the probability
distribution for the queue length at each individual resource.
165
The purpose of this chapter is to present an approximation method

for general queueing networks: the diusion approximations, which are
particularly useful in treating open networks of queues. Both systems with
single and with multiple classes will be considered.
We shall briey review the approach based on the work of Kobayashi
and Reiser [16, 22] using reecting boundaries, but the bulk of the presen-
tation will follow the work of Gelenbe and Pujolle [10, 11, 12, 13] which uses
the instantaneous return model. We have two reasons for this: the latter
approach, as we shall see below, leads to better models of the behaviour
of system queues even when the trac is light; furthermore, it has been
shown in numerous cases that the results thus obtained are more accurate.
6.2. Diusion approximation for a single queue

A promising method for the approximation of queueing systems with
general service time distributions has originated with the work of Newell
[20] and Gaver and Shedler [9] who suggested the use of a diusion
process to approximate the number in queue. The idea of the method is to
replace the discrete number of jobs in the queue by a continuous variable
which, according to the central limit theorem, will be approximately
normally distributed under heavy trac conditions. Consider for instance
the GI/G/1 queue; basic to the diusion approximation for this model
is the assumption that as soon as a busy period begins (i.e. a customer
arrives to a previously empty system) the stochastic process representing
the number in queue is adequately approximated by the predictions of the
central limit theorem which in reality are only valid asymptotically (as the
duration of the busy period tends to innity).
Several questions arise in the choice of the approximate diusion
process model:
(i) the choice of the appropriate boundary conditions;

(ii) the choice of the diusion parameters b, which characterise the drift
and instantaneous variance of the process;
(iii) the selection of the discretisation step which may be used to work
back to a discrete probability distribution from the continuous density
of the diusion process.
Before describing results for such general networks, we present two

informal approaches to diusion approximations of queue behaviour.
Diusion Approximation Methods for General Queueing Networks 167
There are two ways in which we may intuitively understand the basis
for diusion approximations. The rst uses a numerical analysis analogy,
while the second calls upon the central limit theorem.
6.2.1. Queues and the numerical discretisation of the

diusion equation
Consider the following partial dierential equation known as the diusion
equation:
2
f (x, t) b f (x, t) + f (x, t) = 0 (6.1)
t x 2 x2
where f (x, t) is a function of space, the x variable, and of time t. f (x, t) is
chosen to be, for each t, the probability density function of a non-negative
random variable X(t):
P r[x X(t) < x + dx] = f (x, t)dx.
Equation (6.1) can be solved if an initial condition (f (x, 0) for all values
of x 0) and a boundary condition (conditions which must be satised
by f (0, t) and f (0, t)/t) are provided. We shall consider the following
boundary condition given in terms of P (t) a probability mass, function of
time, located at the boundary point x = 0:

d 1 f (x, t)
P (t) = cP (t) + lim bf (x, t) +
dt x0+ 2 x (6.2)
f (0, t) = 0.
P (t) and f (x, t) are constrained so that

P (t) + f (x, t)dx = 1
0
and P (t) is interpreted as
P (t) = P r[X(t) = 0].
Equations (6.1) and (6.2) are to be viewed, for the moment, simply as
formal relations. The interpretation in terms of queueing phenomena can
be obtained either in terms of their discretisation (as will be done here), or
via the central limit theorem as in section 2.2.
Suppose that we discretise the x variable using the grid x0 = 0, xi =

i, i 1, with the constant discretisation step . Then, using standard
numerical analysis techniques we can make the following approximations:
fi+1 (t) fi (t)
f (xi , t)
x
2 fi+1 (t) 2fi (t) + fi1 (t)
f (xi , t)
x2 2
where fi (t) f (xi , t). The approximation to (6.1) and (6.2) is then for i 1:

d b b
fi (t) = + 2 fi+1 + 2 fi + 2 fi1 (6.3)
dt 2 2 2
and
d
P (t) = cP (t) + f1 (t) (6.4)
dt 2
where we have made use in (6.4) of the boundary condition (6.2).
Let us now call
p0 (t) P (t) and pi (t) fi (t)
so that
M
M
pi (t) f (x, t)dx + P (t).
i=0 0
We can write (6.3), (6.4) as

d
pi (t) = b pi+1 b pi + pi1 , i 1

dt 2 2 2
(6.5)
d

p0 (t) = 2 p1 (t) cp0 (t).
dt 2
Now, if we choose
c = /2, b = 0, = 1
we see that (6.5) are the ChapmanKolmogorov equations for the M/M/1
queue with arrival rate equal to the service rate (see Chapter 1). Thus
we can conclude that for this special case ( = ), the diusion equation is
approximated by the equation for the M/M/1 queue and vice versa.
If we seek the stationary solution of (6.1), (6.2) or (6.5), then these
equations are approximations of each other under general conditions.
Set f /t = 0, dP/dt = 0 in (6.3), (6.4) and dpi (t) = 0 in (6.5).

Furthermore, let us use the notation:
= /2, = (/2) b, or b =
c = (1 + b/) = 2 /.
Then, in steady-state (6.1), (6.2) become:

f (x) 2 f (x)
( ) = , f (0) = 0
x x2

f (x)
(2 /)P =
x x=0
where P and f (x) denote the stationary solution. Similarly, for (6.5) we
have
( + )pi = pi+1 + pi1 , i1

p1 = p0 .
If < , which is satised if b < 0, the stationary solution of the M/M/1

queue length equations have the well-known stationary solution
pi = (/)i (1 /)
so that the diusion equation has the approximate stationary solution
P p0 ; f (i) pi /, i 1.
6.2.2. An approach based on the central limit theorem

Consider a single-server system and let A(t) be the cumulative number of
arrivals up to time t, and D(t) be the cumulative number of departures up
to time t. Suppose that the queue is initially empty. Then the number of
units in queue (including the unit being serviced) at time t is given by
Q(t) = A(t) D(t).
The change in the queue length between times t and t + T is
Q(t + T ) Q(t) = [A(t + T ) A(t)] [D(t + T ) D(t)],
or
Q(t) = A(t) D(t).

Let the interarrival times and service times be sequences of independent

and identically distributed random variables, with means and variances
given by (1/, Va ) and (l/, Vs ), respectively. Then on the basis of the
central limit theorem it can be shown that if T is suciently large so
that many events take place between t and t + T and if Q(t) does not
become zero in this interval, then Q(t) should be approximately normally
distributed with
E[Q(t)] = ( )T = bT
and
Var[Q(t)] = (Ka2 + Ks2 )T = T
where Ka2 = Va /(1/)2 is the square of the coecient of variation of the

interarrival times (Ka2 = 1 for M/G/1 queues) and Ks2 = Vs /(1/)2 is
the square of the coecient of variation of the service times. Hence, if the
queue is not empty at time t, the number in it can be approximated by a
continuous stochastic process {X(t), t 0} whose density function
f (x, t)dx = P r{x X(t) < x + dx}
satises the Kolmogorov forward diusion equation (also known as the

FokkerPlanck equation)(6.1):
2
f (x, t) b f (x, t) + f (x, t) = 0
t x 2 x2
where {X(t), t 0} is the continuous-path stochastic process approximat-
ing the number in queue.
Since the approach was initially intended for heavy trac conditions
it is also assumed that the lower boundary at x = 0 for the process
{X(t), t 0} should act as a reecting boundary. This last assumption
implies that no probability mass can collect at x = 0.
From (6.1) we may write (for f = f (x, t))

f 2f f
dx = b f + dx = bf + .
0+ t 0+ x 2 x2 2 x 0+
The left-hand side must be zero because the total probability mass is one,
and no probability mass collects at x = 0. Therefore, for all t 0,
f +
bf (0+ , t) = (0 , t)
2 x
since at x = + both f (x, t) and (f /x) must vanish if f is a probability

density.
The steady-state distribution of X(t) can be obtained from (6.1) by
eliminating the dependence on t: replacing f (x, t) by f (x) and equating
f /t to zero, together with the requirement that

f (x)dx = 1
0
leads to the unique steady-state solution
f (x) = ex , x 0
= 0, x<0
where = 2b/ = 2(1 )/(Ka2 + Ks2 ), provided that < 0, or =

/ < 1. This expression has an important shortcoming: the probability
that the queue is empty, which should be (1 ), is not available. Therefore
the following heuristic modication has been suggested [16, 22]. Let p(i)
denote the diusion approximation to the probability that the queue is of
length i in stationary state. Then take
p(0) = 1
i1 ,
p(i) = (1 ) i1
for = e .
6.2.3. The instantaneous return process [10, 11]

The diusion process we shall present in this section is a generalisation of
standard diusion processes. To simplify and motivate the description of
our model we shall imagine that the stochastic process {X(t), t 0} (which
will be used to approximate the number in queue) represents the position
of a particle moving on the closed interval [0, M ] of the real line. When the
particle is in the open interval ]0, M [ its motion is described by a diusion
process, where b and , the mean and variance of the instantaneous rate of
change of X(t), are given by
E[X(t + t) X(t)]
b = lim
t0 t
E[(X(t + t) X(t))2 ] (E[X(t + t) X(t)])2
= lim .
t0 t
For our present purposes it is not necessary that b and be functions of

x, t; this restriction can be relaxed, however.
When the particle reaches the lower boundary of the interval [0, M [ it
remains there for a period of time h which is a random variable, at the end
of which it jumps instantaneously back into the open interval ]0, M ] to a
random point whose position is dened by the probability density function
f1 (x). Let us denote by fh (r) the probability density function of h. Let
fh (s) be the Laplace transform of fh (r) and suppose that it has the form
n
i
j
fh (s) = esr fh (r)dr = (1 bi )ai
0 i=1 j=1
(s + j )
where

1 if i = 1
ai =
b1 . . . bi1 if i > 1, 0 < bi 1.
This is the Coxian, or method of stages, representation of a Laplace

transform. We saw in Chapter 3 that it approximates almost general density
functions.
When the particle hits the upper boundary at x = M it remains there
for a random holding time H whose probability density function fH (r) is
also Coxian and its Laplace transform is
m
i

j
fH (s) = esr fH (r)dr = (1 Bi )Ai
0 i=1 j=1
(s + j )
where

1 if i = 1
Ai =
B1 . . . Bi1 if i > 1, 0 < Bi 1.
At the end of the holding time at the upper boundary the particle
jumps back instantaneously to a random point in ]0, M [ whose position
is determined by the probability density function f2 (x).f1 (x) and f2 (x)
may be taken to be functions of the instant at which the jumps occur.
Notice that
n
ai
E[h] = ;

i=1 i
similarly,
m
Ai
E[H] = .
i=1
i
Let us introduce the notation
= (E[h])1 , = (E[H])1 .
For the variances of h and H we have, from Chapter 3,

n n

ai Ai
Var(h) = 2, Var(H) = .

i=1 i i=1
2i
It is easy to see that the process {X(t), t 0} dened in this section

is non-Markovian: once the particle is at any one of the boundaries the
additional time it will remain there is not independent of the amount of
time it has resided at the boundary up to the present instant.
Let f = f (x, t) denote the probability density function of the stochastic
process {X(t), t 0} in the open interval ]0, M [ and let Ax,t and Cx,t be
operators dened by
1 2
Ax,t f = f bf + f
t x 2 x2
1
Cx,t f = bf + f.
2 x
Also, let Pi (t), 1 i n, be the probability that the particle is in the
i-th stage of the holding time at the lower boundary at time t while Qi (t),
1 i m, is the probability that it is in the i-th stage of the holding time
at the upper boundary at time t. The equations describing the evolution of
the particle are
n
m

Ax,t f + i (1 bi )Pi (t)f1 (x) + i (1 Bi )Qi (t)f2 (x) = 0 (6.6)
i=1 i=1

d 1 P1 (t) + C0,t f if i = 1
Pi (t) = (6.7)
dt i Pi (t) + i1 bi1 Pi1 (t) if 1 < i n

d 1 Q1 (t) CM,t f if i = 1
Qi (t) = (6.8)
dt Q (t) + B Q (t) if 1 < i m
i i i1 i1 i1
where

1
C0,t f = lim bf + f
x0 2 x

1
CM,t f = lim bf + f
xM 2 x
= (E[h])1 , = (E[H])1 .
Dene P (t) as the probability that the particle is at the lower boundary
at time t, and let Q(t) be the corresponding quantity for the upper
boundary:
n
m

P (t) = Pi (t), Q(t) = Qi (t).
i=1 i=1
From (6.7), (6.8) we obtain

n
d
P (t) = i (1 bi )Pi (t) + C0,t f (6.9)
dt i=1
m
d
Q(t) = i (1 Bi )Qi (t) CM,t f. (6.10)
dt i=1
Equations (6.6), (6.7), (6.8) are simple to interpret. Suppose is a

subinterval of ]0, M [. Then (6.6) can be deduced from
n
1 2
f dx = bf + f dx + i (1 b i )Pi (t) f1 (x)dx
t x 2 x2 i=1
m
+ i (1 Bi )Qi (t) f2 (x)dx (6.11)
i=1
which states that the rate of change of the probability mass in is equal
to the rate of ow of the probability mass out of (the rst term on the
right-hand side of (6.11)) plus the rate of ow into from x = 0 and from
x = M (the second and third terms, respectively, on the right-hand side). In
order to deduce (6.7), notice that for 1 < i n we may write for any t 0,
Pi (t + t) = (1 i t)Pi (t) + i1 bi1 tPi1 (t)
since the time the particle spends in any one of the stages of the Cox
distribution is exponentially distributed; by collecting terms, dividing both
sides by t and taking t 0, this yields (6.7) for 1 < i n in the
usual way. To obtain (6.7) with i = 1 a similar procedure is applied if one

notices that C0,t f is the ow of probability mass out of ]0, M [ from the lower
boundary and, of course, into the rst stage of the holding time at x = 0.
A similar interpretation can be given for (6.8); notice now that CM,t f is
the ow of probability mass away from ]0, M [ via the upper boundary.
In addition to (6.6), (6.7), (6.8) appropriate boundary conditions for
f (x, t) must be specied and initial conditions (at t = 0) must be given
for the stochastic process. Since the boundaries at x = 0 and x = M
behave as absorbing boundaries during their respective holding times we
set limx0 f (x, t) = limxM f (x, t) = 0 for all t 0. Of course, we set
M
f dx + P (t) + Q(t) = 1.
0
We shall now prove that the stationary solution P, Q, f of (6.6), (6.9),

(6.10) depends only on the average holding time 1 , 1 on the boundaries
x = 0, x = M and not on the complete density functions fh (r), fH (r).
We set
dPi (t) dQi (t) t(x, t)
= 0, = 0, =0
dt dt t
in (6.6), (6.7), (6.8) to obtain the relationships
P1 = 1
1 C0 f, Pi = (i1 bi1 /i )Pi1 , 1<in
so that
ai
Pi = i b1 . . . bi1 C0 f = C0 f, 1 < i n.
i
Therefore,
n

P = Pi = 1 C0 f ;
1
similarly, we can show that
Q = 1 CM f.
But
n
n

i (1 bi )Pi = ai (1 bi )C0 f = C0 f
i=1 i=1
and similarly
m

i (1 Bi )Qi = CM f.
i=1
Therefore (6.6), (6.9), (6.10) become in stationary state
Ax f + P f1 (x) + Qf2 (x) = 0 (6.12)

P = C0 f (6.13)
Q = CM f. (6.14)
Since these equations depend on E[h] and E[H] only we have proved
that the stationary probabilities f (x), P, Q are independent of the higher
moments of h and H.
This result shows that if we are interested in approximating the
stationary queue length probability distribution using the instantaneous
return process, it suces to use a model where h and H are exponential
This is the assumption we will make in the sequel.
6.2.4. Application to the GI/G/1 queue: stationary solution

In this section we propose an approximation to the number of customers in
a single-server queue with general service time distribution of mean 1/ and
variance Vs independent of the interarrival times or of queue length, and
with independent interarrival times having a general distribution function
with mean 1/ and variance Va . The stochastic process {X(t), t 0}
approximating the number in queue at time t takes values on the non-
negative real line [0, [; it will be an instantaneous return process. In this
model the case X(t) = 0 refers to the empty queue; an arrival at time t to
the empty queue corresponds to an instantaneous jump of X(t) from 0 to 1,
hence we take in (6.6), f1 (x) = (x 1). For a nite value of t there can be
no probability mass at innity hence we only have a probability mass P (t)
at the origin. The parameters b and in the operators Ax,t and Cx,t are
chosen from the predictions of the central limit theorem as in section 2.1:
b =
= 3 Va + 3 Vs = Ka2 + Ks2 .
It is important to note that in our approximation method the random

variable h refers to the time interval between the last departure from the
queue in a busy period to the rst arrival of the next busy period. If the
arrival process is Poisson it is natural to take = (E[h])1 . However, if the
arrival process is not Poisson then the interarrival time distribution and
the distribution of h need not be the same. Let us consider the case where
(E[h])1 = = .
The instantaneous return process approximation in stationary state for
the GI/G/1 queue is represented by the equations obtained from (6.12),
(6.13):
f 1 2t
b + 2 + P (x 1) = 0 (6.15)
x 2 x
P = C0 f. (6.16)
Notice that the term f1 (x) in (6.12) has been replaced by the Dirac density
function concentrated at x = 1, (x 1). This represents the fact that when
an arrival occurs, the queue length jumps instantaneously from x = 0 to
x = 1. In addition to (6.15) and (6.16) we also use f (0) = 0 and

P+ f (x)dx = 1.
0+
The solution to (6.15), (6.16) is

R[e 1]ex , x 1
f= (6.17)
R[1 ex ], 0x1
P =1R (6.18)
where = 2(1 )/(Ka2 + Ks2 ), = /, and
R = /( + ). (6.19)
The condition for existence of the stationary solution is = / < 1

since it results from the condition < 1. We see, however, that the usual
queueing theory result P = 1 will only be obtained if we set = .
Therefore we shall adopt this value of so that R = / = , which is exact
for the case of Poisson arrivals (i.e. the M/G/1 queue). The computations
in the present section are made with this assumption.
The approximate expected queue length at stationary state is then
given by

1 Ka2 + Ks2 1
L= xf dx = + = . (6.20)
0 2 2(1 ) 2 2b
L has a form similar to the PollaczekKhintchine formula for the M/G/1

queue which is

(1 + Ks2 )
L= 1+ .
2(1 )
In fact, if we set Ka = 1 in (6.20) in order to represent a Poisson arrival
process we obtain that the error in the formula (6.20) is:
L = 1 (1 Ks2 )
L
2
L)/L
so that the relative error (L tends to zero as 1.
6.2.5. Application to a closed two-server system

with general service time distributions
A special case of the model presented in section 6.2.3 will be proposed here
as an approximation to a queueing system containing a nite number of
customers and two servers.
The system whose behaviour we wish to approximate is shown in
Fig. 6.1. It consists of a central processing unit (CPU) and an input-
output device (IOD); a nite and xed number M of programs are being
executed in the system. We shall assume that service times at the CPU
are independent and identically distributed (i.i.d.) random variables with
distribution function with mean 1 and variance Vs ; they are independent
of the service times at the IOD which are also i.i.d. random variables of
mean 1 and variance Va . In general we do not exclude the possibility
that , Va , and Vs be functions of the total number of programs in the
system.
Fig. 6.1.
Again, we use the results obtained in section 6.2.3, which allow us to

assume exponentially distributed holding times at the boundaries for the
instantaneous process model.
Let {X(t), t 0} be the stochastic process approximating the number
of programs in the CPU queue of the multiprogramming system described
in Fig. 6.1. We shall approximate it by a diusion process represented the
probability density f (x, t) for 0 < X(t) < M , and the probability masses
P (t) and Q(t) for X(t) = 0 and X(t) = M , respectively. The equations are
Ax,t f (x, t) + P (t)(x 1) + Q(t)(x M + 1) = 0 (6.21)

d
P (t) = P (t) + C0,t f (x, t) (6.22)
dt
d
Q(t) = Q(t) CM,t f (x, t) (6.23)
dt
where 1/ = E[h] and 1/ = E[H] are the average holding times at the
lower and upper boundaries, respectively. Of course, the lower boundary
x = 0 represents the state in which the CPU queue is empty while the
boundary x = M is the state in which all of the programs are in the CPU
queue. We choose again b = and = 3 Va + 3 Vs as in the previous
sections. In general it is possible to choose and in order to obtain the
best possible approximation to the system being modelled.
Here we shall see that the choice of = , = yields satisfactory
numerical results. Solving (6.21), (6.22), (6.23) in stationary state with
these parameters we readily obtain:

K[1 ex ], 0<x1

x
f = K[e 1]e , 1xM 1 (6.24)

K[e (xM)
1]e (M1)
, M 1x<M
with P and Q the probability masses at 0 and at M , respectively, at

stationary state being
P = K(1 )/, Q = K(1 )e(M1)
where = /, and
K = (1 2 e(M1) )1 .
This result can be either veried by substitution in (6.21), (6.22), (6.23)

with the appropriate boundary conditions and setting partial derivatives
Fig. 6.2. Maximum percentage relative error of diusion approximation for closed two-
server system. , exponential IOD; , constant IOD; , condence interval for
exponential IOD; , condence interval for constant IOD.
with respect to time equal to zero or obtained by solving the dierential

equations directly.
In Fig. 6.2 we summarise the result of simulation experiments [2]
concerning the predictions for the multiprogramming system model. The
quantity plotted is the absolute value of the error term relative to the
quantity obtained by simulation, for the stationary probability (1 P ) that
the CPU is active. That is, if = (1 P ) obtained from the diusion model
and is the corresponding CPU utilisation obtained by the simulation
experiments, then the quantity plotted is | |/. Two sets of simulation
results, one with constant service time at the IOD and the other with
exponentially distributed service time at the IOD are given. In each case
we have also plotted the estimated condence intervals for a 95% condence
level. The value of has been varied between 0.25 and 0.9 and M has been
varied between 1 and 10; the relative error plotted for each value of Ks is
the maximum absolute relative error over all these values of and M for
a given value of Ks . This error remains relatively low, and in any case is
smaller than the width of the condence interval.
A comprehensive accuracy study of the model presented in this section
can be found in [2]. The analysis we present here is from [10], although a
dierent study of the same model can be found in [9].
6.2.6. The discretisation problem

The diusion approximation yields a continuous state space approximation
to a discrete process. Thus, one is tempted to work back to a discrete
probability distribution from the continuous density. This is done, for
instance, in the derivation of (6.5) for the reecting boundary model.
Consider (6.17), (6.18) with R = , giving the instantaneous return
process approximation to the GI/G/1 queue:

[e 1]ex , x 1
f (x) = (6.25)
[1 ex ], 0x1
where = 2(1 )/(Ka2 + Ks2 ), and
P = 1 .
It can be discretised in several dierent ways. First consider the discretisa-

tion p1 (i), i = 0, 1, . . . suggested in [6]:
p1 (i) = f (i) for i 1

p1 (0) = P
which is
i = [1 ]
1]
p1 (i) = [ i1
p1 (0) = 1
where = e ; notice that this is identical to (6.5). The average queue length
obtained is then

L1 = ip1 (i) = /(1 )
i=1
Another discretisation p2 (i), i = 0, 1, . . . developed in [10] is

i

p2 (i) = (1 )2 i ,
f (x)dx =
2
i2
i1

1
1)
p2 (1) = 1 ( (6.26a)

p2 (0) = 1
and the average queue length is

1 Ka2 + Ks2
L2 = 1 = 1+ .
2(1 )
These approximate formulae can be compared when Ka2 = 1 to the
PollaczekKhintchine formula (1.68) for the average length of the M/G/1
queue, which is

(1 + Ks2 )
LP K = 1 + .
2(1 )
In order to examine the dierence between L1 , L2 and LPK for Ka2 = 1,
rst notice that
1
L2 LPK = K 2
2 s
so that the error increases with and Ks2 . But the relative error (L2
LPK )/LPK has the following properties:
(i) lim (L2 LP K )/LPK = 0;

1
(ii) lim (L2 LPK )/LPK = 2(1 ).
Ks2
Property (ii) is important since it states that as Ks2 increases, the relative
error depends only on ; but the factor 2(1 ) will be unacceptably
high for small values of . Property (i) is a general property of diusion
approximations: the relative error tends to zero under heavy trac
conditions.
Let us examine how L1 behaves when Ks2 is large and small, i.e. when
L2 is a poor approximation. With Ka2 = 1, we have
= exp(2(1 )/( + Ks2 ))
which is

= exp[(2/Ks2 ) (1 /Ks2 )(1 )]
for Ks2 1 , or

= 1 2(1 )/Ks2
so that

L1
= K 2.
2(1 ) s
Therefore, for Ks2 1 , we have

L1 LPK
= [K 2 (1 ) 2 + ]
2(1 ) s
( 2)
= Ks2 + .
2 2(1 )
Thus, for
1,
lim (L1 LPK )/LPK = 2(1 )

Ks2
just as for L2 .
Consider now the case where = (1 )
1. This analysis has been
carried out in [25]. We will have
2
2 1 2
= 1 + + O(3 )
+ Ks2 2 + Ks2
so that

( + Ks2 ) (1 ) 2
L1 = 1+ + O( )
2(1 ) + Ks2
and

2 1 2
L1 LPK = Ks + + O( ) .
2 + Ks2
Therefore:
(iii) lim (L1 LPK )/LPK = 0, and for = (1 )
1, we have
1
1
(iv) lim (L1 LPK )/LPK = + O(2 )
Ks2
while for L2 , we have (ii); thus we see that for close to 1, L1 has a relative
accuracy which is twice as good as that of L2 .
In fact, the form of the PollaczekKhintchine formula suggests a new
approximation which was noticed in [10] and further developed in [6].
Instead of choosing as has been done above, suppose that we take
2(1 )
= .
(Ka2 + Ks2 )
We may then derive

1 (Ka2 + Ks2 )
L2 = 1 = 1 +
2(1 )
using the same form for the probabilities as given in (6.25) except that we
replace by and by :

= e .
Of course, L2 is the PollaczekKhintchine formula when Ka2 = 1.

Further analysis of this type can be found in [6], from which we take
the numerical results of Tables 6.1, 6.2 and 6.3.
Table 6.1. Exact and approximate average queue lengths of the M/G/1
queue for = 0.8
Ks2 L2 (exact result) L2 L1
128.00 207.20 258.40 258.00

64.00 104.80 130.40 130.00
32.00 53.60 66.40 66.00
16.00 28.00 34.40 34.00
8.00 15.20 18.40 18.00
4.00 8.80 10.40 10.00
2.00 5.60 6.40 6.00
1.00 4.00 4.40 4.01
0.50 3.20 3.40 3.02
0.33 2.93 3.07 2.69
0.25 2.80 2.90 2.53
0.20 2.72 2.80 2.43
0.00 2.40 2.40 2.03
Table 6.2. Approximate average queue length for the E2 /H2 /1 system
compared with simulation results (95% condence intervals) for Ka2 = 0.5
Ks2 Simulation L2 L2 L1
= 0.75 2 3.44 0.05 3.56 4.31 3.95

4 5.67 0.12 5.81 7.31 6.94
8 10.08 0.32 10.31 13.31 12.94
16 19.27 0.83 19.31 25.31 24.94
32 37.39 1.92 37.31 49.31 48.94
64 73.02 4.73 73.31 97.31 96.94
128 146 14.00 145.3 193.30 192.90
= 0.8 2 4.67 0.09 4.80 5.60 5.21
4 7.83 0.22 8.00 9.60 9.21
8 14.11 0.53 14.40 17.60 17.20
16 27.24 1.39 27.20 33.60 33.20
32 52.95 3.02 52.80 65.60 65.20
64 102.40 8.00 104.00 129.60 129.20
128 203.70 21.00 206.40 257.60 257.20
Table 6.3. Comparison of diusion approximations and exact

results for the average queue length of an E2 /M/1 queue (Ka2 = 0.5)
Exact result L2 L2 L1
0.95 14.331 14.487 14.962 14.492

0.90 6.829 6.974 7.425 6.985
0.85 4.327 4.463 4.887 4.477
0.80 3.075 3.200 3.600 3.219
0.75 2.323 2.438 2.813 2.460
0.70 1.820 1.925 2.275 1.950
The results of Table 6.3 use the well-known fact that in general the
stationary solution of the GI/M/1 queue is given by (see, for instance, [14]):
p0 = 1
pi = (1 )i1 , i1
where is the solution of
= A ( )

and A (s) = 0 esx dA(x), where A(x) is the interarrival time distribu-
tion. Therefore, the average stationary queue length of the G1/M/1 queue
is /(1 ).
The various analytical results and numerical examples are evidence
to the eect that the heuristic modication should be chosen. The
corresponding diusion parameters are b = and = Ka2 + Ks2 .
These will be the values retained in the following sections, so that the
discretised approximation which we shall use is
p0 = 1

1
p1 = 1 (
1) ,

pi = (1 )2 i1 , i 2 (6.26b)

with = e .
6.3. Diusion approximations for general networks

of queues with one customer class
In this section we present an approximation method using a diusion model
to obtain the stationary probability of queue length for any given queue
Fig. 6.3. General open queueing network with rst-in-rst-out service discipline.
in an open or closed queueing network composed of FCFS service stations,

each composed of a single server with general service time distribution. The
method is applied to some examples of interest and the model predictions
are compared with simulation results. First consider the general network of
Fig. 6.3 in which:
(i) External arrivals constitute a renewal process of rate 0 ; the variance

of the interarrival time is V0 , and its squared coecient of variation is
K02 = 20 V0 .
(ii) The transition of customers from one station to another is dened by
a rst-order Markov chain with transition matrix P = (pij ), 1 i,
j n + 1, is the probability that a customer having terminated its
service at station i then enters station j, or leaves the system when
j = n + 1; P is assumed to have a single absorbing state n + 1, and no
closed subchains.
(iii) The service times for successive customers at station i are independent
and identically distributed with common distribution function Fi (t):
service times are also independent from one station to another.
(iv) Customers rst entering the network are directed to station i with
xed probability p0i .
Let ei , 1 i n, be the solution to the system of equations

n

ei = p0i + ej pji
j=1
which is unique under these assumptions (see Chapter 3). Then ei is the
expected number of visits which a customer of the network will make to
station i. The arrival rate of customers to station i is i = 0 ei at steady-
state; also the steady-state probability i that station i contains at least
one customer is given by
0 ei
i = if ei < i
i
where

1
i = t dFi (t)
0
is the average service time for a customer at station i. This fact can be
easily established rigorously; one way is to treat an open network of this
kind as a limiting case of a closed network when one station is saturated
and to apply the work-rate theorem (see Chapter 3).
The approach we develop in this section is based on the following
assumption, which in general is unjustied: the departure process from
any station in the open network is a renewal process, i.e. times between
successive departures are independent and identically distributed. We shall
make use of this assumption in order to compute the rst two moments of
the interdeparture time distribution, although it is in general not satised.
This assumption is valid in the open network with Poisson arrivals and
exponentially distributed service times. It is also valid for the output of
station i when 0 ei /i 1, or when all j = 0. Let Ci , 1 i n, be
the squared coecient of variation of the interdepartures times at station
i, and denote by Ai the interarrival time, Si the service time, Ai the idle
time, and by i the interdeparture time. We shall dene C0 = K02 in order
to maintain a uniform presentation.
For t large enough, and assuming that the output processes from each
individual queue are independent, the total number of arrivals to station i
in the interval [0, t] will be normally distributed with mean i t and variance
n

[(Cj 1)pji + 1]j pji t.
j=0
Here we have used the fact that the sum of independent normal random
variables is normal with variance being the sum of individual variances. In
the usual diusion equations for approximating the length of each queue
(6.15), (6.16), the following parameters will be chosen, following (6.26) (see
section 6.2.6):

bi = i i , i = i , i = i /i

n
(6.27)
2
i = i i Ki + [(Cj 1)pji + 1]j pji

j=0
where the subscript i refers to the parameters of the equations of the i-th
queue, and Ki2 is the squared coecient of variation of service time at the
i-th queue.
In order to complete the development we must obtain Ci , 1 i n. We
shall assume that i is a service time Si with probability i or an interarrival
time plus a service time Ai + Si with probability (1 i ). We then have
E[i ] = i 1 1 1 1
i + (1 i )(i + i ) = i
as would be expected, and
E[i2 ] = (1 2 2 2
i ) (1 + Ci ) = E[Si ] + (1 i )(E[Ai ] + 2E[Ai ]E[Si ])
so that
Ci + 1 = 2i (Ki2 + 1) + (1 i )(2i E[A2i ] + 2i ).
Finally, it is the
n

[(Cj 1)pji + 1]j pji = 3i (E[A2i ] (1 2
i ) ).
j=0
This yields, for 1 i n,

n
(1 i ) 2 i (Ki2 1)
(Ci 1) (Cj 1)j p ji = (6.28)
i i p2ii (1 i ) j=0 1 (1 i )p2ii
j=i
which is a convenient form for numerical solution. An approximation to this

system of equations is obtained if we can neglect the second term on the
left-hand side:
2i (Ki2 1)
Ci
= + 1. (6.29)
1 (1 i )p2ii
We now use these parameters to construct the diusion approximation

to the length of each individual queue. The instantaneous process model
for queue i will use the equations (see (6.15), (6.16))
fi fi 1 2 fi
bi + i 2 + i Pi (t)(xi 1) = 0
t xi 2 xi

d 1 fi
Pi (t) = i Pi (t) + lim+ bi fi + i
dt xi 0 2 xi
fi (0, t) = 0 (absorbing boundary).
fi (xi , t) is the density function approximating the length of the i-th queue
and Pi (t) is the probability that the i-th queue is empty. The stationary
solution will be obtained as

i (e i 1)ei xi , xi 1
fi (xi ) =
i (1 ei xi ), 0 xi 1
Pi = 1 i
where i = 2bi /i , i = i /i , and the approximate average queue

length is
Li = i [1 i /2bi ]
where bi and i are dened in (6.27).

This approach can be improved in the case of self-loops in queues where
a customer leaving a queue may immediately return to the same queue.
Suppose that for some i, pii = 0. In this case the assumption of having a
renewal process of arrivals to the i-th queue independent of queue length is
obviously too strong, especially if pii is relatively large. Whenever pii = 0
we suggest the following modication to the diusion model [17]. Modify
the parameters of the i-th queue so that

0 if j = 1
(i) pij , 1 j n + 1, is replaced by pij =
pij /(1 pii ) if j = 1;
i = i (1 pii ), and Ki2 is replaced by K

(ii) i is replaced by 2 = (1 pii )
i
2
Ki + pii .
The system of equations (6.28) is then solved with the modied values pij
and K 2 . Notice that the arrival rate to a queue is modied only if pii = 0
i
in the original network; however, the value of the load factor i = i /i
is unchanged since i becomes i (1 pii ) and i becomes i (1 pii ).
The queue length is also preserved, since service times are being replaced
by longer service times (see (ii)) which are the sum of a geometrically
distributed number of service times corresponding to the feedback of
customers to the queue. That is, Si is being replaced by
l

Si = Sik
k=1
with probability pl1 1 l

ii (1 pii ), where Si , . . . , Si are independent and
distributed identically to Si .
Since there is no exact method of solution for the type of network
considered here, most of the material on validation of our approximations
will be based on simulation results.
As mentioned above, in the case of Poisson external arrivals and
exponential service times at all stations our predictions for the Ci , 1
i n, from (6.29) are exact. It has been shown in Chapter 3 that for FCFS
service, interdeparture times are exponentially distributed only if arrivals
are Poisson and service times are exponential.
The output of an M/D/1 queue has been examined in considerable
detail [21] so that all moments of the interdeparture time distribution
are available. We may apply this information to the system shown in
Fig. 6.4, when arrivals to the rst queue are Poisson and its service times
are constant. For that case equation (6.18) predicts C1 = 1 21 for the
departures from the rst queue; the value obtained by Pack [21] is exactly
the same. Thus, the squared coecient of variation or interarrival times to
the second queue is 1 21 .
The output process of an M/G/1 queue has been studied [19] by means
of a WienerHopf factorisation to obtain the Laplace transform of the
interdeparture time distribution. In [8] the variance of interdeparture times
has been computed explicitly for the system M/G/1/N , i.e. with nite
population N ; the results of interest to us are obtained by setting N .
In this case, too, we see that our predictions are exact. The variance of
interdeparture times from the rst queue of the system in Fig. 6.2 if service
time is general and external arrivals are Poisson is computed from [8] as
Fig. 6.4.
being [(1 21 )/2 + K12 /21 ]; the value of C1 /2 obtained from equation
(6.28) is exactly this value.
Example 6.1
In order to illustrate the degree of accuracy which can be obtained from the
approximation techniques for open single-customer class queueing networks
which we have presented in this section, we shall apply the preceding
results to the system model presented on Fig. 6.5. The model which is
shown was introduced in [1] in order to evaluate the performance of an
interactive system. In this model, jobs arrive at the system in a Poisson
stream of rate 0 ; after passing through server 1 (which represents a
central processing unit) they either leave the system with probability
(1 1 ) or enter the queue of server 2 (representing an input-output
device). A customer will either enter once again the queue of server 2, with
probability 2 , or proceed to server 1 after nishing its service at server 2.
The following simulations and numerical results have been obtained by
Dinh [7]. The results shown in Table 6.5 provide a comparison, with
respect to simulation results, of the accuracy of the method developed
earlier in this section as well as of the approach of Reiser and Kobayashi
which we have summarised in section 6.3.2. Condence intervals for the
values estimated from the simulation experiments have not been provided.
However, the precise conditions under which the simulations were carried
out are shown in Table 6.4 for the ve simulation runs. These data indicate
that it was impossible in any of the simulations to obtain the same arrival
Fig. 6.5. System model analysed in Example 6.1.

January 11, 2010 12:17
192
Table 6.4. Parameters for the experiments described in Example 6.1
Analysis and Synthesis of Computer Systems

i i
spi-b749
Experiment Queue
No. 1 2 0 K0 No. 1
i Ki Simulated Computed Simulated Computed
1 0.510 0.503 0.512 0.941 1 0.91123 0.427 1.0489 1.045 0.957 0.953
2 0.84000 0 1.078 1.073 0.905 0.901
2 0.509 0.499 0.410 0.944 1 0.91591 0.423 0.835 0.836 0.764 0.766
2 0.84000 0 0.848 0.849 0.712 0.713
3 0.516 0.506 0.342 0.945 1 0.91443 0.414 0.707 0.706 0.646 0.646
2 0.84000 0 0.738 0.738 0.620 0.620
9in x 6in
4 0.512 0.502 0.293 0.967 1 0.90436 0.432 0.601 0.602 0.544 0.544
2 0.84000 0 0.619 0.619 0.520 0.520
5 0.504 0.507 0.257 0.952 1 0.91094 0.422 0.519 0.519 0.472 0.473
2 0.84000 0 0.530 0.531 0.444 0.446
b749-ch06
rates, probabilities or service time distributions which were assumed in the

diusion model.
The results shown in Table 6.5 are the values of the squared coecients
of variation of interdeparture times C1 , C2 and of interarrival times K 1 , K 2
from or to the two servers, as well as the average queue lengths for each of
the ve experiments indicated in Table 6.4. These quantities are estimated
from the simulation results and also computed using the GelenbePujolle
diusion approximation (denoted by GP) which we have already described
in this section, as well as by the approach of Reiser and Kobayashi (denoted
by R/K) which we have reported in section 6.3.2. The values of C1 , C2 for
the ReiserKobayashi method are not tabulated since they are identical to
the squared coecients of variation for the service times which are already
given in Table 6.4 (K1 and K2 ): notice that these are considerably dierent
from those obtained by simulation. In the case of average queue lengths,
we show the result without (No mod.) and with (With mod.) the
modication suggested for handling self-loops or feedback of customers as
we have in server 1 of Fig. 6.5. We notice that the accuracy of the diusion
approximation is consistently worse in the case of server 2, although the
self-loop modication does improve matters.
6.3.1. Application to packet-switching

computer-communication networks
The diusion approximation method developed in this section is particu-
larly suited for the analysis of packet-switching computer-communication
networks [14]. Consider the packet-switching network shown in Fig. 6.6.
It is composed of nodes 1 to 5. The physical links carrying data between
nodes are numbered 1 to 12. Data are transported through the network
in the form of packets of variable length, which can be viewed as the
customers of the network, while the transmission time of the packets along
a link can be viewed as service times. Therefore each link behaves as
a server, and the packets waiting for transmission along a link form a
queue within the node which precedes the link (e.g. node 2 precedes link 5
in Fig. 6.6). Most analyses of such networks have assumed that buer
space at each node is innite and that the packet sizes are exponentially
distributed [15] so that Jacksons theorem may be used. Here we shall
show that diusion approximations can yield more accurate predictions
for such systems when a more accurate representation of packet length is
necessary.
January 11, 2010 12:17
194
Table 6.5. Comparison of the GelenbePujolle and ReiserKobayashi diusion approximations for single-customer class queueing
networks with simulations
Analysis and Synthesis of Computer Systems

Ci Ki Average queue length
spi-b749
Experiment Queue G/P G/P
No. No. Simulation G/P Simulation G/P R/K Simulation R/K No mod. with mod.
1 1 0.466 0.469 0.738 0.757 0.718 13.821 11.648 12.013 11.966

2 0.329 0.153 0.859 0.651 0.602 7.831 2.949 3.124 5.735
2 1 0.653 0.621 0.860 0.825 0.717 2.359 2.002 2.106 2.093
2 1.478 0.422 1.765 0.759 0.604 1.871 0.970 1.031 1.694
3 1 0.944 0.707 1.055 0.862 0.723 1.603 1.168 1.207 1.200
2 2.533 0.543 2.754 0.808 0.595 1.467 0.709 0.718 1.149
9in x 6in
4 1 1.070 0.785 1.189 0.898 0.729 1.046 0.815 0.821 0.818
2 2.858 0.663 3.008 0.860 0.603 1.005 0.545 0.502 0.781
5 1 1.184 0.824 1.284 0.912 0.728 0.758 0.632 0.619 0.617
2 3.341 0.740 3.438 0.890 0.599 0.731 0.453 0.382 0.589
b749-ch06
Fig. 6.6. Model of a packet-switching network.
The trac in the packet-switching network wil be specied by (i) the

input trac; (ii) the nal destination matrix D; and (iii) the routing matrix
R. We shall denote by i (in packets per second) the arrival rate of packets
which enter the network at node i.
D (dij ) is an n n matrix where n is the number of nodes; dij is the
proportion of packets entering the network at node i whose nal destination
is node j. The routing matrix R (rij ) is also n n; rij is the number
of the next link which will be traversed by a packet which is currently at
node i and whose nal destination is node j. Clearly, rij must be an output
link of node i. Notice that R denes a static, i.e. pre-determined, routing
policy in the network. Dynamic policies can also be dened and have been
discussed in [4, 15]. We must also specify the distribution of packet length;
this distribution will immediately give us the distribution of transit times
(or service times) for each link, since link speed (in bits per second) is
known. In such analyses propagation times (which are usually short since
the link lengths are short compared to the distance travelled by light in
one second) in the links, and switching times inside the nodes are neglected
[4, 15]. It is also assumed that a packet arriving at its nal destination is
instantaneously consumed: this implies that we do not analyse the queues
which form for the output of packets from the destination node towards
some output device or computer.
This representation must now be transformed into a queueing network
model. The model will have as many servers and queues as there are links
in the packet-switching network. We must therefore determine the arrival

rate of packets k to the k-th link as well as the transition probabilities pkl
from link k to link l.
Each non-zero element of D, say dij , denes a source-destination pair
(i, j) which will be used to construct a path from node i to node i using
the routing matrix R. A path will be a vector = (1 , . . . , L ) where L is
the path length (L n since the network cannot contain cycles) or number
of links traversed by a packet whose source-destination pair is (i, j). First
number each link as a pair (c, d) if the link connects node c to node d.
This will yield a new routing matrix R (notice that R is not necessarily
isomorphic to R since there may be several distinct links connecting node
c to node d). Then create a vector = (1 , . . . , L

) as follows:
1 = rij

= (c1 , d1 )

k+1 = (ck+1 , dk+1 ) = rd k ,j , if dk = j
L = k, if dk = m.
From and R it is simple to return to the vector . The path trac

is simply
i dij
=
where i (dened above) is the trac entering the network at node i.

Finally, we obtain the link trac k as follows. Let Pk be the set of all
paths containing link k. Then

k = . (6.30)
P k
Clearly, k is the number of packets per second which will be carried by

link k, and is the arrival rate of packets to the queue, or buer, of packets
which are waiting to enter the link. Let Pkl be the set of all paths of the
form = (1 , . . . , x , k, l, x+2 , . . . , L ); i.e. is a path in which link k is
followed by link l. Then we shall take pkl to be

pkl = /k (6.31)
P kl
or the proportion of packets which enter link l after having entered link k.
The k and pkl obtained from (6.30) and (6.31) can now be used in (6.28)
to compute the Ck , which approximate the squared coecients of variation

of interarrival times to the buer queue for link k. Since the Kk2 are
known from the distribution of packet lengths we now have all the data
necessary for the computation of the approximate queue length distribution
and, in particular, the average buer queue lengths Lk from (6.20), using
(6.27), (6.28),

1 k
L k = k , bk = k k (6.32)
2 2bk
where k = k /k , 1
k is the average transit time of a packet along link k.
If the usual formula [15] using Jacksons theorem had been used, we would
have had (see Chapter 3):
Lk = k /(1 k ). (6.33)
Example 6.2
Let us now apply these results to a numerical example given in [4]. The
network is shown in Fig. 6.6 and the external arrival process is Poisson
with rates:
1 = 6;
2 = 8.25;
3 = 7.5;
4 = 6.75;
5 = 1.5.

Packet lengths are assumed to be constants so that Kk2 = 0 for all links
1 k 12, and the packet length is 1000 bits. Links 1, 2, 7, 8, 11, 12 have
a data-transmission capacity of 4800 bits/second, while links 3, 4, 5, 6, 9,
10 have a capacity of 48,000 bits/second. Therefore
1 = 2 = 7 = 8 = 11 = 12 = 4.8
3 = 4 = 5 = 6 = 9 = 10 = 48.
The distribution matrix is

0.0 0.10 0.2 0.10 0.60

0.4 0.00 0.4 0.15 0.05

D= 0.1 0.20 0.0 0.60 0.10
,

0.3 0.30 0.3 0.00 0.10

0.1 0.25 0.3 0.35 0.00
while the routing matrix is

0 3 3 3 2
4 0 5 5 4

R=
6 6 0 9 8 .

10 10 10 0 12
1 1 7 11 0
The system as described was simulated until 6000 packets were received
at their destinations. Although the simulation results have not been
analysed in order to compute statistical condence intervals, this simulation
experiment is comparable in duration to a measurement session on a real
computer network, the main point being that the diusion approximation
is capable of making predictions which are as accurate as simulation of such
systems. The results obtained are given in Table 6.6.
We see that for buer queues which are lightly loaded, Jacksons
formula (6.33) yields results which are of the same degree of accuracy as
the diusion approximation. However for link 2, which is heavily loaded,
the diusion approximation is considerably more accurate.
Let us complete this section by deriving another formula which is of
interest in the analysis of packet-switching networks. A useful performance
measure is the average source-destination transit delay T(i,j) for the source-
destination pair (i, j). For xed routing this corresponds simply to the
Table 6.6. Average buer queue lengths for the packet-switching network
of Fig. 6.6
Average queue length Jackson Diusion approximation Simulation
L1 0.123 0.116 0.117

L2 3.000 1.875 1.920
L3 0.139 0.131 0.132
L4 0.170 0.157 0.163
L5 0.127 0.125 0.105
L6 0.157 0.146 0.173
L7 0.104 0.099 0.087
L8 0.185 0.171 0.208
L9 0.162 0.147 0.155
L10 0.145 0.136 0.129
L11 0.123 0.116 0.106
L12 0.164 0.152 0.154
average time to traverse the path (i, j) which corresponds to (l, m).
Therefore

T(i,j) = Lk /k . (6.34)
k(i,j)
The important case of networks with nite storage capacity will not be
analysed here, and the interested reader is referred to [3] for results on this
subject.
6.3.2. The approach of Kobayashi and Reiser to the

approximation of queueing networks
In [16], Kobayashi proposes a generalisation to an open or closed queueing
network of arbitrary topology of the results of Gaver and Shedler [9].
Kobayashi imposes reecting boundaries to the n-dimensional diusion
process and arrives at an equilibrium joint distribution of queue lengths
which is in product form. He introduces from queueing theory the known
probability of an empty queue at each node to modify the solution of the
diusion equation so as to obtain a more accurate representation of queue
length distribution. Reiser and Kobayashi [22] have presented a simplied
diusion model derived from that approach, and we will briey review their
results.
The squared coecient of variation of the interarrival time to queue i
is chosen to be
n

K (i) = (i )1 [(Kj2 1)pji + 1]j pji (6.35)
j=0
where j = 0 ej , 1 j n, and Kj2 is the squared coecient of variation

of the service time at queue j if j = 0; K0 = 20 V0 . An equilibrium queue
size distribution

1 1 , if mi = 0
pi (mi ) = 1
(6.36)
i (1 i )
i mi , if mi 1
is proposed for queue i where mi is the i-th queues length, and
i = ei
where
2(i i )
i = . (6.37)
i K (i) + i Ki
The approximation proposed for the joint probability distribution is
n

p(m1 , . . . , mn ) = pi (mi ) (6.38)
i=1
for the n queues in an open network. Obviously the result holds only if
i i , 1 i n, which is the usual stability condition.
For a closed network, the following treatment has been suggested.
Suppose i is the utilisation of server i; the joint probability distribution is
taken to be (for M customers in the network):
n

p(m1 , . . . , mn ) = G pi (mi ) (6.39)
i=1
where

1 i , mi = 0
pi (mi ) = (6.40)
i (1 m
i )i
i 1
, mi = 1, 2, . . . , M
and G is a normalising constant. Several methods are suggested for

choosing i . The simplest seems to be to assume that M is suciently
large so that there exists a bottle-neck queue (say k) whose utilisation is
1. Then, by application of the work-rate theorem (see Chapter 3),
i = Xi K /Xk i (6.41)
where Xi , Xk are the equilibrium probabilities that a customer will be at

station i, k, respectively, which is the solution to (see Chapter 3)
n
n

Xi = Xj pji and Xi = 1.
i=1 i=1
The ReiserKobayashi approach yields less accurate results in certain

cases. For instance, consider once again the network of Fig. 6.4; their
approach yields K (2) = 1 for the squared coecient of variation of
interarrivals to queue 2, while the correct result is 1 2i which is obtained
in [21] and also by the approach developed above. In many cases, however,
its accuracy is comparable to that of the method we have described in this
section.
6.4. Approximate behaviour of a single queue in a network

with multiple customer classes
In this section we are concerned with an open network containing n stations.
Each station contains one server. Customers of the network belong to R
classes. At each station service is rendered in strict FIFO order with no
priorities between classes. The solution method we develop is closely related
to the approach of the previous section.
The r-th class of customers, 1 r R, is characterised by:
(i) a stream of arrivals to the network which is a renewal process: its rate
is 0,r and the squared coecient of variation of interarrival times is
K0,r ;
(ii) a general service time distribution function Fri (x) at the i-th service
station, 1 i n. 1 i,r will be its average and Ki,r its squared
coecient of variation.
Furthermore, transitions of customers through the network are

described by a Markov chain (pi,r;j,r ) where 1 i n, 1 r, r R,
1 j n + 1. pi,r;j,r is the probability that a class r customer leaving
station i enters station j in class r . The ctitious station (n + 1) denotes
a departure from the network. We shall call qi,r the probability that an
arriving customer of class r enters station i of the network.
The reader will notice that the queueing network we have thus dened
cannot be solved by any of the available exact solution methods. Such
models are of particular interest in performance evaluation studies of
computer systems which take into account the existence of multiple job
classes. In the area of computer networks they reect well the presence of
short and long packets: the former can represent interactive processing
while the latter can represent the transfer of les.
As in the approach taken in section 6.3, the analysis proceeds in two
parts: the rst concerns the computation of the parameters of the arrival
process to each queue while the second part uses the results of the rst part
in the queue-length computations using diusion approximations.
6.4.1. Computation of the approximate interarrival

statistics for each queue
We shall rst derive the equations which will allow us to compute the
interarrival statistics; the algorithm used for computing these statistics will
then be given in compact form. We shall rst need i,r , the arrival rate of
class r customers to queue i, obtained by solving

n
R
i,r = 0,r qi,r + j,r , pj,r ;i,r . (6.42)
j=1 r =1
Let us denote i,r = i,r /i,r : it can be viewed as the load imposed by class
r customers on station i. We shall dene
R

i = i,r (6.43)
r=1
and
R

i = i,r , i,r = i,r /i for 1 i n, 1 r R,
r=1
R
(6.44)

0 = 0,r ,
r=1
where i is the utilisation (steady-state probability that the queue is busy),

and i is the total arrival rate, associated with station i.
Having obtained the i,r , from (6.37) we need to compute the squared
coecients of variation of the interarrival times of class r customers at
station i. These are obtained by assuming that the arrival and departure
processes of class r customers to and from each queue are renewal processes.
Let i be the time separating two successive departures from station i. We
shall write the following heuristic relation:

Si with probability ui
i = (6.45)
Si + Ai with probability (1 ui )
where Ai is an interarrival time to queue i, and Si is Si,r with probability

i,r , Si,r being the service time of class r customer at station i. Therefore,
from (6.45)
E[i ] = E[Si ] + E[Ai ](1 ui )

R

= 1 1
i,r i,r + i (1 ui )
r=1
R 1

= 1
i = i,r .
i=1
We also obtain:
E[i2 ] = ui E[Si2 ] + (1 ui )E[Si2 + 2Ai Si + A2i ]

R

= E[Si2 ] + 2(1 ui )1
i 1 2
i,r i,r + (1 ui )E[Ai ].
r=1
Denote by Ci = 2i {E[i2 ] (E[i ])2 }

the squared coecient of variation of
interdeparture times at the i-th queue, and let Gi be the squared coecient
of variation of interarrival times to the i-th queue. We then have
Ci + 1 = 2i E[Si2 ] + (1 i )(Gi + 1) + 2i (1 i ).
The service time Si is Si,r if it is the service of a class r customer (i.e. with
probability i,r = i,r /i ). Therefore
R

E[Si2 ] = 2
E[Si,r ]i,r
r=1
R

= 2
(Ki,r + 1)(1 2
i,r ) i,r /i
r=1
so that
R

Ci + 1 = i i,r 1 2
i,r (Ki,r + 1) + (1 i )(Gi + 1 + 2i ). (6.46)
i=r
Using an argument similar to the one used in section 6.3, assuming that
the output processes of the n queues are mutually independent renewal
processes, we can write that the variance of the number of arrivals at the
i-th queue in a long interval (0, t) will be
n

3i {E[A2i ] (E[Ai ])2 }t = i Gi t
= [(Cj 1)pji + 1]j pji t
j=0
so that we take, for 1 i n

n

Gi = 1
i [(Cj 1)pji + 1]j pji (6.47)
j=0
where pji is the probability that a job leaving station j enters station i, or
R
R
pji = j,r pj,r;i,r (6.48)
r=1 r =1
for 0 j n, 1 i n. In (6.47) C0 is the squared coecient of variation

of external interarrival times. Notice that the variance of the number of
arrivals in (0, t) is, asymptotically for large t,
R R

2t
0,r C0 t = 0,r K0r
r=1 r=1
so that we compute C0 from this relation:

R

2
C0 = 0,r K0,r /0 . (6.49)
r=1
Using (6.46) and (6.47) we now obtain the system of n linear equations
for the Ci , 1 i n:
R
n n
j j 2
Ci = i i,r 1 2
i,r (Ki,r +1)+ (1pji ) pji 22i + p Cj . (6.50)
i=1 j=0
i ji
j=0 i
Finally, notice that Gi,r (the squared coecient of variation of interarrival

times of class r to queue i) will be given by
Gi,r = (Gi 1)i,r + 1. (6.51)
This completes the computation of the interarrival statistics to each queue

of the network which we can summarise as follows:
Begin
Step 1 Obtain the i,r , 1 i n, 1 r R from the linear system of nR
equations (6.42).
Step 2 Compute i , i , i,r from (6.43), (6.44).
Step 3 Use (6.48) to obtain the pji , 0 j n, 1 i n.
Step 4 Obtain the Ci , 1 i n, by solving the n linear equations (6.50)
using C0 from (6.49).
Step 5 Compute Gi , and G1,r , 1 i n, 1 r R, from (6.47) using the
result of Step 4, and using (6.51).
End.
6.4.2. Diusion approximation to the queue length process

We now consider the behaviour of any queue, say the i-th, in the network
and approximate the queue length process by a diusion process.
The queue length probability density function fi (x, t) is assumed to

satisfy the diusion equations
fi fi 1 2 fi
bi + i 2 + i Pi (t)(xi 1) = 0
t xi 2 xi

d 1 fi
Pi (t) = i Pi (t) + lim+ bi fi + i
dt xi 0 2 xi
with lim fi (xi , t) = 0, and where Pi (t) is the probability that the queue
xi 0+
length is xi = 0 at time t. The parameters for the diusion process are
chosen to be
bi = i i , i = i i Gi + 3i Vi (6.52)
where Vi is an equivalent variance of service time at queue i:
Vi = E[Si2 ] (E[Si ])2

R
R
2

= 2
(Ki,r + 1)(1 2
i,r ) i,r /i 1
i,r i,r /i . (6.53)
i=1 i=1
Writing i = 2bi /i we obtain the stationary solution, which exists for

i < 0 or i < 1:
Pi = 1 i /i = 1 i (6.54)

i [1 ei xi ], 0 xi 1
fi (x) = (6.55)
i [1 ei ]ei xi , xi 1
which, when discretised using (6.26b), yields the diusion approximation

to the average queue length given by

i (Gi + Ki2 )
L i = i 1+ . (6.56)
2(1 i )
From the distribution for the total number in queue we will now work back
to the distribution of the number of customers of each class in queue. We
proceed as follows. Discretise the probability density function fi (x) by using
(6.26b). pi (ni ) will be the discrete approximation to the stationary queue
length distribution at station i. Let pi,r (li ) be the probability of nding
li customers of class r at station i; we take

n
pi,r (li ) = li (1 i,r )nli pi (ni ) (6.57)
li i,r
nl
since each customer in queue i belongs to class r with probability i,r .

A quantity of interest is the average response (or transit) time through
the network for customers of each class. Denote this quantity by Tr for class
r: Tr is the average time spent by a customer of class r between the instant
at which it enters the network and the instant at which it departs. Clearly,
a customers waiting time at each queue does not depend on its class; let
Ti,r be the response (or transit) time of a class r customer through station
i and Wi,r the waiting time. We have (using Littles formula)
Wi,r = Li /i 1
i
and
Ti,r = Li /i 1 1
i + i,r . (6.58)
Therefore
n

Tr = Ti,r i,r /0,r (6.59)
i=1
since a class r customer will visit station i on average i,r /0,r times.
A detailed validation of this models predictions is given in [7] for a
computer system with two job classes. The accuracy seems to be very good.
Comparisons with simulation results reported in [7] yield a relative error of
less than 10% in average queue lengths for each class.
6.5. Conclusion
Many important practical cases of large-scale computer systems are too
complex to be represented exactly by a mathematical model. Even when a
precise mathematical model can be constructed, the analyst is faced with
a problem of dimension. Models with a number of states proportional to
106 are easy to obtain, but program packages capable of solving Markov
chains of this dimension are not yet available. Often the mathematical
models which arise from computer systems have properties which make
them particularly dicult to handle numerically; for instance, the time
constants related to various parts of the system will vary widely leading to
sti systems of equations. Such properties also make simulation modelling

particularly dicult. If a system is composed of parts with very small and
very large time constants, it will be necessary to simulate it at the time
scale which corresponds to the rapidly varying portions in order to preserve
the desired accuracy; yet the total simulation time will have to be large
compared to the slowly varying parts in order for the simulation to reach
steady-state. Furthermore, the probabilistic or statistical tools available at
present do not allow us to estimate accurately the condence intervals of
simulation results except for the simplest models which have a regenerative
structure or other simplifying properties.
All these considerations make it particularly desirable to have computa-
tionally tractable and relatively accurate approximate mathematical models
for computer systems. We have seen that diusion approximations satisfy
these two criteria. If they are used carefully, under relatively heavy load
conditions and when trac and service times do not have excessively high
coecients of variation, their accuracy is comparable to that of simulation
models. The computational eort involved in solving them is negligible
by comparison; it will usually involve the solution of a system of linear
equations whose size is the product of the number of stations and of
customer classes, and the computation of moments from a continuous or
discretised density function.
The open problems in this area are of both a mathematical and a
practical nature. The convergence of the queueing models to the diusion
approximations has been established only for the simplest and the least
interesting cases which is hardly surprising since the mathematical tools
for this are still rudimentary. From a more practical point of view we
need to extend further our understanding of good diusion models for
various cases of interest, such as queue-dependent arrival or service times,
which are not yet properly handled. Also, further practical and theoretical
understanding of the properties of the ow of customers in a queueing
network will improve the accuracy of diusion approximations. Many more
validations and applications to real systems are also needed in this area.
References
1. Anderson, H. A. and Sargent, R. (1972). The statistical evaluation of the
performance of an experimental APL/360 system. In Statistical Computer
Performance Evaluation (W. Freiberger, Ed.), pp. 7398. Academic Press,
London.
2. Badel, M. (1975). Quelques Problèmes lies à la Simulation de Modèls de

Systèmes Informatiques. Ph.D. Thesis, Universite Paris VI.
3. Badel, M. and Zonzon, M. (1976). Validation dun modèle ` a processus de
diusion pour un reseau de les dattente general. IRIA Research Report,
No. 209.
4. Banh-Tri-An, (1978). Reseaux dOrdinateurs ` a Commutation de Paquets.
Ph.D. Thesis, Universite de Liège.
5. Chandy, M. Herzog, U. and Woo, L. (1975). Parametric analysis of queueing
networks. IBM J. Res. and Dev., 19, 3642.
6. Chiamsiri, S. and Craig-Moore, S. (1977). Accuracy Comparisons between
Two Diusion Approximations for M x /G/1 Queues, Instantaneous Return
versus Reecting Boundary. Paper presented at the Joint Meeting of ORSA
and TIMS, Atlanta.
7. Dinh, V. (1978). Application of a Diusion Model to Computer Performance
Evaluation. Report of IBM-France Field Systems Center.
8. Disney, R. L. and Cherry, W. P. (1974). Some topics in queueing network
theory. In Mathematical Methods in Queueing Theory (A. B. Clarke, Ed.).
Springer, Berlin.
9. Gaver, D. P. and G: S. Shedler, G. S. (1971). Multiprogramming System
Performance via Diusion Approximations. IBM Research Report, RJ-938,
10. Gelenbe, E. (1975). On approximate computer system models. J.A.C.M., 22,
261263.
11. Gelenbe, E. (1976). A Non-Markovian Diusion Model and its Application
to the Approximation of Queueing System Behaviour. IRIA Research
Report, No. 158, Rocquencourt, France.
12. Gelenbe, E. and Pujolle, G. (1976). The behaviour of a single queue in a
general queueing network. Acta Informatica, 7, 123160.
13. Gelenbe, E. and Pujolle, G. (1977). A Diusion Model for Multiple Class
Queueing Networks. IRIA Research Report, No. 242, Rocquencourt, France.
14. Kleinrock, L. (1976). Queueing Systems. Vol. I: Theory. John Wiley.
15. Kleinrock, L. (1976). Queueing Systems. Vol. II: Computer Applications.
John Wiley.
16. Kobayashi, H. (1974). Application of the diusion approximation to queueing
networks: Parts I and II. J.A.C.M., 21, 316328; 459469.
17. Kuhn, P. (1976). Analysis of Complex Queueing Networks by Decomposi-
tion. Proc. of International Teletrac Congress, Melbourne.
18. Labetoulle, J. and Pujolle, G. (1978). Modelling of packet switching com-
munication networks with nite buer size at each node. In Computer
Performance (K. M. Chandy and M. Reiser, Eds), pp. 515536. North-
Holland, Amsterdam.
19. Marshall, K. T. (1968). Some relationships between the distributions of
waiting time, idle time, and inter-output time in GI/G/1 queue. SIAM J.
Appl. Math., 16, 324327.
20. Newell, G. F. (1971). Applications of Queueing Theory ch. 6. Chapman
and Hall, London.
21. Pack, C. D. (1975). The output of an M/D/1 queue. Operations Research,

23, 750760.
22. Reiser, M. and Kobayashi, H. (1974). Accuracy of the diusion approximation
for some queueing systems. IBM J. Res. and Dev., 18, 110124.
23. Shum, A. and Buzen, J. (1978). A method for obtaining approximate
solutions to closed queueing networks with general service times. In Mod-
elling and Performance Evaluation of Computer Systems (H. Beilner and
E. Gelenbe, Eds). North-Holland, Amsterdam.
24. Vicard, J. (1977). Exactitude de modèles mathematiques de lunite de
pagination dun ordinateur. RAIRO Informatique, 11, 287299.
25. Yu, P. S. (1977). On Accuracy Improvement and Applicability Conditions
of Diusion Approximation with Application to Modelling of Computer
Systems. Technical Report, No. 129, Digital Systems Laboratory, Stanford
University.

Chapter 7
Approximate Decomposition and Iterative
Techniques for Closed Model Solution
7.1. Introduction
In general, the computer system analyst has to adapt the panoply of tools at
his disposal to the specic problem at hand, and often his problem will not
t exactly into any available framework. If the analyst has a mathematical
orientation, and enough time available, he may attempt an original solution
method. If he is pressed for time, or if he is not mathematically inclined,
he will tend to program a simulation of the system he has to analyse if the
programming and computer time can be aorded.
There is, however, a third approach he might take: the use of numerical
approximation which in some cases provides only a rst-order approxima-
tion, and which in others provides highly accurate results. The diusion
approximation developed in Chapter 4 is an example of this approach. In
this chapter we will examine a set of approximations which retain, contrary
to diusion approximations, the discrete nature of the model. They will all
be based on a similar approach to a heuristic iterative solution of the steady-
state Birth and Death equations. However, in certain cases, a formal
justication will be available on the basis of problem structure while in
other cases the only justication will be the intuitive appeal of the approach
and its similarity with techniques used in other areas of applied science.
7.2. Subsystem isolation

The set of numerical solution procedures we present in this chapter has been
designed for the approximate analysis of closed networks of queues. All the
procedures call upon the concept of an isolated subsystem composed of one
or more queues in the network. This isolated subsystem is examined in detail
211
Fig. 7.1. Simple aggregate/subsystem decomposition of a queueing network.
under the eect of the rest of the system viewed as an aggregate. In certain
cases, we consider several subsystems interacting with each other and with
the rest of the model. The simplest such structure is shown in Fig. 7.1,
where the subsystem is a single queue while the aggregate contains the
remaining queues of the system. An iterative solution technique based on
an aggregate/subsystem decomposition will often iterate between several
dierent decompositions of the type shown in Fig. 7.1 in order to reach an
approximate solution. The stationary solution will then be framed either in
terms of marginal distributions for each subsystem or as a product of the
marginal distributions when the stationary distribution for global network
state is desired.
In order to illustrate this common heuristic approach, we will rst apply
it to a class of queueing networks for which it yields an exact result: a closed
network of exponential queues with FIFO service discipline and a single
class of customers.
7.2.1. Aggregate/subsystem decomposition for a closed

Jackson network
Consider a closed Jackson network (see Chapter 3) with K customers, N
service centres with state-dependent service rates i (ni ) where
n = (n1 , . . . , nN )
is the occupancy vector (number of customers in queue) at each station.

The N N stochastic matrix P = (pij ) represents the routing probabilities.
Let us now apply the decomposition of Fig. 7.1 where the subsystem
is queue 1. Its interaction with the rest of the system is achieved via
the steady-state ow 1 of customers per unit time to queue 1, and the
same ow (since the system is closed) back to the aggregate. We solve
the subsystem as an M/M/1 queue with state-dependent service and
Approximate Decomposition and Iterative Techniques 213
call p1 (n1 ) its steady-state distribution:

n
1
p1 (n1 ) = p1 (0) n1 1 1 (j)

j=1
for 0 n1 K. The same is done for all of the remaining queues:

n
i
pi (ni ) = pi (0) ni i i (j) .

j=1
Finally, we relate the input ow to any queue to the output ows from all
the queues:
N

i = j Pji
j=1
and postulate a product-form solution

N

p(n) = G pi (ni )
i=1
which, of course, yields the exact result in this particular case.
7.2.2. Solution with one single

aggregate/subsystem decomposition
In the example of section 7.2.1 the aggregate/subsystem decomposition
was carried out for each of the queues in the network. In certain cases the
special structure of the aggregate leads one to attempt a solution using only
a single specic aggregate/subsystem decomposition. This will be especially
the case if, for instance, only the marginal distribution associated with the
aggregate or the subsystem are required, or if the aggregate and subsystem
can each be analysed separately using known analytical results.
Proceeding again via a simple example, let us examine a closed Jackson
network. For further simplicity assume that service rates are independent of
queue length. Let (N 1) queues be in the aggregate and let queue number
N be the subsystem.
We shall imagine that the aggregate interacts infrequently with the
subsystem so that for long periods of time the aggregate itself behaves as
a closed system: i.e. the transition rates j pjN of a customer going from
queue j, j = N , to queue N are much smaller than the other non-zero
transition rates. Furthermore, we suppose that the aggregate is strongly

connected, i.e. that a customer can move with non-zero probability from
any queue to any other one in the aggregate (but not necessarily in one
step).
The quantity we wish to compute is pN (nN ), the marginal probability
distribution of the N -th queue. The arrival rate to the N -th queue is
N
1
N (nN ) = piN i p(n1 , . . . , nN 1 , nN ).
i=1 n >0
PN 1 i
nj =KnN
Now, assuming that the aggregate and the subsystem (the N -th queue)
interact very infrequently, we can write using (3.37)

ei GN 1 (K nN 1)
p(n1 , . . . , nN 1 , nN ) =
n >0
i GN 1 (K nN )
PN 1 i
nj =KnN
which means that we suppose that the queues 1, . . . , N 1 behave as a

closed system for any given value of nN . Thus
N
1
GN 1 (K nN 1) GN 1 (K nN 1)
N (nN )
= ei piN = eN .
1=1
GN 1 (K nN ) GN 1 (K nN )
(7.1)
We now solve for pN (nN ) in isolation, assuming a Poisson arrival rate

N (j) when the N -th queue contains j customers:
nN
(i 1)
pN (nN )
=C (7.2)
i=1
N
or using (7.1):

pN (nN ) eN GN 1 (K nN )
= . (7.3)
pN (nN 1) N GN 1 (K nN + 1)
This procedure yields in fact the exact result for a Jackson network.
Using (3.37) and the argument leading to (3.38) we can see that
(eN /N )nN GN (K nN ) (eN /N )nN +1 GN (K nN 1)
pN (nN ) = .
GN (K)
But using (3.35) we obtain
nN
eN GN 1 (K nN )
pN (nN ) =
N GN (K)
which obviously satises (7.3). Therefore (7.2) is in fact the exact form of
the solution.
7.3. Decomposition as an approximate solution method

In the previous sections we considered decompositions of the Jackson net-
work leading to exact solutions for marginal distributions. Such decompo-
sition techniques are mainly used, however, in order to obtain approximate
solutions to networks for which exact solutions are unavailable in closed
form. In this section we shall describe an approach which originated with
the work of Courtois [3, 4]. Our presentation will only be an introduction
because a more complete presentation of the method is available elsewhere
[4] in book-form and with many examples of applications.
Although the approach is applicable to open systems as well, we
shall concentrate our attention on closed systems. We assume that the
system we consider consists of N service centres and that its behaviour
is Markovian so that it may be described by a discrete time Markov chain
Q = (q(n, n )):

p(n, t + t) = p(n , t)q(n , n) (7.4)
n
where n, n are state vectors:
n = (n1 , . . . , nN ), n = (n1 , . . . , nN )
and the vector
ni = (ni1 , . . . , niai ), 1iN
is the complete state representation associated with service centre i: we

do not specify exactly what it is, but (as in Chapter 3) it may include
a method of stages or Coxian representation of a general service
distribution and of a multiple server with a complex service discipline. The
only restrictions are that:
(i) each of the nij , 1 i N , 1 j ai are non-negative integers; and
(ii) the number ni of customers present at station i can be directly deduced
from ni : let us write this relation as ni = f (ni ).

Since the system is closed we have, for some nite K, K = N 1 ni .

Of course, the passage from (7.4) to the ChapmanKolmogorov dif-
ferential equations can be carried out by taking the limit as t 0
of [p(n, t + t) p(n, t)]/t, but it will be more convenient to work
with (7.4) instead. We maintain the usual assumption that the system
is strongly connected (irreducible and aperiodic), i.e. each state can be
reached from any other state with non-zero probability in a nite amount
of time.
Let be a partition of the set of service centres {1, . . . , N }: =
{ 1 , . . . , l }. We shall say that two state vectors n and n are -equivalent
if and only if

(j) ni = ni .
i j i j
Thus -equivalence of two state vectors n and n , denoted n- -n ,

simply means that the number of customers in the group of service centres
corresponding to each element i of is the same for n and n .
The -equivalence relation induces a partition on the set of states of
the Markov chain Q. Let denote the partition of the set of states n of Q
induced by :
= { 1 , . . . , k }.
We now introduce the concept of a nearly completely decomposable

(NCD) queueing network on a partition of the set of service centres. We
shall say that the queueing network is NCD on if for each state vector n,
and each n - -n,

q(n, n ) q(n, n ) (7.5)
n
n
/ n
where n -/ -n means that n and n are not -equivalent. Inequality (7.5)
stipulates that the transitions between states which are -equivalent are far
more likely than others, and hence far more frequent. Thus, if a queueing
network is NCD on a partition of its service centres, changes in the
number of customers in each element of will be relatively infrequent with
respect to state transitions which do not modify that number.
We shall say that a partition = { 1 , . . . , l } is non-trivial if
1 < l < N . An element of a non-trivial partition will be non-trivial if it
contains more than one service centre. Henceforth we will consider a non-
trivial partition; let k be one of its non-trivial elements, and each element
i of corresponds to a set of -equivalent states:
n, n i if and only if n- -n .

We require that each element i of be strongly connected: that is,

for each n, n i the probability of transition from n to n in a nite
number of steps without passing through some state n / i is positive.
Let D be a stochastic matrix of the same dimension as Q (the matrix
whose elements are the q(n, n )) such that its elements d(n, n ) have the
property
d(n, n ) = 0 only if n- -n .
Q may be written as
Q = D + E (7.6)
where D, , E are determined as follows:

q(n, n )
q(n, n ) if n- -n
d(n, n ) = (7.7)

n n
0 otherwise
= max |q(n, n ) d(n, n )|. (7.8)
n,n
E is a matrix, of same dimension as Q and D, such that its elements

e(n, n ) are given by
e(n, n ) = [q(n, n ) d(n, n )]/.
Since both Q and D are stochastic matrices, it follows that row sums of
E will be zero. Furthermore, |e(n, n )| 1.
It is clear from (7.7) that by a permutation of rows and columns, D
may be rewritten in block diagonal form as:
(7.9)
where each Di corresponds to the element i of the partition . Without

loss of generality we shall assume that this has been done, and that
Q and E are also written so that their rows and columns coincide with those
of D.
7.3.1. Approximate stationary solution of the system Q

Our purpose is to exploit the NCD of Q in order to obtain an approximate
solution to the system of equations (7.4) in stationary state. That is, we
seek the solution q to the equation
q = qQ (7.10)
with

q(n) = 1
n
where q(n) is the element of the row vector q corresponding to n.

A vector d, satisfying the equation
d = dD (7.11)
will be used to approximate q. In fact (7.11) does not have a unique solution,
even when the condition

d(n) = 1
n
is used (d(n) is the element of d corresponding to n) because of the block

diagonal structure of D: from (7.9) we see that k additional equations
have to be provided. We shall presently examine how these conditions may
be chosen in order to obtain a good approximation d of the vector of
equilibrium probabilities q. We write
q = qQ = qD + qE (7.12)
and we may express q as
q=d+ (7.13)
where is the error vector. Then (7.12) becomes
d + = (d + )D + qE
or
(I D) = qE. (7.14)
Our problem is now to choose the k additional relations to be satised

by the d(n), so that the elements (n) of the error vector will be small.
A natural choice of these conditions results from concepts related to the
lumpability of stochastic matrices which will be examined below.
Let A = (a(n, n )) be a stochastic matrix (of same dimension as Q).

We shall say that A is lumpable on if for each n (used to denote a row
or column of A)

a(n, n ) = a(n , n )
n i n i
for each n , n j , and for all i , j . If A is lumpable on , we can then

construct the lumped matrix A from A: A is a k k stochastic matrix
( = {1 , . . . , k }) such that A = ((i, j)) and

(i, j) = a(n, n ), for any n i .
n j
Clearly, A is a stochastic matrix.
Example 7.1
Let
1 3 1
8 8 2

1 1 1
A=
4
.
4 2

1 1 1
3 3 3
A is lumpable on the partition = {(1, 2), (3)} and

1 1
2 2
A =
2 1.

3 3
Example 7.2
The matrix D dened by (7.4) and shown in (7.9) is lumpable on
= {1 , . . . , k } and

1 0
1

D = . .
..
0 1
is the k k unit matrix.
(a) The case when Q is lumpable on : Suppose that Q is lumpable

on . Q is the k k stochastic matrix obtained by lumping Q on , and
let q be the stochastic row vector satisfying
k

q Q = q , qi = 1.
i=1
Clearly, the i-th element qi of q is the stationary probability of nding

Q in any of the states n i , 1 i k.
We shall then use the k additional conditions

qi = d(n), 1 i k (7.15)
ni
in order to solve the system
d = dD.
Because Q is lumpable on , it can be easily seen that q, its stationary

probability distribution vector, has the property

q(n) = qi , 1 i k.
ni
Thus for the error vector, for all 1 i k,

(n) = 0. (7.16)
ni
and the approximate solution d is now exact on the lumped states.

(b) Error analysis when Q is lumpable: Approximating q by d, obtained
by using the k conditions (7.15), leads to a particular error vector: an
estimation of its magnitude, which we shall presently derive, will also
provide a quantitative meaning for the decomposition method.
In order to evaluate the nature of the approximation, it is necessary
to let either D Q or Q D, and to examine the eect of this limiting
eect on the manner in which d q or q d. The analysis here will be
based on the following premises:
(i) Q is fixed, D = Q E and D Q as 0;

(ii) the matrix E, of relative dierences with respect to the maximum
dierence , is also fixed as varies. This determines the manner in
which D Q as 0.
We may write (7.14) as
= [Q E] + qE. (7.17)
Clearly, for each 0 this system of equations has a unique solution

since both q and d exist and are unique (if the conditions of (a) are used).
We can verify that the solution may be written in the following form

= ai i
i=1
where the ai are vectors satisfying:
a1 = a1 Q = qE
(7.18)
ai+1 = ai+1 Q ai E, i 1.
Clearly, 0 = O so that the above summation must indeed begin with i = 1.

Furthermore, (7.18) can be veried by substitution in (7.17). Clearly, the
vectors ai are independent of so that for very small ,
a1 .
Thus we can provide, by computing a1 , a rst-order estimate of the error

made in approximating q by d, and (7.18) provides a precise meaning for
the approximation involved in this method.
(c) The general case: Suppose now that Q is not lumpable on . We
shall write Q as
+ U
Q=Q (7.19)
U are matrices of the same dimension as Q; Q
where Q, is stochastic and
lumpable on and is a real number.
Let us dene, for i = j,

ij = max q(n, n ) (7.20)
ni
n j
and designate by ci any single vector element of i . For n i and j = 1,

let

ij (n) = ij q(n, n ). (7.21)
n i
q (n, n ), and U are constructed as follows. For some 1 i k, let

= (
Q
n i (without loss of generality):

q(n, n ) + ij (n), if n = ci and i = j

k

q(n, n ) = q(n, n) ij (n), if n = n (7.22)

j=1

j=i
q(n, n ), otherwise.
Thus Q is obtained by adding to exactly one element q(n, ci ) of each

row the quantity ij (n) which will ensure that for each n, n i

q(n, n ) = q(n , n ), 1 j k,
n j n j
and by subtracting the sum of the quantities added from the diagonal
elements; is taken to be
k

= max ij . (7.23)
i
j=1
j=i
This guarantees that all elements of U will be less than one in absolute
value. The following points should be noticed:
(i) (see (7.8));

(ii) the diagonal elements q(n, n) are positive because Q is NCD (see (7.5));
(iii) as a consequence Q is irreducible and aperiodic if Q is.

be the vector of stationary probabilities associated with Q:
Let q
=q
q
Q.
The procedure for obtaining an approximate solution to the vector q

could be the following.
Procedure: If is small enough, and if Q is NCD, the problem reduces

to that of (b) since one will now construct a decomposable matrix D and
solve dD = d using d as an approximation to q solution of qQ = q.
is lumpable on , we shall use the q , 1 i k, as the

Of course, since Q i
k additional conditions

qi = d(n) where qi = q(n)
ni ni
in order to solve dD = d. It will now be necessary to evaluate the error

with respect to in addition to the error with respect to (as in (b)).
(d) Error analysis in the general case: We shall rst consider the error
resulting from the approximation of q by q . As in (b) we shall assume that:
(i) Q is xed (given);

(ii) the relative error matrix U remains constant as varies ( > 0);
(iii) Q as 0.
Q
Write
+
q=q
where is the error vector obtained when approximating q by q

. From
(7.19) we can write
+=q
q + Q
+ qU
or
+ qU = [Q U] + qU.
= Q (7.24)
We write

= bi i (7.25)
i=1
where the vectors bi , by substitution in (7.24), must satisfy

+ qU
b1 = b1 Q
(7.26)
bi U,
bi+1 = bi+1 Q i1
so that, for small ,

= b1 (7.27)
which gives us the rst-order approximation to the error vector. Therefore
q
= d + b1 + a1 (7.28)
if we write

q d) =
( ai i
i=1
where the ai are determined as in (b).

Decomposition methods will be used in the study of multiprogramming
virtual memory systems in Chapter 7.
7.4. An electric circuit analogy for queueing

network solution
A technique which is very closely related to decomposition, inspired by
electric network equivalents, was developed by Chandy, Herzog and Woo [1].
In electric networks, a complex portion of the network can be simplied by
replacing it by an equivalent current source and parallel impedance, or
by a voltage source and series impedance (Norton or Thevenin equivalent
circuits). These equivalent circuits are exact equivalences for electric
systems, and they suggest a heuristic equivalence for queueing networks.
The approach developed in [1] is in fact a special form of decomposition,
as seen in section 7.3, but it has been applied with success also to systems
which are not decomposable.
Suppose that we decompose a queueing network containing K cus-
tomers into two disjoint subnetworks, as shown in Fig. 7.2. Assume that the
quantities of interest are the marginal probabilities p1 (K1 ), p2 (K2 ), of the
number of customers in subnet 1 and subnet 2, respectively. Here K1 , K2
represent the number of customers in subnet 1 and subnet 2, respectively.
The heuristic application of Nortons theorem proceeds as follows:
(i) Remove the connection between the two subnetworks, so that a

customer leaving subnet 1 at A1 immediately returns to subnet 1
through B1 ; a customer leaving subnet 2 from B2 will immediately
return to it through A2 .
Fig. 7.2. Nortons equivalent applied to a queueing network.

(ii) For fixed values of K1 = 0, 1, . . . , K compute the ow of customers

from A1 to B1 in stationary state for the isolated subnet 1. Call it
1 (K1 ). Do the same thing for 2 (K2 ), which is the ow of customers
in the isolated subnet 2. Notice that it is possible that 1 (0) = 0 or
2 (0) = 0 if external arrivals can occur.
(iii) Now replace the system of Fig. 7.2 by that of Fig. 7.3 in which
subnet 1 and subnet 2 have been replaced by equivalent servers of
rates 1 () and 2 (), respectively.
(iv) In order to obtain the marginal distributions p1 (K1 ) = p2 (K K1 )
treat the system of Fig. 7.3 as two M/M/1 state-dependent systems
or, equivalently, as a nite-capacity M/M/1 state-dependent system
so that (see Chapter 1)
K1
2 (K i + 1)
p1 (K1 ) = p2 (K K1 ) = p1 (0)
i=1
1 (i)
where
K1
K
1
2 (K i)
p1 (0) = 1 + .
i=1
1 (i)
K1 =1
It is clear that this heuristic, very similar in spirit to the approaches

presented in the previous sections of this chapter, can be applied at various
levels of detail (i.e. for dierent numbers of queues or service stations in
subnet 1 or subnet 2). It may also be rened so as to consider more
Fig. 7.3. The equivalent simplified queueing network.

complex statistics of the ow of customers, as done in Chapter 4 for diusion

approximations.
More complex approximation techniques similar to this one have been
developed by Marie [5], where numerous examples and evaluations of the
accuracy of these techniques may be found.
Chandy, Herzog and Woo [1] have shown that Nortons theorem holds
for networks which satisfy local balance: we exhibit a special instance of this
in section 7.2.2. In [2] they show how this concept can be applied to the
computation of an approximate solution for closed networks of queues for
which an exact solution is not known. We shall outline their method here.
Let us consider here a closed network with K customers and N service
centres. Let pij , 1 i, j N , denote as usual the transition probabilities
and Fi (t) the service time distribution at centre i which is allowed to be
general; let 1
i denote the average service time at centre i, and let the
service discipline be rst-come-rst-served.
Denote by ei , 1 i N , a solution of the system of equations
N

ej pji = ei
1
so that ei is the average number, relative with respect to some station, of

visits that a customer makes to centre i.
We shall construct a sequence of queueing networks, called R0 , R1 , . . . ,
Rm , . . . which will be used to approximate the queueing network R dened
above. R0 is obtained from R by replacing all of the Fi (t) by exponential
distributions having the same average 1 i Rm , m 1, may dier from
R0 only in the average service times at its servers, which we shall call 1i,m .
Thus the quantities ei , 1 i N , are the same for each Rm and preserve
the same physical meaning.
Let U i,m be an estimate of the , or number of customers served per unit
time, at centre i of network Rm ; U i,m is merely an estimate since (as seen
below) it will be computed approximately.
Similarly, let Q i,m be an estimate of average queue length at centre
i in Rm .
Notice that the Ui,m , the true s for Rm , must satisfy
N

Uj,m pji = Ui,m
j=1
and the normalised i,m = Ui,m /ei satises the system

N

j,m ej pji = i,m ei , 1iN
j=1
whose solutions must necessarily satisfy
i,m = j,m for all i, j.
Again, we shall make use of i,m = U

i,m /ei rather than of i,m .
We now present the construction of Rm+1 from Rm .
Step (1): Rm is a closed network of exponential servers of rates i,m ,

1 i N , with K customers.
Step (1.1): For each xed i, construct the equivalent subnetwork containing
all service centres except i; call this subnetwork Ci,m . Compute
the output rate from Ci,m towards centre i, considering Ci,m
as a closed network with 1, . . . , K customers and call it i,m (l),
l = 1, . . . , K.
Step (1.2): Solve the two-queue network of Fig. 7.4 using an appropriate
method (exact, numerical, etc). Notice that the service time
distribution of centre i is the general distribution given in the
initial network R, for this step.
Step (1.3): Compute Q i,m from the results of step (1.2).
i,m and U
Step (1.4): If, for some small positive constant ,
N

(i) (1 )K i,m (1 + )K,
Q and
i=1
N N
1 1
(ii) (1 ) j,m i,m (1 + ) j,m ,
N j=1 N j=1
Fig. 7.4.
for all i, then proceed to step (1.5); otherwise go to step (2).

Step (1.5): Compute, for 1 i N ,
N

i,m+1 = i,m i,m N j,m .
j=1
If
|i,m+1 i,m | i,m
then the procedure stops at the m-th iteration: the quantities computed
in step (1.2) are considered to be a satisfactory approximation to R.
Otherwise start at step (1) with m replaced by m + 1 and i,m+1 computed
in step (1.5).
Step (2): If (i) is not satised but (ii) is satised, go to step (2.2);
otherwise proceed to step (2.1).
Step (2.1): Compute, for 1 i N ,
N

i,m+1 = i,m i,m N j,m .
j=1
If
|i,m+1 i,m | i,m

go to step (2.2); otherwise start the (m + 1)-th step of the
iteration by returning to step (1) with the values i,m+1
computed above.
Step (2.2): Compute
N

i,m+1 = i,m N i,m
Q
j=1
for all 1 i M . The eect is to increase the service rates if

queue lengths are too large, and to decrease them if they are
too small. Now return to step (1), to begin the (m + 1)-th step
of the iteration.
The role of the steps of this iterative technique merits some explanation.
All parts of steps (1) are used to compute an approximation from Rm
to the related quantities of the original network R. Step (2) constructs
modications to Rm which will be incorporated in Rm+1 .
Step (1.1) applies Nortons theorem to each complementary network

Ci,m of the queue i, for each i; step (1.2) solves the pair (queue i equivalent
queue of Ci,m ) using some appropriate technique. Step (1.4) veries that the
solutions thus obtained are within reasonable bounds to conditions which
the exact solution must satisfy. Step (1.5) checks whether an improvement
in the i,m will be signicant, and does ne tuning to obtain Rm+1
from Rm . Step (2) modies Rm if the conditions checked in step (1.4)
are not satised.
This completes the description of the iterative method of [2]: the reader
is referred to the reference for some partial results concerning its accuracy
and to [5] for related results and examples. Another technique for solving
similar models can be found in [6].
References
1. Chandy, K. M., Herzog, U. and Woo, L. (1975). Parametric analysis of general
queueing networks. IBM Res. and Dev., 19, 3642.
2. Chandy, K. M., Herzog, U. and Woo, L. (1975). Approximate analysis of
general queueing networks. IBM J. Res. and Dev., 19, 4349.
3. Courtois, P. J. (1972). On the Near-Complete Decomposability of Networks of
Queues and of Stochastic Models of Multiprogramming Computer Systems.
Computer Science Report, CMU-CS-72, III, Carnegie-Mellon University,
Pittsburgh, Pennsylvania.
4. Courtois, P. J. (1977). Decomposability: Queueing and Computer System
Applications. Academic Press, New York.
5. Marie, R. (1978). Methodes iteratives de Resolution de Reseaux de files
dAttente, Doctoral Thesis, Universite de Rennes.
6. Shum, A. V. and Buzen, J. P. (1977). A method for obtaining approximate
solutions to closed queueing networks with general service times. In
Modelling and Performance Evaluation of Computer Systems (H. Beilner
and E. Gelenbe, Eds), pp. 201220. North-Holland, Amsterdam.

Chapter 8
Synthesis Problems in Single-Resource
Systems: Characterisation and Control
of Achievable Performance
8.1. Problem formulation

So far, we have taken an analytic approach to performance evaluation: for
a given system we have tried (by analysing an appropriate mathematical
model) to obtain the values of certain performance measures of interest.
Suppose, however, that the manager of a computer installation is given a
performance objective to be achieved and that he has a certain freedom in
deciding how the system should be organised and operated. That manager
will then wish to know not what the system performance will be for a
particular mode of operation but what mode of operation, if any, should
be chosen in order to meet the performance objective. The latter type of
question gives rise to what we call problems of synthesis.
The factors which inuence the performance of a computer system can
be grouped into three broad categories: physical characteristics (processor
speed, memory capacity, etc.); demand characteristics (number and nature
of dierent job types, arrival patterns, etc.); and scheduling strategies
(admission procedures, order in which jobs are executed by processors,
memory allocation procedures, etc.). Having once acquired the hardware
and allowed a population of users access to the facilities, the installation
management has usually little or no further control over the rst two
categories; the physical and demand parameters can therefore be regarded
as given and xed. The freedom of choice, and hence the possibilities
for control, are provided by the scheduling strategies. In the following,
we shall devote our attention to several important synthesis problems
stated in terms of designing scheduling strategies to meet performance
objectives.
231
This chapter is concerned with single-resource systems where the

demand comprises dierent job types and where the performance objectives
discriminate between them. The basic model employed is a single-server
queue with a nite number, R, of customer classes arriving in independent
streams. As a measure of system performance we take the vector (to be
called performance vector)
W = (W1 , W2 , . . . , WR )
where Wr is the steady-state average response time (time spent in the

system) for jobs of class r (r = 1, 2, . . . , R). Clearly, given the physical
and demand characteristics (i.e. the speed of the server and the arrival
and job length distribution parameters for the dierent job classes), the
performance vector vector depends only on the algorithm that selects
jobs for service the scheduling strategy. If to a scheduling strategy S
there corresponds a performance vector W we say that S achieves W and
denote it by
S W.
A given performance vector W is said to be achievable if there exists a

scheduling strategy S such that S W; there may, of course, be many
scheduling strategies which achieve the same performance vector.
The rst and most basic problem to be considered is one of characteri-
sation: What is the set of the achievable performance vectors? How can one
tell whether or not a given performance vector belongs to that set? Next,
there come problems of design and optimisation: for a given performance
objective, determine a suitable scheduling strategy to meet it. The answers
to these questions depend on the precise denition of scheduling strategy;
they depend on the degree of complexity that is allowed in the servicing
disciplines and on the amount of information supplied about the jobs.
For example, both the set of the achievable performance vectors and
various best scheduling strategies depend on whether pre-emption of
jobs in service is allowed or not, whether exact jobs lengths or only their
distributions are known in advance, etc.
Just as, in physical systems, there are certain invariants governed by
the fundamental laws of nature and governing the behaviour of physical
phenomena, so in the servicing systems that concern us there are certain
invariants and laws governing their performance. These laws are basic to
the study of synthesis problems and we shall now proceed to derive them.
Synthesis Problems in Single-Resource Systems 233
8.2. Conservation laws and inequalities

Let us introduce the notion of virtual load. For a particular realisation of
the queueing process under scheduling strategy S, the virtual load at time
t, VS (t), is dened as the total amount of work in the system at time t,
i.e. the sum of the remaining required service times for all jobs that are
in the system at time t. If the speed of the server is 1 (which we may,
and do, assume without loss of generality), required service times can be
replaced by service times in this denition. A typical realisation of VS (t)
is illustrated in Fig. 8.1.
At the instants of job arrivals VS (t) jumps upwards by an amount equal
to the required service time of the incoming job; while a job is being served
(any job), it decreases linearly with slope 1 (or C if C is the server
speed); while the server is idle it remains constant (non-zero if there are
jobs in the system, zero otherwise); it jumps downwards whenever a job
departs before the end of its service (the amount of such a jump being the
remaining service time of the departing job). It should be obvious from
the denition that, given the sequence of job arrival instants and required
service times, the only way in which the scheduling strategy S can inuence
VS (t) is by forcing the server to be idle when there are jobs in the system
and by making jobs leave before their service is completed (we assume that
jobs do not leave unnished of their own free will). If these two actions are
disallowed (and they are indeed alien to most computer operating systems)
then VS (t) would be independent of S.
A scheduling strategy which does not allow the server to be idle when
there is work to be done and does not cause jobs to depart before they are
Fig. 8.1.
nished is called work-conserving. From now on, even if we do not say so

explicitly, all scheduling strategies will be presumed work-conserving.
We shall assume now that the stochastic process VS (t) has an equilib-
rium distribution and denote its steady-state average by VS :
VS = lim E[VS (t)].

t
Since VS (t), and hence E[VS (t)], is independent of S for every t, we have
the following basic result.
Theorem 8.1 (General Conservation Law). For any single-server
queueing system in equilibrium there exists a constant V, determined only
by the parameters of the arrival and required service times processes,
such that
VS = V (8.1)
for all work-conserving scheduling strategies S.

Let us examine the implications of this result. We can rewrite (8.1) as
R

VS (r) = V (8.2)
r=1
where VS (r) is the expected steady-state virtual load due to jobs of class r
(the sum of the average remaining service times of all class r jobs in the
system at a random point in the steady-state) under scheduling strategy S.
For a given r, the value of VS (r) depends on S in general (e.g. if the priority
of class r jobs is increased, VS (r) is likely to decrease). Theorem 8.1 asserts
that the vector (VS (1), VS (2), . . . , VS (R)) always varies with S in such a
way that the sum of its elements remains constant. Note that the truth of
this statement does not rely on any assumptions about interarrival times,
service times or independency between them.
Intuitively, the average virtual load due to class r is related to the
average number of class r jobs in the system, and hence to the average
response time for class r jobs. The general conservation law (8.2) should
therefore imply a relation among the elements of the response time vector
W. However, in order to render such a relation explicit we have to make
more restrictive assumptions regarding the nature of the demand processes
and the complexity of the scheduling strategies.
A scheduling strategy is a procedure for deciding which job, if any,
should be in service at any moment of time. It takes as input any
information that is available (the time of day, the types of job in the system,
their arrival instants, the amounts of service they have received, etc.) and
returns the identier of the job to be served, or zero if the server is to
be idle. The only restriction imposed so far has been that the procedure
returns zero if, and only if, there are no jobs in the system. Now we add
two more restrictions:
(i) every time the server becomes idle the procedures memory is cleared;
the scheduling decisions made during one busy period are not based on
information about previous busy periods;
(ii) only information about the current state and the past of the queueing
process is used in making scheduling decisions; thus, it is possible to
discriminate among jobs on the basis of their expected remaining service
times (since their types and attained service are known), but not on
the basis of exact remaining service times.
The reason for condition (i) is that, if dierent scheduling strategies

are operated during dierent busy periods, there will be no interference
among them. This property will be useful later. Restriction (ii) is necessary
in order that the distribution of a class r job service time, given that the
job is in the system, is the same as the unconditional class r service time
distribution, r = 1, 2, . . . , R. (For example, if the exact service times for
class r jobs were known in advance, and the strategy were to serve shorter
jobs rst, then the class r jobs found in the system by a random observer
would tend to be the longer ones; their service time distribution would be
dierent from the a priori one.) Most scheduling strategies used in practice
satisfy these two conditions.
The interarrival and required service times are assumed independent
of each other and of the system state. Denote, as usual, the arrival rate,
average service time and trac intensity for class r jobs by r , 1/r and
r = r /r respectively (r = 1, 2, . . . , R). Also let = 1 + 2 + + R
be the total trac intensity. The system is non-saturated if < 1.
Consider rst the case when the required service times for all job classes
are distributed exponentially. The memoryless property of the exponential
distribution plus condition (ii) imply that the average remaining service
time of any class r job in the system is 1/r , regardless of how much service
that job has already received. Therefore, the average steady-state virtual
load due to class r, under scheduling strategy S, is given by
VS (r) = Nr /r , r = 1, 2, . . . , R,
where Nr is the steady-state average number of class r jobs in the system,

under strategy S. On the other hand, Nr = r Wr , according to Littles
theorem (Wr is the average response time for jobs of class r, under
scheduling strategy S). Hence,
VS (r) = r Wr , r = 1, 2, . . . , R (8.3)
where Wr depends on S. Substituting these values into (8.2), we obtain a

special conservation law (Kleinrock [4]):
Theorem 8.2. When the required service times are distributed exponen-
tially, there exists a constant V determined only by the interarrival time
distributions and by the parameters r , such that
R

r Wr = V, (8.4)
r=1
for all work-conserving scheduling strategies satisfying condition (ii).

Thus, in the context of this conservation law all achievable performance
vectors W lie on the hyperplane dened by (8.4). Any decrease in one of
the components of W must be compensated by a proportional increase in
one or more of the other components. If a scheduling strategy achieves R1
of the components of an achievable performance vector, it also achieves the
R-th component. In the special case of R = 1, all scheduling strategies yield
the same average response time.
The restriction on exponentially distributed service times can be
removed at the expense of narrowing further the class of admissible
scheduling strategies. Consider the case when the required service times for
class r jobs have general distribution (with mean 1/r , r = 1, 2, . . . , R) and
the scheduling strategies are non-pre-emptive (i.e. once a job has entered
service, it is served to completion). Denote by nr the average number of
class r jobs in the queue (none of them have started service yet) and by wr
the average time that class r jobs spend in the queue (both these quantities
depend on the scheduling strategy).
From Littles theorem nr = r wr and Nr = r Wr ; these relations,
together with Wr = wr + (1/r ), imply Nr = nr + r , r = 1, 2, . . . , R.
Hence, the average number of class r jobs being served is equal to r and,
since there can be at most one job being served, the probability that a
class r job is being served is equal to r . Because the scheduling strategy
is non-pre-emptive, the steady-state average remaining service time of the
job in service, given that it is of class r, is equal to the average residual life
r of the class r service time. r is given by
1
r = M2r r , r = 1, 2, . . . , R, (8.5)
2
where M2r is the second moment of the class r service time distribution
(see (1.66), Chapter 1).
We can now write, for the average virtual load due to class r jobs,
r = 1, 2, . . . , R,

1 1
VS (r) = nr + r r = r wr + r r = r Wr r r . (8.6)
r r
Substitution of (8.6) into (8.2) yields what is usually known as Kleinrocks

conservation law:
Theorem 8.3 (Kleinrock [4], Schrage [11]). For any multiclass

GI/G/1 queueing system in the steady-state, there exists a constant V,
determined only by the interarrival and service time distributions, such that
R
R

1
r Wr = V + r r (8.7)
r=1 r=1
r
regardless of the scheduling strategy, as long as the latter is work-conserving,

non-pre-emptive and satises condition (ii).
Again, we have a hyperplane on which all achievable performance

R
vectors must lie. However, the linear combination r=1 r Wr can now
be larger, or smaller, than the average virtual load V , depending on the
shape of the service time distributions. The two coincide, of course, when
all service time distributions are exponential (r = 1/r ).
What is the value of the constant V ? In order to determine this we
have to analyse the model under some particular scheduling strategy (any
strategy satisfying the restrictions will do) and obtain an expression for the
steady-state virtual load. A closed-form solution exists only when the arrival
streams for all job classes are Poisson. Then, it suces to consider the FCFS
scheduling strategy (serving jobs in order of arrival, without distinction of
class and without pre-emption). All job classes can be lumped together and
the model treated as an M/G/1 queue with arrival rate
= 1 + 2 + + R ,
average service time

R
1 r 1
= ,
r=1 r
second moment of the service time

R
r
M2 = M2r
r=1

and trac intensity = / = 1 + 2 + + R . The average virtual load

in this system is equal to the average time a new arrival would have to wait
before beginning service; that average waiting time is given by Pollaczek
Khintchines formula (see Chapter 1). We can write, therefore,
R
M2 r=1 r M2r w0
V = = = (8.8)
2(1 ) 2(1 ) 1
where
R

w0 = r r (8.9)
r=1
is the average residual service time of the job in service.

Thus, when all arrival streams are Poisson, our two special conservation
laws (8.4) and (8.7) become:
Law 1. Valid under exponential service times assumptions (M2r = 2/2r , r =

1, 2, . . . , R). Scheduling strategies must be work-conserving and satisfy
condition (ii) but are otherwise unrestricted:
R
R
1 r
r Wr = . (8.10)
r=1
1 r=1 r
Law 2. Valid under general service times assumptions. Scheduling strategies

must be work-conserving, satisfy condition (ii) and not use pre-emptions:
R
R R
r
w0 1 w0
r Wr = + r r = + (8.11)
r=1
1 r=1 r 1 r=1 r
where r is given by (8.5) and w0 by (8.9).

So far, we have established that the elements of any achievable (within

a certain class of strategies) performance vector W must satisfy an equality
constraint of the type (8.4) or (8.7). We shall now demonstrate that there
is a set of inequality constraints which must be satised as well.
Let g {1, 2, . . . , R} be any non-empty subset of job class indices.
We shall refer to the jobs whose classes are in g as g-jobs. Consider the
virtual load VSg (t) due to g-jobs, the sum of the remaining service times of
all g-jobs in the system at time t. A typical realisation of VSg (t) would look
like the plot in Fig. 8.1, except that there would be no downward jumps.
The horizontal segments of VSg (t) at non-zero level correspond to intervals
when there are g-jobs in the system but when jobs of other classes are being
served. We shall refer to them as g-intervals. The g-intervals, and hence
VSg (t), depend in general on the scheduling strategy S: for a given realisation
of the demand processes, the smaller the g-intervals, the lower the value of
VSg (t). Therefore, if S is a strategy which minimises the g-intervals for
every realisation of the demand processes, then every realisation of VSg (t)
is minimal. Taking expectations, we would obtain
E[VSg (t)] E[VSg (t)], t0
for all S. Next, if a steady-state exists, we can write
VSg VSg , for all S (8.12)
where
VSg = lim E[VSg (t)] and VSg = lim E[VSg (t)].

t t

Does such a minimising strategy S exist? If pre-emptions are allowed,
the answer is clearly yes: any strategy which gives pre-emptive priority to
g-jobs over non-g-jobs can be taken as S , since all these strategies eliminate
the g-intervals completely. (8.12) can then be rewritten as

VS (r) V g , for all S (8.13)
rg
where V g is the (strategy-independent) steady-state average virtual load

in a system where the demand consists only of g-jobs. Thus, the sum of
the average virtual loads due to the job classes in g is bounded from below
by a constant independent of the scheduling strategy. Furthermore, that
bound cannot be improved because there are scheduling strategies, (the
ones giving pre-emptive priority to g-jobs) for which it is reached.
Now, if we make the assumptions that ensure the validity of (8.10) we

can go through steps (8.3) and (8.8) and obtain from (8.13) a generalisation
of conservation law 1 (Coman and Mitrani [1]).
Theorem 8.4. In any multiclass M/M/1 queueing system in equilibrium,

for every non-empty subset g of job class indices, the corresponding elements
of the response time vector W satisfy the inequality
1

r Wr 1 r (r /r ) (8.14)
rg rg rg
regardless of the scheduling strategy, as long as the latter is work-conserving

and satises condition (ii). Moreover, (8.14) becomes an equality if the
strategy gives pre-emptive priority to g-jobs (e.g. if g = {1, 2, . . . , R}).
Note that the Poisson input assumptions were used only in order to
write a closed-form expression for the right-hand side of (8.13); if we leave
it as V g , Theorem 8.4 will continue to hold.
The situation is less straightforward if one is restricted to non-pre-
emptive scheduling strategies only. Now, if g is a proper and non-empty
subset of {1, 2, . . . , R}, the inuence of the jobs whose classes are in
{1, 2, . . . , R}-g cannot be eliminated completely. There is no scheduling
strategy which minimises the g-intervals for every realisation of the demand
processes. However, for a given realisation, the strategy which minimises
the g-intervals has to be one that gives head-of-the-line priority to g-jobs
(eliminating all g-intervals except, perhaps, those at the start of g-jobs
busy periods). Therefore, only such a priority strategy can minimise the
steady-state average virtual load due to g-jobs, VSg . Making the appropriate
assumptions and using (8.6) we can rephrase the above statement thus: in

order to minimise the linear combination rg r Wr it is necessary to give
non-pre-emptive priority to g-jobs.
Now suppose that the input streams are Poisson. If the g-jobs have
non-pre-emptive priority, the only way their average response time can be
inuenced by the non-g-job is through the probability that an incoming
g-jobs nds a non-g-jobs in service. But with Poisson arrivals, that
probability is independent of the scheduling strategy (see Chapter 1).
Hence, if g-jobs have non-pre-emptive priority, VSg is independent of the
order of service among the non-g-jobs. It is also independent of the order
of service among the g-jobs because, once the g-jobs have started being
served, there are no g-intervals until the end of the busy period. Thus, the

minimal value of rg r Wr can be obtained by lumping all g-jobs in one
class, all non-g-jobs in another class, and giving head-of-the-line priority
to the g-jobs. Performing these calculations yields a generalisation of
conservation law 2:
Theorem 8.5. In any multiclass M/G/1 queueing system in equilibrium,

for every non-empty subset g of job class indices, the corresponding elements
of the response time vector W satisfy
1

r Wr w0 r 1 r + (r /r ) (8.15)
rg rg rg rg
(where w0 is given by (8.9)), regardless of the scheduling strategy, as long as

the latter is work-conserving, satises condition (ii) and does not use pre-
emptions. Moreover, (8.15) becomes an equality if the strategy gives non-
pre-emptive priority to g-jobs.
In the next section, the relations derived here will lead to a charac-
terisation of the sets of achievable performance vectors. Before proceeding,
however, we should take another look at the assumptions that have been
made and at the possibilities for relaxing them.
First, we shall consider the scheduling strategies. It is evident that
if the strategies are not required to be work-conserving, Theorem 8.1 and
all that follows from it will hold no more. The necessity of condition (ii) for
the special conservation laws is less obvious (that condition is very rarely
mentioned in the literature) but it, too, turns out to be unavoidable. We
shall give examples of both pre-emptive and non-pre-emptive scheduling
strategies where exact service times are known in advance and where (8.10)
and (8.11) do not hold.
Could we drop the exponential service times assumption and still allow
pre-emptions? The answer is again, alas, no. Neither (8.10) nor (8.11) are
satised in the case of the classic pre-emptive priority disciplines with
general service times.
Finally, we know that the Poisson inputs assumption is not necessary
for the validity of Theorem 8.4. What is not known is whether Theorem 8.5
continues to hold (perhaps with dierent constants on the right-hand side
of the inequalities) if that assumption is relaxed.
8.3. Characterisation theorems

We have obtained several results which can be interpreted as necessary
conditions for achievability. The special conservation laws state that if a
performance vector is achievable, then it must lie in a certain hyperplane.
The inequality constraints narrow the possibilities further by specifying
that if a performance vector is achievable, then it must belong to a
certain polytope (a set bounded by planes) in that hyperplane. Now we
shall demonstrate that these necessary conditions are also sucient: every
performance vector which belongs to the relevant polytope is achievable.
This will give us a complete analytical characterisation of the achievable
performance vectors.
We continue to consider two distinct cases. Denote by H1 the set of
performance vectors that are achievable in multiclass M/M/1 systems. In
the notation of section 8.1,
H1 = {W = (W1 , W2 , . . . , WR ) | M/M/1 system; S : S W}.
Similarly, let H2 be the set of performance vectors that are achievable in

multiclass M/G/1 systems by non-pre-emptive scheduling strategies:
H2 = {W = (W1 , W2 , . . . , WR ) | M/G/1 system;

non-pre-emptive S : S W}.
In both cases, the scheduling strategies have to satisfy the restrictions of

the last section. Next, denote by H1 the set of performance vectors W
that satisfy equation (8.10) and the 2R -2 inequalities (8.14), where g runs
through all the proper and non-empty subsets of {1, 2, . . . , R}. Let H2 be
the set of performance vectors W that satisfy equation (8.11) and the 2R -2
inequalities (8.15), where g runs through all the proper and non-empty
subsets of {1, 2, . . . , R}. These denitions are illustrated in Fig. 8.2, for the
special case of two job classes. When R = 2, the performance vectors, are
points in the two-dimensional plane, the conservation law denes a line
and the two inequalities dene half-planes; the set H1 (also H2 ) is a line
segment. Note that the dening inequalities, together with the conservation
laws, imply that H1 and H2 are always bounded.
Theorem 8.4 asserts that every element of H1 is an element of H1 , i.e.
H1 H1 . (8.16)
Fig. 8.2.
Similarly, Theorem 8.5 implies
H2 H2 . (8.17)
Our aim will be to prove the opposite inclusions.

We shall begin by showing that all vertices of H1 are achievable (belong
to H1 ), and that all vertices of H2 are achievable by non-pre-emptive
scheduling strategies (belong to H2 ). More precisely, the vertices of H1
are achievable by pre-emptive priority disciplines and the vertices of H2
are achievable by head-of-the-line priority disciplines.
Let (1, 2, . . . , R), . . . , (R, R 1, . . . , 1) be the R! possible permutations
of job class indices. To each permutation there corresponds one pre-
emptive priority discipline and one head-of-the-line priority discipline (see
Chapter 1). Denote by W1 (i1 , i2 , . . . , iR ) the response time vector of the
pre-emptive priority discipline (i1 , i2 , . . . , iR ) in an M/M/1 system, and by
W2 (i1 , i2 , . . . , iR ) the response time vector of the head-of-the-line priority
discipline (i1 , i2 , . . . , iR ) in an M/G/1 system. The elements of these vectors
are given by (1.81) and (1.78), respectively.
The following results were established in [1].
Lemma 8.1. For every vertex W1 of H1 there exists a pre-emptive priority

discipline (i1 , i2 , . . . , iR ) such that
W1 = W1 (i1 , i2 , . . . , iR ).
Lemma 8.2. For every vertex W2 of H2 there exists a head-of-the-line

priority discipline (i1 , i2 , . . . , iR ) such that
W2 = W2 (i1 , i2 , . . . , iR ).
Proof of Lemma 8.1. Let W1 = (W1 , W2 , . . . , WR ) be a vertex of H1 .

According to the denition of H1 , W1 must lie at the intersection of R
hyperplanes, one of which is (8.10) and the others of which are some of the
bounds in (8.14). Therefore, the elements of W1 satisfy R simultaneous
linear equations

r Wr = r /r 1 r j = 1, 2, . . . , R
rgj rgj rgj
where one of the gj is the set {1, 2, . . . , R} and the others are proper, non-

empty and dierent subsets. Using the notation ar = r /r , ag = rg ar

and g = rg r , we rewrite these equations as

r Wr = agj /(1 gj ); j = 1, 2, . . . , R. (8.18)
rgj
We shall demonstrate that all the subsets gj are strictly included in

each other. Suppose that this is not so, and that there are two subsets gj
and gk such that both hjk = gj (gj gk ) and hkj = gk (gj gk ) are
non-empty. Consider the union Gjk = gj gk . From (8.18) it follows that

r Wr = [agj /(1 gj )] + [agk /(1 gk )] r Wr
rGjk rgjk
where gjk = gj gk ; the last term is zero by denition if gjk is empty. Since
(8.14) must hold for g = gjk (if non-empty), we can write

r Wr [agj /(1 gj )] + [agk /(1 gk )] [agjk /(1 gjk )]. (8.19)
rGjk
Next, it is not dicult to see that
(1 gj )(1 gk ) > (1 gjk )(1 Gjk )
when hjk and hkj are non-empty. Also,
[agj (1 gk )] + [agk (1 gj )] [agjk (1 Gjk )] < aGjk (1 gjk )
when hjk and hkj are non-empty. (8.19) then implies

r Wr < aGjk /(1 Gjk )
rGjk
which violates (8.14) for g = Gjk . Thus we must have (perhaps after
renumbering)
g1 = {i1 }
g2 = {i1 , i2 }
gR1 = {i1 , i2 , . . . , iR1 }
gR = {i1 , i2 , . . . , iR } = {1, 2, . . . , R}
for some i1 , i2 , . . . , iR {1, 2, . . . , R} such that ij = ik , j = k. The system

of equations (8.18) is triangular; its solution is readily obtained as
Wr = (1 gr1 )1 [1/r + agr /(1 gr )]; r = 1, 2, . . . , R,
where g0 = 0 by denition. But those are precisely the elements of the

response time vector of the pre-emptive priority discipline (i1 , i2 , . . . , iR ).
Proof of Lemma 8.2. This proof is almost identical to the above and need
not be given in full. It suces to note that the right-hand side of (8.15) is
also of the form

br 1 r
rg rg
with br > 0 (r = 1, 2, . . . , R). Again, the system of equations dening a

vertex is triangular; its solution turns out to be the response time vector of
a head-of-the-line priority discipline.
Another way of interpreting Lemmas 8.1 and 8.2 is the following.
Let H1 be the convex hull dened by the R! response time vectors
W1 (1, 2, . . . , R), . . . , W1 (R, R 1, . . . , 1); i.e. W H1 if and only if W

can be represented as a convex combination
W = 1 W1 (1, 2, . . . , R) + + R! W1 (R, R 1, . . . , 1) (8.20)
where 1 , . . . , R! 0 and 1 + + R! = 1. Similarly, let H2 be the

convex hull dened by W2 (1, 2, . . . , R), . . . , W2 (R, R 1, . . . , 1). Since H1
and H2 are bounded polytopes, Lemmas 8.1 and 8.2 imply that
H1 H1 (8.21)
and
H2 H2 . (8.22)
Now, we know that all vertices of H1 are achievable in an M/M/1

system, i.e. they belong to the set H1 . If it can be shown that H1 is convex
(a set in a vector space is called convex if, together with any two elements
x1 and x2 , it contains all elements of the form x1 + (1 )x2 , 0 1),
then it would follow that the whole of H1 is included in H1 . We would
have, from (8.16) and (8.21), H1 H1 H1 H1 , and therefore
H1 = H1 = H1 . (8.23)
Similarly, if it can be shown that the set H2 is convex, it would follow that
H2 = H2 = H2 . (8.24)
To prove the convexity of H1 and H2 we introduce, following [1], the

notion of mixing scheduling strategies. Given two scheduling strategies S1
and S2 , a mixing strategy is obtained by making a random choice between
S1 and S2 every time the system becomes idle: with probability all
scheduling decisions during the next busy period are made according to S1
and with probability 1 they are made according to S2 (0 1). The
random choices are independent of each other and of everything else in the
system. Thus a mixing strategy is determined by a triple (S1 , S2 , ). Note
that if S1 and S2 are work-conserving and satisfy conditions (i) and (ii), so
does (S1 , S2 , ) for every 0 1; in other words, the class of strategies
with which we are dealing is closed with respect to the mixing operation.
Moreover, the subclass of the non-pre-emptive scheduling strategies is also
closed with respect to mixing (since no pre-emption is involved in that
operation).
An almost obvious relationship exists between the performance vectors

of the constituent strategies and that of the mixing strategy.
Theorem 8.6. If W1 and W2 are the response time vectors of S1 and S2

respectively, then the response time vector W of (S1 , S2 , ) is given by
W = W1 + (1 )W2 ; 0 1. (8.25)
The convexity of H1 and H2 follows immediately from this theorem and

the remarks above. Every performance vector which lies on the line segment
between two achievable performance vectors is also achievable; it suces to
construct an appropriate mixing strategy in order to achieve it.
Proof of Theorem 8.6. The busy periods in a single-server system are

completely determined by the virtual load function V (t): they begin when
V (t) jumps up from zero and end when V (t) touches zero again. Since V (t)
is independent of the scheduling strategy, so also are the busy periods.
Furthermore, the beginnings of busy periods are regeneration points for
the queueing process (because interarrival and service times are mutually
independent). It follows that the lengths of consecutive busy periods
are independent and identically distributed, regardless of the scheduling
strategy. The same can be said about the numbers of jobs of various classes
that are served during dierent busy periods.
Consider the n-th arriving job of class r, Jnr (n = 1, 2, . . . ; r =
1, 2, . . . , R), under the mixing strategy (S1 , S2 , ). The above arguments
imply two things. Firstly, Jnr arrives in (or commences) a busy period of
type 1 (respectively of type 2) with probability (respectively 1 ).
Secondly, if Jnr arrives in (or commences) a busy period of type j (j = 1, 2),
then it experiences exactly the same delay as it would have done had it
been the n-th class r job under Sj operating from the beginning with the
same initial state. (Recall that according to assumption (i) of section 8.2,
the scheduling decisions in one busy period are independent of those in
previous busy periods.)
Therefore, not only the steady-state expectations but also the transient
distribution functions of the response times under S1 , S2 and (S1 , S2 , ) are
related as in (8.25), provided that the three strategies are started with the
same initial conditions.
Remark. The notion of mixing scheduling strategies, and Theorem 8.6, can
be extended in an obvious way to more than two constituents. If we are given
m strategies S1 , S2 , . . . , Sm with performance vectors W1 , W2 , . . . , Wm ,
and m positive fractions 1 , 2 , . . . , m such that 1 + 2 + + m = 1,

we can construct a mixing strategy (S1 , S2 , . . . , Sm ; 1 , 2 , . . . , m ) whose
performance vector W is given by
m

W= j Wj . (8.26)
j=1
Equations (8.23) and (8.24) are now established; these are important results
which will be referred to as characterisation theorems.
The identities H1 = H1 and H2 = H2 can be termed analytical

characterisations. They supply us with simple means for checking whether
or not a pre-specied performance vector is achievable. For example, to
determine whether a performance vector W is achievable in an M/G/1
system by a non-pre-emptive scheduling strategy it suces to verify equa-
tion (8.11) and the 2R -2 inequalities (8.15). This task can be accomplished
without too much diculty for values of R as high as 12 or 13. Note that
satises the inequalities (8.15) but a substitution into (8.11) yields a
if W
strict inequality
R
R

r > w0 /(1 ) +
r W (r /r ),
r=1 r=1
then, while W is not achievable, there exists a vector dominated by W

(one whose elements are all smaller than or equal to those of W) which
is achievable. This situation is illustrated in Fig. 8.2 for the case R = 2
(M/M/1 system).
On the other hand, the identities H1 = H1 and H2 = H2 can
be regarded as geometrical characterisations. They specify the vertices
(the extremes) of the sets of achievable performance vectors and point the
way for designing scheduling strategies to meet performance objectives.
Suppose, for example, that we are given in an M/M/1 system a performance
vector W and have already shown it to be achievable; the problem now is
to nd a scheduling strategy which achieves it. One solution is provided by
the mixing strategies: since there exists for W a representation of the type

(8.20) W can be achieved by mixing the R! pre-emptive priority strategies
with probabilities 1 , . . . , R! . Such a solution certainly looks unappealing
(R! grows rather rapidly), but it is not quite as bad as it appears. All
except R of the coecients in the representation (8.20) can be made zeros
(this is because the set H1 belongs to an R-dimensional hyperplane).
The R pre-emptive priority disciplines to be mixed and the parameters

of the mix can be determined using standard linear programming methods.
This, and other problems concerned with the design of scheduling
strategies will be addressed in the following sections.
8.4. The realisation of pre-specified performance vectors.

Complete families of scheduling strategies
We have solved the rst of the synthesis problems outlined at the beginning
of this chapter. If asked whether a performance vector W is achievable (in
an arbitrary M/M/1 system or in an M/G/1 system without pre-emption),
we can give a clear yes or no answer by a rather simpler algorithm.
Let us now approach the next immediate problem. Having determined
that W is achievable (or, better still, that a vector dominated by W is

achievable), nd a scheduling strategy which achieves W (or achieves a
vector dominated by W).
As a general principle, it is easier to nd something if one knows where
to look for it. So the search for a scheduling strategy would be easier if one
could narrow it down to some well-dened simple family of strategies. In
order not to miss the target, however, the narrower family has to be as rich
(as far as the achievable performance vectors are concerned) as the set of
all scheduling strategies. We are thus led to the notion of completeness
(Mitrani and Hine [8]).
Let be a family of scheduling strategies (satisfying conditions (i) and
(ii) of section 8.2). Denote by H the set of performance vectors which are
achievable by strategies from :
H = {W | S ; S W}.
We say that is M/M/1-complete if H = H1 , i.e. if an M/M/1

system any achievable performance vector can be achieved by a strategy
from . Similarly, we say that is M/G/1-complete if all strategies in
are non-pre-emptive and H = H2 . It is obvious from these denitions
that if we have a complete family of scheduling strategies and wish to
achieve a pre-specied performance vector, we can limit our search only to
the strategies in . If, in addition, the family is parametrised (i.e. all the
strategies in it have the same general form and are determined by the values
of a few parameters), then our task is reduced to nding an appropriate
point in the parameter space.
As a rst application of these ideas consider, in an M/M/1 system

the family 1 of scheduling strategies formed by mixing up to R of the
R! pre-emptive priority disciplines. To simplify the notation a little, let
Q1 , Q2 , . . . , QR! be the performance vectors of the pre-emptive priority
disciplines. According to (8.26), the performance vectors of the strategies
in 1 can be expressed as convex combinations
W = 1 Qi1 + 2 Qi2 + + R QiR , (8.27)
where j 0 (j = 1, 2, . . . , R), 1 +2 + +R = 1 and Qi1 , Qi2 , . . . , QiR

are R of the vectors Q1 , . . . , QR! . On the other hand, the characterisation
theorem H1 = H1 asserts that every achievable performance vector is of
the form (8.20). Moreover, since the dimensionality of H1 is R 1 (because
of the conservation law), at most R of the coecients in (8.20) need to
be non-zero (e.g. every point inside a planar polygon can be expressed
as a convex combination of three of the vertices). Thus every achievable
performance vector is of the form (8.27), i.e. 1 is M/M/1-complete.
The problem of nding a strategy from 1 which achieves a pre-
specied (and achievable) performance vector W can now be stated as
follows: nd R! non-negative numbers 1 , 2 , . . . , R! , all but R of which
are equal to zero, such that
R!
R!

j = 1 and
j Qj = W.
j=1 j=1
We have here R + 1 linear constraints, R of which are independent (the

vectors Qj and W have only R 1 independent elements), to which we
wish to nd a non-negative solution such that at most R of the variables
are non-zero. This is the well-known initial basis problem in linear
programming. It can be solved by introducing R articial variables 0 and
= (1 , 2 , . . . , R1 ) and solving the linear program
R!

max j (8.28)
j=1
subject to the constraints
j 0 (j = 1, 2, . . . , R!), j 0 (j = 0, 1, . . . , R),
R!
R!

0 + j = 1 and +
j Qj = W
j=1 j=1
(using only the rst R1 elements of Qj and W). An initial basis for (8.28)
is obtained by setting j = 0 (j = 1, 2, . . . , R!), 0 = 1, = W.
When an

objective value of 1 is reached (as we know it will be if W is achievable),
the corresponding j S and Qj S dene a mixing strategy S 1 which

achieves W.
Note that there may be (and probably are) many solutions to this
problem. Figure 8.3 shows an example of the set H1 for R = 3 (the priority
ordering at each vertex is indicated in brackets) and a target vector W in
the interior of H1 . This particular target vector can be achieved by mixing
the priority disciplines (123), (321) and (213); or by mixing the disciplines
(213), (132) and (231), etc. The gure suggests that there are eight mixing
strategies which achieve W.
The other case of interest is when the target W is not achievable but
dominates an achievable performance vector (i.e. it satises the inequalities
(8.14) but lies above the conservation law hyperplane). Then, before solving
the linear program (8.28), we have to nd an achievable performance
vector W dominated by W. This is another initial basis problem: W
has to satisfy the inequalities (8.14) plus the conservation law equality,
plus the inequality W W. Again one can introduce articial variables
Fig. 8.3.
and solve an auxiliary linear program. Since the solution of that program
is a vertex of the set of feasible vectors, the vector W thus obtained is
either one of the vectors Qj (j = 1, 2, . . . , R!), or it satises W r = W r for
some r = 1, 2, . . . , R. This suggests the following alternative algorithm for
nding W.
For r = 1, 2, . . . , R check whether the projection of W along the r-th
coordinate axis into the conservation law hyperplane is achievable; if yes,
then take that projection as W and stop. For j = 1, 2, . . . , R! check whether

Qj is dominated by W; if yes, take Qj as W and stop.
That algorithm may, in some cases, be more ecient than the linear
programming one. For example, if one of the elements of W is obviously
too large, a projection along the corresponding coordinate axis is likely to
yield the result.
The above results apply, with straightforward modications, to M/G/1
systems where pre-emptions are disallowed. The family 2 of scheduling
strategies formed by mixing up to R of the R! head-of-the-line priority
disciplines, is M/G/1-complete. To nd a strategy from 2 which achieves
a pre-specied (and achievable) performance vector W, one solves a linear
program similar to (8.28); the vectors Qj (j = 1, 2, . . . , R!) are replaced
by the performance vectors of the head-of-the-line priority disciplines. If
the target W is not achievable but dominates an achievable performance
vector, one such vector can be obtained either by solving an initial basis
problem or by a vertex searching algorithm.
So, the families 1 and 2 have several attractive features: they are
conceptually simple, easily implementable, parametrised, complete; there
are algorithms for selecting a strategy that achieves (or improves upon) a
given performance vector. We should point out, however, that these mixing
strategies have one important disadvantage: the variances in response times
which they introduce may be unacceptable large, especially in heavily
loaded systems. Suppose, for example, that in an M/M/1 system with two
job classes, the pre-emptive priority disciplines (1, 2) and (2, 1) are mixed in
the proportion = 0.9. Suppose, further, that the system is heavily loaded,
most of the load being contributed by class 2 (say 1 = 0.15, 2 = 0.8).
Then, while most of the class 1 jobs have very short waiting times, a
signicant proportion (approximately 10%) will have to wait much longer;
the over-all mean response time may be as required, but the variance will
be rather large. Managers of computer installations usually avoid such
strategies because the unlucky 10% of the users tend to be more vociferous
than the satised 90%.
It is desirable, therefore, to try to nd families of scheduling strategies

which are not only complete and parametrised, but are also better suited
for practical applications.
We shall begin by deriving a set of sucient conditions for a
parametrised family of scheduling strategies to be complete. Take, as an
illustration, the case of an M/M/1 system with two job classes. The set
of achievable performance vectors, H1 , is now a line segment (see Fig. 8.2)
at the two extremes of which are the performance vectors W1 (1, 2) and
W1 (2, 1) of the two pre-emptive priority disciplines. Suppose that we have
a family of scheduling strategies which depend on a single parameter
; to a given value of there corresponds a strategy in and hence a
performance vector W() in H1 . In this case one can easily see a set of
conditions that would ensure the completeness of . It is sucient that
there exist two parameter values 1 and 2 such that W(1 ) = W1 (1, 2)
and W(2 ) = W1 (2, 1), and that when varies between 1 and 2 ,
W() varies continuously. W() would then be certain to sweep the entire
segment H1 , i.e. would be complete.
These intuitive ideas can be generalised and made more precise.
Suppose that in a system with R job classes we have a family of scheduling
strategies depending on m continuous parameters 1 , 2 , . . . , m . In other
words, there is a set A in the m-dimensional space 1 2 m
such that every point A corresponds uniquely to a strategy S
and vice versa. Hence, to every point A corresponds (via the strategy)
a performance vector W H and all performance vectors in H have
inverse images in A, although not necessarily unique.
The following theorem (Mitrani and Hine [8]) is useful in establishing
the completeness of parametrised families of scheduling strategies:
Theorem 8.7. If A is (R 1)-dimensional (i.e. m = R 1) and compact,

and if the mapping of A on to H is one-to-one and continuous, and if the
boundary of A is mapped on to the boundary of some (R 1)-dimensional
set H in the W-space, then H = H. In particular, if the boundary of A
is mapped on to the boundary of H1 , then is M/M/1-complete; if the
boundary of A is mapped on to the boundary of H2 (and the strategies in
are non-pre-emptive), then is M/G/1-complete.
This theorem has a simple intuitive meaning. Its main assertion is
that if the strategies in can achieve the boundary (the extremes) of a
given set of performance vectors, then they can achieve the whole set. The
analogy with the case R = 2 can be seen easily. Note that the requirement
concerning the dimensionality of A (and hence of ) is important. For

example, in the case of R = 3 (see Fig. 8.3), the boundary of H1 can
be achieved by mixing pre-emptive priority disciplines two at a time
(1-dimensional parameter set); that family cannot achieve all points in the
interior of H1 .
We shall prove the theorem at the end of this section; let us now turn
to some applications.
8.4.1. Generalised processor-sharing strategies

In section 3.4 we dened a processor-sharing strategy whereby the available
processing capacity is divided equally among the jobs in the system. That
strategy can be generalised (Kleinrock [5]) by allowing jobs of dierent
classes to receive dierent fractions of the processing capacity. The division
is controlled by a vector of positive weights (1 , 2 , . . . , R ): if the
processor speed is C instructions per unit time and there are nr jobs of
class r in the system (r = 1, 2, . . . , R), then all jobs proceed in parallel,
each class r job being served at rate
R

fr (n1 , n2 , . . . , nR ) = Cr j nj (8.29)
j=1
instructions per unit time.

It is clear from (8.29) that if all r s are multiplied by the same constant
the strategy will not change. One of the parameters can therefore be xed
arbitrarily; let R = 1.
We now have an (R 1)-dimensional parameter set A:
A = {(1 , 2 , . . . , R1 ) | r > 0, r = 1, 2, . . . , R 1}. (8.30)
Each point A determines uniquely a processor-sharing strategy and

hence a performance vector W. Moreover, the correspondence is one-to-
one and continuous; we shall establish this later by nding W explicitly
as a function of . We are thus almost in the domain of applicability
of Theorem 8.7 and are tempted to claim that the family of generalised
processor-sharing strategies (denote that family by ) is M/M/1-complete.
Unfortunately, the parameter set is not compact. For compactness
it is necessary that the set includes its boundary, and our A is open
(and unbounded). Also, the boundary of H1 cannot be achieved by
strategies from . Take, for example, the bounding plane B1 dened by
W1 = 1/[1 (1 p1 )] (see (8.14), with g = {1}). According to Theorem 8.4,

it is necessary to give pre-emptive priority to class 1 in order to achieve
any performance vector W B1 ; the strategies in are unable to do this
because they allow all jobs in the system to proceed in parallel.
Nevertheless, it can be shown that is nearly M/M/1-complete:
Lemma 8.3. Every performance vector W in the interior of H1 can be

achieved by a scheduling strategy from the family . If W is on the
boundary of H1 , then it can be approximated as closely as desired by
strategies from .
Proof. We have to show that H is equal to H1 without its boundary.
Consider the parameter regions
A,E = {(1 , 2 , . . . , R1 ) | r E, r = 1, 2, . . . , R 1; < E}
and let ,E be the family of processor-sharing strategies dened over A,E .

We have
A = lim
0
A,E
E
and therefore
H = lim
0
H,E .
E
Each of the regions A,E is compact. Its boundary consists of those

points for which i = for at least one i and/or j = E for at least one
j (i, j = 1, 2, . . . , R 1). Denote by B,E the set of performance vectors
which correspond to these boundary points. According to Theorem 8.7,
H,E consists of B,E and all performance vectors inside it.
Let B be the limiting surface
B = lim
0
B,E .
E
The performance vectors in B are obtained by letting i 0 for at least

one i and/or j for at least one j (i, j = 1, 2, . . . , R 1). Taking a
closer look at (8.29) and remembering that R is xed, we see that such
a limiting process always results in eectively giving pre-emptive priority
to one or more job classes over the remaining job classes. Therefore (see
Theorem 8.4), B is part of the boundary of H1 . However, both these surfaces
are closed and continuous (topologically equivalent to a sphere); if one of
them is part of the other they must coincide.
The proof can now be completed by remarking that, since the surfaces
B,E approach the boundary of H1 , every performance vector W in the
interior of H1 is in the interior of some B,E . As we have seen, this implies
that W H,E and hence W H . Thus H contains all points in H1
except its boundary.
Here we have a family of scheduling strategies which is (to all practical

purposes) complete, and which does not produce high variances in the
response times. An implementation of a processor-sharing strategy would
involve an approximation by a Round-Robin discipline: if processor time is
allocated in quanta of size Q and Q is small, the eect of processor-sharing
with fractions (8.29) can be achieved by giving r quanta of service to the
job at the head of the queue if that job is of class r (r = 1, 2, . . . , R).
It remains to provide an algorithm which, given a performance vector
would nd a processor-sharing strategy (or rather a set of values for
W,
We shall approach this problem from the
1 , 2 , . . . , R ) that achieves W.
opposite direction, i.e. we shall analyse the system in order to nd the
performance vector W that corresponds to a given set of parameter values
1 , 2 , . . . , R . The analysis is a special case of that presented in [2], where
processor-sharing strategies are studied under more general assumptions.
Let Wr (t) be the steady-state average response time of a class r job
whose required service is t. Wr (t) can also be interpreted as the average
time necessary for a class r job whose service requirement is greater than t
to attain service t. Hence, dWr (t) = Wr (t + dt) Wr (t) is the average time
necessary for a class r job to increase its attained service from t to t + dt.
Another expression for this last quantity can be obtained from the
denition of processor-sharing. Let Jr be a job of class r whose required
service is greater than t. Denote by nj (t) the average number of class j
jobs in the system (excluding Jr ) at the moment when Jr attains service
t (j = 1, 2, . . . , R). Then the average time necessary for Jr to increase its
service from t to t + dt is equal to
dt/fr (n1 (t), . . . , nr (t) + 1, . . . , nR (t))
where fr (, , . . . , ) is given by (8.29). Assuming, without loss of generality,

that the processor speed C is 1, we obtain
R
j
Wr (t) =1+ nj (t), r = 1, 2, . . . , R. (8.31)
j=1
r
To nd nj (t), note that this quantity has two components: n , the

average number of class j jobs which were in the system when Jr arrived
and are still there when Jr attains service t; n , the average number of
class j jobs which arrived after Jr and are still in the system when Jr
attains service t. Note, further, that while Jr receives one unit of service
any class j job which is together with it in the system receives j /r units
of service.
Suppose that a class j job had attained service u when Jr arrived.
For that class j job to be still in the system when Jr attains service
t, its service requirement has to be greater than u + (j t/r ); the
probability of that event, given that the requirement is greater than u,
is exp[j (j t/r )]. Next, consider the subsystem of class j jobs whose
attained service is between u and u + du: jobs arrive in it at rate j ej u
(since every class j arrival with service requirement greater than u is bound
to join the subsystem); the average time jobs spend in that subsystem
is dWj (u) = Wj (u)du. Therefore, from Littles theorem, the steady-
state average number of class j jobs with attained service u is equal to
j ej u Wj (u)du (see also [7], [9]). Integrating over all possible values of u,
we obtain the rst component of nj (t):

n = j ej (j t/r ) ej u Wj (u)du.
0
Turning to the second component, we remark that while Jr increases its

attained service from u to u + du, an average of j dWr (u) class j jobs
arrive. Each of those arrivals is still in the system when Jr attains service t
with probability exp[j j (t u)/r ]. Integrating over u (0, t) we obtain
t
n = j ej j (tu)/r Wr (u)du.
0
Finally, substituting the sums of n and n into (8.31) yields a system
of integrodierential equations:
R
j j
Wr (t) = 1+ e j j t/r
ej u Wj (u)du
j=1
r 0
t
j j (tu)/r
+ e Wr (u)du r = 1, 2, . . . , R. (8.32)
0
The boundary conditions are obvious: Wr (0) = 0, r = 1, 2, . . . , R.

This system of equations can be solved for Wr (t), r = 1, 2, . . . , R

(see [2]). However, that is not our aim here; we are interested only in the
unconditional average response times

Wr = r er t Wr (t)dt = er t Wr (t)dt, r = 1, 2, . . . , R.
0 0
Accordingly, we multiply both sides of (8.32) by er t and integrate over

t (0, ). This yields, after some arithmetic,
R

1 j j
Wr = + (Wj + Wr ) , r = 1, 2, . . . , R. (8.33)
r j=1 j j + r r
We are now in a position to tackle either the analysis or the synthesis

problem. For a given processor-sharing strategy, i.e. a given parameter
vector , the corresponding performance vector W can be found by solving
the (linear) system of equations (8.33) for Wr (r = 1, 2, . . . , R). If, on the
other hand, we are given an achievable performance vector W, we can nd a
processor-sharing strategy which achieves W by setting R = 1 and solving
the (non-linear) system of equations (8.33) for r (r = 1, 2, . . . , R 1). In
the second case one would probably have to employ a numerical iteration
procedure.
It is not dicult to conceive of other families of scheduling strategies
which are M/M/1-complete (see [8]). For example, rather than sharing the
processor among all jobs in the system one could share it among the top
jobs in each job class queue (according to a vector of weights). That family
has similar properties to the one we have considered but seems to be more
dicult to analyse. No result like (8.33) is available for it.
Let us also give, without proof, another example of a family of non-pre-
emptive scheduling strategies which is M/G/1-complete. Take the case of
R = 2. Consider the strategy which, after each service completion, selects
for service a class 1 or a class 2 job with probabilities and (1 ),
respectively (if only one class is present then a job of that class is chosen
with probability 1). The family of these strategies, when varies in the
interval [0, 1], is M/G/1-complete. This is because the extreme points
of H2 are achieved for = 0 and = 1, and the mapping W
is obviously continuous. Moreover, since the mixing decisions are made
after each service completion rather than at the end of each busy period,
the response time variances are not as large as under the earlier mixing
strategies. Generalising these ideas to the case R > 2 would produce

an (R 1)-dimensional family of scheduling strategies which is M/G/1-
complete. At present there are no analytical results concerning that family.
There are, however, some operating systems which use similar propor-
tional admission strategies; the choice of parameters is usually made by
experimentation.
We shall now give the proof of Theorem 8.7.
Proof of Theorem 8.7. Since H is the image of a compact set by a

continuous mapping, and since H is bounded (it is contained in the set of
achievable vectors) it must also be compact. Suppose that H = H. This
means that there are points on the boundary of H which are not on the
boundary of H (a point belongs to the boundary of a set if, and only if,
every open sphere containing it also contains points not of the set). Let W0
be one such point and let 0 be the inverse image of W0 in A. Since the
boundary of A is mapped on to the boundary of H, 0 must be an inner
point of A. There exists, therefore, an open sphere A0 such that 0 A0
and A0 A. Let H0 be the image of A0 in H . It is known that under
the conditions of the theorem the image of an open set is open. Therefore,
H0 is an open set contained in H and containing W0 . That, however, is
impossible because W0 was a boundary point of H .
8.5. Optimal scheduling strategies

We have studied several problems which had to do with achieving specic
performance vector targets. Now let us consider the question of how to
choose an appropriate target. What is the best response time vector to
aim for? Clearly, the answer to that question depends on the criterion
that is being used to evaluate and compare dierent choices. We shall
examine some frequently used criteria and the best scheduling strategies
corresponding to them.
The general formulation of an optimisation problem is in terms of
a cost function. We assume that with every response time vector W is
associated with a cost C(W); the problem is to minimise C(W) over the
set of achievable response time vectors. Consider the case when C(W) is
linear in the elements of W:
R

C(W) = cr Wr , cr 0; r = 1, 2, . . . , R. (8.34)
r=1
From the characterisation theorems of section 8.3 we know that the set of
achievable performance vectors is a polytope (H1 in the case of M/M/1
systems, H2 for M/G/1 systems). Moreover, we know exactly what are the
vertices of that polytope (Lemmas 8.1 and 8.2). Since the minimum of a
linear function over a polytope is always reached at one of the vertices, we
immediately obtain the following results.
Lemma 8.4. In an M/M/1 system any cost function of the type (8.34) is
minimised by one of the R! pre-emptive priority disciplines.
Lemma 8.5. In a non-pre-emptive M/G/1 system any cost function of

the type (8.34) is minimised by one of the R! head-of-the-line priority
disciplines.
We have thus come to the rather remarkable conclusion that, with a

linear cost function, no amount of sophistication in a scheduling strategy
can do better than a simple priority discipline which bases its decisions
only on the presence or absence of jobs of various types in the system. It
is still presumed, of course, that the strategies under consideration satisfy
condition (ii) of section 8.2.
It is not dicult now to determine exactly which priority discipline is
the optimal one. Take any priority ordering, say (1, 2, . . . , R), and consider
the eect of interchanging the priorities of two adjacent job classes, j and
j + 1. Let W and W be the response time vectors before and after the
interchange. From the nature of priority disciplines (both pre-emptive and
non-pre-emptive), it follows that
j;
Wj < W j+1 ;
Wj+1 > W
r
Wr = W for r < j or r > j + 1.
The last equalities, together with the conservation laws, imply
j + j+1 W
j Wj + j+1 Wj+1 = j W j+1 . (8.35)
Consider the dierence in the cost functions which can be written as
= cj Wj + cj+1 Wj+1 (cj W

C(W) C(W) j + cj+1 W
j+1 )
cj j ) + cj+1 (j+1 Wj+1 j+1 W
= (j Wj j W j+1 ).
j j+1
Suppose that (cj+1 /j+1 ) > (cj /j ). Then, since the bracketed term which
multiplies (cj+1 /j+1 ) is positive,
> cj (j Wj j W
C(W) C(W) j + j+1 Wj+1 j+1 W
j+1 ) = 0
j
on account of (8.35). Similarly, if (cj+1 /j+1 ) (cj /j ), then C(W)

C(W) 0.
Thus, the cost function can be reduced by giving higher priority to a
job class for which the ratio cost coecient/trac intensity is larger. An
optimal scheduling strategy is obtained as follows:
(i) nd an index ordering (r1 , r2 , . . . , rR ) such that
(cr1 /r1 ) (cr2 /r2 ) (crR /rR ); (8.36)
(ii) choose the discipline which gives highest priority to class r1 , second
highest priority to class r2 , . . . , lowest priority to class rR .
The above prescription applies to both M/M/1 and M/G/1 systems,

with the proviso that in the rst case the priorities should be pre-emptive
and in the second non-pre-emptive. In the case of non-pre-emptive M/G/1
systems this is a classic result (Fife [3], Smith [14]; see also Kleinrock [6]).
Let us examine some special cases. Suppose that our objective is
to minimise the over-all average response time. The corresponding cost
function is
R
r
C(W) = Wr ,
r=1

where = 1 + 2 + + R is the total arrival rate. To obtain the optimal

strategy we rank the quantities r /t = r in descending order, or the
average service times 1/r in ascending order. The result is the Shortest-
Expected-Processing-Time-rst discipline (SEPT), in either pre-emptive or
non-pre-emptive version.
Consider now a single-class M/G/1 system where the exact service time
of every incoming job is known. As matters stand, this information cannot
be used in scheduling the jobs because of condition (ii) of section 8.2; all
admissible non-pre-emptive scheduling strategies yield the same average

response time as, for example, the FIFO discipline:
1 M2
WFIFO = + (8.37)
2(1 )
where M2 is the second moment of the service time distribution (see (8.11)).
The restriction on scheduling strategies can be avoided by introducing
articial job classes. Assume rst that the service times can take only
a nite number of values: they are equal to xj with probability pj , j =
1, 2, . . . , J; p1 + p2 + + pJ = 1. The system can be treated as an M/G/1
queue with J job classes, class j having arrival rate j = pj , mean service
time (1/j ) = xj , trac intensity j = pj xj and second moment of service
time distribution M2j = x2j . Scheduling strategies which use information
about exact service times are now admissible because that information is
contained in the class identiers. To minimise the over-all average response
time one has to give class j non-pre-emptive priority over class k if xj < xk .
This is the Shortest-Processing-Time-rst discipline (SPT): when the server
is ready to begin a new service it selects the shortest job of those present
in the system.
The above argument generalises easily to an arbitrary service time
distribution. Thus, in any non-pre-emptive M/G/1 system the average
response time is minimised by the SPT discipline.
We see that the optimal strategy to be followed depends on the amount
of information available. If only the distribution of service times is known
then the best strategy is SEPT which, in the case of one job class, reduces to
serving jobs in order of arrival; the resulting average response time is given
by (8.37). If individual service times are known then the optimal strategy
is SPT; the corresponding average response time is (see expression (1.79),
Chapter 1)

1 M2 dF (x)
WSPT = + )][1 G(x+ )]
(8.38)
2 0 [1 G(x
where F (x) is the service time distribution function and

x
G(x) = y dF (y)
0
(the notation x and x+ means, respectively, left-hand and right-hand

limit).
Fig. 8.4.
Since (8.38) represents a minimum over a wider domain of strategies, it

is clear that WSPT WFIFO . Moreover, we conjecture that the inequality
is strict as long as the service time distribution is not degenerate. Two of
the graphs in Fig. 8.4 show WFIFO and WSPT as functions of the trac
intensity, , for a xed value of (exponentially distributed service times).
The STP discipline is an example of a non-pre-emptive scheduling
strategy for which the M/G/1 conservation law (Theorem 8.3) does not
hold. Condition (ii) of section 8.2 is violated.
At this point, one naturally asks the question can the average response
time be reduced still further by allowing interruptions of service?. In other
words, if individual service times are known in advance, is there a better way
of using that information than by serving the shortest job rst? Intuitively
the answer is positive. If, for example, a newly arriving job with service
time x nds a job in service with remaining service time y > x, it seems
better to start the new job immediately rather than wait until the current
service is completed.
These considerations lead to the Shortest-Remaining-Processing-Time-

rst discipline (SRPT), whereby at any moment in time the job with the
least remaining service time of those present in the system is being served.
The following result, due to Schrage [10], establishes the optimality of that
discipline.
Lemma 8.6. The SRPT scheduling strategy achieves the lowest possible
value of the over-all average response time in a G/G/1 queueing system.
Proof. A stronger assertion can be demonstrated: for every realisation of

the interarrival and service time processes, SRPT minimises the number
of jobs in the system at any point in time. Indeed, consider a particular
sequence of arrival instants and job service times and suppose that the
scheduling strategy is not SRPT. Then at some point t there must be
two jobs in the system, j and k, with remaining service times xj and xk
respectively, such that (i) xj > xk and (ii) j is in service at t and remains in
service for some interval (t, t + v). Denote by the set of intervals following
t during which either job j or job k is being served. The total length of
the intervals in is xj + xk and is independent of the scheduling strategy.
Let us now modify the scheduling strategy by giving pre-emptive priority
to job k over job j during ; at all other instants the strategy remains
unchanged. Clearly, this modication aects only jobs j and k. Its eect
is to bring forward the earlier of the two departure instants (because of
xk < xj ), without changing the later one. Hence, there will be an interval
of time during which the number of jobs in the system under the modied
strategy is one less than under the original one; at all other times the two
are the same.
Thus, every non-SRPT scheduling strategy can be improved with
respect to the number of jobs in the system. It follows, therefore, that
the smallest number of jobs is achieved under the SRPT discipline. This
completes the proof of the lemma since, according to Littles theorem,
minimising the average number of jobs in the system is equivalent to
minimising the average response time.
The M/G/1 queueing system under SRPT scheduling was analysed by

Schrage and Miller [12]; the average response time is given by

1 F (x) G2 (x) + x2 [1 F (x)]
WSRPT = dx + dF (x)
0 1 G(x) 2 0 [1 G(x )][1 G(x+ )]
(8.39)
where G(x) has the same meaning as in (8.38) and

x
G2 (x) = y 2 dF (y).
0
Since WSRPT is a minimum over all scheduling strategies, it must be that

WSRPT WSPT . Moreover, it seems again that the inequality is strict
except when the service times are constant. The dependency of WSRPT on
for a xed value of and for exponentially distributed service times is
illustrated in Fig. 8.4. The SRPT discipline is an example of a pre-emptive
strategy for which the M/M/1 conservation law does not hold (because
condition (ii) of section 8.2 is not satised).
Let us recapitulate the results obtained so far in this section, for cost
functions of the type (8.34).
(i) M/G/1 systems, pre-emption of service disallowed. If only the dis-

tributions of service times are known, then the optimal scheduling
strategy is obtained by ranking the ratios cr /r and applying non-
pre-emptive priorities. This reduces to SEPT when the objective is to
minimise the overall average response time (cr = r /). If individual
service times are known, and if cr = r /, then the optimal strategy
is SPT.
(ii) M/M/1 systems, pre-emption of service allowed. If only the distribu-
tions of service times are known, then the optimal strategy is obtained
by ranking the ratios cr /r and applying pre-emptive priorities. In view
of the remark following Theorem 8.4, this result holds also for G/M/1
systems. If individual service times are known, and if cr = r /, then
SRPT is optimal in arbitrary G/G/1 systems.
There is an obvious gap in the above: an M/G/1 (non-exponential

service times) system where only the distributions of service times are
known and where pre-emptions are allowed. The optimal scheduling
strategy in this case is still an open problem. If, however, all jobs to be
executed are assumed to be in the system at time zero (no further arrivals),
then the problem has been solved by Sevcik [13]. He has dened a strategy
called Smallest-Rank-rst (SR) and proved its optimality. Moreover, Sevcik
has put forward the conjecture (supported by an intuitive argument) that
SR is optimal in the M/G/1 case, too.
The Smallest-Rank strategy works as follows: suppose that the proces-
sor is assigned for an interval y to a job of class r with attained service
time t. The average amount of processor time that the job will actually use
is equal to
y
Qr (t, y) = [1 Fr (t + x)]dx [1 Fr (t)],
0
where Fr (x) is the required service time distribution function for class r.
The probability that the job will complete within the allocated time y is
equal to
Sr (t, y) = [Fr (t + y) Fr (t)]/[1 Fr (t)].
These two quantities are used to dene the rank vr (t) of a class r job with
attained service t:
Qr (t, y)
vr (t) = min ,
y cr Sr (t, y)
where cr is the cost coecient associated with class r. The minimum is
taken over the set of permissible allocations y (if jobs can be interrupted
at any point, all y 0 are permissible). The smallest value of y for which
the minimum is reached is called the rank quantum. At each scheduling
decision point, the processor is assigned to the job with the smallest rank
for the duration of the corresponding rank quantum.
When vr (t) is a non-increasing function of the attained service t for
every r = 1, 2, . . . , R, the SR strategy behaves like a pre-emptive priority
discipline based on the ordering (8.36), except that the average service times
are replaced by the average remaining service times. This tends to happen
when the service time distributions have coecients of variation not greater
than 1 (e.g. uniform, Erlang or exponential distributions). If, on the other
hand, vr (t) is an increasing function of t for every r, then SR behaves like a
Processor-Sharing discipline (e.g. for hyperexponential distributions). This
conrms the intuitive idea that when the variations in service times are
small, jobs should not be interrupted too much, and when the variations
are large, it is better to interrupt them often.
The optimality of the SR discipline for the case of no arrivals is proved
by induction on the number of jobs present. We shall omit that proof here;
the interested reader is referred to [13].
A few words should be said about minimising cost functions which are
non-linear in the elements of W. The general problem (min C(W), subject
to the constraints W H1 , or W H2 , depending on whether the system
is M/M/1 or M/G/1) can be tackled by classic mathematical programming
methods. The fact that the constraints are always linear they are given
by Theorems 8.4 and 8.5 may facilitate the solution. Take, as an example,
a two-class M/M/1 system and consider the problem
min(W12 + W22 )
(nd the point (W1 , W2 ) which is closest to the origin). The constraints are
1 W1 + 2 W2 = (1 /1 + 2 /2 )/(1 1 2 ) (conservation law)

W1 1/[1 (1 1 )]
W2 1/[2 (1 2 )].
Using only the rst constraint, by means of a Lagrange multiplier, we nd

a possible solution
1 = (K/2)1 ;
W 2 = (K/2)2 ,
W
where the Lagrange multiplier K is determined from the equality constraint:
K = 2(1 /1 + 2 /2 )/[(1 1 2 )(21 + 22 )].
There are now three possibilities: if the vector (W 2 ) satises both

1, W
inequality constraints, then it is achievable and is, therefore, the solution of
the problem. A scheduling strategy to achieve it can be found by using one of
the parameterised families of section 8.4. If W violates one of the inequality
constraints (it cannot violate them both because of the conservation law),
then the solution is at the corresponding extreme point of H1 . The optimal
scheduling strategy will be the pre-emptive priority discipline associated
with that extreme point.
We have obtained in this chapter some rather general results concerning
the characterisation, control and optimisation of performance in single-
resource systems. We have also seen that there are many problems still
unsolved. Other ways of measuring performance form a large area to be
explored. For instance, among the scheduling strategies that achieve a
given performance vector, is there a strategy that minimises the vector
of variances of response times and, if so, how can it be found?
In the next chapter, we shall consider some synthesis problems in
multiple-resource systems. As can be expected, the situation there is much
more complicated. Several methods of control will be examined, using
various models of multiprogrammed computer systems.
References
1. Coman, E. G., Jr. and Mitrani, I. (1980). A characterisation of waiting time
performance realisable by single-server queues. Operations Research.
2. Fayolle, G., Iasnogorodski, R. and Mitrani, I. (1978). On the sharing of a
processor among many job classes. Research Report, No. 275, IRIA-Laboria;
also to appear (1980), JACM.
3. Fife, D. W. (1965). Scheduling with random arrivals and linear loss functions.
Man. Sci., 11(3), 429437.
4. Kleinrock, L. (1965). A conservation law for a wide class of queueing
disciplines. Nav. Res. Log. Quart., 12, 181192.
5. Kleinrock, L. (1967). Time-shared systems: A theoretical treatment.
J.A.C.M., 14(2), 242261.
6. Kleinrock, L. (1976). Queueing Systems. Vol. 2. John Wiley, New York.
7. Kleinrock, L. and Coman, E. G., Jr. (1967). Distribution of attained service
in time-shared systems. J. Comp. Sys. Sci., 287298.
8. Mitrani, I. and Hine, J. H. (1977). Complete parameterised families of job
scheduling strategies. Acta Informatica, 8, 6173.
9. ODonovan, T. M. (1974). Distribution of attained and residual service in
general queueing systems. Operations Research, 22, 570575.
10. Schrage, L. E. (1968). A proof of the optimality of the SRPT discipline.
Operations Research, 16(3), 687690.
11. Schrage, L. (1970). An alternative proof of a conservation law for the queue
G/G/1. Operations Research, 18, 185187.
12. Schrage, L. E. and Miller, L. W. (1966). The queue M/G/1 with the shortest
remaining processing time discipline. Operations Research, 14, 670683.
13. Sevcik, K. C. (1974). A proof of the optimality of smallest rank scheduling.
J.A.C.M., 21, 6675.
14. Smith, W. E. (1956). Various optimisers for single-stage production. Nav.
Res. Log. Quart., 3, 5966.
Chapter 9
Control of Performance
in Multiple-Resource Systems
9.1. Some problems arising in multiprogrammed

computer systems
Large computer systems contain, as a rule, several processors: some of
these perform arithmetic and logical operations, others transfer information
between primary and secondary memory, still others control various com-
munication lines, etc. In order to ensure the ecient use of the resources,
and to allow a number of users simultaneous access to the facilities, such
systems are usually multiprogrammed. A number of jobs are admitted into
main memory; these jobs are called active. Several processors may thus
serve several of the active jobs at the same time.
The rst and most pressing problem arising in this connection is that of
the degree of multiprogramming (the number of jobs that are active at any
one time). If the degree of multiprogramming is too low then the processors
are underutilised. If, on the other hand, it is too high then each active
job can be allocated only a small portion of main memory; consequently,
a heavy trac of information to and from secondary memory is generated
(paging); as a result, very little useful work is done. This latter phenomenon
is called thrashing. The problem is, therefore, to maintain a degree of
multiprogramming such that processor utilisation is high and thrashing
does not occur. Moreover, one should be able to do this dynamically, under
changing load conditions.
There is more freedom of choice when the demand consists of several
job classes. Dierences in program behaviour can be exploited in order to
produce a better mix of active jobs. For example, jobs whose locality of
reference is good (such jobs acquire their working set of pages quickly and
269
then cease to contribute to the paging trac), can be admitted in greater

numbers to the active set.
Apart from seeking the best degree of multiprogramming and the
best mix of active jobs, there are usually some pre-specied performance
objectives which have to be satised by the scheduling algorithm. These
are often stated in terms of average response times (the performance vector
introduced in Chapter 6). For instance, in a system where the demand
arriving per unit time consists of many short, I/O-bound jobs and a few long
CPU-bound jobs, the objective might be to guarantee a certain maximum
average response time for the short jobs.
All these problems are of the same type as the synthesis problems
discussed in Chapter 6: for a given set of system parameters (describing
the hardware conguration and the behaviour of various job classes)
and for a given performance objective, nd a scheduling strategy to
achieve that objective. However, the scheduling decisions involved in
multiprogrammed, multiple-resource systems are more complex; they are
also more interrelated. One has to decide how many jobs, and of what
classes, should be admitted into the active set: this part of the scheduling
strategy will be called admission control policy. Another set of decisions
concern the scheduling of active jobs among the system resources (how much
main memory to allocate to each active job, what queueing discipline to use
at various processors, etc.): these will be referred to as resource allocation
policy. Clearly, the admission policy and the resource allocation policy
inuence each other.
The results which allowed us to characterise the achievable performance
vectors in single-server systems do not carry over to multiple-resource
systems. We do not know what is the set of performance vectors that
can be achieved by varying the scheduling strategy. Neither do we know,
in general, what is the best scheduling strategy corresponding to a
given performance objective function. Existing studies have concentrated on
evaluating particular admission and resource allocation policies, or families
of such policies.
There is one general result which is still valid, and it concerns the
convexity of the set of achievable performance vectors: if two performance
vectors W1 and W2 are achievable by scheduling strategies S1 and S2 , then
all performance vectors on the line segment between W1 and W2 are also
achievable, by mixing scheduling strategies (S1 , S2 , ) (see section 8.3).
In this chapter we shall present a general model for multiprogrammed,
multiple-resource systems and shall consider several control policies. These
Control of Performance in Multiple-Resource Systems 271
will be directed at optimising the degree of multiprogramming in single-

class systems, as well as at controlling the performance vector in systems
with many job classes.
9.2. The modelling of system resources and

program behaviour
A group of system resources to be called the inner system is
involved in the execution of active jobs: this group comprises the CPU, the
main memory, disk and drum processors, etc. Another group the outer
system is concerned with getting jobs into and out of the inner system:
terminals and batch entry stations are of that type. This distinction is
sometimes conceptual rather than physical, e.g. a disk unit may contain les
referred to by the active jobs (inner system) and, at the same time, be used
for spooling or roll-in-roll-out operations (outer system). The determining
characteristic of resources in the inner system is that any job using them
occupies a certain amount of main memory. The system structure is
illustrated in Fig. 9.1. The inner system is modelled by a queueing network:
node 0 represents the CPU (assuming a single CPU), node 1 represents the
paging device (usually a drum) and nodes 2, 3, . . . , K represent other
I/O devices. We can think of the admission control policy as a node, too,
since it may delay jobs coming from the outside and may also remove
active jobs from the inner system, hold them and reintroduce them at a
later stage.
Main memory is allocated to the active jobs in units of xed size called
page frames; their total number is M . The information that can be stored
in a page frame is a page. Every job requires a certain set of pages in
Fig. 9.1.
order to complete execution: that set is its virtual memory. Typically,

only a fraction of a jobs virtual memory can be contained in the page
frames allocated on that job, the rest is stored in secondary memory (on the
paging device). When an active job requires a page from its virtual memory
which is not already in main memory (i.e. when a page fault occurs) a
new page frame has to be allocated and the page brought in before the job
can continue execution. This, in turn, may necessitate the freeing (perhaps
by removing a page) of another page frame from the memory allocation for
that or for another job.
Thus, the niteness of main memory inuences system performance
through the page faults, the trac of pages to and from the paging device,
and the ensuing delays. These eects can be incorporated into the model in
a convenient way by associating with each active job a lifetime function,
e(m) (Belady and Kuehner [2]). For a job which is executing in m page
frames of main memory, e(m) is the average amount of CPU service
received by the job between two consecutive page faults (not necessarily
a continuous CPU interval; it may be interrupted by visits to other I/O
devices). Alternatively, 1/e(m) is the rate at which a job whose current
memory allocation is m pages frames interrupts its CPU service to go to
the paging device.
Intuitively, e(m) should be an increasing function of m: the more page
frames a job has, the less often it is likely to have page faults. There are
also reasons to believe that e(m) is convex, at least for small values of m
(this is the case, for instance, when pages are referenced independently of
each other). Two analytic (as distinct from empirically derived) forms of
e(m) have been used frequently. These are
e(m) = amk ; a > 0, k > 1 (9.1)
(Belady and Kuehner [2]), and
2b
e(m) = ; b, c > 0 (9.2)
1 + (c/m)2
(Chamberlin, Fuller and Lin [4]). The two functions are shown in Fig. 9.2.
When a new job is admitted into the inner system, i.e. becomes active,
it is allocated a certain number of page frames. This implies, in general, that
the allocation of one or more other jobs is reduced; they begin, therefore, to
operate at a dierent point on their lifetime curves closer to the origin
and start visiting node 1 more often. Similarly, when a job departs from
Fig. 9.2.
the inner system, one or more other jobs start visiting node 1 less often.
Thus, not only the service times at node 0 but also the probability that a
job goes to node j after leaving node 0 depend on the number and types of
active jobs.
This behaviour of the inner system means, unfortunately, that the
results of Chapter 3 are not directly applicable to the present model. No
matter what assumptions we make about queueing disciplines, required
service time distributions, etc., a queueing network model of a multi-
programmed system such as the one in Fig. 9.1 will not have a product-
form solution; and without a product-form solution we cannot hope to
have ecient methods for the exact evaluation of performance measures. It
is almost imperative, therefore, to look for approximate solutions.
The parameters of most real-life systems are such that the corre-
sponding models lend themselves easily to decomposition. The interactions
between the inner and the outer systems are weak compared to those within
the inner systems: jobs are admitted into, and depart from, the inner system
at a much lower rate than that at which they circulate inside it. This
allows one to assume that the inner system reaches equilibrium in between
consecutive changes in the degree of multiprogramming. For a given set of
active jobs, the inner system can be treated as a closed queueing network;
that network can be analysed in the steady-state to obtain the rates at
which jobs of various classes obtain CPU service; those rates can be used
to replace the whole inner system by a single server whose rate of service
depends on the system state (via the set of active jobs).
We shall elaborate further on this approach when we apply it to specic
synthesis problems. Our interest will be directed primarily towards the
design of admission and memory allocation policies.
9.3. Control of the degree of multiprogramming

It is intuitively obvious that the eciency of a system should depend on
the degree of multiprogramming. To render that intuition quantitative, let
us introduce as a measure of eciency the over-all T : the average number
of jobs that are completed per unit time. In terms of the model in Fig. 9.1
this is the average number of jobs taking the path from node 0 to the
outer system per unit time. The dierences between job types are thus
removed from consideration; if there are several job classes they are all
lumped together.
Suppose that the inner system is multiprogrammed at a constant degree
n: there are n jobs circulating in it at all times and as soon as one of them
leaves a new one is admitted immediately. Let the steady-state under these
conditions be T (n), and the steady-state CPU utilisation (the probability
that node 0 is busy) be U0 (n). Denote, further, the average CPU time
required per job by 1/ (this is a job characteristic and is independent of
n). Then, since jobs are completed at rate while the CPU is busy, we have
T (n) = U0 (n). (9.3)
The is directly related to the CPU utilisation.

If we plot T (n) or U0 (n) against n, we obtain typically a graph that
looks like that in Fig. 9.3 (see, for example, Denning et al. [6]). We can
distinguish three broad regions on the graph of T (n). (i) When n < n1 , the
system is underloaded; CPU utilisation and are low because there are not
enough active jobs to make ecient use of the resources. (ii) When n > n2 ,
the system is thrashing; CPU utilisation and are low because there are too
many active jobs, each is allocated only a few page frames in main memory,
the page fault rate is very high and most of the jobs spend most of their
Fig. 9.3.
time at the paging drum. The task of a control algorithm is therefore to

ensure that the degree of multiprogramming is maintained dynamically in
the region of ecient operation (iii) n1 n n2 , and preferably close to
the optimum n0 .
We shall discuss three methods of control, all of which have intuitive
justications and have been shown empirically to perform well (to a greater
or lesser extent) under various conditions. However, at present there is no
formal proof for either of them. The knee and the L = S criteria were
studied by Denning and Kahn [5, 6]; the 50% drum utilisation method
was proposed by Leroudier and Potier [11].
9.3.1. The knee criterion

This control procedure works by monitoring the inter-page-fault intervals
for all active jobs and allocating memory in such a way that each job
operates at, or near, a certain point on its lifetime curve called the knee.
The knee m of a lifetime function e(m) is dened as the point maximising
the ratio e(m)/m, when that maximum exists:
e(m )/m e(m)/m for all m > 0. (9.4)
Geometrically, m is such that the ray from the origin passing through the
point (m , e(m )), dominates the entire curve {e(m), m 0}. For example,
the knee of the lifetime function dened by (9.2) is at m = c (see Fig. 9.2).
In order to give the intuitive justication for the knee criterion we
need to introduce a quantity called space-time product, and establish a
relation between it and the . The space-time product, Y , for a job whose
execution time is D (this is total real time spent in the inner system, not
virtual CPU time), is dened as
D
Y = m(t)dt (9.5)
0
where m(t) is the number of page frames that the job holds at time t of its
execution. If m
is the average number of page frames held by the job, then
Y = mD.
(9.6)
Now consider a large period of time, V , during which all M memory

frames are in use. The total space-time in the system over that period is
MV. On the other hand, the average number of jobs executed during the
period is T (n)V (since an average of T (n) jobs depart per unit time). Hence,
the average space-time product per job is
MV M
Y = = . (9.7)
T (n)V T (n)
Thus Y and T (n) are inversely proportional to each other; minimising

the space-time product per job leads to maximising the and vice versa (this
fact was also pointed out by Buzen [3] in an operational analysis context).
Let us derive a rough estimate of Y , for a job running in m page
frames of memory and with a lifetime function e(m). Since the average
CPU requirement is 1/ and the average inter-page-fault interval is e(m),
the job has an average of 1/[e(m)] page faults. Let be the average delay
incurred as a consequence of a page fault (this includes page transport time
plus waiting at drum and CPU). Denote by the average delay, per unit of
CPU time, caused by visits to the other I/O devices (the number of those
visits does not depend on m). Then the average time the job spends in the
inner system is equal to
1
D= + + (9.8)
e(m)
and, from (9.6), the space-time product is

1 m
Y = m( + 1) + . (9.9)
e(m)
Bearing in mind that the knee m of the lifetime function maximises

the ratio e(m)/m, we see from (9.9) that m minimises the component
of the space-time product due to paging. Since the I/O device speeds are
unaected by the allocation of memory, letting each active job operate at
the knee of its lifetime function tends to minimise the space-time product
per job. Such an allocation tends, therefore, to maximise the T (n).
This intuitive argument can be carried a step further by taking the
lifetime function dened by (9.2), substituting it in (9.9) and nding
the optimal memory allocation mopt explicitly. Assuming, as a rough
approximation, that and are independent of m, and solving the equation
dY /dm = 0, we obtain
1
mopt = c/(1 + 2b( + 1)/ ) 2 .
Hence if 2b( + 1)/ is not large (usually it is less than 1), mopt c = m ;
the optimal allocation is indeed close to the knee of the lifetime function.
An implementation of the knee criterion would involve, for each active
job, a continuing monitoring of its paging activity, an estimation of its
lifetime function and an allocation of memory corresponding to the knee
point. The degree of multiprogramming is thus controlled indirectly via the
memory allocation. Such a control policy can be expected to be expensive,
both in instrumentation and in overheads. On the other hand, as Denning
and Kahns experiments suggest (see [6]), the knee criterion is robust and
yields near-optimal degrees of multiprogramming over a wide range of
loading conditions. One cannot, of course, apply the knee criterion if jobs
behave according to a lifetime function that has no knee, such as the one
dened by (9.1). A nite value for mopt may still exist, as we shall see
shortly, and it may be possible to obtain an estimate for it. The memory
allocation controller can then use that estimate.
9.3.2. The L = S criterion

This is a control policy that acts directly on the degree of multiprogram-
ming. It uses a single control variable the average (taken over all active
jobs) inter-page-fault interval L(n); the system lifetime. At any moment
in time, L(n) can be estimated by the mean of the last k inter-page-fault
intervals:
k
1
L(n) ej , (9.10)
k j=1
where ej is the j-th most recent interval, regardless of which job gener-
ated it.
The L = S control policy attempts to balance the system lifetime
and the paging drum average service time S: it maintains the degree of
multiprogramming n at such a level that
L(n) cS, (9.11)
where c is a constant not much greater than 1. The intuition behind this
criterion derives from the bounds that device service times place on .
Let 1/bi and Ui be, respectively, the mean service time and the
utilisation of node i in the inner system (i = 0, 1, . . . , K; 1/b1 = S).
The parameters b1 , b2 , . . . , bk are independent of n but b0 depends on it via
the memory allocation and lifetime functions. The utilisations Ui depend, of
course, on n. Denote further by 1/ai the mean CPU interval (not necessarily
continuous) between consecutive requests for device i; i = 1, 2, . . . , K. These
are averages over all active jobs. 1/a1 is the system lifetime L(n); all other
ai s are program characteristics and are independent of n.
While the CPU is busy, jobs depart from it in the direction of node i
at rate ai (i = 1, 2, . . . , K); the probability that the CPU is busy is
U0 ; therefore, the rate at which jobs arrive into device i is equal to
U0 ai . Similarly, the rate at which jobs leave device i is equal to Ui bi
(i = 1, 2, . . . , K). Since these two rates are equal in the steady-state, we have
U 0 ai = U i b i i = 1, 2, . . . , K. (9.12)
These equations, combined with relation (9.3), allow us to write K + 1

expressions for the system :

U0
T (n) = (9.13)
Ui bi /ai , i = 1, 2, . . . , K.
Now, the utilisations Ui , being probabilities, must satisfy the inequali-
ties Ui 1, i = 0, 1, . . . , K. Therefore,

T (n) (9.14)
bi /ai , i = 1, 2, . . . , K.
Of these K + 1 bounds, only b1 /a1 = L(n)/S depends on n.

By introducing the I/O constant
I = min{1, b2 /a2 , . . . , bK /aK } (9.15)
we can rewrite (9.14) as
T (n) min{I, L(n)/S}. (9.16)
From the above, the is bound by two functions. One of them is constant
and the other is decreasing with n (the more active jobs, the less memory
per job and hence the smaller CPU intervals between page faults). These
bounds are illustrated in Fig. 9.4. If the two bounds intersect, they do so at
point n
which satises L( n) = IS. Intuitively, the should start decreasing
for n > n , i.e. n
is slightly larger but close to the optimal value of n.
Fig. 9.4.
The control policy indicated by this intuition is to maintain the degree of

multiprogramming so that L(n) cIS, for some small constant c > 1.
When I = 1, i.e. when ai bi (i = 2, 3, . . . , K), or the system is not
I/O-bound, that is the policy (9.11).
The L = S criterion is not inconsistent with minimising the space-
time product per job. If we take the lifetime function (9.1), e(m) = amk ,
substitute it in (9.9) and solve the equation dY /dm = 0 (with and
independent of n), we see that a solution mopt exists and satises
(k 1)
amkopt = .
+1
If the system is not I/O-bound, the constant (the average I/O delay per
unit of CPU time) is small; the constant k of the Belady lifetime function
is usually less than 2. Assuming that (the average delay per page fault) is
not much greater than the drum service time S, we conclude that e(mopt )
is near S: we arrive again at the L = S criterion.
A controller based on the L = S rule is simpler and easier to implement
than one based on the knee criterion. All that is required is an estimate
of the current system lifetime L(n), obtained as in (9.10). However, this
policy appears to be less robust and in some cases (especially in I/O-bound
systems) leads to a degree of multiprogramming which is signicantly lower
than the optimal (see [6]).
9.3.3. The 50% criterion

This rule states, very simply, that the degree of multiprogramming
should be maintained at a level such that the utilisation of the paging
drum is approximately 0.5 + d, where d is a constant less than 0.1.
The supporting argument for the 50% criterion is equally simple. It proceeds
as follows.
One of the manifestations of thrashing is a high rate of page faults,
hence a high rate of requests for the paging drum. A high rate of requests
implies a long queue; thus, a queue at the paging drum is symptomatic of a
thrashing system and, to prevent thrashing, queues should not be allowed
to develop. An average of one request at the paging drum is a good target
to aim for. If we treat that device as an independent M/M/1 queue we can
write for the average number n of requests there (see Chapter 1)
= U1 /(1 U1 ),
n
where U1 is the trac intensity, or the probability that the drum is busy. If
we wish that average to be n = 1 we should keep the utilisation at U1 = 0.5.
Tenuous though the above argument may appear, the 50% rule seems
to perform reasonably well, especially in systems where the average drum
service time S is lower than the knee lifetime e(m ) (see [6]). In other
cases this criterion tends to be less reliable than the other two and to
underestimate the optimal degree of multiprogramming. On the other hand,
it is the simplest of the three and the most straightforward to implement.
To summarise, we have described here three heuristic control rules
for maintaining the degree of multiprogramming at, or near, its optimal
level. We say that they are heuristic because there are no mathematical
proofs establishing their validity, only intuitive arguments. There is,
however, a certain amount of empirical and numerical evidence [1, 12, 13]
which suggests that these rules can be applied successfully in practical
systems.
It should be emphasised that any dynamic control is necessarily
involved with transient phenomena whereas the theoretical support for the
proposed control procedures is based on steady-state analysis. The degree
of multiprogramming is assumed to remain constant long enough for the
inner system to reach steady-state. If the loading conditions change rapidly
in relation to the control actions (or, alternatively, if the control procedure
reacts slowly to changes in the load conditions) then this assumption is
violated and the control, if it works at all, will be unstable. There are no
general assertions that can be made in this connection; much depends on
implementation and instrumentation, as well as on the control algorithm.
For example, some experiments (Leroudier and Potier [11]) indicate that
the 50% rule responds rapidly to load uctuations.
In the following sections we shall present some control algorithms which

take into account, and exploit, the dierences in behaviour patterns that
exist in systems with several job classes.
9.4. The page fault rate control policy (RCP)

We have seen that there are certain trade-os that govern the choice of
a control algorithm. Program-driven algorithms (ones that collect and
use information about individual programs in the active set) tend to
perform better but are more expensive in terms of implementation and
overheads. On the other hand, load-driven algorithms (based on global
system behaviour) tend to be less robust but are easier to implement and
have lower overheads. The page fault rate control policy (Gelenbe, Kurinckx
and Mitrani [8]) combines the two principles by doing a small amount of
program monitoring (counting page faults for each active job) and some
load monitoring (estimating ).
Control is exercised by forcing active jobs into a special, impeded,
state from time to time (see Fig. 9.5) and keeping them there for random
periods. A job is removed from the inner system and put into the impeded
set (freeing all its pages in main memory) as soon as it has had J + 1,
consecutive page faults; J is a parameter of the policy. Thus, some well-
behaved jobs complete their execution without ever entering the impeded
set while others, with heavy paging demands, may pass through it several
times before completing. The parameter J plays a role analogous to that of
the CPU quantum in a conventional time-sharing scheduler, since it forces
a job to relinquish its resources if it requires more than J accesses to the
paging drum.
The periods that jobs spend in the impeded set should depend on the
over-all system behaviour: when the system is thrashing, i.e. when is low,
Fig. 9.5.
jobs should remain impeded longer; when the improves, impeded jobs can
be reintroduced into the active set at a higher rate. Under the page fault
rate control policy, the average time 1/ that jobs remain in the impeded
set is proportional to the average interval 1/ between departures from the
active set. We set
= h (9.17)
where h is a small constant (in the numerical evaluations of the policy its
value was chosen as h = 0.01).
Thus, an implementation of the policy would involve two types of
monitoring: (i) counting page faults for each active job in order to decide
when to remove jobs to the impeded set, and (ii) estimating the total rate
of departures from the inner system in order to regulate the average times
that jobs spend in the impeded set. This compromise between program-
driven and load-driven control allows the policy to prevent thrashing by
discriminating against the jobs which contribute to it most the jobs
with most page faults.
An exact analysis of the system performance under RCP is not feasible
for the reasons mentioned in the last section: the behaviour of active jobs
depends on their number. However, an approximate evaluation can be
obtained by applying decomposition. First, we consider the inner system
as a closed network with a xed number nr of class r jobs circulating
inside (r = 1, 2, . . . , R). An analysis of the closed network will enable
us to replace the whole inner system by a single aggregate server which
gives simultaneous service to all active jobs, at rates depending on the
state n = (n1 , n2 , . . . , nR ). To do this we need the steady-state probability
r (n1 , n2 , . . . , nR ) that, in the closed network, a class r job is in service
at the CPU: that probability determines the for class r jobs in state n.
Also necessary is the steady-state probability rJ (n1 , n2 , . . . , nR ) that a
class r job which has already had J page faults is in service at the CPU: it
determines the rate at which jobs leave the inner system to join the impeded
set.
An important distinguishing feature of the jobs of a given job class is
their paging behaviour. In the model, a dierent lifetime function er (m)
is associated with each job class (r = 1, 2, . . . , R). The counting of page
faults in the closed network is modelled by splitting class r into J + 1
articial job classes (r, 0), (r, 1), . . . , (r, J). At each visit to the paging
drum, a job of class (r, j) becomes a job of class (r, j +1) if j = 0, 1, . . . , J 1
and it becomes a job of class (r, 0) if j = J.
Exact expressions for r (n) and rJ (n) can be obtained, under a suitable
set of assumptions, by applying the BCMP theorem of Chapter 3 (the
formulae for a special case can be found in [8]). However, the computational
eort associated with the solution of a multiclass network (calculation of
normalisation constant, aggregation of states to nd marginal distributions,
etc.) is considerable. What is more, that eort grows rather rapidly with
the size of the model, in particular with the number of articial job classes
resulting from a large value of J. On the other hand, an exact solution
is rarely necessary, especially in view of the fact that the whole model is
approximate (because of the decomposition and because parameters have to
be estimated). Good approximations for the probabilities r (n) and rJ (n)
can be obtained rather easily as follows.
Solve a single-class closed network with n = n1 + n2 + + nR jobs
circulating inside. As a lifetime function, use the linear combination
R

e= (nr /n)er .
r=1
If other parameters vary across job classes, they are also averaged in a
similar fashion. That single-class solution yields the over-all CPU utilisation
U0 (n).
Now return for a moment to the full model (including the outer system
and the impeded set) and let 1/r be the average total CPU time required
by a class r job; assume that the distribution of that time is exponential.
Then, while a class r job is being served by the CPU, it leaves for the outer
system at rate r . Similarly, while a class r job is in service at the CPU it
leaves for the impeded set at rate 1/[(J + 1)er ] (it is ejected at the J + 1st
page fault). Therefore, in the virtual CPU time of a class r job the average
interval between consecutive departures from the inner system is equal to
1
1 (J + 1)er
Qr = r + = ; r = 1, 2, . . . , R. (9.18)
(J + 1)er 1 + (J + 1)er r
Intuitively, the proportion of all CPU busy time which is devoted to
class r jobs can be approximated by
R
qr = nr Qr ns Qs ; r = 1, 2, . . . , R. (9.19)
s=1
Furthermore, the class r jobs which have had J page faults (these are
the class (r, J) jobs) occupy approximately a fraction 1/(J + 1) of that
proportion. We therefore set
r (n1 , n2 , . . . , nR ) = qr U0 (n1 , n2 , . . . , nR )
rJ (n1 , n2 , . . . , nR ) = [qr /(J + 1)]U0 (n1 , n2 , . . . , nR ); r = 1, 2, . . . , R.
(9.20)
Remark. If we were dealing with a generalised RoundRobin server which

gives Qr quanta of service to each class r job every time its turn comes,
then (9.19) would be exactly the fraction of the server busy time devoted
to class r jobs (see section 8.4). We are, in eect, assuming here that the
inner system behaves approximately like a RoundRobin server. A limited
validation of expressions (9.20), performed by simulation, showed good
agreement with observed values ([8]).
The inner system can now be replaced by a single aggregate server.
When the state of its queue is n = (n1 , n2 , . . . , nR ), that server returns
class r jobs to the outer system at rate
r (n) = r r (n); r = 1, 2, . . . , R (9.21)
and it sends class r jobs to the impeded set at rate
r (n) = (1/er )rJ (n); r = 1, 2, . . . , R. (9.22)
The total rate at which class r jobs depart from the aggregate server is
r (n) = r (n) + r (n). The rate at which each impeded job returns to the
aggregate server is given by (9.17), with = 1 + 2 + + R .
The model has thus been reduced to a queueing network with three
nodes: the outer system, the aggregate server and the impeded set. The
state of the network (under exponential assumptions) is a 2R-dimensional
Markov process (n; k) = (n1 , n2 , . . . , nR ; k1 , k2 , . . . , kR ), where nr and kr
are the numbers of class r jobs at the aggregate server and in the impeded
set, respectively (r = 1, 2, . . . , R). While it is easy to write the steady-state
balance equations for that process, solving them is by no means easy: there
is no product-form solution because the behaviour of the aggregate server
depends on the vector n and not just on the total number of jobs there.
However, in some cases another level of decomposition may simplify matters
considerably.
Consider the case of two job classes and suppose that the outer system
consists of two nite sources: N1 terminals of class 1 and N2 terminals
of class 2. There are thus N1 and N2 class 1 and class 2 jobs circulating
endlessly among the three nodes; the system state is (n1 , n2 ; k1 , k2 ), where
n1 + k1 N1 , n2 + k2 N2 . The number of states, and hence the number
N1 + 2 N2 + 2
of balance equations, is 2 2 .
If the two job classes have very dierent lifetime function character-
istics, we can attempt another decomposition. Suppose that the ejection
threshold J can be chosen in such a way that jobs of one class, say class 2,
are ejected from the inner system much more often than the others (the
choice of J will be examined later). Then, if the aggregate server and the
impeded set are considered in isolation, with I1 and I2 class 1 and class
2 jobs circulating among them, it can be assumed that all I1 jobs are at
the aggregate server. The probabilities P (I1 , n2 | I1 , I2 ) that there are n2
class 2 jobs at the aggregate server, given I1 and I2 (n2 = 0, 1, . . . , I2 ) can
be obtained by solving a simple I2 + 1 state Markov process.
As the next step, the aggregate server and the impeded set are replaced
by a single server which, when in state I1 , I2 , sends class r jobs to the
terminals at rate
I2

r (I1 , I2 ) = r (I1 , n2 )P (I1 , n2 | I1 , I2 ), r = 1, 2,
n2 =0
where r (n1 , n2 ) is given by (9.21).

The system state is now determined by the vector (I1 , I2 ), I1 =
0, 1, . . . , N1 ; I2 = 0, 1, . . . , N2 . The steady-state distribution of that vector
is obtained (assuming exponentially distributed think times) by solving a
system of (N1 + 1)(N2 + 1) linear equations. From the joint distribution
of I1 and I2 one can compute the marginal distribution Ir (r = 1, 2) and
hence the average number E[Ir ] of class r jobs in execution (inner system
and impeded set). The average number of class r jobs at the terminals is
Nr E[Ir ] and therefore the rate at which class r jobs are submitted for
execution is r (Nr E[Ir ]), where 1/r is the mean think time at class r
terminals. Littles theorem now yields the average response time Wr of
class r jobs:
Wr = E[Ir ]/[r (Nr E[Ir ])], r = 1, 2. (9.23)
Figure 9.6 shows W1 as a function of N1 (for a xed value of N2 )

and W2 as a function of N2 (for a xed value of N1 ). The corresponding
response time curves in an uncontrolled system are also illustrated; the
dierence is very striking. The Belady lifetime function (9.1) was assumed
for this example. The two job classes diered in the value of the locality
Fig. 9.6.
parameter k of the lifetime function: k = 1.8 for the good class 1 jobs
and k = 1.5 for the bad class 2 jobs. Main memory was divided equally
among the active jobs and an ejection threshold J = 30 was used.
On the basis of some numerical comparisons (see [7] and [8]) it appears
that if the load characteristics do not vary rapidly with time, the control
exercised by the page fault rate control policy is close to optimal. Those
evaluations, however, ignored the overheads associated with the policy and,
in particular, the times taken to move jobs between the active and impeded
sets. The ability of the policy to react quickly to variations in program
behaviour is also open to investigation.
The choice of J: The function of the control parameter J is twofold:
rst, to prevent thrashing eectively and second, to ensure that the good
jobs those that do not have page faults often are not ejected from
the active set often. To full the rst objective J should not be too large
(otherwise there would be no control), and to full the second it should not
be too small. This trade-o can be assessed by evaluating the probability
pr that a class r job is ejected from the active set before it is completed.
Under exponential assumptions about total execution and inter-page fault
times we can write (see Chapter 1)
J J
1/er 1
pr = = , r = 1, 2, . . . , R.
r + (1/er ) er r + 1
In the case of two job classes suppose that class 1 is much better than
class 2 (i.e. e1 e2 ) and that it is possible to choose J so that Je1 1 1
and Je2 2 1. Then we would have p1 0, p2 1 and both objectives
would be satised. If the dierence between the two classes is not so extreme
and it is not possible to achieve p1 0 without at the same time having
p2 0, then the decision is much less clear-cut. One could proceed by
experimentation. Another possibility is to choose (if possible) the smallest

J such that (p1 /p2 ) > c, for some c > 1. Note that thrashing can always
be prevented even if there is no discrimination between the job classes.
9.5. Control of performance by selective

memory allocation
Let us now return to the synthesis problem considered in Chapter 6: given a
computer system with an input composed of R job classes, and the freedom
to vary the scheduling strategy, what vectors of average response times
W = (W1 , W2 , . . . , WR ) can be achieved? Posed in that generality (even
assuming Poisson inputs and exponential service times) the problem is still
open for multiple-resource systems. The conservation laws which allowed
us to obtain the characterisation theorems of section 6.3 are no longer
valid. We shall therefore pursue a more modest aim: that of dening and
studying a family of scheduling strategies which achieve, if not all achievable
performance vectors, at least a large subset of them.
In the single-processor systems of Chapter 6, the idea behind the
complete families of scheduling strategies (section 6.4) was to divide the
processing capacity in unequal fractions among the dierent job classes.
The same idea can be applied to the multiprogrammed system that we are
considering here (Hine, Mitrani and Tsur [10]). This time, the resource to
be divided will be the main memory.
We shall dene a memory allocation strategy controlled by a vector of
positive real weights = (1 , 2 , . . . , R ), whose elements correspond
to the job classes. If the number of class r jobs submitted for execution is
Nr (r = 1, 2, . . . , R), then the fraction of main memory allocated to class r
is equal to

R
r = r I(Nr >0) s I(Ns >0) ; r = 1, 2, . . . , R, (9.24)
s=1
where IB is the indicator function

1 if B is true
IB =
0 otherwise.
The memory allocated to class r is divided equally among the active class r
jobs. This memory allocation strategy must be accompanied by a job
admission strategy, in order to avoid thrashing.
Of the Nr jobs of class r present, a certain number nr will be admitted

into the active set; Nr nr jobs wait in an external queue (r = 1, 2, . . . , R).
Each active class r job thus runs in r M/nr pages of main memory (those
amounts may have to be adjusted slightly to make them integers), where M
is the number of pages available and r is given by (9.24); r = 1, 2, . . . , R.
There are several possibilities for the admission strategy. Suppose, for
example, that the lifetime function for class r is of type (9.2)
er (m) = 2br /[1 + (cr /m)2 ], r = 1, 2, . . . , R.
One could then decide to admit a class r job into the active set if there are
cr free pages in the allocation for class r (the knee criterion, section 9.3).
In this last case, the number of class r jobs in the active set would be
nr = min(Nr , r M/cr ), r = 1, 2, . . . , R. (9.25)
where x denotes the integer part of x.

Note that the division of memory resulting from the rule (9.24) is
state-dependent, but it is state-dependent in a very special way. Only the
presence or absence of jobs of a given class matter, not their number. It
is thus possible for a single job to cause an entire partition of memory to
be allocated to it, while jobs of another class have to wait outside because
their partition is full. This policy will be referred to as static partitioning.
Later we shall compare it to a dynamic partitioning policy which takes
congestion into account.
The static partitioning strategies dened by (9.24) achieve a wide range
of performance vectors. If we let, for instance, 1 , keeping r (r =
2, 3, . . . , R) nite, the resulting policy is to allocate the whole memory to
class 1 as soon as there are class 1 jobs present. In other words, the result
is to give pre-emptive priority to class 1. Similarly, if 1 0, the result
is to give pre-emptive priority to all other classes over class 1. Thus the
average response time for any job class can be made to range from the best
to the worst achievable. Any of the R! pre-emptive priority orderings can
be approximated as closely as desired by strategies from the family.
In the single-server case these properties would have ensured that the
family is complete, i.e. that all achievable performance vectors can be
achieved by strategies from it. Now, we cannot assert this for we do not know
what is the set of achievable performance vectors. It is clear, however, that
static partitioning strategies can achieve a large subset of the achievable
vectors. Moreover, these strategies are easy to implement and would not
involve signicant operational overheads (in addition to those associated

with whatever load control policy is employed).
To evaluate the performance of the static partitioning strategies we
apply the familiar decomposition approach. The inner system is considered
in isolation with a xed population of n = (n1 , n2 , . . . , nR ) active jobs
circulating inside. An analysis of that closed network, using the BCMP
theorem of section 3.5, yields the probabilities r (n1 , n2 , . . . , nR ) that a
class r job is in service at the CPU (the formulae for a special case can
be found in [10]). The inner system is then replaced by a single aggregate
server which, when in state n = (n1 , n2 , . . . , nR ), services class r jobs at rate
r (n) = r r (n), r = 1, 2, . . . , R. (9.26)
Here, as before, 1/r is the average CPU time required by a class r job;
the distribution of that time is assumed to be exponential.
The global system state is now described by the vector N =
(N1 , N2 , . . . , NR ), where Nr is the number of class r jobs submitted for
execution. Using the appropriate mapping N n (equation (9.25) is an
example) in conjunction with (9.26), one can write balance equations for the
steady-state distribution of N. These equations have to be solved numeri-
cally (perhaps using an approximation technique: see [10]), since there are
no closed-form solutions for R-dimensional Markov processes. From the dis-
tribution of N one can nd the mean number of class r jobs submitted and
hence, by Littles theorem, the average response time Wr (r = 1, 2, . . . , R).
Some results for the case of two-job classes are illustrated in Fig. 9.7.
The outer system in that example consisted of two independent Poisson
streams of jobs, with rates 1 and 2 for class 1 and class 2, respectively.
The inner system comprised a CPU, a paging drum and a ling disk;
Fig. 9.7.
Chamberlins lifetime functions were assumed and admissions into the

active set were controlled by (9.25). With two-job classes, the static
partitioning strategies depend on one parameter only: if, in (9.24), 1 and
2 are multiplied by the same constant, the strategy does not change.
Part (a) of the gure shows W1 and W2 as functions of 1 , the fraction
of main memory allocated to class 1 (0 1 1). The response time
vectors W = (W1 , W2 ) achieved by strategies from the static partitioning
family are shown in part (b). Note the marked dierence between this and
the single-server case (Fig. 8.2): there the achievable performance vectors
formed a straight-line segment.
Let us now tackle the static nature of these scheduling strategies.
Intuitively, it seems a good idea to allow jobs whose partition is temporarily
overloaded to spill over into a partition which is temporarily underloaded
(if one exists). To this end, we propose a family of strategies depending on
2R parameters: a vector of positive real weights = (1 , 2 , . . . , R ) and
a vector of positive integers m = (m1 , m2 , . . . , mR ). The s play the same
role as before, they dene a partitioning of main memory according to
(9.24). The way that partitioning is used, however, depends on the vector
m and on the state of the system N = (N1 , N2 , . . . , NR ). The number mr
is the minimum number of pages that a job of class r may be allocated;
in the case of Chamberlins lifetime functions, mr = cr is a good choice
(r = 1, 2, . . . , R).
There are three possibilities:
(i) If Nr mr < r M for all r = 1, 2, . . . , R, then all partitions are

underloaded; all jobs present are admitted into the active set and
each class r job is allocated r M/Nr pages of main memory (r =
1, 2, . . . , R).
(ii) If Nr mr r M for all r = 1, 2, . . . , R, then all partitions are
overloaded; the number of active class r jobs is equal to nr =
r M/mr and each of them is allocated mr pages (r = 1, 2, . . . , R).
(iii) If there exists a subset of job classes S such that Nr mr < r M for
r S and Nr mr r M for r {1, 2, . . . , R} S, then all jobs of
classes in S are active and there is a pool of

M =M r M Nr mr
rS
pages available for allocation to the job classes in {1, 2, . . . , R} S.

That pool is divided into fractions r proportional to the weights r ,
for r {1, 2, . . . , R} S. The number of active class r jobs is then

nr = min{Nr , (r M + r M )/mr } and each of them is allocated mr
pages (r {1, 2, . . . , R} S). If, after that allocation, there are still
pages left in the pool M , these revert to their original partitions and
are divided among the jobs there.
The above is called a dynamic partitioning strategy. In cases (i) and

(ii) it acts exactly like a static partitioning one; in case (iii) it allows jobs
from some classes to be admitted into partitions belonging to other classes
but gives priority there to the original owners.
One would expect that dynamic partitioning would lead to a more
ecient utilisation of main memory and hence to better over-all system
performance. This is true to a certain extent but the dierence does not
appear to be signicant. This is illustrated in Fig. 9.8 for an example
otherwise identical to that in Fig. 9.7 (two job classes, parameter 1 varying
in range 0 1 1, m1 = c1 , m2 = c2 ). Part (a) shows the average response
times W1 and W2 as functions of 1 , while the set of achievable performance
vectors W = (W1 , W2 ) is shown in part (b).
If Figs. 9.7 and 9.8 are superimposed it can be seen that the
performance of the dynamic partitioning strategies, over the whole range
of 1 , approximates very closely that of the static partitioning strategies
for 1 in the neighbourhood of 0.5. What is happening here is that any
unequal division of memory intended by the parameters 1 and 2 is
counterbalanced by the dynamic memory allocation. The net eect is
approximately the same as a static partitioning which divides the main
memory equally among the classes. If the two families of scheduling
strategies behave in general as in this example, we would be justied in
claiming that there is not much point in dynamic partitioning. The same
Fig. 9.8.
performance vectors can be achieved by static partitioning with a suitable

choice of parameters while there are other, more extreme, performance
vectors which are achievable by static but not by dynamic partitioning.
Another conclusion can be drawn from the strong non-linearity of
the achievable performance curve in Fig. 9.7. Whereas in single-resource
systems a decrease in one element of the performance vector is accompanied
by a proportional increase in another, here a small improvement in the
response time for one job class can lead to a disproportionate deterioration
in the response time for the other.
In the next section we shall argue that such non-linear behaviour is
likely to be observed in any multiclass terminal system, regardless of the
scheduling strategy.
9.6. Towards a characterisation of achievable performance

in terminal systems
In section 3.6 we derived a general relation (3.48) between node utilisations
and response times in queueing networks. That relation can be used to shed
some light on the nature of the set of performance vectors that is achievable
in terminal-driven multiprogrammed systems (Hine [9]).
Consider a computer system where the input source (the outer system
in Fig. 9.1) consists of R groups of terminals. There are N1 terminals
generating class 1 jobs, . . . , NR terminals generating class R jobs. Class r
terminals are characterised by their average think time r (r = 1, 2, . . . , R).
Denote by sir the total average service required by a class r job from node i
in the inner system (e.g. s0r = 1/r is the total average CPU time required
by a class r job). These are job characteristics which, with the exception of
the paging drums required service time s1r , do not depend on the admission
and scheduling policies. Denote, further, by Uir the utilisation of node i in
the inner system due to class r jobs (i = 0, 1, . . . , K; r = 1, 2, . . . , R); that
is, the proportion of time that node i spends serving class r jobs.
Equation (3.48) now expresses the average response time for class r
jobs thus:
Nr sir
Wr = r , i = 0, 1, . . . , K; r = 1, 2, . . . , R. (9.27)
Uir
Let us take a particular node in the inner system, for example the CPU
(i = 0), and solve (9.27) for the utilisation factor:
Nr s0r
U0r = , r = 1, 2, . . . , R. (9.28)
Wr + r
The total CPU utilisation U0 is obtained by summing (9.28) over all job
classes:
R
Nr s0r
U0 = . (9.29)
r=1
Wr + r
This equation can be regarded as a conservation law conditioned upon the

CPU utilisation: the performance vectors of all scheduling strategies which
yield CPU utilisation U0 must lie on the surface dened by (9.29). If one
wishes to lower the average response time for a particular job class and
keep the same CPU utilisation, then the response times of one or more
other classes will increase in such a way that the vector W remains on that
surface.
Consider the case of two-job classes (R = 2). Equation (9.29) now
denes a hyperbola with asymptotes at
Wr = (Nr s0r /U0 ) r , r = 1, 2
and middle point of the convex region at

1
Wr = [(Nr s0r + (N1 N2 s01 s02 ) 2 )/U0 ] r , r = 1, 2.
When U0 tends to zero, that curve moves away from the origin and attens
out; conversely, when U0 increases the curve moves towards the origin and
becomes more convex.
Thus, in a well-tuned system where the CPU utilisation is close to the
maximum attainable, the performance vectors of all scheduling strategies
which maintain that utilisation lie on a hyperbola. Of course, not all points
on the hyperbola are achievable. As in the single-server case, the achievable
performance vectors must satisfy inequalities of the type
Wr Wrmin , r = 1, 2,
where Wrmin is the average response time for class r jobs in a system where
the other job class does not exist and where the CPU utilisation is the
maximum attainable. The set of achievable performance vectors is therefore
contained in a region such as the shaded area in Fig. 9.9. The situation is
similar when the number of job classes is R > 2.
We should point out that this characterisation of achievable perfor-
mance has a limited, if any, practical value. The maximum attainable CPU
utilisation is not usually known and the position of the bounding hyperbola
(or R-dimensional surface) may be very sensitive to its estimate. Moreover,
even if the constraints are calculated accurately there is no guarantee that
Fig. 9.9.
all points which satisfy them are, in fact, achievable. What we have obtained
is an idea of the likely shape of the set of achievable performance vectors.
That idea is consistent with the results of the last section (the curve in
Fig. 9.7b closely resembles a hyperbola). It also conrms the observation
made before, namely that the trade-os between response times for dierent
job classes are likely to be non-linear.
References
1. Adams, M. C. and Millard, G. E. (1975). Performance Measurements on
the Edinburgh Multi-Access System (EMAS). Proc. ICS 75, Antibes.
2. Belady, L. A. and Kuehner, C. J. (1969). Dynamic space sharing in computer
systems. Comm. A.C.M., 12, 282288.
3. Buzen, J. P. (1976). Fundamental operational laws of computer system
performance. Acta Informatica, 7, 167182.
4. Chamberlin, D. D., Fuller, S. H. and Lin, L. Y. (1973). A Page Allocation
Strategy for Multiprogramming Systems with Virtual Memory. Proc. 4th
Symp. on Operations Systems Principles, pp. 6672.
5. Denning, P. J. and Kahn, K. C. (1975). A Study of Program Locality and
Lifetime Functions. Proc. 5th Symp. on Operations Systems Principles,
pp. 207216.
6. Denning, P. J., Kahn, K. C., Leroudier, J., Potier, D. and Suri, R. (1976).
Optimal multiprogramming. Acta Informatica, 7, 197216.
7. Gelenbe, E. and Kurinckx, A. (1978). Random injection control of multi-
programming in virtual memory. IEEE Trans. on Software Engng., 4, 217.
8. Gelenbe, E., Kurinckx, A. and Mitrani, I. (1978). The Rate Control Policy
for Virtual Memory Management. Proc. 2nd Int. Symp. on Operations
Systems, IRIA, Rocquencourt.
9. Hine, J. H. (1978). Scheduling for Pre-specified Performance in Multipro-
grammed Computer Systems. Research. Report., University of Wellington.
10. Hine, J. H., Mitrani, I. and Tsur, S. (1979). Control of response times in
multi-class systems by memory allocation. C.A.C.M., 22(7), 415423.
11. Leroudier, J. and Potier, D. (1976). Principles of Optimality for Multipro-
gramming. Proc. Int. Symp. on Computer Performance Modelling, Measur-
ing and Evaluation, pp. 211218. Cambridge, Massachusetts.
12. Rodriguez-Rossel, J. and Dupuy, J. P. (1972). The Evaluation of a Time
Sharing Page Demand System. Proc. AFIPS, SJCC 40, pp. 759765.
13. Sekino, A. (1972). Performance Evaluation of a Multiprogrammed
Time Shared Computer System. MIT Project MAC, Research Report
MAC-TR-103.

Chapter 10
A Queue with Server of Walking Type
10.1. Introduction
Queues with autonomous service (QAS) represent service systems in which
the server becomes unavailable for a random time after each service epoch.
Such systems have been used to model secondary memory devices in
computer systems (e.g. paging disks or drums) as was done in Chapter 2.
The queue with server of walking type studied by Skinner [1] is a special
instance of our model. This model has also been considered by Borovkov [4].
Assuming general independent interarrival times we obtain an opera-
tional formula relating the waiting time in stationary state of a QAS to
the waiting time of the GI/G/1 queue. This result dispenses the need for
analysis of the QAS in special cases and generalizes the result of Skinner [1],
or that of Coman [2] for a paging drum. Sucient conditions for stability
or instability of the system are also obtained.
10.1.1. The mathematical model

We examine a single server, rst-come-rst-served service center to which
customers arrive according to a renewal process. Let A1 , A2 , . . . , An , . . .
denote the interarrival times, and denote by s1 , s2 , . . . , sn , . . . the service
times of the successive customers. After serving the n-th customer the server
becomes idle for a time Tn 0. We write Sn sn + Tn , n 1, and assume
that S1 , S2 , . . . , Sn , . . . is a sequence of i.i.d. (independent and identically
distributed) random variables, independent also of the interarrival times.
Suppose that the queue is empty at time sk + Tk ; the server becomes
once again available for service at times
sk + Tk + S1k + S2k , . . . , sk + Tk + S1k + S2k + + Snk , . . .
297
That is, service will resume for the (k + 1)-th customer which arrives at

time ak+1 k+1 1 Ai at time sk + Tk + 1l(ak+1 ) Sik , where
l

l(ak+1 ) = inf l : sk + T k + Sik ak+1 .
1
We assume that the {Snk }n,k1 are i.i.d. and independent of the interarrival
times and of the sequence {Sn }n1 . In the sequel, we shall drop the index
k associated with Snk in order to simplify the notation, though it will be
understood that the variables associated with the end of dierent busy
periods are distinct.
The model we consider arises in many applications. In computer
systems [2, 3, 5] it serves as a model of a paging drum (in this case S
and S are constant and equal). In data communication systems it can
serve to represent a data transmission facility where transmission begins
at predetermined instants of time.
Using the terminology of Skinner [1] who analyzed the model assuming
Poisson arrivals, we shall call it a queue with server of walking type: after
each service the server takes a walk. Borovkov [4] studies a related model
which he calls a queue with autonomous service.
The purpose of this paper is to obtain a general formula relating the
waiting time Wn of the n-th customer in our model to the waiting time,
of the n-th customer Vn in an equivalent GI/G/1 queue, n 1. This
equivalent GI/G/1 queue has the same arrival process, but the service times
are S1 , S2 , . . . , Sn , . . . and Vn+1 = [Vn + Sn An+1 ]+ . This result allows
us to dispense with a special analysis of our queueing model in stationnary
state since we can obtain the result directly from the known analysis of the
corresponding GI/G/1 queue.
The formula (Theorem 4) is derived in section 2 together with sucient
conditions for ergodicity. Section 3 contains an application to the paging
drum model.
10.1.2. Relation to previous work

Let us briey review previous work on the subject. Borovkov ([4], Chap-
ter 8) denes a system with arrivals according to a renewal process and
in batches, and with service also in batches. According to the notations
dened above, he assumes that the Tn 0 and that the Sn are distributed
as the Sn , n 1. Furthermore he considers various special cases for the
A Queue with Server of Walking Type 299
distribution of the Sn and the An . His main result is that the queue
length distribution (where the queue does not include the customers in
service) of the above system is identical to the queue length distribution of
a conventional queue (with batch arrivals and batch service) if the service
times are exponentially distributed. The model considered by Skinner [1] is
a special case of the one we study since he assumes that the arrival process is
Poisson; otherwise it is identical to ours. He obtains the generating function
for the queue length distribution in stationary state.
10.2. Properties of the waiting time process

Consider the sequence W1 , W2 , . . . , Wn , . . . where Wn is the waiting time
of the n-th customer arriving to the queue. We shall rst prove that the
Wn , n 1, satisfy a simple recurrence relation. Let n = Sn An+1 , n 1.
Lemma 1.
Wn+1 = (Wn n ), n1 (10.1)
where (.) is defined by

x

if x 0
l(x)
(x) =

Sj x, if x > 0

1
where we define for x > 0:

l

l(x) = inf l: Si x, l > 0 (10.2)
1
n
Proof. The n-th customer arrives to the queue at time 1 Ai and begins
n
service at 1 Ai + Wn . The server will then be once again available (for
n
the (n + 1)-th customer) at time 1 Ai + Wn + Sn . Therefore

n n+1

Aj + Wn + Sn Aj , if Wn + Sn An+1 0

1 1
Wn+1 = l(A W S )

n+1
n n

Sj (An+1 Wn Sn ) if Wn + Sn An+1 < 0
1
(10.3)
where l(x) is dened in (10.2).
This can be rewritten as

W + n if Wn + n 0
n

l(Wn n )
Wn+1 =
(10.4)

Sj (Wn n ) if Wn + n < 0

1
which is the formula (10.1) given in the lemma.

As a consequence of Lemma 1 we have the following result.
Lemma 2. If En > 0 for n 1 then Wn with probability 1 as

n .
Proof. Notice from (10.1) that (x) x for all x with probability 1: if
x 0 the statement is obvious; since (x) 0 with probability 1 it follows
that (x) x if x > 0. Therefore, by Lemma 1 we have
Wn+1 Wn + n , n1
n
Therefore Wn+1 1 n , n 1. If En > 0, then the sum on the RHS
converges with probability 1 to + as n .
Henceforth we shall assume that En < 0 for all n 1.
Remark 3. It is now clear that W1 , W2 , . . . , Wn , . . . is a Markov chain since

1 , 2 , . . . , n , . . . is a sequence of i.i.d. random variables and (.) is a random
function which depends on S1 , S2 , . . . , which are themselves independent of
the S1 , S2 , . . . , and of the A1 , A2 , . . . .
We shall now study the characteristic function EeitWn+1 for the waiting
time process. Using (10.1) we have, for any real t

Pk+1
EeitWn+1 = Eeit(Wn +n ) I[Wn + n 0] + Eeit(Wn +n + 1 Si )
.I
k=0
k+1 k

Wn + n + Si 0, 0 > Wn + n + Si (10.5)
1 1

Let f (t) = EeitS .
Then

Pk
EeitWn+1 = Eeit(Wn +n ) + [f (t) 1] Eeit(Wn +n + 1 Si ) .I
k=0
k

Wn + n + Si < 0 (10.6)
1
We are now ready to establish the main result of the paper.
Theorem 4. Suppose that
(a) the random variable is not arithmetic; that is g(t) = Eeit has a single
real value t(t = 0) for which g(t) = 1,
(b) E < 0, and E S < .
Then:
(i) W = limn Wn exists and is a proper random variable (1 ),

p
(ii) W = V + , where V = limn Vn (1 ),
p p
l(x)

= lim Si x,
p x
1
and is independent of V .
That is, is the (limiting) forward recurrence time of the renewal

process S1 , S1 + S2 , . . . , S1 + + Sn , . . . It is well known that
x
P [ < x] = [1 FS (y)]dy/E S
0
Proof. Dene

k

P
it(Wn +n + k
1 Si )
n (t) = Ee .I Wn + n + Si < 0
k=0 1

k

0
itx
= e d P Wn + n + Si < x (10.7)
k=0 1
(1 ) lim W = means limit in law.

p
Introduce the following notation:
n (t) = EeitWn
Then (10.9) becomes
n+1 (t) = n (t)g(t) + (f (t) 1)n (t) (10.8)
Our proof will be complete if we can prove the existence and uniqueness
of the characteristic function (t) EeitW of a positive random variable
W , which is the solution of the stationary equation
(t)(1 g(t)) = (f (t) 1)(t) (10.9)
obtained from (10.8), such that (i) and (ii) are satised.
Uniqueness. We shall rst show that if the solution (t) to (10.9) exists,
then it is unique. If (t) exists, it must be continuous for real t and (t)
must exist. Using (10.7):
0
(t) = eity dG(y)

where

k

G(y) P W ++ Si < y
k=0 1
Let us rst show that (t) is a continuous function of t.

Set W = x in (10.6). Let us prove that the series on the right-hand-side
of (10.7) is uniformly convergent on R+ as function of x:

Pk1 k

Eeit(x++ 1 Si ) I x + + Si < 0

k=0 1

k

k

P x++ Si < 0 < P + Si < 0 = EH()
k=0 1 k=0 1
where H(.) is the renewal function for the renewal process S1 , S1 + S2 , S1 +

S2 + S3 , . . . and EH() is the expectation of H() with respect to the
random variable . But H(y), which is the expected number of renewals in
[0, y] for y > 0 (and is zero for y 0), is bounded by a function + y for
, 0. This completes the proof since a similar argument can be applied
to the second series.
Therefore G(y) is a continuous function of y for almost all y < 0. It

is obviously an increasing function of y and G() = 0 and G(0 ) < ,
since for y < 0
k

k

P W ++ Si < y P +

Si < 0
1 1
because W 0. Also, G(y) is bounded for y < 0. Thus we have established

that (t) is a continuous function of t.
Rewrite (10.9) as
(t)
(t) = (1 g(t))
f (t) 1
Since (t) is continuous and g(t) = 1 for t = 0 (by assumption (a)), it
follows that every zero of (f (t) 1), if any, except t = 1, coincides with
some zero of (t).
We now call upon a result of Borovkov [4]; if E < 0 (Chapter 4, p. 103,
equation (1)):
P (V = 0)
(1 g(t)) = [1 EeitX ]
EeitV
where X is a negative random variable. Therefore we may write,
P (V = 0)
(t) [1 eitX ] = (f (t) 1)(t)
EeitV
or
(t)P (V = 0)it it(t)
= (10.10)
(f (t) 1)EeitV 1 EeitX
Consider the LHS of (10.10). (t), f (t) and EeitV are characteristic
functions of positive random variables; they are therefore analytic in the
upper half-plane (Im(t) > 0) and continuous on the real line and bounded.
Consider the RHS of (10.10). (t) is the characteristic function of a negative
random variable and so is EeitX ; therefore the RHS of (10.10) is analytic
on the lower half-plane (Im(t) < 0) and continuous on the real line and
bounded. Therefore by the Liouvilles Theorem the expression (10.10) is a
constant, call it C. Let us write:
C(f (t) 1) itV
(t) = Ee
itP (V = 0)
Taking
CE S
1 = (0) =
P (V = 0)
we have C = P (V = 0)/E S and
f (t) 1 itV
(t) = Ee (10.11)
itE S
Therefore if (t) exists, then it is unique since it is given by (10.11).
In fact, we have also shown that if it exists, it satises (ii) since (10.11) is
simply the Fourier transform of the statement in (ii).
Existence. We must now prove the existence of the solution (t) given by
(10.11), of the equation (10.9).
Using (10.7), we shall show that (t) of (10.11) is a solution to (10.9).
We write, from (10.7):
0

k

(t) = d P W ++
Sn < x eitx (10.12)
k=0 1
It is the Fourier transform of the restriction to R of a mesure . is

the convolution of two measures.
1 , corresponding to the random variable , on R

2 , dened on R+ with

k

2 [0, x[= P W+ Sn < x .
k=0 n=1
The Fourier transform of 2 is given by

(t)
(|f (t)| < 1 when Im(t) > 0)
1 f (t)
But, from (10.11),
(t) EeitV
=
1 f (t)
itE(S)
Using the fact that itE(1 S)

is the Fourier transform of the Lebesgue measure
on R+ , with density E(1S)
, 2 itself is obtained as the convolution of this
Lebesgue measure and of the measure of V .
Hence, is -nite and its Fourier transform is

EeitV EeitV EeitV
eitx (dx) = g(t) = [1 g(t)] +
R itE S itE S itE S

p(V = 0).(1 EeitX )
= + eitx 2 (dx)
itE S R+
We deduce = + 2 , where
2 is restricted to R+ .
is the restriction of to R and therefore has the Fourier transform
(t) which is
1 E(eitX )
(t) = p(V = 0)
itE S
Hence replacing (t) above and (10.11) in (10.9) we see that the equality
(10.9) is satised completing the existence proof.
We have established the existence and uniqueness of the stationary
solution (t) of equation (10.8). We now have to prove that
lim Wn = W
n p
i.e. that this stationary solution is the limit in the above equation. For this
we shall call upon general results on the ergodicity of Markov chains as
presented by Revuz [6]. In particular:

1 We rst show that Wn is irreducible.

2 We use the Theorem (Revuz [6], Theorem 2.7, Chapter 3) that states
that if a chain is irreducible and if a nite invariant measure exists, then
it is recurrent in the sense of Harris (i.e. a Harris chain). Thus we show
that Wn is a Harris chain.

3 Finally we use Oreys theorem (Revuz [6], Theorem 2.8, Chapter 6)
which states that if a nite invariant measure m exists for an aperiodic
Harris chain Wn , then Wn W; if the measure m is a probability
p
measure then it is the measure of W .
Let us proceed with this proof.

1 To show irreducibility, consider the measure m whose Fourier transform
is (t). By (10.11) we can write
m=vs
where denotes the convolution, v is the measure whose transform is

EeitV and s is the measure whose transform is [f (t) 1]/itE S.
Clearly,
vc (A) > 0 m(A) > 0 where vc is the continuous component of v
for a subset A of the non-negative real line. We shall show that, for each
initial state x [0, [, there exists a positive integer n such that
P (Wn A | W0 = x) > 0.
For this, notice that Vn is ergodic (Borovkov [4], Theorem 7) if a) and

b) are satised. Thus
vc (A) > 0 P (V A) > 0 [m P (Vm A | V0 = x) > 0]
But since P (Vm = Wm A) > 0 for each nite m (the case where the
queue with automous server does not empty up to, and including, the m-th
customer), then
vc (A) > 0 [m P (Wm A | W0 = x) > 0]
Therefore (by Revuz [6], Denition 2.1 of Chapter 3) Wn is vc -irreducible.

2 Theorem 2.7, Chapter 3 of Revuz [6] states that Wn is a Harris chain
if it is v-irreducible and if there exists nite invariant measure m such
that v(A) > 0 m(A) > 0 for all A (m v, in Revuzs notation).
This has already been proved. Therefore Wn is indeed recurrent in the
sense of Harris.

3 We now have to show, in order to use Oreys theorem, that Wn is
aperiodic. We call again upon the classical result that Vn is ergodic
if E < 0 and is not arithmetic (both of which we have assumed).
Therefore Vn is aperiodic, and so is Wn since for each nite m
P (Vm = Wm A) > 0
Thus, by Oreys theorem Wn is ergodic and
lim Wn = W
n p
This complete the proof of Theorem 4.

10.3. Application to a paging drum model

In this section we shall apply the theoretical results obtained in the previous
sections to the standard paging disk model arising in the analysis of
computer system behaviour [2, 3, 5]. The customers are requests for the
transfer of pages (blocks of information of xed size) from a paging disk
that in studied in section 2.2. For the purpose of eciency, described in
[2, 3, 5], page requests are addressed to one of N sector queues; each paging
drum sector when traversed permits to deliver one page. Since the paging
drum rotates at constant speed, if T is the time for one complete rotation
then one page will be transfered in time T /N and service at this particular
sector queue will not be available for a time T (N 1)/N until the paging
drum can be once again positioned at the beginning of the same sector.
Let W be the stationary waiting time at a sector queue with general
independent interarrivals, and V be the stationary waiting time of the
corresponding GI/D/1 queue with constant service time T . We then have:
Theorem 5. If EA < T and is not arithmetic with E < 0 then W and

V are proper random variables related by the formula
W =V + Y
p
where V and Y are independent and Y is uniformly distributed in [0, T ].

In particular we obtain the average waiting time for the case of Poisson
arrivals studied in Chapter 2.
T T 2
EW = +
2 2(1 T )
where is the arrival rate of transfer requests.
Clearly, the response time R (waiting time plus service time) is simply
T
R=V +Y +
N
References
1. Skinner, C. E. (1967). A priority queueing model with server walking type,
2. Coman, E. G. (1969). Analysis of a drum input-output queue under scheduled
operation, J.A.C.M., 16(1), 7390.
3. Gelenbe, E., Lenfant, J. and Potjer, D. (1975). Response time of a xed-head
disk to transfer of variable length, S.I.A.M.J. on Computing, 4(4), 461473.
4. Borokov, A. A. (1976). Stochastic Processes in Queueing Theory, Springer

Verlag, New York.
5. Fuller, S. H. and Baskett, F. (1972). An analysis of drum storage units, Tech.
Rep. 29, Digital Syst. Lab. Stanford University, Stanford, Cal.
6. Revuz, D. (1975). Markov chains, North Holland, Amsterdam.
January 11, 2010 12:18 spi-b749 9in x 6in b749-Index
Index
50% criterion, 279, 280 characteristic function, 300, 302

charge-coupled device, 44, 6467
absorbing state, 186 circulating shift register, 44, 63, 64
achievable performance vector, see coecient of variation, 53, 62, 170,
performance vector 186188, 190, 199201, 203, 204
active jobs, 269271, 273275, 277, communicating classes, 83, 87
278, 281, 282, 286, 289 communication channel, see
acyclic graph, 76 Multiplexed communication
admission control policy, 270, 271 channel
aggregate server, 282, 284, 285, 289 complete families of scheduling
aggregate state, 104 strategies, 249, 287
aggregate/subsystem decomposition, M/G/1-complete, 249, 252, 253,
see Decomposition 258, 259
analytic, 303
M/M/1-complete, 249, 250,
arrival stream, 57, 63, 7779, 237, 238 253255, 258
computer system, 1, 43, 44, 58, 74,
balance equations, 81, 8385, 87,
93, 105, 112, 165, 201, 206, 207,
9193, 96, 97, 100103, 284, 285,
211, 231, 267, 269, 287, 292
289
conservation law, 233, 234, 236238,
BCMP theorem, 98, 101, 283, 289
240242, 250252, 260, 263, 265,
boundary condition, 166168, 175,
267, 287, 293
179, 257
general, 234
bubble memory, 44, 58, 63, 64
buer queue, 6971, 197, 198 Kleinrock, 236, 237
busy period, 166, 177, 235, 240, 241, special, 236, 238, 241, 242
246, 247, 258 control of performance, 269, 287
in multiple-resource systems, 269
central limit theorem, 166, 167, 169, in single-resource systems, see
170, 176 Synthesis problems
ChapmanKolmogorov equations, 168 Control of the degree of
dierential, 215 multiprogramming, see Degree of
forward, 170 multiprogramming
Characterisation theorems, 242, 248, convex hull, 245, 246
260, 287 Coxian distributions, 95, 100, 106
309
CPU, 7375, 112, 113, 178180, G-Networks (Gelenbe Networks),

270276, 278, 279, 281283, 289, 117119
292, 293 general conservation law, see
utilisation, 113, 180, 274, 283, Conservation law
292, 293 generalised processor-sharing, see
customer classes, 201, 207, 232 Processor-sharing
generating function, 4649, 51, 53, 54,
decomposition, 211, 212, 215, 220, 56, 61, 107, 109, 111
224, 273, 282285, 289 GI/G/1 system, 237
aggregate/subsystem, 212, 213 GI/M/1 system, 185
approximation, 220, 283 global balance, 85, 103
degree of multiprogramming, GordonNewell theorem, 86
269271, 273275, 277, 279, 280
demand characteristics, 231, 232 holding time, 172, 173, 175, 179
departure stream, 7779 hyper exponential distribution, 95,
diusion approximation, 165 266
diusion equation, 167170, 188, 199,
205 I/O constant, 278
diusion process, 166, 171, 179, 199, idle period, 48, 54, 59, 60, 6568
204, 205 impeded set, 281286
parameters, 205 in multiple-resource systems, 267
initial basis problem, 251, 252
dirac density function, 177
instantaneous return model, 166
directed graph, 74
instantaneous transition rate, 84, 100
discretisation, 166168, 181
iterative technique, 165, 228
numerical, 167, 168
step, 166, 168
Jackson networks, 80, 83, 87, 90, 101,
dynamic partitioning, 288, 291, 292
103, 104, 106
Jacksons theorem, 85, 86, 193, 197
egalitarian processor-sharing, see job routing, 76
Process-sharing
electric circuit analogy, 224 key renewal theorem, 50
equilibrium, 77, 78, 83, 90, 199, 200, Kleinrocks conservation laws, see
218, 234, 240, 241, 273 Conservation laws
Erlang distribution, 95, 96 knee criterion, 275, 277, 279, 288
Error analysis, 220, 223 Kolmogorov forward diusion
exponential distribution, 226, 235, equation, 170
266
L=S criteria, 275, 277, 279
FCFS, see also FIFO Lagrange multiplier, 267
feedforward networks, 76, 86 Laplace transform, 95, 172, 190
FIFO, 61, 75, 76, 90, 92, 105, 201, LCFS (preemptive-resume) strategy,
212, 262 9294, 100, 105
ling disk, 74, 113, 289 lifetime function, 272, 275279, 282,
ow balance equations, 81 283, 285, 286, 288, 290
FokkerPlanck equation, 170 linear programming, 249, 250, 252
forward recurrence time, 44, 56 link trac, 196
Index 311
little theorem, 87, 88, 108, 112, 236, node, 7377, 7990, 9395, 98114,
257, 264, 285, 289 193, 195, 196, 199, 271274, 278,
local balance, 85, 92, 93, 97, 102, 105, 284, 285, 292
226 closed, 74, 81, 111
locality of reference, 269 generating function, 107
lost arrival, 106 open, 74, 81, 83, 101
lumpability of stochastic matrices, recurrent, 8183, 87
218 transient, 81, 82
utilisation, 87, 106, 108110,
main memory, 59, 269272, 274, 281, 112, 278, 292
286288, 290, 291 normal distribution, 166, 187
M/D/1 system, 190 normalising constant, 106, 107, 110,
M/G/1 system, 50, 58, 242, 243, 248, 111, 113, 114, 200
249, 252, 260262, 265 normalising equation, 85, 91, 93, 94,
M/G/c system, 80 96, 101
M/M/1 system, 92, 225, 240, 242, Nortons theorem, 224, 226, 229
243, 246, 248250, 252, 253, 260, numerical discretisation, see
261, 265267 Discretisation
M/M/c system, 84
marginal distribution, 110, 212, 213, optimal scheduling strategies, 259
215, 225, 283, 285 output theorem, 78, 79, 90
Markov chain, 43, 45, 49, 50, 82, 86,
page fault, 272, 274, 276, 278,
87, 105, 186, 201, 206, 215, 216
280283, 286
embedded, 43, 45
page fault rate control policy, see rate
Markov process, 85, 96, 97, 100, 284,
control policy
285, 289
page frames, 271, 272, 274276
irreducible, 8587
paging device, 271, 272
transient, 82
paging drum, 44, 58, 6365, 74, 113,
Markov renewal process, 49, 50
275, 277, 279282, 289, 292
Markov renewal theory, 43, 49 path trac, 196
mass service systems, 1 performance evaluation, 1, 58, 201,
measures of system performance, 106 229, 231
memory allocation, see also Selective performance measures, 43, 44, 75, 76,
memory allocation 106, 107, 113, 198, 231, 273
memoryless property, 235 performance objective, 231, 232, 248,
mixing scheduling strategy, see 270
Scheduling strategy performance vector, 232, 236, 237,
MM property, 105 239, 241, 242, 247256, 258260,
multiplexed communication channels, 267, 270, 271, 287, 288, 290294
43, 44 achievable, 231, 232, 236, 237,
multiprogrammed computer system, 239, 242, 247253, 258, 260,
74, 267 270, 288, 290, 291, 293, 294
Poisson
negative customers, 117, 118 arrival stream, 57, 63, 78, 79, 238
network generating function, 109 non-homogeneous, 99
network, see queueing network Poisson process, 59, 90, 99
Pollaczek-Khintchines formula, 53, rate control policy, 281, 282, 286

57, 62, 178, 182184, 238 reecting boundary, 170, 181
polytope, 242, 246, 260 regeneration points, 247
positive customers (ordinary renewal function, 302
customers), 118 renewal process, 49, 50, 186, 187, 189,
priority disciplines, 105, 241, 201203, 302
243245, 249254, 260 renewal theory, 43, 49
head-of-the-line, 240, 241, residual lifetime, 57
243245, 252, 260 residual service, 238
non-preemptive, 260 resource allocation policy, 270
preemptive, 241, 243245, 249,
response time, 44, 61, 75, 88, 94, 106,
250, 252254, 260
111, 112, 114, 165, 232, 234, 236,
processor-sharing, 75, 9194, 98, 105, 240, 241, 243, 245, 247, 252, 256,
254256, 258, 266 258265, 267, 270, 285, 287294
egalitarian, 91, 93
response time vector, see Performance
generalised, 254 vector
product form solution, 79
reversibility, 77, 78
proportional admission strategy, 259
theorem, 78
rotational delays, 113
queue length process, 43, 45, 4850,
RoundRobin discipline, 284
53, 204
routing probabilities, 75, 105, 212
stationary, 43, 44, 4751, 53, 56,
58, 61, 176, 177, 185, 205
queueing network, 73, 74, 76, 80, 90, sample path, 1
98, 99, 106, 165, 166, 186, 191, 194, saturation, 59, 62
195, 199, 201, 207, 212, 216, scheduling strategy, 1, 74, 75, 90, 92,
224226, 271, 273, 284, 292 93, 97, 100, 232243, 247250, 252,
BCMP, 98 255, 258265, 267, 270, 287, 292
closed, 186, 199, 211, 273 mixing, 246, 247, 250, 252, 270
completely closed, 86, 87 non-preemptive, 236, 237,
completely open, 87 240243, 246, 248, 249, 258,
general, 43, 74, 98, 165, 166, 260, 262, 263
185, 186, 292 work-conserving, 234, 236238,
GordonNewell, 106 240, 241, 246
Jackson, 73, 80 secondary memory, 43, 58, 63, 64,
open, 87, 186, 191, 199 269, 272
queueing system, 1, 50, 84, 166, 178, device, 43, 58, 63, 64
234, 237, 240, 241, 264 sector queue, 5964
queueing theory, 1, 177, 199 seek delays, 113
Queues with autonomous service, 297 SEPT, see Shortest-Expected-
Processing-Time-rst
random observer, 235 discipline
random variable, 1, 45, 49, 56, 57, 65, server of walking type, 43, 44, 47, 49,
86, 94, 95, 98, 167, 170, 172, 176, 53, 5658, 60, 65, 69, 70, 297
178, 188 server-per-job strategy, 93
Index 313
Shortest-Expected-Processing-Time- stochastic process, 1, 96, 166, 170,

rst discipline, 261 171, 173, 175, 176, 179, 234
Shortest-Processing-Time-rst subchain, 99, 101, 103, 104, 106, 111,
discipline, 262 186
Shortest-Remaining-Processing-Time- closed, 99, 186
rst discipline, 264 open, 106
signals, 118 Synthesis Problems, 231
single-server system, 1, 169, 234, 247, in Single-Resource Systems, 231
270 synthesis problems, 231, 232, 249,
Smallest-Rank-rst strategy, 265 267, 270, 273
sojourn time, 79, 80, 108, 112 in multiple-resource systems, 267
Solid-state secondary memory system lifetime, 277279
devices, 63
space-time product, 275, 276, 279 terminal system, 112, 113, 292
special conservation laws, see think time, 75, 112, 113, 285, 292
conservation laws thrashing, 269, 274, 280282, 286, 287
SPT, see Shortest-Processing- throughput, 59, 87, 106, 108, 110114,
Time-rst discipline 226, 227, 274278, 281, 282
SR, see Smallest-Rank-rst strategy time sharing system, 74, 75
SRPT, see Shortest-Remaining- trac equations, see also row balance
Processing-Time-rst discipline equations
staircase approximation, 95 trac intensity, 88, 92, 104, 235, 238,
static partitioning, 288292 261263, 280
station balance, 105, 106 transition probability, 49
stationary distribution, see also
steady-state distribution utilisation factor, 108, 114
steady-state, 77, 79, 81, 8394, 9698,
100104, 106, 109, 169, 171, 187, virtual load, 233235, 237240, 247
202, 207, 211213, 232, 234237, virtual memory, 58, 73, 224, 272
239, 240, 247, 256, 257, 273, 274,
278, 280, 282, 284, 285, 289 work-conserving scheduling strategy,
distribution, 81, 8487, 8991, see Scheduling strategy
93, 96, 97, 100104, 106, 109, work-rate theorem, 108, 187, 200
171, 213, 285, 289 working set, 269
regime, 81

(Gelenbe Erol) Analysis and Synthesis of Computer

Uploaded by

Copyright:

Available Formats

(Gelenbe Erol) Analysis and Synthesis of Computer

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Gelenbe Erol) Analysis and Synthesis of Computer

Uploaded by

Copyright:

Available Formats

What are the main topics covered in the document?

What are the main topics covered in the document?

What scheduling strategies are discussed?

What scheduling strategies are discussed?

Analysis and

P643 tp.indd 1 3/19/10 4:38:53 PM

Editor-in-Chief: Erol Gelenbe (Imperial College)

Vol. 1 Computer System Performance Modeling in Perspective:

Vol. 2 Residue Number Systems: Theory and Implementation

Vol. 3: Fundamental Concepts in Computer Science

Vol. 4: Analysis and Synthesis of Computer Systems (2nd Edition)

KwangWei - Analysis and Synthesis.pmd 2 5/6/2010, 3:55 PM

Imperial College Press

P643 tp.indd 2 3/19/10 4:38:53 PM

British Library Cataloguing-in-Publication Data

ANALYSIS AND SYNTHESIS OF COMPUTER SYSTEMS (2nd Edition)

Desk Editor: Tjan Kwang Wei

Typeset by Stallion Press

KwangWei - Analysis and Synthesis.pmd 1 5/6/2010, 3:55 PM

Preface to the Second Edition

This page intentionally left blank

Preface to the Second Edition v

1. Basic Tools of Probabilistic Modelling 1

2. The Queue with Server of Walking Type

viii Analysis and Synthesis of Computer Systems

3. Queueing Network Models 73

4. Queueing Networks with Multiple Classes of Positive

5. Markov-Modulated Queues 137

6. Diusion Approximation Methods for General

6.3. Diusion approximations for general networks

7. Approximate Decomposition and Iterative Techniques

8. Synthesis Problems in Single-Resource Systems:

9. Control of Performance in Multiple-Resource Systems 269

x Analysis and Synthesis of Computer Systems

9.6. Towards a characterisation of achievable

10. A Queue with Server of Walking Type 297

1.1. General background

2 Analysis and Synthesis of Computer Systems

Basic Tools of Probabilistic Modelling 3

being independent of the early history of the process, and independent

e.g. D/M/2 describes a queueing system with Deterministic (constant)

1.2. Markov processes. The exponential distribution

P (S(t + y) = j|S(t) = i) = pi,j (y) for all t, (1.2)

then the Markov process is said to be time-homogeneous (for an excellent

4 Analysis and Synthesis of Computer Systems

P(x + y) = P(x)P(y), x, y 0. (1.4)

We shall assume that the functions pi,j (y) are continuous at y = 0:

pi,j (h) = ai,j h + o(h), i = j = 0, 1, . . . , (1.7)

where o(x) is a function such that lim [o(x)/x] = 0.

Basic Tools of Probabilistic Modelling 5

the instantaneous transition rate from state i to state j, i = j. The

1 pi,i (h) = ai,i h + o(h), i = 0, 1, . . . , (1.8)

so ai,i is the instantaneous rate of transition out of state i. Of course, we

P (y) = AP(y). (1.10)

Similarly, dierentiating (1.4) with respect to y and letting y 0 yields

P (x) = P(x)A. (1.11)

P(y) = eAy . (1.12)

This turns out, indeed, to be the solution, provided that (1.12) is

Thus, the transition probability functions are completely determined