Queuing Theory

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Queuing Theory (Waiting Line Models)

       
         

Queuing theory is the mathematical study of waiting lines which are the most
frequently encountered problems in everyday life. For example, queue at a cafeteria,
library, bank, etc.

Common to all of these cases are the arrivals of objects requiring service and the
attendant delays when the service mechanism is busy. Waiting lines cannot be
eliminated completely, but suitable techniques can be used to reduce the waiting time of
an object in the system. A long waiting line may result in loss of customers to an
organization. Waiting time can be reduced by providing additional service facilities, but it
may result in an increase in the idle time of the service mechanism.

Queuing Theory Definitions


Queuing theory  (or Waiting Line Model) is based on mathematical theories and deals with the
problems arising due to flow of customers towards the service facility.

The waiting line models help the management in balancing between the cost associated
with waiting and the cost of providing service. Thus, queuing or waiting line models can
be applied in such situations where decisions have to be taken to minimize the waiting
time with minimum investment cost. 

Basic Terminology: Queuing theory (Waiting Line Models)


The present section focuses on the standard vocabulary of Waiting Line Models
(Queuing Theory).

Queuing Model
It is a suitable model used to represent a service oriented problem, where customers
arrive randomly to receive some service, the service time being also a random variable.

Arrival
The statistical pattern of the arrival can be indicated through the probability distribution
of the number of the arrivals in an interval.

Service Time
The time taken by a server to complete service is known as service time.
Server
It is a mechanism through which service is offered.

Queue Discipline
It is the order in which the members of the queue are offered service.

Poisson Process
It is a probabilistic phenomenon where the number of arrivals in an interval of length t
follows a Poisson distribution with parameter λt, where λ is the rate of arrival.

Queue
A group of items waiting to receive service, including those receiving the service, is
known as queue.

Waiting time in queue


Time spent by a customer in the queue before being served.

Waiting time in the system


It is the total time spent by a customer in the system. It can be calculated as follows:

Waiting time in the system = Waiting time in queue + Service time

Queue length
Number of persons in the system at any time.

Average length of line


The number of customers in the queue per unit of time.

Average idle time


The average time for which the system remains idle.

FIFO
It is the first in first out queue discipline.
Bulk Arrivals
If more than one customer enter the system at an arrival event, it is known as bulk
arrivals.
Please note that bulk arrivals are not embodied in the models of the subsequent
sections.

Queueing theory is the mathematical study of waiting lines, or queues.[1] A queueing model is
constructed so that queue lengths and waiting time can be predicted.[1] Queueing theory is generally
considered a branch of operations research because the results are often used when making business
decisions about the resources needed to provide a service.

Queueing theory has its origins in research by Agner Krarup Erlang when he created models to describe
the Copenhagen telephone exchange.[1] The ideas have since seen applications including
telecommunication, traffic engineering, computing[2] and, particularly in industrial engineering, in the
design of factories, shops, offices and hospitals, as well as in project management.[3][4]

Contents

1 Spelling

2 Single queueing nodes

3 Overview of the development of the theory

4 Service disciplines

5 Simple two-equation queue

6 Queueing networks

6.1 Example analysis of an M/M/1 queue

6.2 Routing algorithms

7 Mean field limits

8 Fluid limits

9 Heavy traffic/diffusion approximations

10 See also
11 References

12 Further reading

13 External links

Spelling

The spelling "queueing" over "queuing" is typically encountered in the academic research field. In fact,
one of the flagship journals of the profession is named Queueing Systems.

Single queueing nodes

Simple description and analogy

A queue, or "queueing node" can be thought of as nearly a black box. Jobs or "customers" arrive to the
queue, possibly wait some time, take some time being processed, and then depart from the queue (see
Fig. 1).

Fig. 1. A black box. Jobs arrive to, and depart from, the queue.

The queueing node is not quite a pure black box, however, since there is some information we need to
specify about the inside of the queuing node. The queue has one or more "servers" which can each be
paired with an arriving job until it departs, after which that server will be free to be paired with another
arriving job (see Fig. 2).

Fig. 2. A queueing node with 3 servers. Server a is idle, and thus an arrival is given to it to process. Server
b is currently busy and will take some time before it can complete service of its job. Server c has just
completed service of a job and thus will be next to receive an arriving job.

An analogy often used is that of the cashier at a supermarket. There are other models, but this is one
commonly encountered in the literature. Customers arrive, are processed by the cashier, and depart.
Each cashier processes one customer at a time, and hence this is a queueing node with only one server.
A setting, where a customer will leave immediately, when in arriving he finds the cashier busy, is called a
queue with no buffer (or no "waiting area", or similar terms). A setting with a waiting zone for up to n
customers is called a queue with a buffer of size n.

Birth-death process

The behaviour / state of a single queue (also called a "queueing node") can be described by a Birth-
death process, which describe the arrivals and departures from the queue, along with the number of
jobs (also called "customers" or "requests", or any number of other things, depending on the field)
currently in the system. An arrival increases the number of jobs by 1, and a departure (a job completing
its service) decreases k by 1 (see Fig. 3).

Fig. 3. Birth / death process. The values in the circles represent the state of the birth-death process, k.
The system transitions between values of k by "births" and "deaths" which occur at rates given by
various values of λi and μi, respectively. For a queueing system, k is the number of jobs in the system
(either being serviced or waiting if the queue has a buffer of waiting jobs). Further, for a queue, the
arrival rates and departure rates are generally considered not to vary with the number of jobs in the
queue so that we consider a single average rate of arrivals/departures per unit time to the queue.
Hence, for a queue, this diagram has arrival rates of λ = λ1, λ2, …, λk and departure rates of μ = μ1, μ2,
…, μk (see Fig. 4).

Fig. 4. A queue with 1 server, arrival rate λ and departure rate μ.

Kendall's notation

Single queueing nodes are usually described using Kendall's notation in the form A/S/c where A
describes the distribution of durations between each arrival to the queue, S the distribution of service
times for jobs and c the number of servers at the node.[5][6] For an example of the notation, the M/M/1
queue is a simple model where a single server serves jobs that arrive according to a Poisson process
(inter-arrival durations are exponentially distributed) and have exponentially distributed service times.
In an M/G/1 queue, the G stands for general and indicates an arbitrary probability distribution for inter-
arrival times.

Overview of the development of the theory


In 1909, Agner Krarup Erlang, a Danish engineer who worked for the Copenhagen Telephone Exchange,
published the first paper on what would now be called queueing theory.[7][8][9] He modeled the
number of telephone calls arriving at an exchange by a Poisson process and solved the M/D/1 queue in
1917 and M/D/k queueing model in 1920.[10] In Kendall's notation:

M stands for Markov or memoryless and means arrivals occur according to a Poisson process;

D stands for deterministic and means jobs arriving at the queue which require a fixed amount of service;

k describes the number of servers at the queueing node (k = 1, 2, …).

If there are more jobs at the node than there are servers, then jobs will queue and wait for service

The M/G/1 queue was solved by Felix Pollaczek in 1930,[11] a solution later recast in probabilistic terms
by Aleksandr Khinchin and now known as the Pollaczek–Khinchine formula.[10][12]

After the 1940s queueing theory became an area of research interest to mathematicians.[12] In 1953
David George Kendall solved the GI/M/k queue[13] and introduced the modern notation for queues,
now known as Kendall's notation. In 1957 Pollaczek studied the GI/G/1 using an integral equation.[14]
John Kingman gave a formula for the mean waiting time in a G/G/1 queue: Kingman's formula.[15]

Leonard Kleinrock worked on the application of queueing theory to message switching and packet
switching. His initial contribution to this field was his doctoral thesis at the Massachusetts Institute of
Technology in 1962, published in book form in 1964 in the field of digital message switching. His
theoretical work after 1967 underpinned the use of packing switching in the ARPANET, a forerunner to
the Internet.

The matrix geometric method and matrix analytic methods have allowed queues with phase-type
distributed inter-arrival and service time distributions to be considered.[16]

Problems such as performance metrics for the M/G/k queue remain an open problem.[10][12]

Service disciplines
First in first out (FIFO) queue example.

Various scheduling policies can be used at queuing nodes:

First in first out

Also called first-come, first-served (FCFS),[17] this principle states that customers are served one at a
time and that the customer that has been waiting the longest is served first.[18]

Last in first out

This principle also serves customers one at a time, but the customer with the shortest waiting time will
be served first.[18] Also known as a stack.

Processor sharing

Service capacity is shared equally between customers.[18]

Priority

Customers with high priority are served first.[18] Priority queues can be of two types, non-preemptive
(where a job in service cannot be interrupted) and preemptive (where a job in service can be
interrupted by a higher-priority job). No work is lost in either model.[19]

Shortest job first

The next job to be served is the one with the smallest size

Preemptive shortest job first

The next job to be served is the one with the original smallest size[20]

Shortest remaining processing time

The next job to serve is the one with the smallest remaining processing requirement.[21]

Service facility

Single server: customers line up and there is only one server

Several parallel servers–Single queue: customers line up and there are several servers

Several servers–Several queues: there are many counters and customers can decide going where to
queue
Customer's behavior of waiting

Balking: customers deciding not to join the queue if it is too long

Jockeying: customers switch between queues if they think they will get served faster by doing so

Reneging: customers leave the queue if they have waited too long for service

Arriving customers not served (either due to the queue having no buffer, or due to balking or reneging
by the customer) are also known as dropouts and the average rate of dropouts is a significant parameter
describing a queue.

Simple two-equation queue

A common basic queuing system is attributed to Erlang, and is a modification of Little's Law. Given an
arrival rate λ, a dropout rate σ, and a departure rate μ, length of the queue L is defined as:

{\displaystyle L={\frac {\lambda -\sigma }{\mu }}.} {\displaystyle L={\frac {\lambda -\sigma }{\mu }}.}

Assuming an exponential distribution for the rates, the waiting time W can be defined as the proportion
of arrivals that are served. This is equal to the exponential survival rate of those who do not drop out
over the waiting period, giving:

{\displaystyle {\frac {\mu }{\lambda }}=e^{-W{\mu }}} {\displaystyle {\frac {\mu }{\lambda }}=e^{-
W{\mu }}}

The second equation is commonly rewritten as:

{\displaystyle W={\frac {1}{\mu }}\mathrm {ln} {\frac {\lambda }{\mu }}} {\displaystyle W={\frac {1}
{\mu }}\mathrm {ln} {\frac {\lambda }{\mu }}}

The two-stage one-box model is common in epidemiology.[22]

Queueing networks

Networks of queues are systems in which a number of queues are connected by what's known as
customer routing. When a customer is serviced at one node it can join another node and queue for
service, or leave the network.
For networks of m nodes, the state of the system can be described by an m–dimensional vector (x1, x2,
…, xm) where xi represents the number of customers at each node.

The simplest non-trivial network of queues is called tandem queues.[23] The first significant results in
this area were Jackson networks,[24][25] for which an efficient product-form stationary distribution
exists and the mean value analysis[26] which allows average metrics such as throughput and sojourn
times to be computed.[27] If the total number of customers in the network remains constant the
network is called a closed network and has also been shown to have a product–form stationary
distribution in the Gordon–Newell theorem.[28] This result was extended to the BCMP network[29]
where a network with very general service time, regimes and customer routing is shown to also exhibit a
product-form stationary distribution. The normalizing constant can be calculated with the Buzen's
algorithm, proposed in 1973.[30]

Networks of customers have also been investigated, Kelly networks where customers of different classes
experience different priority levels at different service nodes.[31] Another type of network are G-
networks first proposed by Erol Gelenbe in 1993:[32] these networks do not assume exponential time
distributions like the classic Jackson Network.

Example analysis of an M/M/1 queue

Consider a queue with 1 server and the following variables:

λ: the average arrival rate (customers arriving to the system per unit time, e.g. per 30 seconds);

μ: the average departure rate (customers leaving the system—completing service—per the same unit
time, e.g. per 30 seconds);

n: the number of customers in the system at the given time;

Pn: the probability of there being n customers in the system.

Further, let En represent the number of times the system enters state n, and Ln represent the number of
times the system leaves state n. For all n, we have |En − Ln| ∈ {0, 1}, that is, the number of times the
system leaves a state differs by at most 1 from the number of times it enters that state, since it will
either return into that state at some time in the future (En = Ln) or not (|En − Ln| = 1).
When the system arrives at a steady state, the arrival rate should be equal to the departure rate.

Balance equation
situation 0:   
 
situation 1: 
{\
situation n: 
By balance equation, 
By mathematical induction, 
Because 
We get 

M/M/1 queue
From Wikipedia, the free encyclopedia
Jump to navigationJump to search

An M/M/1 queueing node

In queueing theory, a discipline within the mathematical theory of probability, an M/M/1


queue represents the queue length in a system having a single server, where arrivals are
determined by a Poisson process and job service times have an exponential distribution. The model
name is written in Kendall's notation. The model is the most elementary of queueing models [1] and an
attractive object of study as closed-form expressions can be obtained for many metrics of interest in
this model. An extension of this model with more than one server is the M/M/c queue.

Contents

 1Model definition
 2Transient solution
 3Stationary analysis
o 3.1Number of customers in the system
o 3.2Busy period of server
o 3.3Response time
 3.3.1First-come, first-served discipline
 3.3.2Processor sharing discipline
 4Diffusion approximation
 5References

Model definition[edit]
An M/M/1 queue is a stochastic process whose state space is the set {0,1,2,3,...} where the value
corresponds to the number of customers in the system, including any currently in service.

 Arrivals occur at rate λ according to a Poisson process and move the process from
state i to i + 1.
 Service times have an exponential distribution with rate parameter μ in the M/M/1 queue,
where 1/μ is the mean service time.
 A single server serves customers one at a time from the front of the queue, according to
a first-come, first-serveddiscipline. When the service is complete the customer leaves the queue
and the number of customers in the system reduces by one.
 The buffer is of infinite size, so there is no limit on the number of customers it can contain.
The model can be described as a continuous time Markov chain with transition rate matrix
on the state space {0,1,2,3,...}. This is the same continuous time Markov chain as in a birth–
death process. The state space diagram for this chain is as below.

Transient solution[edit]
We can write a probability mass function dependent on t to describe the probability that the
M/M/1 queue is in a particular state at a given time. We assume that the queue is initially in
state i and write pk(t) for the probability of being in state k at time t. Then[2]
where ,  and  is the modified Bessel function of the first kind. Moments for the transient
solution can be expressed as the sum of two monotone functions.[3]

Stationary analysis[edit]
The model is considered stable only if λ < μ. If, on average, arrivals happen faster than
service completions the queue will grow indefinitely long and the system will not have a
stationary distribution. The stationary distribution is the limiting distribution for large values
of t.
Various performance measures can be computed explicitly for the M/M/1 queue. We write
ρ = λ/μ for the utilization of the buffer and require ρ < 1 for the queue to be stable. ρ
represents the average proportion of time which the server is occupied.
Number of customers in the system[edit]
The probability that the stationary process is in state i (contains i customers, including those
in service) is[4]:172–173
We see that the number of customers in the system is geometrically distributed with
parameter 1 − ρ. Thus the average number of customers in the system is ρ/(1 − ρ) and
the variance of number of customers in the system is ρ/(1 − ρ)2. This result holds for any
work conserving service regime, such as processor sharing. [5]
Busy period of server[edit]
The busy period is the time period measured between the instant a customer arrives to
an empty system until the instant a customer departs leaving behind an empty system.
The busy period has probability density function [6][7][8][9]
where I1 is a modified Bessel function of the first kind,[10] obtained by using Laplace
transforms and inverting the solution.[11]
The Laplace transform of the M/M/1 busy period is given by[12][13][14]:215
which gives the moments of the busy period, in particular the mean is 1/(μ − λ)
and variance is given by
Response time[edit]
The average response time or sojourn time (total time a customer spends in
the system) does not depend on scheduling discipline and can be
computed using Little's law as 1/(μ − λ). The average time spent waiting is
1/(μ − λ) − 1/μ = ρ/(μ − λ). The distribution of response times experienced
does depend on scheduling discipline.
First-come, first-served discipline[edit]
For customers who arrive and find the queue as a stationary process, the
response time they experience (the sum of both waiting time and service
time) has transform (μ − λ)/(s + μ − λ)[15] and therefore probability density
function[16]
Processor sharing discipline[edit]
In an M/M/1-PS queue there is no waiting line and all jobs receive an
equal proportion of the service capacity.[17]Suppose the single server
serves at rate 16 and there are 4 jobs in the system, each job will
experience service at rate 4. The rate at which jobs receive service
changes each time a job arrives at or departs from the system. [17]
For customers who arrive to find the queue as a stationary process,
the Laplace transform of the distribution of response times experienced
by customers was published in 1970,[17] for which an integral
representation is known.[18] The waiting time distribution (response time
less service time) for a customer requiring x amount of service has
transform[4]:356
where r is the smaller root of the equation
The mean response time for a job arriving and requiring
amount x of service can therefore be computed as x μ/(μ − λ).
An alternative approach computes the same results using
a spectral expansion method.[5]

You might also like