Fitting Self-Similar Traffic by A Superposition of Mmpps Modeling The Distribution at Multiple Time Scales

Download as ps, pdf, or txt
Download as ps, pdf, or txt
You are on page 1of 11

PAPER

Fitting Self-Similar Trafc by a Superposition of MMPPs Modeling


the Distribution at Multiple Time Scales
Ant onio NOGUEIRA
a)
, Paulo SALVADOR
b)
, Rui VALADAS
c)
, Regular Members,
and Ant onio PACHECO
d)
, Nonmember
SUMMARY Measuring and modeling network trafc is of key impor-
tance for the trafc engineering of IP networks, due to the growing diversity
of multimedia applications and the need to efciently support QoS differen-
tiation in the network. Several recent measurements have shown that Inter-
net trafc may incorporate long-range dependence and self-similar charac-
teristics, which can have signicant impact on network performance. Self-
similar trafc shows variability over many time scales, and this behavior
must be taken into account for accurate prediction of network performance.
In this paper, we propose a new parameter tting procedure for a superpo-
sition of Markov Modulated Poisson Processes (MMPPs), which is able to
capture self-similarity over a range of time scales. The tting procedure
matches the complete distribution of the arrival process at each time scale
of interest. We evaluate the procedure by comparing the Hurst parameter,
the probability mass function at each time scale, and the queuing behavior
(as assessed by the loss probability and average waiting time), correspond-
ing to measured trafc traces and to traces synthesized according to the
proposed model. We consider three measured trafc traces, all exhibiting
self-similar behavior: the well-known pOct Bellcore trace, a trace of aggre-
gated IP WAN trafc, and a trace corresponding to the popular le sharing
application Kazaa. Our results show that the proposed tting procedure is
able to match closely the distribution over the time scales present in data,
leading to an accurate prediction of the queuing behavior.
key words: Trafc modeling, self-similar, time scale, Markov Modulated
Poisson Process.
1. Introduction
The growing diversity of services and applications in the In-
ternet places a strong requirement on the use of efcient de-
sign and control procedures. In particular, the main char-
acteristics of the supported trafc must be known with suf-
cient detail. Several recent studies have shown that Inter-
net trafc may exhibit properties of self-similarity and long-
range dependence (LRD) [1][7]. In general, self-similarity
implies long-range dependence and vice-versa. Self-similar
trafc shows identical statistical characteristics over a wide
range of time scales, which may have a signicant impact
on network performance. Therefore, it is important to make
frequent measurements of packet ows and to describe them
through appropriate trafc models.
Manuscript received May 27, 2003.
Manuscript revised October 14, 2003.
Final manuscript received November 15, 2003.

University of Aveiro / Institute of Telecommunications


Aveiro, Campus de Santiago, 3810-193 Aveiro, Portugal.

Instituto Superior T ecnico - UTL, Department of Mathemat-


ics and CEMAT, Av. Rovisco Pais, 1049-001 Lisboa, Portugal.
a) E-mail: [email protected]
b) E-mail: [email protected]
c) E-mail: [email protected]
d) E-mail: [email protected]
The impact of LRD on network performance has been
addressed by several works . References [5], [8][10] study
the case of a single queue and conclude that the buffer occu-
pancy is not affected by autocovariance lags that are beyond
the so-called critical time scale (CTS) or correlation hori-
zon (CH), which depends on system parameters such as the
buffer capacity. Similar conclusions are observed for the
case of tandem queues in [11]. Thus, matching the LRD is
only required within the time scales specic to the system
under study. One of the consequences of this result is that
more traditional trafc models, such as Markov Modulated
Poisson Processes (MMPPs), can still be used to model traf-
c exhibiting LRD. The use of MMPPs also benets from
the existence of several tools for calculating the queuing be-
havior and the effective bandwidth.
In this paper we propose a new parameter tting pro-
cedure for a superposition of Markov Modulated Poisson
Processes (MMPPs), which captures self-similar behavior
over a range of time scales. Each MMPP models a specic
time scale. The parameter tting procedure matches, at each
time scale, an MMPP to a probability mass function (PMF)
that describes the contribution of that time scale to the over-
all trafc behavior. The number of states of each MMPP
is not xed a priori; it is determined as part of the tting
procedure. The accuracy of the tting procedure is eval-
uated by applying it to several measured trafc traces that
exhibit self-similar behavior: the well-known pOct Bellcore
trace, a trace of aggregated IP WAN trafc, and a trace cor-
responding to the le sharing application Kazaa. We se-
lected the Kazaa application given its present popularity in
the Internet. We compare the PMF at each time scale, and
the queuing behavior (as assessed by the loss probability and
average waiting time), corresponding to the measured and to
synthetic traces generated from the inferred models. Our re-
sults show that the proposed tting method is very effective
in matching the PMF at the various time scales and leads to
an accurate prediction of the queuing behavior.
When measuring network trafc data, we can either
record the individual arrival instants or the number of ar-
rivals in a predened sampling (time) interval. The former
approach brings more detail but the latter one has the advan-
tage of producing a xed amount of data that is known in
advance. This allows the recording of longer traces, which
clearly pays off the loss of detail in data recording, if the
sampling interval is chosen appropriately. In this work,
we consider discrete-time MMMPs (dMMPPs) instead of
continuous-time MMPPs, since they are more natural model
for data corresponding to the number of arrivals in a sam-
pling interval. Note that discrete-time and continuous-time
MMPPs are basically interchangeable (through a simple pa-
rameter rescaling) as models for arrival processes, whenever
the sampling interval used for the discrete-time version is
small compared with the average sojourn times in the states
of the modulating Markov chain.
Several tting procedures have been proposed in the
literature for estimating the parameters of MMPPs from em-
pirical data ([12][21], among many others). However, most
procedures only apply to 2-MMPPs (e.g. [12], [14], [15],
[18]). This model can capture trafc burstiness but the num-
ber of states is not enough to reproduce variability over a
wide range of time scales. On the other hand, the tting
procedures for MMPPs with an arbitrary number of states
mainly concentrate on matching rst- and/or second-order
statistics, without addressing directly the issue of modeling
on multiple time scales [13], [16], [17], [20], [21]. Yoshihara
et al. [19] developed a tting procedure for self-similar traf-
c based on the superposition of 2-MMPPs, that matches the
variance over specied time scales. In this way, the resulting
MMPP reproduces the variance-scale curve characteristic of
self-similar processes. Kasahara [22] addressed in detail the
queuing behavior achieved by this tting procedure. Sal-
vador et al. [21] proposed a tting procedure that matches
simultaneously the autocovariance and marginal distribution
of the packet counts. The resulting MMPP is constructed as
the superposition of L 2-MMPPs matching the autocovari-
ance, and one M-MMPP matching the marginal distribution.
The autocovariance modeling is such that each 2-MMPP (in
the set of L 2-MMPPs) models a characteristic time con-
stant (also called time scale) of the autocovariance function.
In this paper we develop a procedure that matches the com-
plete distribution of the packet counts at each time scale.
Here, however, the concept of time scale is not directly re-
lated to second-order statistics; instead it refers to the char-
acterization of the trafc process when aggregated over a
number of time intervals. Thus, the proposed tting method
addresses the scaling paradigm in a more natural way than
the one proposed in [21], achieving even higher accuracy in
the tting of self-similar trafc, although usually including
a higher number of parameters in the resulting MMPP.
The paper is organized as follows. Section 2 intro-
duces self-similarity and long-range dependence, motivat-
ing the need for a trafc model that matches the different
time scales of the data. Section 3 gives the required back-
ground on MMPPs. Section 4 presents the various steps of
the parameter tting procedure. Section 5 briey describes
the data traces used in the numerical evaluation and in sec-
tion 6 we discuss the results. Finally, section 7 presents the
main conclusions.
2. Self-similarity, long-range dependence, and time
scales
Consider the continuous-time process Y (t) representing the
trafc volume (e.g. in bytes) from time 0 up to time t and
let X(t) = Y (t) Y (t 1) be the corresponding increment
process (e.g. in bytes/second). Consider also the sequence
X
(m)
(k) which is obtained by averaging X(t) over non-
overlapping blocks of length m, that is
X
(m)
(k) =
1
m
m

i=1
X((k 1)m + i), k = 1, 2, ... (1)
The tting procedure developed in this work will be
based on the aggregated processes X
(m)
(k).
We start by introducing the notion of distributional self-
similarity. Y (t) is exactly self-similar when it is equiv-
alent, in the sense of nite-dimensional distributions, to
a
H
Y (at), for all t > 0 and a > 0, where H (0 < H < 1)
is the Hurst parameter. Clearly, the process Y (t) can not
be stationary. However, if Y (t) has stationary increments
then again X(k) = X
(1)
(k) is equivalent, in the sense of
nite-dimensional distributions, to m
1H
X
(m)
(k). This il-
lustrates that a trafc model developed for tting self-similar
behavior must preferably enable the matching of the distri-
bution on several time scales.
Long-range dependence is associated with stationary
processes. Consider now that X(k) is second-order sta-
tionary with variance
2
and autocorrelation function r(k).
Note that, in this case, X
(m)
(k) is also second-order sta-
tionary. The process X(k) has long-range dependence
(LRD) if its autocorrelation function is nonsummable, that
is,

n
r(n) = . Intuitively, this means that the pro-
cess exhibits similar uctuations over a wide range of time
scales. Taking the case of the pOct Bellcore trace, it can be
seen in Figure 1 that the uctuations over the 0.01, 0.1 and
1s time scales are indeed similar.
Equivalently, one can say that a stationary process is
LRD if its spectrum diverges at the origin, that is f(v)
c
f
|v|

, v 0. Here, is a dimensionless scaling expo-


nent, that takes values in (0, 1); c
f
takes positive real values
and has dimensions of variance. On the other hand, a short
range dependent (SRD) process is simply a stationary pro-
cess which is not LRD. Such a process has = 0 at large
scales, corresponding to white noise at scales beyond the so-
called characteristic scale or correlation horizon. The Hurst
parameter H is related with by H = ( + 1)/2.
There are several estimators of LRD. In this study we
use the semi-parametric estimator developed in [23], which
is based on wavelets. Here, one looks for alignment in the
so-called Logscale Diagram (LD), which is a log-log plot
of the variance estimates of the discrete wavelet transform
coefcients representing the trafc process, against scale,
completed with condence intervals about these estimates
at each scale. It can be thought of as a spectral estima-
tor where large scale corresponds to low frequency. The
main properties explored in this estimator are the stationar-
ity and short-term correlations exhibited by the process of
discrete wavelet transform coefcients and the power-law
dependence in scale of the variance of this process. We will
represent the scale by j and the logarithm of the variance
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
0
2
4
6
8
10
12
14
16
18
20
sample number
a
v
e
r
a
g
e

n
u
m
b
e
r

o
f

a
r
r
iv
a
ls
0.01s scale
0.1s scale
1s scale
Fig. 1 LRD processes exhibit uctuations over a wide range of time
scales (Example: trace pOct).
estimate by y
j
. Trafc is said to be LRD if, within the limits
of the condence intervals, the log of the variance estimates
fall on a straight line, in a range of scales from some initial
value j
1
up to the largest one present in data and the slope
of the straight line, which is an estimate of the scaling expo-
nent , lies in (0, 1).
There is a close relationship between long-range de-
pendent and self-similar processes. In fact, if Y (t) is self-
similar with stationary increments and nite variance then
X(k) is long-range dependent, as long as
1
2
< H < 1. The
process X(k) is said to be exactly second-order self-similar
(
1
2
< H < 1) if
r (n) = 1/2
_
(n + 1)
2H
2n
2H
+ (n 1)
2H
_
(2)
for all n 1, or is asymptotically self-similar if
r (n) n
(22H)
L(n) (3)
as n , where L(n) is a slowly varying function at in-
nity. In both cases the autocovariance decays hyperbol-
ically, which indicates LRD. Any asymptotically second-
order self-similar process is LRD, and vice-versa.
3. Markov Modulated Poisson Processes
The discrete-time Markov Modulated Poisson Process
(dMMPP) is the discrete-time version of the popular
(continuous-time) MMPP and may be regarded as an
Markov random walk where the increments in each instant
have a Poisson distribution whose parameter is a function
of the state of the modulator Markov chain. More precisely,
the (homogeneous) Markov chain (Y, J) = {(Y
k
, J
k
), k =
0, 1, . . .} with state space IN
0
S is a dMMPP if and only
if for k = 0, 1, . . .,
P(Y
k+1
= m, J
k+1
= j|Y
k
= n, J
k
= i) =
=
_
0 m < n
p
ij
e

mn
j
(mn)!
m n
(4)
for all m, n IN
0
and i, j S, with
j
, j S, being
nonnegative real constants and P = (p
ij
) being a stochastic
matrix. Note that the distribution of Y
k+1
Y
k
given J
k+1
=
j is Poisson with mean
j
, so that
j
represents the mean
increment of the process Y when the modulating Markov
chain is in state j.
Whenever (4) holds, we say that (Y, J) is a dMMPP
with set of modulating states S and parameter (matrices) P
and , and write
(Y, J) dMMPP
S
(P, ) (5)
where = (
ij
) = (
i

ij
), with being the Kronecker
function, i.e.,
ij
is one if i = j and is zero otherwise. The
matrix P is the transition probability matrix of the modu-
lating Markov chain J, whereas is the matrix of Pois-
son arrival rates. If S has cardinality r, we say that (Y, J)
is a dMMPP of order r (r-dMMPP). When, in particular,
S = {1, 2, . . . , r} for some r IN, then
P =
_

_
p
11
p
12
. . . p
1r
p
21
p
22
. . . p
2r
. . . . . . . . . . . .
p
r1
p
r2
. . . p
rr
_

_
(6)
and
=
_

1
0 . . . 0
0
2
. . . 0
. . . . . . . . . . . .
0 0 . . .
r
_

_
(7)
and we write simply that (Y, J) r-dMMPP
r
(P, ).
Consider now the superposition of L independent
dMMPPs
(Y
(l)
, J
(l)
) r
l
-dMMPP
r
l
(P
(l)
,
(l)
) (8)
where l = 1, 2, . . . , L and J
(1)
, J
(2)
, . . . , J
(L)
are ergodic
chains in steady-state and the dimension of each dMMPP is
not necessarily the same. For l = 1, 2, . . . , L we denote by

(l)
=
_

(l)
1

(l)
2
. . .
(l)
r
l
_
the stationary distribution of J
(l)
.
The result of the superposition is the process
(Y, J) =
_
L

l=1
Y
(l)
, (J
(1)
, J
(2)
, . . . , J
(L)
)
_
(9)
dMMPP
S
(P, )
where
S = {1, 2, . . . , r
1
} . . . {1, 2, . . . , r
L
} (10)
P = P
(1)
P
(2)
. . . P
(L)
(11)
=
(1)

(2)
. . .
(L)
(12)
where r
1
, r
2
, . . . , r
L
, represent the dimensions of each one
of the L dMMPPs, with and denoting the Kronecker
sum and the Kronecker product, respectively. Note that the
Markov chain J is also in steady-state.
In our approach L, that represents the number of time
scales considered, is xed a priori and the dimensions of the
dMMPPs, r
1
, r
2
, . . . , r
L
, are computed as part of the tting
procedure.
The dMMPP is a particular case of the discrete-time
batch Markovian arrival process (usually denoted by D-
BMAP) proposed by Blondia and Casals [24]. These pro-
cesses and queues fed by them have received a great deal
of attention (see, e.g., [25][33] and references therein). In
particular, we note their use in explaining long range depen-
dence [28].
4. Inference Procedure
The inference procedure estimates one dMMPP for each
time scale that matches a probability mass function (PMF)
characteristic of that time scale. The resulting dMMPP is
obtained from the superposition of all dMMPPs inferred for
each time scale. This inference procedure is closely related
to the notion of distributional self-similarity. The owchart
of the inference method is represented in gure 2 where,
basically, four steps can be identied: (i) compute the data
sequences (corresponding to the average number of arrivals
per time interval) at each time scale; (ii) calculate the em-
pirical PMF at the largest time scale and infer its dMMPP;
(iii) for all other time scales (going from the largest to the
smallest one), calculate the empirical PMF, deconvolve it
from the empirical PMF of the previous time scale and infer
a dMMPP that matches the resulting PMF; and (iv) calculate
the nal dMMPP through superposition of the dMMPPs in-
ferred for each time scale. We will describe these steps in
detail in the next subsections. The time interval at the small-
est time scale, t, the number of time scales, L, and the
level of aggregation, a, are given apriori.
4.1 Calculation of the data aggregates
The procedure starts by computing the data sequence cor-
responding to the average number of arrivals in the small-
est time scale, D
(1)
(k), k = 1, 2, . . . , N. Then, it cal-
culates the data sequences of the remaining time scales,
D
(l)
(k), l = 2, ..., L, corresponding to the average number
of arrivals in intervals of length ta
(l1)
. This is given by
D
(l)
(k) =
_
_
_

_
1
a
a1

i=0
D
(l1)
(k + i)
_
,
k1
a
IN
0
D
(l)
(k 1),
k1
a
/ IN
0
(13)
where (x) represents round toward the integer nearest x.
Note that the block length of equation (1) is related with
a and l by m = a
l1
. Note also that all data sequences
have the same length N and that D
(l)
(k) is formed by
sub-sequences of a
l1
successive equal values; these sub-
sequences will be called l-sequences. The empirical distri-
bution of D
(l)
(k) will be denoted by p
(l)
(x).
4.2 Calculation of the PMFs
This step infers the PMFs that, at each time scale, must be
tted to a dMMPP. For the largest time scale, this PMF is
simply the empirical one. For all other time scales l, l =
1, 2, ..., L 1, the associated dMMPP will model only the
trafc components due to that scale. For time scale l, these
trafc components can be obtained through deconvolution
of the empirical PMFs of this time scale and of previous
time scale l + 1, i.e.,

f
(l)
p
(x) =
_
p
(l)

1
p
(l+1)
_
(x) (14)
However, this may result in negative arrival rates for the
dMMPP
(l)
, which will occur whenever
min
_
x : p
(l+1)
(x) > 0
_
< min
_
x : p
(l)
(x) > 0
_
(15)
To correct this, the dMMPP
(l)
will be tted to

f
(l)
(x) =

f
(l)
p
_
x + e
(l)
_
(16)
where e
(l)
= min
_
0, min
_
x :

f
(l)
p
(x) > 0
__
, which as-
sures

f
(l)
(x) = 0, x < 0. These additional factors are
removed in the nal step of the inference procedure.
4.3 Parameter inference
The rst step in the inference of the dMMPP
(l)
parameters,
l = 1, 2, ..., L, is the approximation of

f
(l)
by a weighted
sum of Poisson probability functions. This resorts to an al-
gorithm, introduced in [21], that progressively subtracts a
Poisson probability function from

f
(l)
. The main steps of
this algorithm are depicted in the owchart of gure 3 and
will be explained in the next paragraphs.
Let the i
th
Poisson probability function, with mean

(l)
i
, be represented by g

(l)
i
(x) and dene h
(l)
(i)
(x) as the
difference between

f
(l)
(x) and the weighted sum of Pois-
son probability functions at the i
th
iteration. Initially,
we set h
(l)
(1)
(x) =

f
(l)
(x) and, in each step, we rst
detect the maximum of h
(l)
(i)
(x). The corresponding x-
value,
i
= [h
(l)
(i)
]
1
_
max h
(l)
(i)
(x)
_
, will be considered
the i
th
Poisson rate of the dMMPP
(l)
. We then cal-
culate the weights of each Poisson probability function,
w
(l)
i
=
_
w
(l)
1i
, w
(l)
2i
, ..., w
(l)
ii
_
, through the following set of
linear equations:

f
(l)
(
(l)
m
) =
i

j=1
w
(l)
ji
g

(l)
j
(
(l)
m
) (17)
for m = 1, ..., i and l = 1, ..., L. This assures that the tting
between

f
(l)
(x) and the weighted sum of Poisson proba-
bility functions is exact at
(l)
m
points, for m = 1, 2, . . . , i.
The nal step in each iteration is the calculation of the new
difference function
aggregate data for all time
scales, l (l=2,...,L)
calculation of the empirical
PMF for time scale l=L
approximation by a weighted
sum of Poisson distributions
inference of the dMMPP
(l)
parameters
l = l - 1
l = 0?
calculation of the empirical
PMF for time scale l
deconvolve the PMF of the
previous scale from this PMF
calculation of the equivalent
dMMPP parameters
Yes
No
( )
( )
( )
( )
( )
( ) k k k
L 2 1
D ,..., D , D
( )
( ) k
1
D
) ( ) (
, w
l
i
l
i

( ) ( ) ( ) x p x p x f
l l l
p
) 1 ( 1 ) ( ) (

+
=
( ) x f
l
p
) (

( ) x p
l ) (

( ) ( ) l l
P ,
P ,
( ) x f
L) (
shift the empirical PMF of time scale l
( ) x f
l ) (

( ) ( ) { } ( ) ( ) 0

: min , 0 min

) ( ) ( ) (
> + = x f x x f x f
l
p
l
p
l
Fig. 2 Flow diagram of the inference procedure.
h
(l)
(i+1)
(x) =

f
(l)
(x)
i

j=1
w
(l)
ji
g

(l)
j
(x). (18)
The algorithm stops when the maximum of h
(l)
(i)
(x) is lower
than a pre-dened percentage of the maximum of

f
(l)
(x)
and r
l
, the number of states of the dMMPP, is made equal to
i.
Note that the number of states of each dMMPP depends
on the level of accuracy employed in the approximation of
the empirical PMF by the weighted sum of Poisson proba-
bility functions.
After r
l
has been determined, the parameters
(l)
j
and

(l)
j
, j = 1, 2, . . . , r
l
, of the r
l
dMMPP, are set equal to

(l)
j
= w
(l)
jr
l
and
(l)
j
=
(l)
j
. (19)
The next step is to associate, for each time scale l, one
of the dMMPP
(l)
states with each time slot of the arriving
process. Recall the data sequences aggregated at time scale
l have a
l1
successive equal values called l-sequences. The
state assignment process considers only the rst time inter-
val of each l-sequence, dened by i = a
l1
(k 1) +1, k
i=1
( ) { } [ ] x h h
l
i
l
i
l
i
) (
) (
1
) (
) (
) (
max

=
( ) ( ) x f x h
l l ) ( ) (
) 1 (

=
( )
( )
( ) ( )

=
+
=
i
j
l
ji
l l
i
x g w x f x h l
j
1
) ( ) (
) 1 ( ) (

{ } ( ) { } x f x h
l l
i
) ( ) (
) 1 (

max ) ( max
+
i=i+1
No
Yes
i M =
i m g w f
i
j
l
m
l
ji
l
m
l
l
j
,..., 1 , ) ( ) (

1
) ( ) ( ) ( ) (
) ( = =

) ( ) (
and compute
l l

Fig. 3 Algorithm for calculating the number of states and the Poisson
arrival rates of the dMMPP
(l)
.
IN, i N. The state that is assigned to l-sequence i
is calculated randomly according to the probability vector

(l)
(i) =
_

(l)
1
(i) , . . . ,
(l)
r
l
(i)
_
, with

(l)
i
(n) =
g

(l)
i
_
D
l
(i)
_

r
l
j=1
g

(l)
j
(D
l
(i))
, (20)
where n = 1, ..., r
l
, and g

(y) represents a Poisson prob-


ability distribution function with mean . The elements of
this vector represent the probability that the state j had orig-
inated the number of arrivals D
(l)
(k) at time slot k from
time scale l.
After this step, we infer the dMPPP
(l)
transition proba-
bilities, p
(l)
ij
, i, j = 1, ..., r
l
, by counting the number of tran-
sitions between each pair of states. If n
(l)
ij
represents the
number of transitions from state i to state j, corresponding
to the dMPPP
(l)
, then
p
(l)
ij
=
n
(l)
ij

r
l
m=1
n
(l)
mj
, j = 1, ..., r
l
(21)
The transition probability and the Poisson arrival rate
matrices of the dMPPP
(l)
are then given by
P
(l)
=
_

_
p
(l)
11
p
(l)
12
. . . p
(l)
1r
l
p
(l)
21
p
(l)
22
. . . p
(l)
2r
l
. . . . . . . . . . . .
p
(l)
r
l
1
p
(l)
r
l
2
. . . p
(l)
r
l
r
l
_

_
(22)
and

(l)
=
_

(l)
1
0 . . . 0
0
(l)
2
. . . 0
. . . . . . . . . . . .
0 0 . . .
(l)
r
l
_

_
(23)
with l = 1, ..., L.
4.4 Construction of the nal dMMPP model
The nal dMMPP process is constructed using equations
(11) and (12), where the matrices
(l)
and P
(l)
, l = 1, ..., L,
were calculated in the last subsection. However, the ad-
ditional factors introduced in sub-section 4.2 must be re-
moved. Thus,
=
L1

l=1
e
(l)
I (24)
where I is the identity matrix.
5. Measured trafc traces
The evaluation of tting procedure will be based on three
measured trafc traces. Two traces are of aggregated IP traf-
c: (i) the well-known and publicly available pOct LAN
trace from Bellcore [1] and (ii) a trace corresponding to
the downstream Internet access trafc of approximately 65
simultaneous users, measured at the access link of a Por-
tuguese ISP (to an ADSL network). The third trace is from
the Kazaa application (10 simultaneous users) and was also
measured at the above access link. We have included this
application given the fact that an increasing percentage of
the overall Internet trafc belongs to peer-to-peer protocols
of the same type as Kazaa. For all our measurements, the
trafc analyzer was a 1.2 GHz AMD Athlon PC, with 1.5
Gbytes of RAM and running WinDump; we recorded the
arrival instant and the IP header of each packet. No packet
drops were reported by WinDump in all measurements. The
main characteristics of the selected traces are described in
Table 1.
All traces exhibit self-similar behavior. For example,
taking the case of trace pOct, the analysis of its autocovari-
ance function (Figure 4) lead us to suspect that it exhibits
LRD, due to the slow decay for large time lags. This is
conrmed by the scaling analysis, since the y
j
values in the
Logscale Diagram, are aligned between a medium octave
(7) and octave 14, the highest one present in data (Figure 5).
Recall that the y
j
values are logarithms of the variance esti-
mates of the discrete wavelet transform coefcients that rep-
resent the trafc process. A similar analysis was made for
the other traces, also revealing the same self-similar (LRD)
behavior.
10
-1
10
0
10
1
10
2
100
150
200
250
300
350
400
450
sec
a
u
t
o
c
o
v
a
r
ia
n
c
e
Fig. 4 Autocovariance of packet counts, trace pOct.
2 4 6 8 10 12 14
4
6
8
10
12
14
Octave j
y
j
Fig. 5 Second order Logscale Diagram, trace pOct.
6. Numerical Results
The suitability of the proposed dMMPP tting procedure is
assessed using several criteria: (i) comparing the Hurst pa-
rameters of the original and synthesized (from the inferred
dMMPP) data traces; (ii) comparing the probability func-
tions of the average number of arrivals in different time
scales, obtained for the original and synthesized traces and
(iii) comparing the queuing behavior, in terms of packet loss
ratio (PLR) and average waiting time in queue (AWT), of
the original and synthesized traces, using trace-driven simu-
lation. We have resorted to simulation since, in our case, the
packet processing times are not necessarily multiples of the
sampling interval, as would be required for using the avail-
able theoretical results of dMMPPs.
All simulations were carried out using a xed packet
length equal to the mean packet length of the trace. For
all traces, the sampling interval of the counting process was
chosen to be 0.1s and three different time scales were con-
sidered: 0.1s, 0.2s and 0.4s. For each trace, the estimation
procedure took less than 1 minute, using a MATLAB imple-
mentation running in the PC described above, which shows
that the procedure is computationally very efcient.
In order to verify that the proposed tting approach
captures the self-similar behavior, we compare in Table 2
Trace Capture period Trace size Mean rate Mean pkt size
name (pkts) (byte/s) (bytes)
October Bellcore trace 1 million 322790 568
ISP 10.26pm to 10.49pm, October 18
th
2002 1 million 583470 797
Kaaza 10.26pm to 11.31pm, October 18
th
2002 0.5 million 131140 1029
Table 1 Main characteristics of measured traces.
Trace original tted
October 0.941 (6,12) 0.962 (6,12)
ISP 0.745 (6,13) 0.784 (4,13)
Kazaa 0.783 (6,12) 0.773 (2,12)
Table 2 Comparison between Hurst parameter values
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.2
0
0.2
0.4
0.6
0.8
1
original traffic quantiles
f
it
t
e
d

t
r
a
f
f
ic

q
u
a
n
t
ile
s
Fig. 6 Q-Q plot of the cumulative probability function at the smallest
time scale, trace pOct.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.2
0
0.2
0.4
0.6
0.8
1
original traffic quantiles
f
it
t
e
d

t
r
a
f
f
ic

q
u
a
n
t
ile
s
Fig. 7 Q-Q plot of the cumulative probability function at the intermedi-
ate time scale, trace pOct.
the Hurst parameters estimated for the original and tted
trafc, for each one of the three selected data traces. Table 2
also includes the range of time scales where the y
j
follow a
straight line, inside brackets near to the corresponding Hurst
parameter value. There is a very good agreement between
the Hurst parameter values of the original and tted trafc,
so LRD behavior is indeed well captured by our model.
The next evaluation criteria is based on the comparison
between the PMFs of the original and tted traces, for dif-
ferent time scales. Starting with trace pOct, it can be seen
in Figures 6, 7, 8, 9, 10, 11 and 12 that there is a good
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.2
0
0.2
0.4
0.6
0.8
1
1.2
original traffic quantiles
fitte
d
tr
a
ffic
q
u
a
n
tile
s
Fig. 8 Q-Q plot of the cumulative probability function at the largest time
scale, trace pOct.
pkts/sec
0 200 400 600 800 1000 1200 1400
p
r
o
b
a
b
i
l
i
t
y
0.000
0.005
0.010
0.015
0.020
0.025
original traffic
fitted traffic
Fig. 9 Probability mass function at the smallest time scale, trace pOct.
pkts/sec
0 200 400 600 800 1000 1200 1400
p
r
o
b
a
b
i
l
i
t
y
0.000
0.002
0.004
0.006
0.008
0.010
0.012
0.014
original traffic
fitted traffic
Fig. 10 Probability mass function at the intermediate time scale, trace
pOct.
agreement between the probability functions of the original
and tted traces, for all time scales. We used three differ-
ent types of graphical comparison to show the accuracy of
pkts/sec
0 200 400 600 800 1000 1200 1400
p
r
o
b
a
b
i
l
i
t
y
0.000
0.002
0.004
0.006
0.008
0.010
original traffic
fitted traffic
Fig. 11 Probability mass function at the largest time scale, trace pOct.
0 200 400 600 800 1000 1200 1400
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
pkts/sec
c
u
m
u
la
t
iv
e

p
r
o
b
a
b
ilit
y
original traffic
fitted traffic
Fig. 12 Cumulative probability function at the largest time scale, trace
pOct.
Fig. 13 Probability mass function at the smallest time scale, trace ISP.
the inference procedure in the tting of the original trace
PMF: PMF plot, QQ-plot and CDF plot. Recall that the t-
ting procedure explicitly aimed at matching the PMF at the
various time scales, so the obtained results conrm that the
procedure is effective in performing this task. Due to space
limitations, for the case of the ISP and Kazaa traces we only
show the comparison between the probability functions at
the smallest time scale, in Figures 13 and 14. However, a
good agreement also exists at the three times scales.
Considering now the queuing behavior, we compare
the PLR and the AWT obtained, through trace-driven simu-
lation, with the original and tted traces. Two different sets
Fig. 14 Probability mass function at the smallest time scale, trace Kazaa.
queue length (byte)
1e+3 1e+4 1e+5 1e+6 1e+7 1e+8
p
a
c
k
e
t

lo
s
s

r
a
t
io
1e-7
1e-6
1e-5
1e-4
1e-3
1e-2
1e-1
1e+0
original traffic (=0.6)
fitted traffic (=0.6)
original traffic (=0.7)
fitted traffic (=0.7)
original traffic (=1.0)
fitted traffic (=1.0)
original traffic (=1.2)
fitted traffic (=1.2)
Fig. 15 Packet loss ratio, trace pOct.
of utilization ratios were used in the simulations: for traces
pOct and Kazaa, we used = 0.6 and = 0.7 and for trace
ISP the selected values were = 0.8 and = 0.9. This
is due to the lower burstiness of the ISP trafc, which leads
to lower packet losses for the same link utilization. Besides
these utilization ratio values, we have also run simulations
with = 1.0 and = 1.2, for all traces, in order to as-
sess how the generated dMMPPs capture the performance
at high utilization ratio values.
From gures 15 and 16 it can be seen that, for trace
pOct, the PLR is very well approximated by the tted
dMMPP for the lower utilization ratios ( = 0.6 and =
0.7) but the accuracy degrades at higher utilization ratios.
The same behavior occurs for the AWT, but generally the
agreement of the AWT curves is less accurate than that of
the PLR curves, even for low utilization ratios. For trace ISP
the results are illustrated in gures 17 and 18. The agree-
ment between the curves corresponding to the original and
tted traces is good for both performance metrics and for all
utilization ratio values, specially for = 0.6 and = 0.7.
For trace Kazaa the results are depicted in gures 19 and 20.
In this case, the agreement between the curves is only good
for the = 0.6 and = 0.7 utilization ratios, beginning
to degrade as the utilization ratio increases. This behavior
is similar to the one observed for trace pOct. Thus, in gen-
eral, the results are good except for the highest utilization
ratios where some deviations can occur. At these values,
queue length (byte)
1e+4 1e+5 1e+6 1e+7 1e+8
a
v
e
r
a
g
e

w
a
it
in
g

t
im
e

(
s
)
0.001
0.01
0.1
1
10
100
1000
original traffic (=0.6)
fitted traffic (=0.6)
original traffic (=0.7)
fitted traffic (=0.7)
original traffic (=1.0)
fitted traffic (=1.0)
original traffic (=1.2)
fitted traffic (=1.2)
Fig. 16 Average waiting time in queue, trace pOct.
buffer length (byte)
1e+2 1e+3 1e+4 1e+5 1e+6 1e+7 1e+8
p
a
c
k
e
t

lo
s
s

r
a
t
io
1e-6
1e-5
1e-4
1e-3
1e-2
1e-1
original traffic (=0.8)
fitted traffic (=0.8)
original traffic (=0.9)
fitted traffic (=0.9)
original traffic (=1.0)
fitted traffic (=1.0)
original traffic (=1.2)
fitted traffic (=1.2)
Fig. 17 Packet loss ratio, trace ISP.
buffer length (byte)
1e+3 1e+4 1e+5 1e+6 1e+7 1e+8
a
v
e
r
a
g
e

w
a
it
in
g

t
im
e

(
s
)
0.001
0.01
0.1
1
10
100
original traffic (=0.8)
fitted traffic (=0.8)
original traffic (=0.9)
fitted traffic (=0.9)
original traffic (=1.0)
fitted traffic (=1.0)
original traffic (=1.2)
fitted traffic (=1.2)
Fig. 18 Average waiting time in queue, trace ISP.
the buffer occupancies are very high and the sensitivity of
the results becomes larger; however, the packet losses and
waiting times are very high, even unrealistic for operational
networks.
As a nal remark, we can say that the proposed t-
ting approach provides a close match of the Hurst parame-
ters and probability mass functions at each time scale, and
this agreement reveals itself sufcient to drive a good queu-
ing performance in terms of packet loss ratio and average
waiting time in queue. The computational complexity of the
tting method is also very small.
queue length (byte)
1e+1 1e+2 1e+3 1e+4 1e+5 1e+6 1e+7 1e+8
p
a
c
k
e
t

lo
s
s

r
a
t
io
1e-6
1e-5
1e-4
1e-3
1e-2
1e-1
1e+0
original traffic (=0.6)
fitted traffic (=0.6)
original traffic (=0.7)
fitted traffic (=0.7)
original traffic (=1.0)
fitted traffic (=1.0)
original traffic (=1.2)
fitted traffic (=1.2)
Fig. 19 Packet loss ratio, trace Kazaa.
queue length (byte)
1e+3 1e+4 1e+5 1e+6 1e+7 1e+8
a
v
e
r
a
g
e

w
a
it
in
g

t
im
e

(
s
)
0.001
0.01
0.1
1
10
100
original traffic (=0.6)
fitted traffic (=0.6)
original traffic (=0.7)
fitted traffic (=0.7)
original traffic (=1.0)
fitted traffic (=1.0)
original traffic (=1.2)
fitted traffic (=1.2)
Fig. 20 Average waiting time in queue, trace Kazaa.
7. Conclusions
We have proposed a new parameter tting procedure for
a superposition of Markov Modulated Poisson Processes
(MMPPs), which is able to capture self-similarity over a
range of time scales. The tting procedure matches the com-
plete distribution of the arrival process at each time scale of
interest. We evaluated the procedure by comparing the Hurst
parameter, the probability mass function at each time scale,
and the queuing behavior (as assessed by the loss probabil-
ity and average waiting time), corresponding to measured
trafc traces and to traces synthesized according to the pro-
posed model. Three measured trafc traces were consid-
ered, all exhibiting self-similar behavior: the well-known
pOct Bellcore trace, a trace of aggregated IP WAN trafc,
and a trace corresponding to the popular le sharing appli-
cation Kazaa. Our results show that the proposed tting pro-
cedure is able to match closely the distribution over the time
scales present in data, leading to an accurate prediction of
the queuing behavior.
Acknowledgement
The authors would like to thank the anonymous referees for
their valuables comments and suggestions. This research
was supported in part by Fundac ao para a Ci encia e a Tec-
nologia, the project POSI/42069/CPS/2001, and the grant
BD/19781/99.
References
[1] W. Leland, M. Taqqu, W. Willinger, and D. Wilson, On the self-
similar nature of Ethernet trafc (extended version), IEEE/ACM
Transactions on Networking, vol. 2, no. 1, pp. 115, Feb. 1994.
[2] J. Beran, R. Sherman, M. Taqqu, and W. Willinger, Long-range
dependence in variable-bit rate video trafc, IEEE Transactions on
Communications, vol. 43, no. 2/3/4, pp. 15661579, 1995.
[3] M. Crovella and A. Bestavros, Self-similarity in World Wide Web
trafc: Evidence and possible causes, IEEE/ACM Transactions on
Networking, vol. 5, no. 6, pp. 835846, Dec. 1997.
[4] V. Paxson and S. Floyd, Wide-area trafc: The failure of Poisson
modeling, IEEE/ACM Transactions on Networking, vol. 3, no. 3,
pp. 226244, June 1995.
[5] B. Ryu and A. Elwalid, The importance of long-range dependence
of VBR video trafc in ATM trafc engineering: Myths and real-
ities, ACM Computer Communication Review, vol. 26, pp. 314,
Oct. 1996.
[6] W. Willinger, M. Taqqu, R. Sherman, and D. Wilson, Self-similarity
through high-variability: Statistical analysis of Ethernet LAN trafc
at the source level, IEEE/ACM Transactions on Networking, vol. 5,
no. 1, pp. 7186, Feb. 1997.
[7] W. Willinger, V. Paxson, and M. Taqqu, Self-similarity and Heavy
Tails: Structural Modeling of Network Trafc, A Practical Guide
to Heavy Tails: Statistical Techniques and Applications. Birkhauser,
1998.
[8] D. P. Heyman and T. V. Lakshman, What are the implications
of long range dependence for VBR video trafc engineering?,
IEEE/ACM Transactions on Networking, vol. 4, no. 3, pp. 301317,
June 1996.
[9] A. Neidhardt and J. Wang, The concept of relevant time scales and
its application to queuing analysis of self-similar trafc, in Pro-
ceedings of SIGMETRICS1998/PERFORMANCE1998, 1998, pp.
222232.
[10] M. Grossglauser and J. C. Bolot, On the relevance of long-range
dependence in network trafc, IEEE/ACM Transactions on Net-
working, vol. 7, no. 5, pp. 629640, Oct. 1999.
[11] A. Nogueira and R. Valadas, Analyzing the relevant time scales in
a network of queues, in SPIEs International Symposium ITCOM
2001, Aug. 2001.
[12] K. Meier-Hellstern, Atting algorithmfor Markov-modulated Pois-
son process having two arrival rates, European Journal of Opera-
tional Research, vol. 29, 1987.
[13] P. Skelly, M. Schwartz, and S. Dixit, A histogram-based model for
video trafc behaviour in an ATM multiplexer, IEEE/ACM Trans-
actions on Networking, pp. 446458, Aug. 1993.
[14] R. Gr unenfelder and S. Robert, Which arrival law parameters are
decisive for queueing system performance, in International Tele-
trafc Congress, ITC 14, 1994.
[15] S. Kang and D Sung, Two-state MMPP modelling of ATM super-
posed trafc streams based on the characterisation of correlated in-
terarrival times, IEEE GLOBECOM95, pp. 14221426, Nov. 1995.
[16] A. Andersen and B. Nielsen, A Markovian approach for modeling
packet trafc with long-range dependence, IEEE Journal on Se-
lected Areas in Communications, vol. 16, no. 5, pp. 719732, June
1998.
[17] S. Li and C. Hwang, On the convergence of trafc measurement and
queuing analysis: A statistical-match and queuing (SMAQ) tool,
IEEE/ACM Transactions on Networking, pp. 95110, Feb. 1997.
[18] C. Nunes and A. Pacheco, Parametric estimation in MMPP(2) using
time discretization, Proceedings of the 2nd Internation Symposium
on Semi-Markov Models: Theory and Applications, Dec. 1998.
[19] T. Yoshihara, S. Kasahara, and Y. Takahashi, Practical time-scale
tting of self-similar trafc with Markov-modulated Poisson pro-
cess, Telecommunication Systems, vol. 17, no. 1-2, pp. 185211,
2001.
[20] P. Salvador and R. Valadas, A tting procedure for Markov modu-
lated Poisson processes with an adaptive number of states, in Pro-
ceedings of the 9th IFIP Working Conference on Performance Mod-
elling and Evaluation of ATM & IP Networks, June 2001.
[21] P. Salvador, A. Pacheco, and R. Valadas, Multiscale tting proce-
dure using Markov modulated Poisson processes, Telecommunica-
tions Systems, vol. 23, no. 1-2, pp. 123148, June 2003.
[22] S. Kasahara, Internet trafc modeling: Markovian approach to self-
similar trafc and prediction of loss probability for nite queues, IE-
ICE Transactions on Communications, vol. E84-B, no. 8, pp. 2134
2141, 2001.
[23] D. Veitch and P. Abry, A wavelet based joint estimator for the pa-
rameters of LRD, IEEE Transactions on Information Theory, vol.
45, no. 3, Apr. 1999.
[24] C. Blondia and O. Casals, Statistical multiplexing of VBR sources:
A matrix-analytic approach, Performance Evaluation, vol. 16, no.
1-3, pp. 520, 1992.
[25] C. Blondia, A discrete-time batch Markovian arrival process as
B-ISDN trafc model, Belgian Journal of Operations Research,
Statistics and Computer Science, vol. 32, pp. 323, 1993.
[26] A.S. Alfa and S. Chakravarthy, A discrete queue with the Marko-
vian arrival process and phase type primary and secondary services,
Stochastic Models, vol. 10, no. 2, pp. 437451, 1994.
[27] N. Rananand, Markov approximations to D-BMAPs: information-
theoretic bounds on queueing performance, Stochastic Models, vol.
11, no. 4, pp. 713734, 1995.
[28] F. Geerts and C. Blondia, Superposition of Markov sources and
long range dependence, in Information Network and Data Commu-
nications: Proceedings of the IFIP/ICCC International Conference
on Information Network and Data Communication, F.A. Aagesen,
H. Botnevik, and D. Khakhar, Eds. Chapman & Hall, 1996.
[29] I. Frigui, A.S. Alfa, and X. Xu, Algorithms for computing wait-
ing time distributions under different queue disciplines for the D-
BMAP/PH/1, Naval Research Logistics, vol. 44, no. 6, pp. 559576,
1997.
[30] A.S. Alfa, Discrete time analysis of a MAP/PH/1 vacation queue
with gated time-limited service, Queueing Systems, vol. 29, pp. 35
54, 1998.
[31] Ad Ridder, Fast simulation of discrete time queues with Markov
modulated batch arrivals and batch departures, AEU International
Journal of Electronics and Communications, vol. 52, pp. 127132,
1998.
[32] C. Herrmann, The complete analysis of the discrete time nite
DBMAP/G/1/N queue, Performance Evaluation, vol. 43, pp. 95
121, 2001.
[33] N.K. Kim, S.H. Chang, and K.C. Chae, On the relationships
among queue lengths at arrival, departure, and random epochs in the
discrete-time queue with D-BMAP arrivals, Operations Research
Letters, vol. 30, no. 6, pp. 2532, 2002.
Ant onio Nogueira was born in Mozam-
bique, on March 12, 1969. He graduated in
Electronics and Telecommunications Engineer-
ing in 1992 and received the MSc. degree in
Electronics and Telecommunications Engineer-
ing in 1997, both from University of Aveiro,
Aveiro, Portugal. In 1995, he joined Institute
of Telecommunications - pole of Aveiro as a re-
searcher. He is currently an Assistant and a PhD.
student at University of Aveiro. His main re-
search areas of interest are trafc modeling and
network planning.
Paulo Salvador was born in Coimbra,
Portugal, on February 17, 1975. He gradu-
ated in Electronics and Telecommunications En-
gineering in 1998 from University of Aveiro,
Aveiro, Portugal. In 1998, he joined Institute of
Telecommunications, Aveiro, Portugal, as a re-
searcher. He is currently a PhD. student at Uni-
versity of Aveiro. His main research areas of
interest are trafc modeling, network planning
and Internet security.
Rui Valadas graduated in Electrical and
Computers Engineering, in 1986, from the In-
stituto Superior T ecnico, Lisbon, Portugal, and
received the Ph.D. degree from the University of
Aveiro, Aveiro, Portugal, in 1996. He joined the
University of Aveiro in 1986, where he is now
an Associate Professor of the Department of
Electronics and Telecommunications. He is also
leader of the Networks and Multimedia Com-
munications Group at the Institute of Telecom-
munications, Aveiro, Portugal. His main re-
search interests are in the area of trafc engineering.
Ant onio Pacheco graduated in probabil-
ity and statistics at Faculdade de Ci encias, Uni-
versity of Lisbon, Portugal, in 1987, received
the M.Sc. degree in applied mathematics from
the IST (Instituto Superior T ecnico, Technical
University of Lisbon, Portugal) in 1991, and ob-
tained the Ph.D. degree in operations research at
Cornell University in 1994. He has been work-
ing at IST since 1987, where he is now Asso-
ciate Professor at the Mathematics Department,
and has been a full member of the Center for
Mathematics and its Applications from IST since 1994. He has been teach-
ing courses in stochastic processes, time series and engineering statistics
at IST. His current research interests include stochastic processes, queue-
ing theory, telecommunications networks, stochastic process control and
stochastic automata. He is a member of INFORMS, SPE and AB-FLAD.

You might also like